1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2025-04-23 11:58:50 +02:00

60 Commits

Author SHA1 Message Date
David Steele
85fc3da4c3
Update CipherType/CipherMode to StringId.
As in 6cc521b, this allows option values and enums to be easily mapped together.
2021-04-28 11:36:20 -04:00
David Steele
aa72c19a83
Do not write files atomically or sync paths during backup copy.
There is no need to write the file atomically (e.g. via a temp file on Posix) because checksums are tested on resume after a failed backup. The path does not need be synced for each file because all paths are synced at the end of the backup.

This functionality was not lost during the migration -- it never existed in the Perl code, though these settings are used in restore. See 59f1353 where backupFile() was migrated to C.
2021-04-23 12:33:25 -04:00
David Steele
ed0d48f52c Add StringId type.
It is often useful to represent identifiers as strings when they cannot easily be represented as an enum/integer, e.g. because they are distributed among a number of unrelated modules or need to be passed to remote processes. Strings are also more helpful in debugging since they can be recognized without cross-referencing the source. However, strings are awkward to work with in C since they cannot be directly used in switch statements leading to less efficient if-else structures.

A StringId encodes a short string into an integer so it can be used in switch statements but may also be readily converted back into a string for debugging purposes. StringIds may also be suitable for matching user input providing the strings are short enough.

This patch includes a sample of StringId usage by converting protocol commands to StringIds. There are many other possible use cases. To list a few:

* All "types" in storage, filters. IO , etc. These types are primarily for identification and debugging so they fit well with this model.

* MemContext names would work well as StringIds since these are entirely for debugging.

* Option values could be represented as StringIds which would mean we could remove the functions that convert strings to enums, e.g. CipherType.

* There are a number of places where enums need to be converted back to strings for logging/debugging purposes. An example is protocolParallelJobToConstZ. If ProtocolParallelJobState were defined as:

typedef enum
{
    protocolParallelJobStatePending = STRID5("pend", ...),
    protocolParallelJobStateRunning = STRID5("run", ...),
    protocolParallelJobStateDone = STRID5("done", ...),
} ProtocolParallelJobState;

then protocolParallelJobToConstZ() could be replaced with strIdToZ(). This also applies to many enums that we don't covert to strings for logging, such as CipherMode.

As an example of usage, convert all protocol commands from strings to StringIds.
2021-04-20 15:22:42 -04:00
David Steele
442b2e41b1 Refactor info modules with inline getters/setters.
Extend the pattern introduced in 79a2d02c to the info modules.
2021-04-09 13:48:40 -04:00
David Steele
b715c70b46 Refactor storage modules with inline getters/setters.
Extended the pattern introduced in 79a2d02c to the storage modules: Storage, StorageRead, StorageWrite.
2021-04-07 14:04:38 -04:00
David Steele
2016fac0d9
Improve protocol handlers.
Make protocol handlers have one function per command. This allows the logic of finding the handler to be in ProtocolServer, isolates each command to a function, and removes the need to test the "not found" condition for each handler.
2021-03-16 13:09:34 -04:00
David Steele
ec347847e5 Pass cipher type directly to backupFileProtocol().
Cipher type was inferred from the presence of cipherSubPass rather than being passed explicitly in order to maintain compatibility with Perl backupFile().

Now that Perl is gone it makes sense to pass it explicitly, as we do elsewhere.
2021-03-12 15:19:32 -05:00
David Steele
e07040c2e4 Add HRN_STORAGE_TIME() harness macro.
Makes updating the time of a path/file more streamlined in tests.

Also update all tests where utime() was being used directly.
2021-03-12 12:54:34 -05:00
David Steele
3c85a497a6 Add backup delta unit test.
This test was added to take the place of another test, which turned out not to be workable.

Even so, it adds coverages at little cost so it seems worth keeping.
2021-03-11 14:40:14 -05:00
David Steele
dc1052f1da Remove extra spaces before macro continuation. 2021-03-11 14:11:21 -05:00
David Steele
c862e9654a
Log archive copy during backup.
Copying can be a fairly expensive operation so it makes sense to log it so the user gets some status during long copy operations.
2021-03-11 08:22:44 -05:00
David Steele
28301199eb Rename FUNCTION_HARNESS_RESULT*() macros to FUNCTION_HARNESS_RETURN*().
When the FUNCTION_*_RESULT*() macros were renamed to FUNCTION_*_RETURN_*() in the core code the test harness macros were missed.

Update them to make the naming consistent.
2021-03-10 18:42:22 -05:00
Cynthia Shang
13dc8e68d7 Make --repo optional for backup command.
If there are multiple repos and the --repo option is not specified then backup will automatically select the highest priority repo.
2021-02-26 14:49:50 -05:00
David Steele
456a300bb7 Remove too-verbose braces in switch statements.
The original intention was to enclose complex code in braces but somehow braces got propagated almost everywhere.

Document the standard for braces in switch statements and update the code to reflect the standard.
2021-01-26 12:10:24 -05:00
David Steele
f95850c546 Fix logical -> bitwise boolean operator in backup unit test.
This unset more than the storageFeatureCompress flag but the test was not affected.

Found on MacOS M1.
2021-01-24 08:12:31 -05:00
Cynthia Shang
f32eb9b94e
Partial multi-repository implementation.
Multi-repository implementations for the archive-push, check, info, stanza-create, stanza-upgrade, and stanza-delete commands.

Multi-repo configuration is disabled so there should be no behavioral changes between these commands and their current single-repo implementations.

Multi-repo documentation and integration tests are still in the multi-repo development branch. All unit tests work as multi-repo since they are able to bypass the configuration restrictions.
2021-01-21 15:21:50 -05:00
David Steele
aeee83044d
Fix resume after partial delete of backup by prior resume.
If files other than backup.manifest.copy were left in a backup path by a prior resume then the next resume would skip the backup rather than removing it. Since the backup path still existed, it would be found during backup label generation and cause an error if it appeared to be later than the new backup label. This occurred if the skipped backup was full.

The error was only likely on object stores such as S3 because of the order of file deletion. Posix file systems delete from the bottom up because directories containing files cannot be deleted. Object stores do not have directories so files are deleted in whatever order they are provided by the list command. However, the issue can be reproduced on a Posix file system by manually deleting backup.manifest.copy from a resumable backup path.

Fix the issue by removing the resumable backup if it has no manifest files. Also add a new warning message for this condition.

Note that this issue could be resolved by running expire or a new full backup.
2021-01-12 12:38:32 -05:00
David Steele
87996558d2
Replace double type with time in config module.
The C code does not use doubles to represent seconds like the Perl code did so time can be represented as an integer which reduces the number of data types that config has to understand.

Also remove Variant doubles since they are no longer used.

Note that not all double code was removed since we still need to display times to the user in seconds and it is possible for the times to be fractional. In the future this will likely be simplified by storing the original user input and using that value when the time needs to be displayed.
2020-12-09 08:59:51 -05:00
David Steele
d4211d3aaf Add retries to PostgreSQL sleep when starting a backup.
Inaccuracies in sleep time or clock skew might make a single sleep insufficient to reach the next second.

Add a few retries to make the process more reliable but still avoid an infinite loop if something is seriously wrong.
2020-12-02 22:41:14 -05:00
David Steele
63c12d2541 Fix spacing and typos in backup/manifest modules. 2020-11-09 16:26:43 -05:00
David Steele
d452e9cc38
Use zero-based indexes when referring to option indexes.
There were a number of places in the code where "hostId" was used, but hostId is just the option group index + 1 so this led to a lot of +1 and -1 to convert the id to an index and vice versa.

Instead just use the zero based index wherever possible. This is pretty much everywhere except when the host-id option is read or set, or where a message is being formatted for the user.

Also fix a bug in protocolRemoteParam() where remotes spawned from the main process could get process ids that were not 0. Only the locals should spawn remotes with process id > 0. This seems to have been harmless since the process id is only a label, but it could be confusing when debugging.
2020-10-26 10:25:16 -04:00
David Steele
76cfd8ca70
Allow [, #, and space as the first character in database names.
iniLoad() was trimming lines which meant that a leading space would not pass checksum validation when a manifest was reloaded. Remove the trims since files we write should never contain extraneous spaces. This further diverges the format for the functions that read conf files (e.g. pgbackrest.conf) and those that read info (e.g. manifest) files.

While we are at it also allow [ and # as initial characters. # was reserved for comments but we never put comments into info files. [ denotes a section but we can get around this by never allowing arrays as values in info files, so if a line ends in ] it must be a section. This is currently the case but enforce it by adding an assert to info/info.c.
2020-10-24 11:07:07 -04:00
David Steele
7d069a2b91 Remove indexed option constants.
These constants don't scale well as the index total is increased for an option.

The core code rarely uses these options and they are easily replaced with cfgOptionName().

The tests had started to make use of the constants, so provide functions that build the option name from the optionId and, optionally, the optionKey.
2020-10-19 14:03:48 -04:00
David Steele
927d9adbee
Improve PostgreSQL version identification.
Previously, catalog versions were fixed for all versions which made maintaining the catalog versions during PostgreSQL beta and release candidate cycles very painful. A version of pgBackRest which was functionally compatible was rendered useless by a catalog version bump in PostgreSQL.

Instead use only the control version to identify a PostgreSQL version when possible. Some older versions require a catalog version to positively identify a PostgreSQL version, so include them when required.

Since the catalog number is required to work with tablespaces it will need to be stored. There's already a copy of it in backup.info so use that (even though we have been ignoring it in the C versions).
2020-09-18 16:55:26 -04:00
David Christensen
9fd31913a8 Add hint about starting the stanza when WAL segment not found.
If a stop command has been issued the check command fails due to archiving timing out.

Provide a hint to document this situation and point the user in the proper direction.
2020-09-03 07:49:49 -04:00
David Steele
3e9dce0d76 Rename strPtr()/strPtrNull() to strZ()/strZNull().
We use the Z suffix in many functions to indicate that we are expecting a zero-terminated string so make this function conform to the pattern.

As a bonus the new name is a bit shorter, which is a good quality in a commonly-used function.
2020-07-30 07:49:06 -04:00
David Steele
3f0b41eb9c Add support for testing on 64-bit big-endian architectures.
In particular add support for s390x but we hope this will work for other 64-bit big-endian architectures.

Run basic unit tests on Travis CI for 390x.
2020-07-20 09:59:16 -04:00
David Steele
55277357b8 Reduce reliance on static checksums in unit tests.
Testing against static checksums is valuable but it can be become burdensome when supporting multiple architectures.

Reduce the number of tests we are doing against static checksums when the architecture can cause the checksum to vary.
2020-07-20 09:47:43 -04:00
David Steele
620a8d17cf
Automatic retry for backup, restore, archive-get, and archive-push.
If a local command, e.g. backupFile(), fails it will stop the entire process. Instead, retry local commands to deal with transient errors.

Remove special logic in the S3 storage driver to retry RequestTimeTooSkewed errors since this is now handled by the general retry mechanism in the places where it is most likely to happen, i.e. file read/write. Also, this error should have been entirely eliminated by the asynchronous TLS implementation.
2020-07-14 15:05:31 -04:00
David Steele
ea04ec7b3f Disable query parallelism in PostgreSQL sessions used for backup control.
There is no need to have parallelism enabled in a backup control session. In particular, 9.6 marks pg_stop_backup() as parallel-safe but an error will be thrown if pg_stop_backup() is run in a worker.
2020-06-25 08:02:48 -04:00
David Steele
f55cb386d4 Fix versions passed to HRNPQ_MACRO_OPEN_GE_92() test macro.
These were not noticed because currently 9.3 and 9.6 behave the same on open.
2020-06-24 18:33:20 -04:00
David Steele
45d9b03136
Add strCatZ().
strCat() did not follow our convention of appending Z to functions that accept zero-terminated strings rather than String objects.

Add strCatZ() to accept zero-terminated strings and update strCat() to accept String objects.

Use LF_STR where appropriate but don't use other String constants because they do not improve readability.
2020-06-24 12:09:24 -04:00
David Steele
3d74ec1190
Use PostgreSQL instead of postmaster where appropriate.
Using postmaster in messages was not very helpful since users rarely interact directly with the postmaster. Using PostgreSQL instead seems clearer.
2020-06-17 15:14:59 -04:00
David Steele
11c192f30e
Add hint when checksum delta is enabled after a timeline switch.
This warning is normal when restoring a backup or promoting a standby so add a hint to make that clear.
2020-06-16 13:20:01 -04:00
David Steele
fe829af4ec Remove exclamations from test data.
Three exclamations are commonly used to mark areas of the code that need attention before commit so having them in a test is distracting.
2020-05-28 10:27:45 -04:00
David Steele
688ec2a8f5 Use an extension to denote vendorized code.
Vendorized code is copied from another project when a library is not available and a git subproject won't work. Currently all the vendorized code is copied from PostgreSQL but it makes sense to have a more general mechanism for indicating vendorized code.

The .vendor extension will be used to denote vendorized code in the same way that .auto is used to denote auto-generated code.
2020-05-18 19:11:26 -04:00
David Steele
99405cbb15 Replace booleans with enums in compressType parameters.
This was an oversight in 438b957f which added multiple compression type support. The booleans were interpreted as none and gz which works fine for the CompressType enum until the position of gz or none changes.
2020-05-05 13:23:36 -04:00
David Steele
22ba1f02ce Convert storagePosixNew() to storagePosixNewP().
An upcoming feature requires new parameters for storagePosixNew() and this causes a lot of churn because almost every test creates a Posix storage object. Some refactoring in the tests might reduce this duplication but storagePosixNew() is collecting a lot of parameters so converting to storagePosixNewP() makes sense in any case.

There are relatively few call sites in the core code but they still benefit from better readability after this change.
2020-04-30 11:01:38 -04:00
David Steele
2e6938fad9 Restore works when PGDATA is a link.
Make the restore clean process look more like manifest build, i.e. do cleanup of each target root directory outside the main cleanup callback. This means some code duplication but removes the logic handling "dot" paths.

Add tests for both restore and backup (which already worked but was not tested).
2020-04-21 17:55:36 -04:00
David Steele
e5e81d3839 Only limit backup copy size for WAL-logged files.
The prior behavior introduced in dcddf3a5 could possibly lead to postgresql.conf or postgresql.auto.conf being truncated in the backup since they are copied via tmp files and could change size during the backup.

In general it seems safer to limit this feature to WAL-logged files which will be reconstructed during recovery.
2020-04-16 14:48:16 -04:00
David Steele
dcddf3a58b Limit backup file copy size to size reported at backup start.
If a file grows during the backup it will be reconstructed by WAL replay during recovery so there is no need to copy the additional data.

This also reduces the likelihood of seeing torn pages during the copy. Torn pages can still occur in the middle of the file, though, so they must be handled.
2020-03-19 13:16:05 -04:00
Cynthia Shang
73315268fd Fix typo. 2020-03-19 12:11:20 -04:00
David Steele
26c89b2c8c Improve testing of files that change size during the backup.
Files can change size during a backup so update and add tests to cover the various scenarios more thoroughly.
2020-03-18 13:40:16 -04:00
David Steele
4ec04e5163 Added redacted manifest to testBackupValidate().
The manifest is excellent for validation but including the entire manifest is too noisy and some values are architecture/algorithm dependent.

Output a redacted version that contains the most important information which can be improved on over time.
2020-03-18 10:10:10 -04:00
David Steele
307e741298 Test that shrunk file is backed up correctly.
It's possible, though rare, for a file to shrink during a backup.

There was no issue with the code but having a test is always a good idea.
2020-03-17 16:01:17 -04:00
David Steele
438b957f9c Add infrastructure for multiple compression type support.
Add compress-type option and deprecate compress option. Since the compress option is boolean it won't work with multiple compression types. Add logic to cfgLoadUpdateOption() to update compress-type if it is not set directly. The compress option should no longer be referenced outside the cfgLoadUpdateOption() function.

Add common/compress/helper module to contain interface functions that work with multiple compression types. Code outside this module should no longer call specific compression drivers, though it may be OK to reference a specific compression type using the new interface (e.g., saving backup history files in gz format).

Unit tests only test compression using the gz format because other formats may not be available in all builds. It is the job of integration tests to exercise all compression types.

Additional compression types will be added in future commits.
2020-03-06 14:41:03 -05:00
David Steele
4ab8943ca8 Use PG_PAGE_SIZE_DEFAULT constant instead of pageSize variable.
Page size is passed around a lot but in fact it can only have one value, PG_PAGE_SIZE_DEFAULT, which is checked when pg_control is loaded. There may be an argument for supporting multiple page sizes in the future but for now just use the constant to simplify the code.

There is also a significant performance benefit.  Because pageSize was being used in pageChecksumBlock() the main loop was neither unrolled nor vectorized (-funroll-loops -ftree-vectorize) as it is now with a constant loop boundary.
2020-03-05 09:14:27 -05:00
David Steele
9d48882268 Centralize PostgreSQL page header data structures.
These data structures were copied a few places (but only once in the core code) so put them in a place where everyone can use them.

To do this create a new file, static.auto.h, to contain data types and macros that have stayed the same through all the versions of PostgreSQL that we support.  This allows us to have single, non-versioned set of headers and code for stable data structures like page headers.

Migrate a few types from version.auto.h that are required for page header structures and pull the remaining types from PostgreSQL directly.

We had previously renamed xlog to wal so update those where required since we won't be modifying the PostgreSQL names anymore.
2020-03-04 13:31:27 -05:00
David Steele
3f77a83e73 Remove raw option for gz compression.
This was a minor optimization used in protocol layer compression.  Even though it was slightly faster, it omitted the crc-32 that is generated during normal compression which could lead to corrupt data after a bad network transmission.  This would be caught on restore by our checksum but it seems better to catch an issue like this early.

The raw option also made the function signature different than future compression formats which may not support raw, or require different code to support raw.

In general, it doesn't seem worth the extra testing to support a format that has minimal benefit and is seldom used, since protocol compression is only enabled when the transmitted data is uncompressed.
2020-02-27 12:19:40 -05:00
David Steele
ee351682da Rename "gzip" to "gz".
"gz" was used as the extension but "gzip" was generally used for function and type naming.

With a new compression format on the way, it makes sense to standardize on a single abbreviation to represent a compression format in the code.  Since the extension is standard and we must use it, also use the extension for all naming.
2020-02-27 12:09:05 -05:00