1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
Commit Graph

1418 Commits

Author SHA1 Message Date
David Steele
dcddf3a58b Limit backup file copy size to size reported at backup start.
If a file grows during the backup it will be reconstructed by WAL replay during recovery so there is no need to copy the additional data.

This also reduces the likelihood of seeing torn pages during the copy. Torn pages can still occur in the middle of the file, though, so they must be handled.
2020-03-19 13:16:05 -04:00
Cynthia Shang
73315268fd Fix typo. 2020-03-19 12:11:20 -04:00
David Steele
d677b07081 Move coverage code to CoverageTest module.
This code needs some work, which will be easier if it is all in one place.
2020-03-19 12:07:51 -04:00
David Steele
26c89b2c8c Improve testing of files that change size during the backup.
Files can change size during a backup so update and add tests to cover the various scenarios more thoroughly.
2020-03-18 13:40:16 -04:00
David Steele
4ec04e5163 Added redacted manifest to testBackupValidate().
The manifest is excellent for validation but including the entire manifest is too noisy and some values are architecture/algorithm dependent.

Output a redacted version that contains the most important information which can be improved on over time.
2020-03-18 10:10:10 -04:00
David Steele
b8cd1b6790 Add TEST_RESULT_STR_Z_KEYRPL() test macro.
This macro will automatically do key replacement before the comparison. This saves the indentation required for an embedded function call.

Possibly TEST_RESULT_Z_KEYRPL() would also be useful but it will be added when needed.
2020-03-18 10:05:08 -04:00
David Steele
f2548f45ce Allow storage reads to be limited by bytes.
The current use case is reading files from the PostgreSQL cluster during backup.

A file may grow during backup but we only need to copy the number of bytes that were reported during the manifest build.  The rest will be rebuilt from the WAL during recovery so copying more is just a waste of space.

Limiting the copy sizes in backup will be part of a future commit.
2020-03-17 18:16:17 -04:00
David Steele
307e741298 Test that shrunk file is backed up correctly.
It's possible, though rare, for a file to shrink during a backup.

There was no issue with the code but having a test is always a good idea.
2020-03-17 16:01:17 -04:00
David Steele
9a47b88da3 Add links to custom coverage report.
When multiple files were missing coverage it could be hard to locate the coverage report for a specific file.

Add links for uncovered files to make this easier.

Also move table titles out of the table so they are valid html.
2020-03-16 20:02:36 -04:00
David Steele
f7dac144a6 Reduce variables extern'd by the common/log module in debug builds.
These days it is better to include the module in define.yaml when we need to poke at the internal implementation.

This doesn't quite work for the log test harness, so for now some variables will need to remain extern'd in debug builds.
2020-03-16 18:16:27 -04:00
David Steele
3fbfcba811 Forbid access to /tmp/pgbackrest in the Vagrantfile.
This matches the error that will be thrown in the vm=none test on Travis CI if a unit test writes to /tmp/pgbackrest.
2020-03-16 17:27:01 -04:00
David Steele
46911c64c1 Make storage and logging dry-run aware.
Enhance dry-run support added in 2fa69af8 by forbidding writes in the storage layer and adding prefixes to log messages.

The former will protect against mistakes in dry-run implementations and the latter will make it clear when a command was executed in dry-run mode.

Update expire unit tests with the new log prefix.
2020-03-16 17:24:21 -04:00
Cynthia Shang
2fa69af8da Add --dry-run option to the expire command.
Use dry-run to see which backups/archive would be removed by the expire command without actually removing anything.
2020-03-16 13:56:52 -04:00
David Steele
4328bc1ac6 Move raw coverage results to test/result/raw path.
These results were stored in the vagrant path along with a full copy of src.

Instead store the raw coverage data in test/result/raw and change source references to the files that already exist in [test-path]/repo.
2020-03-16 08:41:32 -04:00
David Steele
d702249507 Build binaries in the test path rather than the vagrant path.
It makes more sense to build in the test path since many developers won't have a vagrant path. Anyway, it's better not to modify the vagrant path since it belongs to vagrant.

Instead of installing the binary just mount it into the container from where it was built. This saves a bit of time and space.
2020-03-15 10:09:27 -04:00
David Steele
19d975346b Improve stability of command/check test module.
When pgbackrest was present this test behaved unexpectedly.

While the binary is not currently required for this test is might be in the future so fix the test to prevent a regression.
2020-03-15 09:59:22 -04:00
David Steele
959dce569b Update code classification and remove XS definition. 2020-03-14 18:30:24 -04:00
David Steele
213cc6e8be Move docker files to test/result. 2020-03-14 15:40:37 -04:00
David Steele
6827e248cd Move coverage results to test/result. 2020-03-14 15:29:42 -04:00
David Steele
75ff25f17f Move profile results to test/result. 2020-03-14 14:50:36 -04:00
David Steele
0f7fe55f72 Build packages on demand only and change build path.
Building packages is not a normal part of development so don't build packages by default. Instead build them in CI as needed.

Do the builds in test/result instead of .vagrant to be friendlier with hosts that are not running vagrant. Anyway, it's probably not a good idea to be creating files in the .vagrant path.
2020-03-14 14:35:09 -04:00
David Steele
5645c91ed5 Add comments to test/.gitignore. 2020-03-14 14:18:22 -04:00
David Steele
4cd060b7fe Generate src/build/aclocal.m4 automatically.
This file is required when macros from the autoconf archive are used in configure.ac
2020-03-14 12:48:08 -04:00
David Steele
9e80c5710e Use a checksum to build configure.ac more efficiently.
Building the configure.ac script can take multiple seconds depending on the state of the autoconf cache. Use a checksum to only rebuild when configure.ac has changed no matter how the timestamps have changed.
2020-03-14 12:39:29 -04:00
David Steele
748f9502eb Remove obsolete ignore. 2020-03-14 10:04:49 -04:00
David Steele
237a3da4d6 Configure and make improvements.
Configure:

* Use standard make variables, e.g. CFLAGS, rather than our own, e.g. CINCLUDE
* Add PG_CONFIG var for configuring custom pg_config location
* Don't error if xml_config or pg_config is missing (but error if libs/headers not found)
* Check for zlib.h header
* Check for lz4frame.h header when liblz4 is present

Make:

* Use gcc-style auto dependencies
* Put src list at the top since it is most frequently modified
* Add clean-all target to also remove auto-generated config files
2020-03-13 09:07:57 -04:00
David Steele
838ef4eca1 Move configure.ac to src/build.
This file is used to generate src/configure and is not required to make pgbackrest since src/configure is updated before distribution.

Move to src/build so it is out of the way.
2020-03-12 09:34:52 -04:00
David Steele
2ac9c19d4a Fix misleading comment. 2020-03-12 09:28:16 -04:00
David Steele
181fa1fc8b Detect changes in reference.xml for code auto-generation.
Changes to reference.xml can affect the command-line documentation built into the binary so changes must trigger an auto-generated code build during smart builds.
2020-03-12 09:27:44 -04:00
David Steele
0ba8062f5f Get package source files dynamically during package build.
The prior method was to build a special container to hold these files which meant they would get stale on development systems.  On CI the container was always rebuilt so failures would be seen there even when dev seemed to be working.

Instead get the package source when the package is built to ensure it is as up-to-date as possible.

This change was prompted by failures on the Ubuntu 12.04 container while getting the package source, probably due to an ancient version of git.  Package builds are no longer supported on that platform with the addition of lz4 compression so it didn't seem worth fixing.
2020-03-12 08:48:45 -04:00
David Steele
4a5bd002c0 Move pgBackRest::Version module to pgBackRestDoc::ProjectInfo.
The primary source for project info is now src/version.h.

The pgBackRestDoc::ProjectInfo module loads the project info from src/version.h at runtime so there is no need to update it.
2020-03-10 17:57:02 -04:00
David Steele
731b862e6f Rename BackRestDoc Perl module to pgBackRestDoc.
This is consistent with the way BackRest and BackRest test were renamed way back in 18fd2523.

More modules will be moving to pgBackRestDoc soon so renaming now reduces churn later.
2020-03-10 15:41:56 -04:00
David Steele
36d4ab9bff Move Perl modules out of lib directory.
This directory was once the home of the production Perl code but since f0ef73db this is no longer true.

Move the modules to test in most cases, except where the module is expected to be useful for the doc engine beyond the expected lifetime of the Perl test code (about a year if all goes well).

The exception is pgBackRest::Version which requires more work to migrate since it is used to track pgBackRest versions.
2020-03-10 15:12:44 -04:00
David Steele
c279a00279 Add lz4 compression support.
LZ4 compresses data faster than gzip but at a lower ratio.  This can be a good tradeoff in certain scenarios.

Note that setting compress-type=lz4 will make new backups and archive incompatible (unrestorable) with prior versions of pgBackRest.
2020-03-10 14:45:27 -04:00
David Steele
79cfd3aebf Remove LibC.
This was the interface between Perl and C introduced in 36a5349b but since f0ef73db has only been used by the Perl integration tests.  This is expensive code to maintain just for testing.

The main dependency was the interface to storage, no matter where it was located, e.g. S3.  Replace this with the new-introduced repo commands (d3c83453) that allow access to repo storage via the command line.

The other dependency was on various cfgOption* functions and CFGOPT_ constants that were convenient but not necessary.  Replace these with hard-coded strings in most places and create new constants for commonly used values.

Remove all auto-generated Perl code.  This means that the error list will no longer be maintained automatically so copy used errors to Common::Exception.pm.  This file will need to be maintained manually going forward but there is not likely to be much churn as the Perl integration tests are being retired.

Update test.pl and related code to remove LibC builds.

Ding, dong, LibC is dead.
2020-03-09 17:41:59 -04:00
David Steele
d3c83453de Add repo-create, repo-get, repo-put, and repo-rm commands.
These commands are generally useful but more importantly they allow removing LibC by providing the Perl integration tests an alternate way to work with repository storage.

All the commands are currently internal only and should not be used on production repositories.
2020-03-09 17:15:03 -04:00
David Steele
948835fb84 Update repo-ls command to work better with files.
If the command was passed a file it would return no results since it was originally intended to list files when passed a path.

However, as a general purpose command working directly with files makes sense.
2020-03-09 16:54:07 -04:00
David Steele
5e1291a29f Rename ls command to repo-ls.
This command only makes sense for the repository storage since other storage (e.g. pg and spool) must be located on a local Posix filesystem and can be listed using standard unix commands.  Since the repo storage can be located lots of places having a common way to list it makes sense.

Prefix with repo- to make the scope of this command clear.

Update documentation to reflect this change.
2020-03-09 16:41:04 -04:00
David Steele
f581edfa50 Remove valgrind suppressions made obsolete by f0ef73db. 2020-03-09 13:36:46 -04:00
David Steele
3c4f91b319 Remove Perl unit tests made obsolete in 434cd832.
These were replaced by C unit tests but not all the unit test setup code was removed in the Perl module.
2020-03-09 13:35:26 -04:00
David Steele
54bc3b454a Cleanup pgPageChecksum() test in postgres/interface module.
Some of the comments were wrong or inconsistent.

Update TEST_RESULT_U16_HEX() to the less-specific TEST_RESULT_UINT_HEX().
2020-03-06 15:01:50 -05:00
David Steele
438b957f9c Add infrastructure for multiple compression type support.
Add compress-type option and deprecate compress option. Since the compress option is boolean it won't work with multiple compression types. Add logic to cfgLoadUpdateOption() to update compress-type if it is not set directly. The compress option should no longer be referenced outside the cfgLoadUpdateOption() function.

Add common/compress/helper module to contain interface functions that work with multiple compression types. Code outside this module should no longer call specific compression drivers, though it may be OK to reference a specific compression type using the new interface (e.g., saving backup history files in gz format).

Unit tests only test compression using the gz format because other formats may not be available in all builds. It is the job of integration tests to exercise all compression types.

Additional compression types will be added in future commits.
2020-03-06 14:41:03 -05:00
David Steele
02aa03d1a2 Remove obsolete methods in pgBackRest::Storage::Storage module.
All the methods in this module will need to be implemented via the command-line in order to get rid of LibC, so the first step is to reduce the code in the module as much as possible.

First remove storageDb() and use storageTest() instead.  Then create storageTest() using pgBackRestTest::Common::Storage which has no dependencies on LibC.  Now the only storage using the LibC interface is storageRepo().

Remove all link functions since those operations cannot be performed on a repo unless it is Posix, in which case the LibC interface is not needed.  Same for owner().

Remove pathSync() because syncs are not required in the tests.  No test data is reused after a crash.

Path create/exists functions should never be explicitly performed on a repo so remove those.  File exists can be implemented by calling info() instead.

Remove encryption detection functions which were only used by Backup/Archive::Info reconstruct() which are now obsolete.

Remove all filters except pgBackRest::Storage::Filter::CipherBlock since they are not being used.  That also means there are no filters returning results so remove all the result code.

Move hashSize() and pathAbsolute() into pgBackRest::Storage::Base where they can be shared between pgBackRest::Storage::Storage and pgBackRestTest::Common::Storage.
2020-03-06 14:10:09 -05:00
David Steele
00647c7109 Remove Perl Db module and LibC dependencies.
This was mostly dead code except the DB_BACKUP_ADVISORY_LOCK constant, moved to the real/all test module, and the function that pulls info from pg_control, moved to ExpireEnvTest.pm.
2020-03-06 07:21:17 -05:00
David Steele
2e0fe25650 Remove dependency on LibC hash filter.
Perl provides Digest::SHA for hashing so there is no need to expose this via LibC anymore.
2020-03-05 18:34:59 -05:00
David Steele
e55443c890 Move logic from postgres/pageChecksum to command/backup/pageChecksum().
The postgres/pageChecksum module was designed as an interface to the C structs for the Perl code.  The new C code can do this directly so no need for an interface.

Move the remaining test for pgPageChecksum() into the postgres/interface test module.
2020-03-05 16:12:54 -05:00
David Steele
3796b74dca Use stock PostgreSQL page checksum implementation.
We were using a customized version which worked fine but was hard to merge with upstream changes.  Now this code is maintained much like the types in static.auto.h that we copy and check with each release.

The goal is to eventually build directly against PostgreSQL (either source or libcommon) and this brings us one step closer.
2020-03-05 14:23:01 -05:00
David Steele
1b647a1a22 Remove invalid page checksum test.
All zero pages should not have checksums.  Not only is this test invalid but it will not work with the stock page checksum implementation in PostgreSQL, which checks for zero pages.  Since we will be using that code verbatim soon this test needs to go.
2020-03-05 14:06:36 -05:00
David Steele
eb4347f20b Use static checksums in mock/all integration tests.
Using static values serves as a better cross-check against the page checksum code. The downside is that these checksums may not work with some big endian systems but in that case neither will the unit tests.

We can also remove the page checksum interface from LibC which brings us one step closer to eliminating it.
2020-03-05 13:56:20 -05:00
David Steele
4ab8943ca8 Use PG_PAGE_SIZE_DEFAULT constant instead of pageSize variable.
Page size is passed around a lot but in fact it can only have one value, PG_PAGE_SIZE_DEFAULT, which is checked when pg_control is loaded. There may be an argument for supporting multiple page sizes in the future but for now just use the constant to simplify the code.

There is also a significant performance benefit.  Because pageSize was being used in pageChecksumBlock() the main loop was neither unrolled nor vectorized (-funroll-loops -ftree-vectorize) as it is now with a constant loop boundary.
2020-03-05 09:14:27 -05:00
David Steele
91f321fb86 Rename old page*() functions to conform to new conventions.
The general convention now is to prefix PostgreSQL functions with "pg".
2020-03-04 14:24:40 -05:00
David Steele
a86253f112 Remove obsolete function pageChecksumBufferTest().
This function made validation faster in Perl because fewer calls (and buffer transformations) were required when all checksums were valid.

In C calling pageChecksumTest() directly is just as efficient so there is no longer a need for pageChecksumBufferTest().
2020-03-04 14:12:02 -05:00
David Steele
9d48882268 Centralize PostgreSQL page header data structures.
These data structures were copied a few places (but only once in the core code) so put them in a place where everyone can use them.

To do this create a new file, static.auto.h, to contain data types and macros that have stayed the same through all the versions of PostgreSQL that we support.  This allows us to have single, non-versioned set of headers and code for stable data structures like page headers.

Migrate a few types from version.auto.h that are required for page header structures and pull the remaining types from PostgreSQL directly.

We had previously renamed xlog to wal so update those where required since we won't be modifying the PostgreSQL names anymore.
2020-03-04 13:31:27 -05:00
David Steele
8ec41efb04 Improve poor man's regular expression common prefix generator.
The S3 driver depends on being able to generate a common prefix to limit the number of results from list commands, which saves on bandwidth.

The prior implementation could be tricked by an expression like ^ABC|^DEF where there is more than one possible prefix.  To fix this disallow any prefix when another ^ anchor is found in the expression.  [^ and \^ are OK since they are not anchors.

Note that this was not an active bug because there are currently no expressions with multiple ^ anchors.
2020-02-28 17:41:34 -05:00
Cynthia Shang
ceb050e950 Fix flapping test in real/all module.
The restore test function was passing strBackup to the restoreCompare function but when the restore is expected to pick a backup based on a timestamp, then strBackup may not be the one chosen.

Modified the code so that strBackupExpected is set based on the parameters passed to the function and this is then passed to restoreCompare.
2020-02-28 14:50:50 -05:00
David Steele
7d8c0d29fb Remove compress option from config tests.
This option was used for boolean testing but it will soon be deprecated and the semantics changed.  To reduce churn it seems easiest to just use other options for testing.  This will also be helpful when the option is eventually removed.
2020-02-27 14:51:40 -05:00
David Steele
dbf6255ab8 Remove compress/compress-level options from commands where unused.
These commands (e.g. restore, archive-get) never used the compress options but allowed them to be passed on the command line. Now they will error when these options are passed on the command line. If these errors occur then remove the unused options.
2020-02-27 12:25:32 -05:00
David Steele
3f77a83e73 Remove raw option for gz compression.
This was a minor optimization used in protocol layer compression.  Even though it was slightly faster, it omitted the crc-32 that is generated during normal compression which could lead to corrupt data after a bad network transmission.  This would be caught on restore by our checksum but it seems better to catch an issue like this early.

The raw option also made the function signature different than future compression formats which may not support raw, or require different code to support raw.

In general, it doesn't seem worth the extra testing to support a format that has minimal benefit and is seldom used, since protocol compression is only enabled when the transmitted data is uncompressed.
2020-02-27 12:19:40 -05:00
David Steele
ee351682da Rename "gzip" to "gz".
"gz" was used as the extension but "gzip" was generally used for function and type naming.

With a new compression format on the way, it makes sense to standardize on a single abbreviation to represent a compression format in the code.  Since the extension is standard and we must use it, also use the extension for all naming.
2020-02-27 12:09:05 -05:00
David Steele
5afd950ed9 Improve performance of MEM_CONTEXT*() macros.
The prior code used TRY...CATCH blocks to cleanup mem contexts when an error occurred. This included freeing new mem contexts that were still being initialized when the error occurred and ensuring that the prior memory context was restored.

This worked fine in production but it involved a lot of setjmp()/longjmp() calls that resulted in longer compilation times and sluggish performance under valgrind, profiling, and coverage testing.

Instead maintain a stack of new contexts and context switches that can be used to do cleanup after an error. Normally, the stack is not used for this purpose and pushing/popping is a cheap operation. In the prior implementation most of the TRY...CATCH logic needed to be run even on success.

One bonus is that the binary is about 8% smaller after this change.  Another benefit is that new contexts *must* be explicitly freed/discarded or an error will occur.  See info/manifest.c for an example of where this is useful outside the standard macros.
2020-02-26 21:15:39 -05:00
David Steele
cc743f2e04 Skip pg_internal.init temp file during backup.
If PostgreSQL crashes it can leave behind a pg_internal.init temp file with the pid as the extension, as discussed in https://www.postgresql.org/message-id/flat/20200131045352.GB2631%40paquier.xyz#7700b9481ef5b0dd5f09cc410b4750f6.  On restart this file is not cleaned up so it can persist for the lifetime of the cluster or until another process with the same id happens to write pg_internal.init.

This is arguably a bug in PostgreSQL, but in any case it makes sense not to backup this file.
2020-02-21 11:51:39 -05:00
David Steele
6353e9428d Error when archive-get/archive-push/restore are not run on a PostgreSQL host.
This error was lost during the migration to C.  The error that occurred instead (generally an SSH auth error) was hard to debug.

Restore the original behavior by throwing an error immediately if pg1-host is configured for any of these commands.  reset-pg1-host can be used to suppress the error when required.
2020-02-12 17:18:48 -07:00
David Steele
dac8119bf1 Add pgIsLocalVerify().
This functionality is required in commands other than restore, so centralize it.
2020-02-12 15:47:07 -07:00
David Steele
e2c304d473 Prevent defunct processes in asynchronous archive commands.
The main improvement is a double-fork to prevent zombie processes if the parent process exits after the (child) async process. This is a real possibility since the parent process sticks around to monitor the results of the async process.

In the first fork, ignore SIGCHLD in the very unlikely case that the async process exits before the first fork. This is probably only possible if the async process exits immediately, perhaps due to a chdir() failure. Set SIGCHLD back to default in the async process so waitpid() will work as expected.

Also update the comment on chdir() to more accurately reflect what is happening.

Finally, add a test in certain debug builds to ensure the first fork exits very quickly. This only works when valgrind is not in use because valgrind makes forking so slow that it is hard to tell if the async process performed work or not (in the case that the second fork goes missing and the async process is a direct child).
2020-02-12 12:17:23 -07:00
David Steele
43936c58a8 Fix resume when the resumable backup was created by Perl.
In this case the resumable backup should be ignored, but the C code was not able to load the partial manifest written by Perl since the format differs slightly. Add validations to catch this case and continue gracefully.
2020-02-11 19:44:06 -07:00
David Steele
44adf21c83 Consolidate archive async exec code.
Move duplicated code to the common module.  This will reduce copy and paste between the get and push modules when changes are made.
2020-02-10 21:30:43 -07:00
David Steele
0eaedc9a6a Improve async archive error file removal.
2a06df93 removed the error file so an old error would not be reported before the async process had a chance to try again.  However, if the async process was already running this might lead to a timeout error before reporting the correct error.

Instead, remove the error files once we know that the async process will start, i.e. after the archive lock has been acquired.

This effectively reverts 2a06df93.
2020-02-10 19:17:11 -07:00
David Steele
2a06df93f3 Remove async archive error file when not throwing an error.
This ensures that the error will not be thrown before the async process has a chance to retry.
2020-02-06 20:59:04 -08:00
David Steele
0f8ec3e478 Read HTTP content to eof when size/encoding not specified.
Generally, the content-size or content-encoding headers will be used to specify how much content should be expected.

There is a special case where the server sends 'Connection:close' without the content headers and the content may be read up until eof.

This appears to be an atypical usage but it is required by the specification.
2020-01-30 14:51:26 -07:00
Cynthia Shang
856980ae99 Auto-select backup set on restore when time target is specified.
Auto-selection is performed only when --set is not specified. If a backup set for the given target time cannot not be found, the latest (default) backup set will be used.

Currently a limited number of date formats are recognized and timezone names are not allowed, only timezone offsets.
2020-01-30 14:38:05 -07:00
Cynthia Shang
f46d1fa74c Add timezone calculations to time module.
Add tzPartsValid() and tzOffsetSecond() to calculate timezone offsets from user provided values.

Update epochFromParts() to accept a timezone offset in seconds.
2020-01-30 11:28:30 -07:00
David Steele
80687cbe74 Free TLS connection in common/io-http test.
The test that checks for no output from the server was leaving a connection open which valgrind was complaining about.

Wait on the server long enough to cause the error on the client then close the connection to free the memory.
2020-01-28 10:19:58 -07:00
David Steele
697150eaf8 Add more validations to the manifest on backup.
Validate that checksums exist for zero size files.  This means that the checksums for zero size files are explicitly set by backup even though they'll always be the same.  Also validate that zero length files have the correct checksum.

Validate that repo size is > 0 if size is > 0.  No matter what compression type is used a non-zero amount of data cannot be stored in zero bytes.
2020-01-26 23:07:07 -07:00
David Steele
7ab07dc580 Validate checksums are set in the manifest on backup/restore.
This is a modest start but it addresses the specific issue that was caused by the bug fixed in 45ec694a.  This validation will produce an immediate error rather than erroring out partway through the restore.

More validations are planned but this is the most important one and seems safest for this release.
2020-01-26 21:58:59 -07:00
David Steele
45ec694af2 Fix missing files corrupting the manifest.
If a file was removed by PostgreSQL during the backup (or was missing from the standby) then the next file might not be copied and updated in the manifest. If this happened then the backup would error when restored.

The issue was that removing files from the manifest invalidated the pointers stored in the processing queues.  When a file was removed, all the pointers shifted to the next file in the list, causing a file to be unprocessed.  Since the unprocessed file was still in the manifest it would be saved with no checksum, causing a failure on restore.

When process-max was > 1 then the bug would often not express since the file had already been pulled from the queue and updates to the manifest are done by name rather than by pointer.
2020-01-26 13:19:13 -07:00
David Steele
90abc3cf17 Use pkg-config instead of xml2-config for libxml2 build options.
pkg-config is a generic way to get build options rather than relying on a package-specific utility.

XML2_CONFIG can be used to override this utility for systems that do not ship pkg-config.
2020-01-24 10:08:05 -07:00
David Steele
b134175fc7 Use designated initializers to initialize structs.
Previously memNew() used memset() to initialize all struct members to 0, NULL, false, etc.  While this appears to work in practice, it is a violation of the C specification.  For instance, NULL == 0 must be true but neither NULL nor 0 must be represented with all zero bits.

Instead use designated initializers to initialize structs.  These guarantee that struct members will be properly initialized even if they are not specified in the initializer.  Note that due to a quirk in the C99 specification at least one member must be explicitly initialized even if it needs to be the default value.

Since pre-zeroed memory is no longer required, adjust memAllocInternal()/memReallocInternal() to return raw memory and update dependent functions accordingly.  All instances of memset() have been removed except in debug/test code where needed.

Add memMewPtrArray() to allocate an array of pointers and automatically set all pointers to NULL.

Rename memGrowRaw() to the more logical memResize().
2020-01-23 14:15:58 -07:00
David Steele
600a51815f Set client_encoding to UTF8 on PostgreSQL connect.
This is the only non-ASCII character encoding we have tested so make sure that's all we get from PostgreSQL.
2020-01-21 18:42:22 -07:00
David Steele
94842ccece Fix comment. 2020-01-21 11:59:25 -07:00
David Steele
03d434c7e1 Remove RHEL package patch now that it has been merged upstream.
Also revert 731ffcfb and update ContainerTest.pm for upstream changes.
2020-01-21 11:57:59 -07:00
David Steele
b89e6b7f69 Fix error in timeline conversion.
The timeline is required to verify WAL segments in the archive after a backup. The conversion was performed base 10 instead of 16, which led to errors when the timeline was ≥ 0xA.
2020-01-21 10:29:46 -07:00
David Steele
c630bda1c1 Remove Debian package patch now that it has been merged upstream. 2020-01-19 10:37:08 -07:00
David Steele
d9efbc3698 Add UTF8 strings to manifest and restore tests.
The most likely place to get UTF8 characters is in database names so make sure UTF8 works in the places where database names are processed.
2020-01-18 10:46:48 -07:00
David Steele
ec173f12fb Add MEM_CONTEXT_PRIOR() block and update current call sites.
This macro block encapsulates the common pattern of switching to the prior (formerly called old) mem context to return results from a function.

Also rename MEM_CONTEXT_OLD() to memContextPrior().  This violates our convention of macros being in all caps but memContextPrior() will become a function very soon so this will reduce churn.
2020-01-17 13:29:49 -07:00
David Steele
c6d6b7dbef Use MEM_CONTEXT_NEW_BEGIN() block instead of memContextNew().
A few places were using just memContextNew(), probably because they did not immediately need to create anything in the new context, but it's better if we use the same pattern everywhere, even if it results in a few extra mem context switches.
2020-01-17 11:58:41 -07:00
David Steele
e81629b442 Reclassify Perl and LibC code as test/harness.
These were still being included in the core totals but they are no longer used by core.
2020-01-15 13:53:30 -07:00
David Steele
2c0ba0820d v2.21: C Migration Complete
Bug Fixes:

* Fix options being ignored by asynchronous commands. The asynchronous archive-get/archive-push processes were not loading options configured in command configuration sections, e.g. [global:archive-get]. (Reviewed by Cynthia Shang. Reported by Urs Kramer.)
* Fix handling of \ in filenames. \ was not being properly escaped when calculating the manifest checksum which prevented the manifest from loading. Since instances of \ in cluster filenames should be rare to nonexistent this does not seem likely to be a serious problem in the field.

Features:

* pgBackRest is now pure C.
* Add pg-user option. Specifies the database user name when connecting to PostgreSQL. If not specified pgBackRest will connect with the local OS user or PGUSER, which was the previous behavior. (Contributed by Mike Palmiotto.)
* Allow path-style URIs in S3 driver.

Improvements:

* The backup command is implemented entirely in C. (Reviewed by Cynthia Shang.)
2020-01-15 13:21:52 -07:00
David Steele
8d3710b2fe Fix options being ignored by asynchronous commands.
The local, remote, archive-get-async, and archive-push-async commands were used to run functionality that was not directly available to the user. Unfortunately that meant they would not pick up options from the command that the user expected, e.g. backup, archive-get, etc.

Remove the internal commands and add roles which allow pgBackRest to determine what functionality is required without implementing special commands. This way the options are loaded from the expected command section.

Since remote is no longer a specific command with its own options, more manipulation is required when calling remote. This might be something we can improve in the config system but it may be worth leaving as is because it is a one-off, for now at least.
2020-01-15 12:24:58 -07:00
David Steele
a7738ebba3 Update comments in command/remote module. 2020-01-13 13:21:28 -07:00
David Steele
fe263e87b1 Allow path-style URIs in S3 driver.
Although path-style URIs have been deprecated by AWS, they may still be used with products like Minio because no additional DNS configuration is required.

Path-style URIs must be explicitly enabled since it is not clear how they can be auto-detected reliably.  More importantly, faulty detection could cause regressions in current installations.
2020-01-12 11:31:06 -07:00
David Steele
3f89ecf8d9 Add time to storage ls JSON output.
Time is supported in all drivers with the update to S3 at 61538f93, so it is now possible to add time to the ls command and have it work on all repo types.
2020-01-10 09:39:33 -07:00
David Steele
0c5c78e5e1 Make quoting in cfgExeParam() optional.
Parameter lists that are passed directly to exec*() do not need quoting when spaces are present.  Worse, the quotes will not be stripped and the option value will be garbled.

Unfortunately this still does not fix all issues with quoting since we don't know how it might need to be escaped to work with SSH command configuration.  The answer seems to be to pass the options in the protocol layer but that's beyond the scope of this commit.
2020-01-09 09:23:15 -07:00
David Steele
7de5ce23ad Add internal remote-type option.
This option was overloaded on the general type option but it makes sense to split this out since the meaning is pretty different.

Rename the values to conform to current standards, i.e. pg and repo, now that the Perl code won't care anymore.
2020-01-08 18:59:02 -07:00
David Steele
7a1871c341 Fix test log message to match pg-version parameter name.
It was confusing that this part of the log message did not match the parameter name, which made reproducing test failures from CI a little harder.
2020-01-08 09:54:44 -07:00
David Steele
61538f932c Parse dates in storageS3InfoList() and storageS3Info().
Previously dates were not being filled by these functions which was fine since dates were not used.

We plan to use dates for the ls command plus it makes sense for the driver to be complete since it will be used as an example.
2020-01-06 15:53:53 -07:00
David Steele
d2fb4f977c Add httpLastModifiedToTime() to parse HTTP last-modified header. 2020-01-06 15:24:49 -07:00
David Steele
a08298ce1b Add basic time management functions.
These are similar to what mktime() and strptime() do but they ignore the local system timezone which saves having to munge the TZ env variable to do time conversions.
2020-01-06 15:18:52 -07:00
David Steele
33e328abbf Remove unused LibC code.
The code was made obsolete by the migration to C.
2019-12-28 18:30:32 -07:00
David Steele
e72a9dd0d2 Add error parameter to cfgCommandId().
This allows commands to be checked for validity without generating an error.
2019-12-28 13:37:03 -07:00
David Steele
d41eea685a Change meaning of TEST_RESULT_STR() macro.
This macro was created before the String object existed so subsequent usage with String always included a lot of strPtr() wrapping.

TEST_RESULT_STR_Z() had already been introduced but a wholesale replacement of TEST_RESULT_STR() was not done since the priority was on the C migration.

Update all calls to (old) TEST_RESULT_STR() with one of the following variants: (new) TEST_RESULT_STR(), TEST_RESULT_STR_Z(), TEST_RESULT_Z(), TEST_RESULT_Z_STR().
2019-12-26 18:08:27 -07:00