1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
Commit Graph

63 Commits

Author SHA1 Message Date
David Steele
3e254f4cff Add IoFilter interface to CipherBlock object.
This allows CipherBlock to be used as a filter in an IoFilterGroup.  The C-style functions used by Perl are now deprecated and should not be used for any new code.

Also add functions to convert between cipher names and CipherType.
2018-11-28 12:42:36 -05:00
Cynthia Shang
f4a1751abc Improve JSON to Variant conversion and add Variant to JSON conversion.
Add boolean and one-dimensional list types to jsonToKv().

Add varToJson() and kvToJson() to convert Variants and KeyValues to JSON.

Contributed by Cynthia Shang.
2018-11-23 16:02:33 -05:00
David Steele
256b727a3d Add S3 storage driver.
Only the storageNewRead() and storageList() functions are currently implemented, but this is enough to enable S3 for the archive-get command.
2018-11-21 19:32:49 -05:00
David Steele
72252ed2a1 Add HttpClient object.
A robust HTTP client with pipelining support and automatic retries.

Using a single object to make multiple requests is more efficient because requests are pipelined whenever possible. Requests are automatically retried when the connection has been closed by the server. Any 5xx response is also retried.

Only the HTTPS protocol is currently supported.
2018-11-21 19:11:45 -05:00
David Steele
1dd06a6e46 Add TlsClient object.
A simple, secure TLS client intended to allow access to services that are exposed via HTTPS. We call it TLS instead of SSL because SSL methods are disabled so only TLS connections are allowed.

This object is intended to be used for multiple TLS connections against a service so tlsClientOpen() can be called each time a new connection is needed. By default, an open connection will be reused for pipelining so the user must be prepared to retry their transaction on a read/write error if the server closes the connection before it can be reused. If this behavior is not desirable then tlsClientClose() may be used to ensure that the next call to tlsClientOpen() will create a new TLS session.

Note that tlsClientRead() is non-blocking unless there are *zero* bytes to be read from the session in which case it will raise an error after the defined timeout. In any case the tlsClientRead()/tlsClientWrite()/tlsClientEof() functions should not generally be called directly. Instead use the read/write interfaces available from tlsClientIoRead()/tlsClientIoWrite().
2018-11-21 18:43:25 -05:00
David Steele
bc25db5667 Add interface objects for libxml2.
Add XmlDocument, XmlNode, and XmlNodeList objects as a thin interface layer on libxml2.

This interface is not intended to be comprehensive. Only a few libxml2 capabilities are exposed but more can be added as needed.
2018-11-20 20:40:11 -05:00
David Steele
8f857a975e Add constant macros to String object.
There are many places (and the number is growing) where a zero-terminated string constant must be transformed into a String object to be usable.  This pattern wastes time and memory, especially since the created string is generally used in a read-only fashion.

Define macros to create constant String objects that are initialized at compile time rather than at run time.
2018-11-10 09:37:12 -05:00
David Steele
df200bee2a Add regExpPrefix() to aid in static prefix searches.
The storageList() command accepts a regular expression as a filter.  This works fine for local filesystems where it is relatively cheap to get a complete list of files and filter them in code.  However, for remote filesystems like S3 it can be expensive to fetch a complete list of files only to discard the bulk of them locally.

S3 does not filter on regular expressions but it can accept a static prefix so this function extracts a prefix from a regular expression when possible.

Even a few characters can drastically reduce the amount of data that must be fetched remotely so the function does not try to be too clever.  It requires a ^ anchor and stops scanning when the first special character is found.
2018-11-09 16:50:22 -05:00
David Steele
48d2795f31 Merge crypto/random module into crypto/crypto.
There wasn't enough code to justify a separate module/test and it seems to fit just fine in crypto/crypto.
2018-11-06 20:04:16 -05:00
David Steele
2cb312ef5a Add cryptoError() and update crypto code to use it.
This adds detail to error messages when available and improves code coverage.
2018-11-06 19:16:00 -05:00
David Steele
1f8931f732 Improve single test run performance.
Improve on 7794ab50 by including the build flag files directly into the Makefile as dependencies (even though they are not includes).  This simplifies some of the rsync logic and allows make to do what it does best.

Also split build flag files into test, harness, and build to reduce rebuilds.  Test flags are used to build test.c, harness flags are used to build the rest of the files in the test harness, and build flags are used for the files that are not directly involved in testing.
2018-11-03 16:34:04 -04:00
Cynthia Shang
34c63276cd Automatically enable backup checksum delta when anomalies (e.g. timeline switch) are detected.
There are a number of cases where a checksum delta is more appropriate than the default time-based delta:

* Timeline has switched since the prior backup
* File timestamp is older than recorded in the prior backup
* File size changed but timestamp did not
* File timestamp is in the future compared to the start of the backup
* Online option has changed since the prior backup

A practical example is that checksum delta will be enabled after a failover to standby due to the timeline switch.  In this case, timestamps can't be trusted and our recommendation has been to run a full backup, which can impact the retention schedule and requires manual intervention.

Now, a checksum delta will be performed if the backup type is incr/diff.  This means more CPU will be used during the backup but the backup size will be smaller and the retention schedule will not be impacted.

Contributed by Cynthia Shang.
2018-11-01 11:31:25 -04:00
David Steele
03b9db9aa2 Fix error after log file open failure when processing should continue.
The C code was warning on failure and continuing but the Perl logging code was never updated with the same feature.

Rather than add the feature to Perl, just disable file logging if the log file cannot be opened.  Log files are always opened by C first, so this will eliminate the error in Perl.

Reported by vthriller.
2018-10-25 14:58:25 +01:00
David Steele
db8dce7adc Disable flapping archive/get unit on CentOS 6.
This test has been flapping since 9b9396c7.  It seems to be some kind of timing issue since all integration tests pass and this unit passes on all other VMs.  It only happens on Travis and is not reproducible in any development environment that we have tried.

For now, disable the test since the constant flapping is causing major delays in testing and quite a bit of time has been spent trying to identify the root cause.  We are actively developing these tests and hope the issue will be identified during the course of normal development.

A number of improvements were made to the tests while searching for this issue.  While none of them helped, it makes sense to keep the improvements.
2018-10-02 17:54:43 +01:00
David Steele
e66e68e324 Add cryptoHmacOne() for HMAC support.
There doesn't seem to be any need to implement this as a filter since current use cases (S3 authentication) work on small datasets.

So, use the single function method provided by OpenSSL for simplicity.
2018-09-27 09:20:47 +01:00
David Steele
bcca625062 Add bufHex()to Buffer object.
A general-purpose function for converting buffers to hex strings.
2018-09-26 22:33:48 +01:00
David Steele
d038b9a029 Support configurable WAL segment size.
PostgreSQL 11 introduces configurable WAL segment sizes, from 1MB to 1GB.

There are two areas that needed to be updated to support this: building the archive-get queue and checking that WAL has been archived after a backup.  Both operations require the WAL segment size to properly build a list.

Checking the archive after a backup is still implemented in Perl and has an active database connection, so just get the WAL segment size from the database.

The archive-get command does not have a connection to the database, so get the WAL segment size from pg_control instead.  This requires a deeper inspection of pg_control than has been done in the past, so it seemed best to copy the relevant data structures from each version of PostgreSQL and build a generic interface layer to address them.  While this approach is a bit verbose, it has the advantage of being relatively simple, and can easily be updated for new versions of PostgreSQL.

Since the integration tests generate pg_control files for testing, teach Perl how to generate files with the correct offsets for both 32-bit and 64-bit architectures.
2018-09-25 10:24:42 +01:00
Cynthia Shang
880fbb5e57 Add checksum delta for incremental backups.
Use checksums rather than timestamps to determine if files have changed.  This is useful in cases where the timestamps may not be trustworthy, e.g. when performing an incremental after failing over to a standby.

If checksum delta is enabled then checksums will be used for verification of resumed backups, even if they are full.  Resumes have always used checksums to verify the files in the repository, enabling delta performs checksums on the database files as well.

Note that the user must manually enable this feature in cases were it would be useful or just keep in enabled all the time.  A future commit will address automatically enabling the feature in cases where it seems likely to be useful.

Contributed by Cynthia Shang.
2018-09-19 11:12:45 -04:00
David Steele
03003562d8 Merge all posix storage tests into a single unit.
As we add storage drivers it's important to keep the tests for each completely separate.  Rather than have three tests for each driver, standardize on having a single test unit for each driver.
2018-09-17 11:45:41 -04:00
David Steele
8852622fa2 Fix missing test caused by a misplaced YAML tag. 2018-09-16 15:53:19 -04:00
David Steele
84ab787b1a Merge protocol storage helper into storage helper.
These are separated the same way in the Perl code where the remote storage driver is located in the Protocol module. However, in the C code the intention is to implement the remote storage driver as a regular driver in the storage layer rather than making a special case out of it.

So, merge the storage helpers. This also has the benefit of making the code a bit simpler.

Also separate storageSpool() and storageSpoolWrite() to make it clearer which operations require write access and to maintain consistency with the other storage helper functions.
2018-09-16 14:12:53 -04:00
David Steele
fd14ceb399 Rename posix driver files/functions for consistency.
The posix driver was developed over time and the naming is not very consistent.

Rename the files and functions to work well with other drivers and generally favor longer names since the driver functions are seldom (eventually never) used outside the driver itself.
2018-09-13 18:58:22 -04:00
David Steele
5aa458ffae Simplify debug logging by allowing log functions to return String objects.
Previously, debug log functions had to handle NULLs and truncate output to the available buffer size.  This was verbose for both coding and testing.

Instead, create a function/macro combination that allows log functions to return a simple String object.  The wrapper function takes care of the memory context, handles NULLs, and truncates the log string based on the available buffer size.
2018-09-11 18:32:56 -04:00
David Steele
9b9396c7b7 Migrate local, unencrypted, non-S3 archive-get command to C.
The archive-get command will only be executed in C if the repository is local, unencrypted, and type posix or cifs.  Admittedly a limited use case, but this is just the first step in migrating the archive-get command entirely into C.

This is a direct migration from the Perl code (including messages) to integrate as seamlessly with the remaining Perl code as possible.  It should not be possible to determine if the C version is running unless debug-level logging is enabled.
2018-09-11 15:42:31 -04:00
David Steele
6e9b6fdca9 Migrate control functions to detect stop files to C from Perl.
Basic functions to detect the presence of stanza or all stop files and error when they are present.

The functionality to detect stop files without error was not migrated. This functionality is only used by stanza-delete and will be migrated with that command.
2018-09-07 08:03:05 -07:00
David Steele
5bdaa35fa5 Migrate walIsPartial(), walIsSegment(), and walSegmentFind() from Perl to C.
Also refactor regular expression defines to make them more reusable.
2018-09-07 08:00:18 -07:00
David Steele
9660076093 Add helper for repository storage.
Implement rules for generating paths within the archive part of the repository. Add a helper function, storageRepo(), to create the repository storage based on configuration settings.

The repository storage helper is located in the protocol module because it will support remote file systems in the future, just as the Perl version does.

Also, improve the existing helper functions a bit using string functions that were not available when they were written.
2018-09-07 07:58:08 -07:00
David Steele
960ad73298 Info objects now parse JSON and use specified storage.
Use JSON code now that it is available and remove temporary hacks used to get things working initially.

Use passed storage objects rather than using storageLocal().  All storage objects in C are still local but this won't always be the case.

Also, move Postgres version conversion functions to postgres/info.c since they have no dependency on the info objects and will likely be useful elsewhere.
2018-09-06 10:12:14 -07:00
David Steele
cb4b715533 Add strReplaceChr() to String object. 2018-08-14 16:49:38 -04:00
David Steele
4a176681c3 Add cvtCharToZ() and macro for debugging char params. 2018-08-14 16:18:17 -04:00
David Steele
6643afe9a8 Add gzip compression/decompression filters for C. 2018-08-14 14:56:59 -04:00
David Steele
e3ff6b209d Filters can now produce output that differs from input.
This allows filters such as compression, encryption, etc. to be implemented.
2018-08-14 14:21:53 -04:00
Cynthia Shang
8ab2e72960 Migrate minimum set of code for reading archive.info files from Perl to C.
Contributed by Cynthia Shang.
2018-08-09 08:57:21 -04:00
David Steele
7993f1a966 Add basic C JSON parser. 2018-08-09 08:06:23 -04:00
David Steele
01aea0c067 Implement filters that do not modify the buffer.
Update cryptoHash to use the new interface.
2018-07-24 21:08:27 -04:00
David Steele
58e9f1e50c Refactor the common/log tests to not depend on common/harnessLog.
common/harnessLog was not ideally suited for general testing and made all the tests quite awkward. Instead, move all code used to test the common/log module into the logTest module and repurpose common/harnessLog to do log expect testing for all other tests in a cleaner way.

Add a few exceptions for config testing since the log levels are reset by default in config/parse.
2018-07-20 18:51:42 -04:00
David Steele
0ac176b722 Abstract IO layer out of the storage layer.
This allows the routines to be used for IO objects that do not have a storage representation.

Implement buffer read and write IO objects.
2018-07-19 16:04:20 -04:00
Cynthia Shang
0e6b927a17 Add uint64 variant type and supporting conversion functions.
Contributed by Cynthia Shang.
Reviewed by Stephen Frost.
2018-07-12 15:23:18 -04:00
David Steele
350b30fa49 Move cryptographic hash functions to C using OpenSSL. 2018-06-11 14:52:26 -04:00
David Steele
064ec757e9 Rename cipher module to the more general crypto. 2018-06-11 10:53:16 -04:00
David Steele
a385cb520b Update primary test environment (Vagrant and Docker) to Ubuntu 18.04. 2018-06-06 15:52:28 -04:00
David Steele
52bc073234 Add stack trace macros to all functions.
Low-level functions only include stack trace in test builds while higher-level functions ship with stack trace built-in. Stack traces include all parameters passed to the function but production builds only create the parameter list when the log level is set high enough, i.e. debug or trace depending on the function.
2018-05-18 11:57:32 -04:00
David Steele
0a860e0b60 Full branch coverage for command/help/help, common/error, common/ini, and common/log modules. 2018-05-05 09:38:09 -04:00
David Steele
90aadc6534 Full branch coverage for config module. 2018-05-04 12:49:25 -04:00
David Steele
54dd6f3ed4 Add asynchronous, parallel archive-get.
This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command.
2018-04-30 17:27:39 -04:00
David Steele
321a28f6b0 Add walSegmentNext() and walSegmentRange(). 2018-04-29 11:47:50 -04:00
David Steele
be02c67503 Add pgControlInfo() to read pg_control and determine the PostgreSQL version. 2018-04-29 11:20:51 -04:00
David Steele
8c6e2bdbc7 Add storageInfo() and track size in read objects. 2018-04-29 11:02:21 -04:00
David Steele
d44848baa0 Add strLstExists(), strLstExistsZ(), strSub(), and strSubN() to String and StringList objects. 2018-04-29 10:32:46 -04:00
David Steele
89d3476e32 Refactor archive common functions in preparation for parallel async archive-get. 2018-04-29 10:16:59 -04:00