pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-12 10:04:14 +02:00

Author	SHA1	Message	Date
David Steele	ea9147e2e0	Reduce buffer-size default to 1MiB. The prior default was determined by benchmarking the Perl code prior to the 1.0 release. In general buffer allocation was more expensive in Perl so large buffers gave the best performance. This was due to multiple buffer allocations for each filter in an IO operation. The C code allocates fixed buffers for each IO operation so the cost for buffer allocation is lower than Perl. That being the case it made sense to benchmark the C code to determine the optimal buffer default. The performance/storage tests were used to measure the performance of a variety of filters. 1GiB of data was processed by each filter 10 times and the results of the tests were averaged. While most buffer sizes gave similar performance, 1MiB appeared to perform the best overall. Of course, different architectures are likely to yield different results but this seems like a sensible default. The buffer-size option may still need to be manually configured to give optimal results. Raw test data for reference: 4MB buffer (prior default) copy time 1807ms, avg time 180ms, avg throughput: 5942MB/s md5 time 14200ms, avg time 1420ms, avg throughput: 756MB/s sha1 time 11431ms, avg time 1143ms, avg throughput: 939MB/s sha256 time 23463ms, avg time 2346ms, avg throughput: 457MB/s gzip -6 time 381199ms, avg time 38119ms, avg throughput: 28MB/s lz4 -1 time 15484ms, avg time 1548ms, avg throughput: 693MB/s 1MB buffer (new default) copy time 1760ms, avg time 176ms, avg throughput: 6100MB/s md5 time 13739ms, avg time 1373ms, avg throughput: 781MB/s sha1 time 11025ms, avg time 1102ms, avg throughput: 973MB/s sha256 time 22539ms, avg time 2253ms, avg throughput: 476MB/s gzip -6 time 372995ms, avg time 37299ms, avg throughput: 28MB/s lz4 -1 time 15118ms, avg time 1511ms, avg throughput: 710MB/s 512K buffer copy time 1782ms, avg time 178ms, avg throughput: 6025MB/s md5 time 13724ms, avg time 1372ms, avg throughput: 782MB/s sha1 time 10959ms, avg time 1095ms, avg throughput: 979MB/s sha256 time 22982ms, avg time 2298ms, avg throughput: 467MB/s gzip -6 time 378120ms, avg time 37812ms, avg throughput: 28MB/s lz4 -1 time 15484ms, avg time 1548ms, avg throughput: 693MB/s 256K buffer copy time 1805ms, avg time 180ms, avg throughput: 5948MB/s md5 time 13706ms, avg time 1370ms, avg throughput: 783MB/s sha1 time 11074ms, avg time 1107ms, avg throughput: 969MB/s sha256 time 22588ms, avg time 2258ms, avg throughput: 475MB/s gzip -6 time 372645ms, avg time 37264ms, avg throughput: 28MB/s lz4 -1 time 16346ms, avg time 1634ms, avg throughput: 656MB/s	2020-05-19 16:58:49 -04:00
David Steele	f773d909be	Improve storage filter performance tests. Improve the accuracy of the calculations in several areas with better integer expressions. Make the input buffer size configurable. Previously it was always 1mb, i.e. block size. Use a macro for output results to reduce code duplication.	2020-05-19 14:35:20 -04:00
David Steele	a3d9d9a387	Handle missing reason phrase in HTTP response. Reason phrases (e.g. OK) are optional in HTTP 1.1 but the space after the status code is not. When the reason phrase was missing the required space was trimmed along with the trailing CR leading to a format error. Rework the logic to preserve the space and allow empty reason phrases. Found while testing against the Backblaze S3-compatible API.	2020-05-19 08:20:33 -04:00
David Steele	cffdadad92	Fix typo pp64le -> ppc64le. Travis-CI guessed the correct value but logged a warning.	2020-05-18 19:55:09 -04:00
David Steele	688ec2a8f5	Use an extension to denote vendorized code. Vendorized code is copied from another project when a library is not available and a git subproject won't work. Currently all the vendorized code is copied from PostgreSQL but it makes sense to have a more general mechanism for indicating vendorized code. The .vendor extension will be used to denote vendorized code in the same way that .auto is used to denote auto-generated code.	2020-05-18 19:11:26 -04:00
David Steele	a329afd3be	Add MD5 hash filter to performance tests.	2020-05-18 19:02:11 -04:00
David Steele	92c036b966	Add code count rule for valgrind suppression missed in `6be5ea33`. `6be5ea33` changed valgrind suppression file naming but failed to update the code count rules.	2020-05-18 18:09:41 -04:00
David Steele	ac5d46dc50	Increase buffer size for lz4 compression flush. Some lz4 versions between r131 and 1.7.5 did not return a sufficient buffer size from LZ4F_compressBound() to allow LZ4F_compressEnd() to complete reliably. While this issue was fixed in lz4 1.7.5 there are affected versions in supported distributions such as CentOS/RHEL 7. Use one of the hacks suggested in https://github.com/lz4/lz4/issues/290 to increase the buffer size enough for LZ4F_compressEnd() to complete. This means that a slightly larger buffer size is required for all versions but it seems worth it to (hopefully) to fix the issue in all lz4 versions.	2020-05-16 18:25:31 -04:00
David Steele	f4e6e6bd80	Add missing cryptoInit() in cryptoHmacOne(). If cryptoInit() had not already been called then EVP_get_digestbyname() would fail. This does not appear to be a problem currently because of call order. Also, newer versions of OpenSSL auto-initialize.	2020-05-15 07:49:23 -04:00
David Steele	ea485e916b	Add jq to tools installed by Vagrantfile.	2020-05-14 18:45:23 -04:00
David Steele	ed5149c9be	Disable package builds for CentOS 7 which are broken upstream.	2020-05-14 18:27:26 -04:00
David Steele	e7ad795ffb	Move common HTTP headers to HTTP client. Some headers in the S3 driver were common HTTP headers that may be used by other drivers that utilize HTTP. Also change the order of HTTP_HEADER_TRANSFER_ENCODING to be alphabetical.	2020-05-13 19:00:48 -04:00
David Steele	4cbd1f1e7e	Fix incorrect whitespace.	2020-05-13 14:27:28 -04:00
David Steele	b5dd14e6f3	Make storage type more generic in the integration tests. Rather than bS3 use strStorage which can indicate more than two storage types. For the moment there are still only two storage types but this change is required before more can be added.	2020-05-12 18:55:20 -04:00
David Steele	9639a2c15f	Add missing do...while loop to harness macro.	2020-05-12 13:30:46 -04:00
Magnus Hagander	b8a5c3ac6f	Fix incorrect command in reference documentation. Also update process to command to be more consistent with the surrounding text.	2020-05-12 13:13:04 -04:00
David Steele	33cbdb78fd	Add ppc64le to Travis-CI build matrix. apt.postgresql.org provides packages for ppcle64 so it's important that we support it. Rearrange jobs a bit based on current runtimes and importance. Also reduce the number of tests run for arm64 since it is slower than other architectures.	2020-05-11 10:10:22 -04:00
David Steele	86855e271d	Fix subtle timing issue in command/expire tests. `cdebfb09` added relative times to backup.into but a subtle issue was introduced that would cause the tests to fail if the time acquired by cmdExpire() was exactly the same as timeNow used to format backup.info. cmdExpire() was working correctly given the inputs, but the tests did not run predictably. This was found while running the tests with --no-valgrind --no-coverage which allows them to run a lot faster, thus exposing the timing issue.	2020-05-09 12:12:29 -04:00
David Steele	22d260ad53	Allow more tests to run outside of containers. These tests required sudo to achieve complete coverage. Add a new coverage exception, vm_covered, that applies to code that can only be covered in a container. When the test is run outside of a container code sections that require a container will be excluded with TEST_CONTAINER_REQUIRED and the coverage exception will be added to prevent a coverage error. This does require marking up the core code with vm_covered, which in some modules (e.g. common/io/tls/client) can be extensive. It's possible that some of these tests can be rewritten to be less dependent on sudo but no attempt was made to do that here. Only allow coverage summaries in a vm since coverage summaries outside a vm will not be complete, which was true even before this commit.	2020-05-09 09:17:33 -04:00
Stephen Frost	b4fc1804a8	Minor updates for bzip2 compression after more review. Update error types throw by bzip2 to be more consistent with gzip. Update the bzip2 and gzip error default to be AssertError as that's the more common case in both, and add a 'break;' to the default clause -- we don't intend to be just falling through those case statements, even if the default is the last, we should be explicit about that. Clean up some tabs that snuck in, rename a variable to be more clear, and add some comments.	2020-05-08 16:27:54 -04:00
David Steele	14369c1c3c	Mark variables modified in TRY block as volatile. It's important that the values in these variables are maintained even after an exception is thrown, so they must be marked volatile. Found while testing on the ppc64le architecture.	2020-05-08 15:36:20 -04:00
Cynthia Shang	cdebfb09e0	Add time-based retention for full backups. The --repo-retention-full-type option allows retention of full backups based on a time period, specified in days. The new option will default to 'count' and therefore will not affect current installations. Setting repo-retention-full-type to 'time' will allow the user to use a time period, in days, to indicate full backup retention. Using this method, a full backup can be expired only if the time the backup completed is older than the number of days set with repo-retention-full (calculated from the moment the 'expire' command is run) and at least one full backup meets the retention period. If archive retention has not been configured, then the default settings will expire archives that are prior to the oldest retained full backup. For example, if there are three full backups ending in times that are 25 days old (F1), 20 days old (F2) and 10 days old (F3), then if the full retention period is 15 days, then only F1 will be expired; F2 will be retained because F1 is not at least 15 days old.	2020-05-08 15:25:03 -04:00
David Steele	e873ad6da0	Update Minio version to 2020-05-06T23-23-25Z in tests/documentation. This release fixes the issue we submitted regarding an unquoted eTag: https://github.com/minio/minio/issues/9517	2020-05-07 17:26:46 -04:00
David Steele	faabf1227d	Update Fedora container to Fedora 32. This allows unit testing on gcc 10. Also fix an incorrect enum in the config/config unit test that was caught by the new compiler.	2020-05-07 11:06:56 -04:00
David Steele	6646446d2a	Use Z_STREAM_END to detect gz compression finish. Checking for free space in the output buffer worked, but if the buffer was completely filled then deflate() would need to be called again, which was wasteful and a bit confusing for debugging. Instead, use Z_STREAM_END to detect that compression is done. This change was inspired by the bz2 implementation in `a021c9fe` since bz2 does not allow BZ2_bzCompress() to be called after BZ_STREAM_END is returned. That made it obvious that gz would prefer the same implementation, even if it is more tolerant. The documentation at https://www.zlib.net/manual.html agrees.	2020-05-07 10:22:22 -04:00
David Steele	f8509ab76c	Don't allow sudo to disable core dumps in test containers. Newer versions of sudo output this message to stderr when run in a container: sudo: setrlimit(RLIMIT_CORE): Operation not permitted See https://github.com/sudo-project/sudo/issues/42 for details. A simple workaround is to prevent sudo from disabling core dumps. This seems safe enough because if sudo is segfaulting then core files are the least of our worries.	2020-05-07 07:38:28 -04:00
David Steele	3a75589855	Don't allow sudo to disable core dumps in documentation. Newer versions of sudo output this message to stdout when run in a container: sudo: setrlimit(RLIMIT_CORE): Operation not permitted See https://github.com/sudo-project/sudo/issues/42 for details. A simple workaround is to prevent sudo from disabling core dumps. This seems safe enough because if sudo is segfaulting then core files are the least of our worries.	2020-05-07 07:23:15 -04:00
David Steele	ad784e1997	Add libz-dev to required build packages in Debian documentation. This is apparently not installed by default in Ubuntu 20.04 as it was in prior versions.	2020-05-07 07:12:42 -04:00
David Steele	12a5d8a155	Add arm64 to Travis-CI build matrix. apt.postgresql.org will soon be providing packages for arm64 so it's important that we support it. Testing on multiple architectures also helps expose potential issues in popular architectures. See `10a5182d` for an example.	2020-05-06 19:11:28 -04:00
David Steele	6be5ea3388	Suppress Valgrind errors on a per-VM basis. There are a number of Valgrind errors on Ubuntu 12.04 which do not happen on newer distro versions. However, suppressions for these errors have masked legitimate issues in subsequent code. Instead, make suppressions VM specific so errors in other VMs are not masked.	2020-05-06 18:24:48 -04:00
David Steele	28967951ab	Fix leak in TlsClient object. sckClientOpen() is the most likely part of this code to error so move it up above SSL session creation to reduce the chance of a leak.	2020-05-06 18:17:50 -04:00
David Steele	e677929802	Fix leak in CipherBlock object. EVP_CIPHER_CTX_cleanup() was being called instead of EVP_CIPHER_CTX_free() so most of the memory was being freed but not all of it. This leak was masked by Valgrind suppressions which are only applicable to Ubuntu 12.04, which will be addressed in a future commit.	2020-05-06 18:09:11 -04:00
David Steele	10a5182d62	Simplify retry handling in tls/http/socket clients. Travis-CI arm64 was not happy with this pattern, perhaps because connected was being reset after a longjmp() even though it should have stayed with its originally initialized value of false. In any case, tlsClientOpen() ended up returning NULL on error rather than throwing an exception. The new pattern seems simpler and passes all tests unmodified, so even though the error was only seen in TlsClient it makes sense to propagate to the other clients.	2020-05-06 15:00:34 -04:00
David Steele	8aede3353c	Always use 127.0.0.1 on TLS tests outside of containers. Resolving localhost can vary based on the local network configuration so it is safer to just use a static IP. This was found while testing on Travis-CI arm64.	2020-05-06 14:49:03 -04:00
David Steele	3fe6ad5047	Build branches with -cit suffix on Travis-CI. This allows a branch to be targeted at only Travis-CI and not other CI services.	2020-05-06 10:23:42 -04:00
David Steele	4c6dbe17ea	Remove redundant Cirrus-CI branch filter.	2020-05-06 09:12:50 -04:00
Stephen Frost	a021c9fe05	Add bzip2 compression support. bzip2 is a widely available, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), while being around twice as fast at compression and six times faster at decompression. bzip2 is currently available on all supported platforms.	2020-05-05 16:49:01 -04:00
David Steele	98f30ef222	Move PostgreSQL 9.4 real integration tests to Ubuntu 18.04. PostgreSQL 9.4 packages for RHEL 6 were dropped from yum.p.o.	2020-05-05 15:00:13 -04:00
David Steele	99405cbb15	Replace booleans with enums in compressType parameters. This was an oversight in `438b957f` which added multiple compression type support. The booleans were interpreted as none and gz which works fine for the CompressType enum until the position of gz or none changes.	2020-05-05 13:23:36 -04:00
David Steele	d04c21ca83	Centralize String and Buffer constants in stringz.h. It's not clear how useful single-character zero-terminated constants are or if we want propagate them through the code, but it at least makes sense to centralize the constants used by the Buffer and String objects.	2020-05-04 19:05:38 -04:00
David Steele	47aa765375	Add Zstandard compression support. Zstandard is a fast lossless compression algorithm targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library. Zstandard version >= 1.0 is required, which is generally only available on newer distributions.	2020-05-04 15:25:27 -04:00
David Steele	1aaaa94253	Remove Ubuntu 19.04 container definition. Ubuntu 19.04 is no longer supported.	2020-05-04 14:02:25 -04:00
David Steele	39f5f3a0b4	Remove PostgreSQL 9.4 for Fedora 30 dropped from yum.p.o.	2020-05-04 13:12:52 -04:00
David Steele	64a21920e2	Move S3 initialization in user guide quickstart. The previous location was too late to allow --var=s3-all=y to work with --require=/repo-host, which depends on /quickstart/configure-archiving. Since the section is not included in production documentation, the position is not very important to flow so just move it to where it works.	2020-05-03 18:42:33 -04:00
David Steele	ef93249922	Add contributor for `816ba924` and reclassify as a bug.	2020-05-01 17:32:31 -04:00
David Steele	816ba9244f	Allow pg-path1 to be optional for synchronous archive-push. If the WAL path is absolute then pg1-path should be optional but in fact it was required to load pg_control. Skip the pg_control check when pg1-path is not specified. The check against the stanza version/system-id remains to protect the repo from corruption.	2020-05-01 10:30:35 -04:00
David Steele	1d45282b97	Add missing spaces between while keyword and condition. Our convention is to have a space here but some were missed.	2020-05-01 09:31:50 -04:00
David Steele	28ab65df10	Remove unused struct member. Perhaps this was intended to verify the WAL size but was never implemented. Verifying the WAL size is probably a good idea so this member may be added back if the feature is implemented.	2020-05-01 09:08:37 -04:00
David Steele	22ba1f02ce	Convert storagePosixNew() to storagePosixNewP(). An upcoming feature requires new parameters for storagePosixNew() and this causes a lot of churn because almost every test creates a Posix storage object. Some refactoring in the tests might reduce this duplication but storagePosixNew() is collecting a lot of parameters so converting to storagePosixNewP() makes sense in any case. There are relatively few call sites in the core code but they still benefit from better readability after this change.	2020-04-30 11:01:38 -04:00
David Steele	baf8cb9068	Fix issue checking if file links are contained in path links. There is no conflict if the path containing a file link is a parent path of a path link. The Perl code apparently had this right but the migration to C missed it. Exclude this case when checking for link conflicts.	2020-04-30 10:47:09 -04:00

1 2 3 4 5 ...

2789 Commits