1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
Commit Graph

1222 Commits

Author SHA1 Message Date
David Steele
4d84820021 Improve performance of info file load/save.
Info files required three copies in memory to be loaded (the original string, an ini representation, and the final info object). Not only was this memory inefficient but the Ini object does sequential scans when searching for keys making large files very slow to load.

This has not been an issue since archive.info and backup.info are very small, but it becomes a big deal when loading manifests with hundreds of thousands of files.

Instead of holding copies of the data in memory, use a callback to deliver the ini data directly to the object when loading. Use a similar method for save to avoid having an intermediate copy. Save is a bit complex because sections/keys must be written in alpha order or older versions of pgBackRest will not calculate the correct checksum.

Also move the load retry logic to helper functions rather than embedding it in the Info object. This allows for more flexibility in loading and ensures that stack traces will be available when developing unit tests.

Reviewed by Cynthia Shang.
2019-09-06 13:48:28 -04:00
David Steele
7334f30c35 Add helper function for adding CipherBlock filters to groups.
Reviewed by Cynthia Shang.
2019-09-06 13:35:28 -04:00
David Steele
8df7d68c8d Fix sudo missed in "Build pgBackRest as an unprivileged user".
286a106a updated the documentation to build pgBackRest as an unprivileged user, but the wget command was missed.  This command is not actually run, just displayed, because the release is not yet available when the documentation is built.

Update the wget command to run as the local user.
2019-09-03 18:28:53 -04:00
David Steele
005684bf1f Begin v2.18 development. 2019-09-03 17:53:50 -04:00
David Steele
ce2bf29998 v2.17: C Migrations and Bug Fixes
Bug Fixes:

* Improve slow manifest build for very large quantities of tables/segments. (Reported by Jens Wilke.)
* Fix exclusions for special files. (Reported by CluelessTechnologist, Janis Puris, Rachid Broum.)

Improvements:

* The stanza-create/update/delete commands are implemented entirely in C. (Contributed by Cynthia Shang.)
* The start/stop commands are implemented entirely in C. (Contributed by Cynthia Shang.)
* Create log directories/files with 0750/0640 mode. (Suggested by Damiano Albani.)

Documentation Bug Fixes:

* Fix yum.p.o package being installed when custom package specified. (Reported by Joe Ayers, John Harvey.)

Documentation Improvements:

* Build pgBackRest as an unprivileged user. (Suggested by Laurenz Albe.)
2019-09-03 16:39:32 -04:00
David Steele
0b5720c642 Fix yum.p.o package being installed when custom package specified.
The {[os-type-is-centos]} expression was missing parens which meant "and" expressions built on it would always evaluate true if the os-type was centos6.

Reported by Joe Ayers, John Harvey.
2019-09-03 14:34:49 -04:00
Josh Soref
4a88791a0a Fix typos in the release notes.
Contributed by Josh Soref.
2019-08-26 12:29:43 -04:00
Josh Soref
545ccfa878 Fix typos in the documentation.
Contributed by Josh Soref.
2019-08-26 12:26:00 -04:00
Josh Soref
c2771e5469 Fix comment typos.
This includes some variable names in tests which don't seem important enough for their own commits.

Contributed by Josh Soref.
2019-08-26 12:05:36 -04:00
David Steele
01c2669b97 Fix exclusions for special files.
Prior to 2.16 the Perl manifest code would skip any file that began with a dot.  This was not intentional but it allowed PostgreSQL socket files to be located in the data directory.  The new C code in 2.16 did not have this unintentional exclusion so socket files in the data directory caused errors.

Worse, the file type error was being thrown before the exclusion check so there was really no way around the issue except to move the socket files out of the data directory.

Special file types (e.g. socket, pipe) will now be automatically skipped and a warning logged to notify the user of the exclusion.  The warning can be suppressed with an explicit --exclude.

Reported by CluelessTechnologist, Janis Puris, Rachid Broum.
2019-08-23 07:47:54 -04:00
David Steele
c002a2ce2f Move info file checksum to the end of the file.
Putting the checksum at the beginning of the file made it impossible to stream the file out when saving.  The entire file had to be held in memory while it was checksummed so the checksum could be written at the beginning.

Instead place the checksum at the end.  This does not break the existing Perl or C code since the read is not order dependent.

There are no plans to improve the Perl code to take advantage of this change, but it will make the C implementation more efficient.

Reviewed by Cynthia Shang.
2019-08-21 19:45:48 -04:00
Cynthia Shang
c733319063 The stanza-create/update/delete commands are implemented entirely in C.
Contributed by Cynthia Shang.
2019-08-21 16:26:28 -04:00
David Steele
286a106ae4 Build pgBackRest as an unprivileged user.
pgBackRest was being built by root in the documentation which is definitely not best practice.

Instead build as the unprivileged default container user.  Sudo privileges are still required to install.

Suggested by Laurenz Albe.
2019-08-20 09:46:29 -04:00
David Steele
9eaeb33c88 Fix slow manifest build for very large quantities of tables/segments.
storagePosixInfoList() processed each directory in a single memory context.  If the directory contained hundreds of thousands of files processing became very slow due to the number of allocations.

Instead, reset the memory context every thousand files to minimize the number of allocations active at once, improving both speed and memory consumption.

Reported by Jens Wilke.
2019-08-19 21:36:01 -04:00
David Steele
41b6795a37 Create log directories/files with 0750/0640 mode.
The log directories/files were being created with a mix of modes depending on whether they were created in C or Perl.  In particular, the C code was creating log files with the execute bit set for the user and group which was just odd.

Standardize on 750/640 for both code paths.

Suggested by Damiano Albani.
2019-08-17 14:15:37 -04:00
Cynthia Shang
382ed92825 The start/stop commands are implemented entirely in C.
The Perl versions remain because they are still being used by the Perl stanza commands.  Once the stanza commands are migrated they can be removed.

Contributed by Cynthia Shang.
2019-08-09 15:17:18 -04:00
David Steele
efc62c9057 Begin v2.17 development. 2019-08-05 12:32:06 -04:00
David Steele
9e730c1bd6 v2.16: C Migrations and Bug Fixes
Bug Fixes:

* Retry S3 RequestTimeTooSkewed errors instead of immediately terminating. (Reported by sean0101n, Tim Garton, Jesper St John, Aleš Zelený.)
* Fix incorrect handling of transfer-encoding response to HEAD request. (Reported by Pavel Suderevsky.)
* Fix scoping violations exposed by optimizations in gcc 9. (Reported by Christian Lange, Ned T. Crigler.)

Features:

* Add repo-s3-port option for setting a non-standard S3 service port.

Improvements:

* The local command for backup is implemented entirely in C. (Contributed by David Steele, Cynthia Shang.)
* The check command is implemented partly in C. (Reviewed by Cynthia Shang.)
2019-08-05 12:03:04 -04:00
David Steele
3d3003e9ca The check command is implemented partly in C.
Implement switch WAL and archive check in C but leave the rest in Perl for now.

The main idea was to have some real integration tests for the new database code so the rest of the migration can wait.

Reviewed by Cynthia Shang.
2019-08-01 20:35:01 -04:00
David Steele
e4901d50d5 Add Db object to encapsulate PostgreSQL queries and commands.
Migrate functionality from the Perl Db module to C. For now this is just enough to implement the WAL switch check.

Add the dbGet() helper function to get Db objects easily.

Create macros in harnessPq to make writing pq scripts easier by grouping commonly used functions together.

Reviewed by Cynthia Shang.
2019-08-01 15:38:27 -04:00
David Steele
f9e1f3a798 Retry S3 RequestTimeTooSkewed errors instead of immediately terminating.
The cause of this error seems to be that a failed request takes so long that a subsequent retry at the http level uses outdated headers.

We're not sure if pgBackRest it to blame here (in one case a kernel downgrade fixed it, in another case an incorrect network driver was the problem) so add retries to hopefully deal with the issue if it is not too persistent.  If SSL_write() has long delays before reporting an error then this will obviously affect backup performance.

Reported by sean0101n, Tim Garton, Jesper St John, Aleš Zelený.
2019-08-01 14:28:30 -04:00
David Steele
554d98746a Add repo-s3-port option for setting a non-standard S3 service port.
If this option is set then ports appended to repo-s3-endpoint or repo-s3-host will be ignored.

Setting this option explicitly may be the only way to use a bare ipv6 address with S3 (since multiple colons confuse the parser) but we plan to improve this in the future.
2019-07-25 17:36:51 -04:00
David Steele
d8ca0e5c5b Add Perl interface to C PgQuery object.
This validates that all current queries work with the new interface and removes the dependency on DBD::Pg.
2019-07-25 17:05:39 -04:00
David Steele
415542b4a3 Add PostgreSQL query client.
This direct interface to libpq allows simple queries to be run against PostgreSQL and supports timeouts.

Testing is performed using a shim that can use scripted responses to test all aspects of the client code.  The shim will be very useful for testing backup scenarios on complex topologies.

Reviewed by Cynthia Shang.
2019-07-25 14:50:02 -04:00
David Steele
59f135340d The local command for backup is implemented entirely in C.
The local process is now entirely migrated to C.  Since all major I/O operations are performed in the local process, the vast majority of I/O is now performed in C.

Contributed by David Steele, Cynthia Shang.
2019-07-25 14:34:16 -04:00
David Steele
3bdba4933d Fix incorrect handling of transfer-encoding response to HEAD request.
The HTTP server can use either content-length or transfer-encoding to indicate that there is content in the response.  HEAD requests do not include content but return all the same headers as GET.  In the HEAD case we were ignoring content-length but not transfer-encoding which led to unexpected eof errors on AWS S3.  Our test server, minio, uses content-length so this was not caught in integration testing.

Ignore all content for HEAD requests (no matter how it is reported) and add a unit test for transfer-encoding to prevent a regression.

Found by Pavel Suderevsky.
2019-07-17 16:49:42 -04:00
Cynthia Shang
6a89c1526e Revert a2dcdc07.
It is simpler to implement the required logic in stanza-delete rather than add complexity to this function.

Contributed by Cynthia Shang.
2019-07-10 12:04:25 -04:00
David Steele
a22a6dc08c Update contributor name. 2019-07-10 06:06:07 -04:00
Cynthia Shang
a2dcdc0711 Update lockStopTest() to optionally return a result rather than error.
Some commands (e.g. stanza-delete) would prefer to throw a customized error.

Contributed by Cynthia Shang.
2019-07-09 16:41:58 -04:00
David Steele
fc21013522 Fix scoping violations exposed by optimizations in gcc 9.
gcc < 9 makes all compound literals function scope, even though the C spec requires them to be invalid outside the current scope.  Since the compiler and valgrind were not enforcing this we had a few violations which caused problems in gcc >= 9.

Even though we are not quite ready to support gcc 9 officially, fix the scoping violations that currently exist in the codebase.

Reported by chrlange, Ned T. Crigler.
2019-07-05 16:25:28 -04:00
David Steele
c55009d0f9 Community yum package can be installed with --var=package=yum.
Like apt, the community yum package can now be installed instead of a user-specified package.
2019-06-27 14:39:11 -04:00
David Steele
4815752ccc Add Perl interface to C storage layer.
Maintaining the storage layer/drivers in two languages is burdensome.  Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C.  Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents.

The goal is to move the integration tests to C so this interface will eventually be removed.  That being the case, the interface was designed for maximum compatibility to ease the transition.  The result looks a bit hacky but we'll improve it as needed until it can be retired.
2019-06-26 08:24:58 -04:00
David Steele
466602387b Begin v2.16 development. 2019-06-25 08:42:20 -04:00
David Steele
6650d8144c v2.15: C Implementation of Expire
Bug Fixes:

* Fix archive retention expiring too aggressively. (Fixed by Cynthia Shang. Reported by Mohamad El-Rifai.)

Improvements:

* The expire command is implemented entirely in C. (Contributed by Cynthia Shang.)
* The local command for restore is implemented entirely in C.
* Remove hard-coded PostgreSQL user so $PGUSER works. (Suggested by Julian Zhang, Janis Puris.)
* Honor configure --prefix option. (Suggested by Daniel Westermann.)
* Rename repo-s3-verify-ssl option to repo-s3-verify-tls. The new name is preferred because pgBackRest does not support any SSL protocol versions (they are all considered to be insecure). The old name will continue to be accepted.

Documentation Improvements:

* Add FAQ to the documentation. (Contributed by Cynthia Shang.)
* Use wal_level=replica in the documentation for PostgreSQL ≥ 9.6. (Suggested by Patrick McLaughlin.)
2019-06-25 08:29:06 -04:00
David Steele
51fcaee43e Add host-repo-path variable internal replacement.
This variable needs to be replaced right before being used without being added to the cache since the host repo path will vary from system to system.

This is frankly a bit of a hack to get the documentation to build in the Debian packages for the upcoming release.  We'll need to come up with something more flexible going forward.
2019-06-25 07:58:38 -04:00
David Steele
c22e10e4a9 Honor configure --prefix option.
The --prefix option was entirely ignored and DESTDIR was a combination of DESTDIR and bindir.

Bring both in line with recommendations for autoconf and make as specified in https://www.gnu.org/software/make/manual/html_node/Directory-Variables.html and https://www.gnu.org/prep/standards/html_node/DESTDIR.html.

Suggested by Daniel Westermann.
2019-06-24 15:42:33 -04:00
Cynthia Shang
62715ebf2d Fix archive retention expiring too aggressively.
The problem expressed when repo1-archive-retention-type was set to diff.  In this case repo1-archive-retention ended up being effectively equal to one, which meant PITR recovery was only possible from the last backup.  WAL required for consistency was still preserved for all backups.

This issue is not present in the C migration committed at 434cd832, which was written before this bug was reported.  Even so, we wanted to note this issue in the release notes in case any other users have been affected.

Fixed by Cynthia Shang.
Reported by Mohamad El-Rifai.
2019-06-19 17:49:38 -04:00
David Steele
a7d64bab7a Add FAQ on where to find old Debian/Ubuntu packages. 2019-06-18 19:02:09 -04:00
Cynthia Shang
e2d791394a Add FAQ to the documentation.
Contributed by Cynthia Shang.
2019-06-18 18:42:47 -04:00
David Steele
434cd83285 The expire command is implemented entirely in C.
This implementation duplicates the functionality of the Perl code but does so with different logic and includes full unit tests.

Along the way at least one bug was fixed, see issue #748.

Contributed by Cynthia Shang.
2019-06-18 15:19:20 -04:00
David Steele
0efdf2576f Remove hard-coded PostgreSQL user so $PGUSER works.
The PostgreSQL user was hard-coded to the OS user which libpq will automatically use if $PGUSER is not set, so this code was redundant and prevented $PGUSER from working when set.

Suggested by Julian Zhang, Janis Puris.
2019-06-18 07:35:34 -04:00
Cynthia Shang
c64c9c0590 Add backup management functions to InfoBackup.
Allow current backups to be listed and deleted.

Also expose some constants required by expire and stanza-* commands.

Contributed by Cynthia Shang.
2019-06-17 06:59:06 -04:00
Cynthia Shang
44bafc127d Rename info*New() functions to info*NewLoad().
These names more accurately reflect what the functions do and follow the convention started in Info and InfoPg.

Also remove the ignoreMissing parameter since it was never used.

Contributed by Cynthia Shang.
2019-06-17 06:47:15 -04:00
David Steele
6e809e578f Add tag to specify minio version to use for documentation build.
The new minio major release broke the build.  We'll need to figure that out but for now use the last major version, which is known to work.
2019-06-11 10:34:42 -04:00
David Steele
d7bd0c58cd Use wal_level=replica in the documentation for PostgreSQL >= 9.6.
The documentation was using wal_level=hot_standby which is a deprecated setting.

Also remove the reference to wal_level=archive since it is no longer supported and is not recommended for older versions.

Suggested by Patrick McLaughlin.
2019-06-05 07:27:24 -04:00
David Steele
64260b2e98 Build all docs with S3 using --var=s3-all=y
Force repo-type=s3 for all tests.  This is not currently the default for any OS builds.
2019-05-29 08:38:45 -04:00
David Steele
404284b90f Add internal flag for commands.
Allow commands to be skipped by default in the command help but still work if help is requested for the command directly.  There may be other uses for the flag in the future.

Update help for ls now that it is exposed.
2019-05-28 12:18:05 -04:00
David Steele
20e5b92f36 Add ls command.
Allows listing repo paths/files from the command-line, to be used primarily for testing and debugging.

This command is internal-only so the interface may change at any time without notice.
2019-05-28 10:03:48 -04:00
David Steele
3e1b06acaa Use minio as local S3 emulator in documentation.
The documentation was relying on a ScalityS3 container built for testing which wasn't very transparent.  Instead, use the stock minio container and configure it in the documentation.

Also, install certificates and CA so that TLS verification can be enabled.
2019-05-27 07:37:20 -04:00
David Steele
ec9622cde8 Use the git log to ease release note management.
The release notes are generally a direct reflection of the git log.  So, ease the burden of maintaining the release notes by using the git log to determine what needs to be added.

Currently only non-dev items are required to be matched to a git commit but the goal is to account for all commits.

The git history cache is generated from the git log but can be modified to correct typos and match the release notes as they evolve.  The commit hash is used to identify commits that have already been added to the cache.

There's plenty more to do here.  For instance, links to the commits for each release item should be added to the release notes.
2019-05-22 18:54:49 -04:00