pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00

Author	SHA1	Message	Date
David Steele	02aa03d1a2	Remove obsolete methods in pgBackRest::Storage::Storage module. All the methods in this module will need to be implemented via the command-line in order to get rid of LibC, so the first step is to reduce the code in the module as much as possible. First remove storageDb() and use storageTest() instead. Then create storageTest() using pgBackRestTest::Common::Storage which has no dependencies on LibC. Now the only storage using the LibC interface is storageRepo(). Remove all link functions since those operations cannot be performed on a repo unless it is Posix, in which case the LibC interface is not needed. Same for owner(). Remove pathSync() because syncs are not required in the tests. No test data is reused after a crash. Path create/exists functions should never be explicitly performed on a repo so remove those. File exists can be implemented by calling info() instead. Remove encryption detection functions which were only used by Backup/Archive::Info reconstruct() which are now obsolete. Remove all filters except pgBackRest::Storage::Filter::CipherBlock since they are not being used. That also means there are no filters returning results so remove all the result code. Move hashSize() and pathAbsolute() into pgBackRest::Storage::Base where they can be shared between pgBackRest::Storage::Storage and pgBackRestTest::Common::Storage.	2020-03-06 14:10:09 -05:00
David Steele	2e0fe25650	Remove dependency on LibC hash filter. Perl provides Digest::SHA for hashing so there is no need to expose this via LibC anymore.	2020-03-05 18:34:59 -05:00
David Steele	ee351682da	Rename "gzip" to "gz". "gz" was used as the extension but "gzip" was generally used for function and type naming. With a new compression format on the way, it makes sense to standardize on a single abbreviation to represent a compression format in the code. Since the extension is standard and we must use it, also use the extension for all naming.	2020-02-27 12:09:05 -05:00
David Steele	7d8068f27b	Don't decode manifest data when it is generated on a remote. Decoding a manifest from the JSON provided by C to the hash required by Perl is an expensive process. If manifest() was called on a remote it was being decoded into a hash and then immediately re-encoded into JSON for transmission over the protocol layer. Instead, provide a function for the remote to get the raw JSON which can be transmitted as is and decoded in the calling process instead. This makes remote manifest calls as fast as they were before 2.16, but local calls must still pay the decoding penalty and are therefore slower. This will continue to be true until the Perl storage interface is retired at the end of the C migration. Note that for reasonable numbers of tables there is no detectable difference. The case in question involved 250K tables with a 10 minute decode time (which was being doubled) on a fast workstation.	2019-09-03 12:30:45 -04:00
Josh Soref	c2771e5469	Fix comment typos. This includes some variable names in tests which don't seem important enough for their own commits. Contributed by Josh Soref.	2019-08-26 12:05:36 -04:00
David Steele	4815752ccc	Add Perl interface to C storage layer. Maintaining the storage layer/drivers in two languages is burdensome. Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C. Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents. The goal is to move the integration tests to C so this interface will eventually be removed. That being the case, the interface was designed for maximum compatibility to ease the transition. The result looks a bit hacky but we'll improve it as needed until it can be retired.	2019-06-26 08:24:58 -04:00
David Steele	cb00030ee3	Remove dead code missed in `1b486847`. This commit removed all Perl references to spool storage but some stuff was left behind.	2019-05-08 18:58:07 -04:00
David Steele	32ca27a20b	Simplify storage object names. Remove "File" and "Driver" from object names so they are shorter and easier to keep consistent. Also remove the "driver" directory so storage implementations are visible directly under "storage".	2019-05-03 15:46:15 -04:00
David Steele	d211c2b8b5	Fix possible truncated WAL segments when an error occurs mid-write. The file write object destructors called close() and finalized the file even if it was not completely written. This was an issue in both the C and Perl code. Rewrite the destructors to simply free resources (like file handles) rather than calling the close() method. This leaves the temp file in place for filesystems that use temp files. Add unit tests to prevent regression. Reported by blogh.	2019-02-15 11:52:39 +02:00
David Steele	ef9dc89e08	Update Storage::Local->list() to accept an undefined path. The Perl code has a tendency to generate absolute paths even when they are not needed. This change helps the C and Perl storage work together via the protocol layer.	2019-01-16 18:49:12 +02:00
David Steele	e73416e9e3	Change file ownership only when required. Previously chown() would be called even when no ownership changes were required. In most cases changes are not required and it seems better to perform an extra stat() rather than an extra chown(). Also add unit tests for owner() since there weren't any.	2018-12-05 17:56:47 -05:00
David Steele	bf873be4aa	Redact authentication header when throwing S3 errors. The authentication header contains the access key (not the secret key) so don't include it in errors that can be seen at any log level. Suggested by Brad Nicholson.	2018-12-05 12:51:13 -05:00
David Steele	1ad67644da	Remove request for S3 object info directly after putting it. After a file is copied during backup the size is requested from the storage in case it differs from what was written so that repo-size can be reported accurately. This is useful for situations where compression is being done by the filesystem (e.g. ZFS) and what is stored can differ in size from what was written. In S3 the reported size will always be exactly what was written so there is no need to check the size and doing so immediately can cause problems because the new file might not appear in list commands. This has not been observed on S3 (though it seems to be possible) but it has been reported on the Swift S3 gateway. Add a driver capability to determine if size needs to be called after a file is written and if not then simply use the number of bytes written for repo-size. Reported by Matt Kunkel.	2018-11-30 10:38:02 -05:00
David Steele	801e2a5a2c	Rename PGBACKREST/BACKREST constants to PROJECT. This brings consistency between the C and Perl constants and allows for easier code reuse.	2018-11-24 19:05:03 -05:00
David Steele	cca7a4ffd4	Retry all S3 5xx errors rather than just 500 internal errors. We were already retrying 500 errors but 503 (rate-limiting) errors were not being retried and would cause an instant failure which aborted the command. There are only two 5xx errors currently implemented by S3 but instead of adding 503 simply retry all 5xx errors. This is consistent with the http definition of this error class, "the server failed to fulfill an apparently valid request." Suggested by Craig A. James.	2018-10-30 16:45:42 -04:00
Cynthia Shang	b6b2c915b2	Allow hashSize() to run on remote storage. Apparently we never needed to run this function remotely. It will be needed by the backup checksum delta feature, so implement it now. Contributed by Cynthia Shang.	2018-09-18 11:39:48 -04:00
David Steele	c688bc8627	Improve support for special characters in filenames. % characters caused issues in backup/restore due to filenames being appended directly into a format string. Reserved XML characters (<>&') caused issues in the S3 driver due to improper escaping. Add a file with all common special characters to regression testing.	2018-09-10 10:54:34 -04:00
David Steele	80ef6fce75	Fix missing missing URI encoding in S3 driver. File names with uncommon characters (e.g. @) caused authentication failures due to S3 encoding them correctly while the S3 driver did not. Reported by Dan Farrell.	2018-09-10 10:47:00 -04:00
David Steele	375ff9f9d2	Ignore all files in a linked tablespace directory except the subdirectory for the current version of PostgreSQL. Previously an error would be generated if other files were present and not owned by the PostgreSQL user. This hasn't been a big deal in practice but it could cause issues. Also add tests to make sure the same logic applies with links to files, i.e. all other files in the directory should be ignored. This was actually working correctly, but there were no tests for it before.	2018-08-31 16:06:40 -04:00
Andrew Schwartz	1bd98b61df	Fix non-compliant ISO-8601 timestamp format in S3 authorization headers. AWS and some gateways were tolerant of space rather than zero-padded hours while others were not. Fixed by Andrew Schwartz.	2018-07-01 08:17:27 -04:00
David Steele	350b30fa49	Move cryptographic hash functions to C using OpenSSL.	2018-06-11 14:52:26 -04:00
Yogesh Sharma	6a40c916d4	Add repo-s3-token option to allow temporary credentials tokens to be configured. pgBackRest currently has no way to request new credentials so the entire command (e.g. backup, restore) must complete before the credentials expire. Contributed by Yogesh Sharma.	2018-05-02 14:06:40 -04:00
David Steele	71ba08f579	Use path list in the backup manifest to do restore path syncs. Remove recursive path sync functionality since it is no longer used.	2018-05-01 11:05:37 -04:00
David Steele	54dd6f3ed4	Add asynchronous, parallel archive-get. This feature maintains a queue of WAL segments to help reduce latency when PostgreSQL requests a WAL segment with restore_command.	2018-04-30 17:27:39 -04:00
David Steele	4744eb9387	Add storagePathRemove() and use it in the Perl Posix driver. This implementation should be faster because it does not stat each file. It simply assumes that most directory entries are files so attempts an unlink() first. If the entry is reported by error codes to be a directory then it attempts an rmdir().	2018-04-11 08:21:09 -04:00
David Steele	348278bb68	Make backup directory sync more efficient. Scanning the entire backup directory can be very expensive if there are a lot of small tables. The backup manifest contains the backup directory list so use it to perform syncs instead of scanning the backup directory.	2018-04-03 21:30:15 -04:00
David Steele	5890272247	Fix directory syncs running recursively when only the specified directory should be synced. Reported by Craig A. James.	2018-04-03 18:12:03 -04:00
David Steele	599e41a251	Improve S3 delete performance. The constant S3_BATCH_MAX had been replaced with a hard-coded value of 2, probably during testing.	2018-02-18 14:54:32 -05:00
David Steele	7cf955425e	The C library is now required. This eliminates conditional loading and eases development of new library features.	2017-11-26 17:45:00 -05:00
David Steele	b8746f368d	Inflate performance improvement for gzip filter and full unit test coverage.	2017-11-14 15:12:31 -05:00
Cynthia Shang	b03c26968a	Repository encryption support. Contributed by Cynthia Shang.	2017-11-06 12:51:12 -05:00
David Steele	8d6a08a32b	Library code for repository encryption support.	2017-11-03 13:57:58 -04:00
David Steele	8674a4f7ae	Allow functions with sensitive options to be logged at debug level with redactions. Previously, functions with sensitive options had to be logged at trace level to avoid exposing them. Trace level logging may still expose secrets so use with caution.	2017-10-24 12:35:36 -04:00
David Steele	d989cf8ac2	Replace dynamically built class hierarchies in I/O layer with fixed parent() calls.	2017-10-22 19:07:17 -04:00
David Steele	1f120f3fce	Improve performance of list requests on S3. Any beginning literal portion of a filter expression is used to generate a search prefix which often helps keep the request small enough to avoid rate limiting. Suggested by Mihail Shvein.	2017-10-20 14:10:16 -04:00
David Steele	eea2ccc3ab	Add HTTP retries to harden against transient S3 network errors.	2017-09-03 16:48:41 -04:00
David Steele	206415d4c7	Fixed an issue that could cause compression to abort on growing files. Reported by Jesper St John, Aleksandr Rogozin.	2017-08-30 16:34:05 -04:00
David Steele	1e0ed07455	Configuration rules are now pulled from the C library when present.	2017-08-25 16:47:47 -04:00
David Steele	61c38f5808	Fixed authentication issue in S3 retry.	2017-08-09 11:27:09 -04:00
David Steele	038d47bcc0	Retry when S3 returns an internal error (500).	2017-08-08 17:15:01 -04:00
David Steele	156fd4d54d	Add bIgnoreMissing parameter to Local->manifest().	2017-07-25 12:44:38 -04:00
David Steele	f3b62d2d67	Fixed misleading error message when a file was opened for write in a missing directory.	2017-06-27 17:07:12 -04:00
David Steele	918c1c6f49	Add s3-repo-ca-path and s3-repo-ca-file options. The options accommodate systems where CAs are not automatically found by IO::Socket::SSL, i.e. RHEL7, or to load custom CAs. Suggested by Scott Frazer.	2017-06-22 18:22:49 -04:00
David Steele	f596702c5b	Improve S3 error reporting.	2017-06-21 20:46:49 -04:00
David Steele	f6d4457d58	Full/Synthetic test refactor. * Combine hardlink and non/compressed in synthetic tests to reduce test time and improve coverage. * Change log level of hardlink logging to detail. * Cast size in S3 manifest to integer.	2017-06-15 15:32:10 -04:00
David Steele	051c961151	S3 repository support.	2017-06-12 10:52:32 -04:00
David Steele	de7fc37f88	Storage and IO layer refactor: Refactor storage layer to allow for new repository filesystems using drivers. (Reviewed by Cynthia Shang.) Refactor IO layer to allow for new compression formats, checksum types, and other capabilities using filters. (Reviewed by Cynthia Shang.)	2017-06-09 17:51:41 -04:00

47 Commits