pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00

Author	SHA1	Message	Date
David Steele	c279a00279	Add lz4 compression support. LZ4 compresses data faster than gzip but at a lower ratio. This can be a good tradeoff in certain scenarios. Note that setting compress-type=lz4 will make new backups and archive incompatible (unrestorable) with prior versions of pgBackRest.	2020-03-10 14:45:27 -04:00
David Steele	d3c83453de	Add repo-create, repo-get, repo-put, and repo-rm commands. These commands are generally useful but more importantly they allow removing LibC by providing the Perl integration tests an alternate way to work with repository storage. All the commands are currently internal only and should not be used on production repositories.	2020-03-09 17:15:03 -04:00
David Steele	5e1291a29f	Rename ls command to repo-ls. This command only makes sense for the repository storage since other storage (e.g. pg and spool) must be located on a local Posix filesystem and can be listed using standard unix commands. Since the repo storage can be located lots of places having a common way to list it makes sense. Prefix with repo- to make the scope of this command clear. Update documentation to reflect this change.	2020-03-09 16:41:04 -04:00
David Steele	438b957f9c	Add infrastructure for multiple compression type support. Add compress-type option and deprecate compress option. Since the compress option is boolean it won't work with multiple compression types. Add logic to cfgLoadUpdateOption() to update compress-type if it is not set directly. The compress option should no longer be referenced outside the cfgLoadUpdateOption() function. Add common/compress/helper module to contain interface functions that work with multiple compression types. Code outside this module should no longer call specific compression drivers, though it may be OK to reference a specific compression type using the new interface (e.g., saving backup history files in gz format). Unit tests only test compression using the gz format because other formats may not be available in all builds. It is the job of integration tests to exercise all compression types. Additional compression types will be added in future commits.	2020-03-06 14:41:03 -05:00
David Steele	e55443c890	Move logic from postgres/pageChecksum to command/backup/pageChecksum(). The postgres/pageChecksum module was designed as an interface to the C structs for the Perl code. The new C code can do this directly so no need for an interface. Move the remaining test for pgPageChecksum() into the postgres/interface test module.	2020-03-05 16:12:54 -05:00
David Steele	a86253f112	Remove obsolete function pageChecksumBufferTest(). This function made validation faster in Perl because fewer calls (and buffer transformations) were required when all checksums were valid. In C calling pageChecksumTest() directly is just as efficient so there is no longer a need for pageChecksumBufferTest().	2020-03-04 14:12:02 -05:00
David Steele	3f77a83e73	Remove raw option for gz compression. This was a minor optimization used in protocol layer compression. Even though it was slightly faster, it omitted the crc-32 that is generated during normal compression which could lead to corrupt data after a bad network transmission. This would be caught on restore by our checksum but it seems better to catch an issue like this early. The raw option also made the function signature different than future compression formats which may not support raw, or require different code to support raw. In general, it doesn't seem worth the extra testing to support a format that has minimal benefit and is seldom used, since protocol compression is only enabled when the transmitted data is uncompressed.	2020-02-27 12:19:40 -05:00
David Steele	ee351682da	Rename "gzip" to "gz". "gz" was used as the extension but "gzip" was generally used for function and type naming. With a new compression format on the way, it makes sense to standardize on a single abbreviation to represent a compression format in the code. Since the extension is standard and we must use it, also use the extension for all naming.	2020-02-27 12:09:05 -05:00
David Steele	44adf21c83	Consolidate archive async exec code. Move duplicated code to the common module. This will reduce copy and paste between the get and push modules when changes are made.	2020-02-10 21:30:43 -07:00
Cynthia Shang	856980ae99	Auto-select backup set on restore when time target is specified. Auto-selection is performed only when --set is not specified. If a backup set for the given target time cannot not be found, the latest (default) backup set will be used. Currently a limited number of date formats are recognized and timezone names are not allowed, only timezone offsets.	2020-01-30 14:38:05 -07:00
David Steele	d2fb4f977c	Add httpLastModifiedToTime() to parse HTTP last-modified header.	2020-01-06 15:24:49 -07:00
David Steele	a08298ce1b	Add basic time management functions. These are similar to what mktime() and strptime() do but they ignore the local system timezone which saves having to munge the TZ env variable to do time conversions.	2020-01-06 15:18:52 -07:00
David Steele	f0ef73db70	pgBackRest is now pure C. Remove embedded Perl from the distributed binary. This includes code, configure, Makefile, and packages. The distributed binary is now pure C. Remove storagePathEnforceSet() from the C Storage object which allowed Perl to write outside of the storage base directory. Update mock/all and real/all integration tests to use storageLocal() where they were violating this rule. Remove "c" option that allowed the remote to tell if it was being called from C or Perl. Code to convert options to JSON for passing to Perl (perl/config.c) has been moved to LibC since it is still required for Perl integration tests. Update build and installation instructions in the user guide. Remove all Perl unit tests. Remove obsolete Perl code. In particular this included all the Perl protocol code which required modifications to the Perl storage, manifest, and db objects that are still required for integration testing but only run locally. Any remaining Perl code is required for testing, documentation, or code generation. Rename perlReq to binReq in define.yaml to indicate that the binary is required for a test. This had been the actual meaning for quite some time but the key was never renamed.	2019-12-13 17:55:41 -05:00
David Steele	1f2ce45e6b	The backup command is implemented entirely in C. For the most part this is a direct migration of the Perl code into C except as noted below. A backup can now be initiated from a linked directory. The link will not be stored in the manifest or recreated on restore. If a link or directory does not already exist in the restore location then a directory will be created. The logic for creating backup labels has been improved and it should no longer be possible to get a backup label earlier than the latest backup even with timezone changes or clock skew. This has never been an issue in the field that we know of, but we found it in testing. For online backups all times are fetched from the PostgreSQL primary host (before only copy start was). This doesn't affect backup integrity but it does prevent clock skew between hosts affecting backup duration reporting. Archive copy now works as expected when the archive and backup have different compression settings, i.e. when one is compressed and the other is not. This was a long-standing bug in the Perl code. Resume will now work even if hardlink settings have been changed. Reviewed by Cynthia Shang.	2019-12-13 17:14:26 -05:00
David Steele	d3132dae26	Add functions for building new manifests. New manifests are built before a backup is performed. Reviewed by Cynthia Shang.	2019-12-08 18:43:47 -05:00
David Steele	2cfde18755	Add pgLsnFromStr(), pgLsnToStr(), and pgLsnToWalSegment().	2019-12-08 14:19:47 -05:00
David Steele	d2587250da	Add backup functions to Db object. These functions implement the database backup functionality for all supported versions.	2019-12-07 18:44:06 -05:00
David Steele	158e439689	Remove obsolete Perl archive code. This should have been removed in `a1c13a50` but was missed.	2019-11-26 17:16:45 -05:00
David Steele	ab65ffdfac	Add protocolStorageType*() to manage protocol storage types. Abstract the string representation of storage types that are passed over the protocol layer.	2019-11-23 10:22:11 -05:00
David Steele	09e129886e	Add storageInfoList() support to remote storage driver.	2019-11-16 17:47:42 -05:00
David Steele	edcc7306a3	Add TIME parameter debug type. Previously we were using int64_t to debug time_t but this may not be right depending on how the compiler represents time_t, e.g. it could be a float. Since a mismatch would have caused a compiler error we are not worried that this has actually happened, and anyway the worst case is that the debug log would be wonky. The primary benefit, aside from correctness, is that it makes choosing a parameter debug type for time_t obvious.	2019-11-08 09:46:00 -05:00
David Steele	bcd3e4953a	Make perl/exec test container required. This test fails in some cases when --vm=none but it's not worth investigating since this code will be going away soon.	2019-10-10 22:10:20 -04:00
Cynthia Shang	a1c13a50dd	The check command is implemented entirely in C. Note that building the manifest on each host has been temporarily removed. This feature will likely be brought back as a non-default option (after the manifest code has been fully migrated to C) since it can be fairly expensive.	2019-10-08 18:04:09 -04:00
David Steele	45881c74ae	Allow most unit tests to run outside of a container. Three major changes were required to get this working: 1) Provide the path to pgbackrest in the build directory when running outside a container. Tests in a container will continue to install and run against /usr/bin/pgbackrest. 1) Set a per-test lock path so tests don't conflict on the default /tmp/pgbackrest path. Also set a per-test log-path while we are at it. 2) Use localhost instead of a custom host for TLS test connections. Tests in containers will continue to update /etc/hosts and use the custom host. Add infrastructure and update harnessCfgLoad*() to get the correct exe and paths loaded for testing. Since new tests are required to verify that running outside a container works, also rework the tests in Travis CI to provide coverage within a reasonable amount of time. Mainly, break up to doc tests by VM and run an abbreviated unit test suite on co6 and co7.	2019-10-08 12:06:30 -04:00
David Steele	29e132f5e9	PostgreSQL 12 support. Recovery settings are now written into postgresql.auto.conf instead of recovery.conf. Existing recovery_target* settings will be commented out to help avoid conflicts. A comment is added before recovery settings to identify them as written by pgBackRest since it is unclear how, in general, old settings will be removed. recovery.signal and standby.signal are automatically created based on the recovery settings.	2019-10-01 13:20:43 -04:00
David Steele	a58635ac02	Move C performance tests out of unit tests. Performance tests were being done in unit tests until there was a better place to put them. Now there is, so move them there.	2019-09-28 14:24:27 -04:00
David Steele	004ff99a2d	Identify Perl performance test by appending -perl. This is intended to differentiate the upcoming C performance tests from the Perl performance tests that will eventually be migrated.	2019-09-28 13:17:21 -04:00
David Steele	cb62bebadf	Use bsearch() on sorted lists rather than an iterative method. bsearch() is far more efficient than an iterative approach except in the most trivial cases. For now insert will reset the sort order to none and the list will need to be resorted before bsearch() can be used. This is necessary because item pointers are not stable after a sort, i.e. they can move around. Until lists are stable it's not a good idea to surprise the caller by mixing up their pointers on insert.	2019-09-28 10:08:20 -04:00
David Steele	451ae397be	The restore command is implemented entirely in C. For the most part this is a direct migration of the Perl code into C. There is one important behavioral change with regard to how file permissions are handled. The Perl code tried to set ownership as it was in the manifest even when running as an unprivileged user. This usually just led to errors and frustration. The C code works like this: If a restore is run as a non-root user (the typical scenario) then all files restored will belong to the user/group executing pgBackRest. If existing files are not owned by the executing user/group then an error will result if the ownership cannot be updated to the executing user/group. In that case the file ownership will need to be updated by a privileged user before the restore can be retried. If a restore is run as the root user then pgBackRest will attempt to recreate the ownership recorded in the manifest when the backup was made. Only user/group names are stored in the manifest so the same names must exist on the restore host for this to work. If the user/group name cannot be found locally then the user/group of the PostgreSQL data directory will be used and finally root if the data directory user/group cannot be mapped to a name. Reviewed by Cynthia Shang.	2019-09-26 07:52:02 -04:00
David Steele	c969137021	Migrate backup manifest load/save to C. The backup manifest stores a complete list of all files, links, and paths in a backup along with metadata such as checksums, sizes, timestamps, etc. A list of databases is also included for selective restore. The purpose of the manifest is to allow the restore command to confidently reconstruct the PostgreSQL data directory and ensure that nothing is missing or corrupt. It is also useful for reporting, e.g. size of backup, backup time, etc. For now, migrate enough functionality to implement the restore command. Reviewed by Cynthia Shang.	2019-09-23 13:50:46 -04:00
David Steele	1049632873	Add user module for managing system users/groups. Centralize the management of users and groups. Also update Posix storage driver where users/groups were already in use.	2019-09-08 20:11:51 -04:00
David Steele	0a96764cb8	Remove most references to PostgreSQL control and catalog versions. The control and catalog versions were stored a variety of places in the optimistic hope that they would be useful. In fact they never were. We can't remove them from the backup.info and backup.manifest files due to backwards compatibility concerns, but we can at least avoid loading and storing them in C structures. Add functions to the PostgreSQL interface which will return the control and catalog versions for any supported version of PostgreSQL to allow backwards compatibility for backup.info and backup.manifest. These functions will be useful in other ways, e.g. generating the tablespace identifier in PostgreSQL >= 9.0.	2019-09-07 18:04:39 -04:00
David Steele	4d84820021	Improve performance of info file load/save. Info files required three copies in memory to be loaded (the original string, an ini representation, and the final info object). Not only was this memory inefficient but the Ini object does sequential scans when searching for keys making large files very slow to load. This has not been an issue since archive.info and backup.info are very small, but it becomes a big deal when loading manifests with hundreds of thousands of files. Instead of holding copies of the data in memory, use a callback to deliver the ini data directly to the object when loading. Use a similar method for save to avoid having an intermediate copy. Save is a bit complex because sections/keys must be written in alpha order or older versions of pgBackRest will not calculate the correct checksum. Also move the load retry logic to helper functions rather than embedding it in the Info object. This allows for more flexibility in loading and ensures that stack traces will be available when developing unit tests. Reviewed by Cynthia Shang.	2019-09-06 13:48:28 -04:00
Cynthia Shang	c733319063	The stanza-create/update/delete commands are implemented entirely in C. Contributed by Cynthia Shang.	2019-08-21 16:26:28 -04:00
Cynthia Shang	53f27da3a6	Add checkDbConfig() to compare pgBackRest/PostgreSQL configs. Checking the PostgreSQL-reported path and version against the pgBackRest configuration helps ensure that pgBackRest is operating against the correct cluster. In Perl this functionality was in the Db object, but check seems like a better place for it in C. Contributed by Cynthia Shang.	2019-08-21 15:41:52 -04:00
Cynthia Shang	fa640f22ad	Allow Info* objects to be created from scratch in C. Previously, info files (e.g. archive.info, backup.info) were created in Perl and only loaded in C. The upcoming stanza commands in C need to create these files so refactor the Info* objects to allow new, empty objects to be created. Also, add functions needed to initialize each Info* object to a valid state. Contributed by Cynthia Shang.	2019-08-21 15:12:00 -04:00
David Steele	7d97d49f41	Add MostCommonValue object. Calculate the most common value in a list of variants. If there is a tie then the first value passed to mcvUpdate() wins. mcvResult() can be called multiple times because it does not end processing, but there is a cost to calculating the result each time since it is not stored.	2019-08-18 20:46:34 -04:00
David Steele	8aa1e552b0	Add backup type conversion functions. Convert back and forth between the string and enum representations of backup types.	2019-08-18 20:09:44 -04:00
Cynthia Shang	382ed92825	The start/stop commands are implemented entirely in C. The Perl versions remain because they are still being used by the Perl stanza commands. Once the stanza commands are migrated they can be removed. Contributed by Cynthia Shang.	2019-08-09 15:17:18 -04:00
David Steele	3d3003e9ca	The check command is implemented partly in C. Implement switch WAL and archive check in C but leave the rest in Perl for now. The main idea was to have some real integration tests for the new database code so the rest of the migration can wait. Reviewed by Cynthia Shang.	2019-08-01 20:35:01 -04:00
David Steele	e4901d50d5	Add Db object to encapsulate PostgreSQL queries and commands. Migrate functionality from the Perl Db module to C. For now this is just enough to implement the WAL switch check. Add the dbGet() helper function to get Db objects easily. Create macros in harnessPq to make writing pq scripts easier by grouping commonly used functions together. Reviewed by Cynthia Shang.	2019-08-01 15:38:27 -04:00
Cynthia Shang	03b28da1ca	Rename control/control module to control/common. This is more consistent with how other common modules are named. Contributed by Cynthia Shang.	2019-07-31 11:35:58 -04:00
David Steele	f8b0676fd6	Allow modules to be included for testing without requiring coverage. Sometimes it is useful to get at the internals of a module that is not being tested for coverage in order to provide coverage for another module that is being tested. The include directive allows this. Update modules that had previously been added to coverage that only need to be included.	2019-07-25 20:15:06 -04:00
David Steele	415542b4a3	Add PostgreSQL query client. This direct interface to libpq allows simple queries to be run against PostgreSQL and supports timeouts. Testing is performed using a shim that can use scripted responses to test all aspects of the client code. The shim will be very useful for testing backup scenarios on complex topologies. Reviewed by Cynthia Shang.	2019-07-25 14:50:02 -04:00
David Steele	59f135340d	The local command for backup is implemented entirely in C. The local process is now entirely migrated to C. Since all major I/O operations are performed in the local process, the vast majority of I/O is now performed in C. Contributed by David Steele, Cynthia Shang.	2019-07-25 14:34:16 -04:00
David Steele	38ba458616	Add IoSink filter. Discard all data passed to the filter. Useful for calculating size/checksum on a remote system when no data needs to be returned. Update ioReadDrain() to automatically use the IoSink filter.	2019-07-18 08:42:42 -04:00
David Steele	9836578520	Remove perl critic and coverage. No new Perl code is being developed, so these tools are just taking up time and making migrations to newer platforms harder. There are only a few Perl tests remaining with full coverage so the coverage tool does not warn of loss of coverage in most cases. Remove both tools and associated libraries.	2019-07-05 16:55:17 -04:00
David Steele	b9b21315ea	Updates for openssl 1.1.1. Some HTTP error tests were failing after the upgrade to openssl 1.1.1, though the rest of the unit and integration tests worked fine. This seemed to be related to the very small messages used in the error testing, but it pointed to an issue with the code not being fully compliant, made worse by auto-retry being enabled by default. Disable auto-retry and implement better error handling to bring the code in line with openssl recommendations. There's no evidence this is a problem in the field, but having all the tests pass seems like a good idea and the new code is certainly more robust. Coverage will be complete in the next commit when openssl 1.1.1 is introduced.	2019-07-02 22:09:12 -04:00
David Steele	4815752ccc	Add Perl interface to C storage layer. Maintaining the storage layer/drivers in two languages is burdensome. Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C. Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents. The goal is to move the integration tests to C so this interface will eventually be removed. That being the case, the interface was designed for maximum compatibility to ease the transition. The result looks a bit hacky but we'll improve it as needed until it can be retired.	2019-06-26 08:24:58 -04:00
Cynthia Shang	b498188f01	Error on db history mismatch when expiring. Amend commit `434cd832` to error when the db history in archive.info and backup.info do not match. The Perl code would attempt to reconcile the history by matching on system id and version but we are not planning to migrate that code to C. It's possible that there are users with mismatches but if so they should have been getting errors from info for the last six months. It's easy enough to manually fix these files if there are any mismatches in the field. Contributed by Cynthia Shang.	2019-06-24 11:59:44 -04:00
David Steele	039e515a31	Allow protocol compression when read/writing remote files. If the file is compressible (i.e. not encrypted or already compressed) it can be marked as such in storageNewRead()/storageNewWrite(). If the file is being read from/written to a remote it will be compressed in transit using gzip. Simplify filter group handling by having the IoRead/IoWrite objects create the filter group automatically. This removes the need for a lot of NULL checking and has a negligible effect on performance since a filter group needs to be created eventually unless the source file is missing. Allow filters to be created using a VariantList so filter parameters can be passed to the remote.	2019-06-24 10:20:47 -04:00
David Steele	434cd83285	The expire command is implemented entirely in C. This implementation duplicates the functionality of the Perl code but does so with different logic and includes full unit tests. Along the way at least one bug was fixed, see issue #748. Contributed by Cynthia Shang.	2019-06-18 15:19:20 -04:00
David Steele	ceafd8e19d	Migrate page checksum filter to C. This filter exactly mimics the behavior of the Perl filter so is a drop-in replacement. The filter is not integrated yet since it requires the Perl-to-C storage layer interface coming in a future commit.	2019-06-17 07:52:03 -04:00
David Steele	ced42d6511	Add HTTP client cache. This cache manages multiple http clients and returns one to the caller that is not busy. It is the responsibility of the caller to indicate when they are done with a client. If returnContent is set then the client will automatically be marked done. Also add special handing for HEAD requests to recognize that content-length is informational only and no content is expected.	2019-06-11 10:48:22 -04:00
David Steele	20e5b92f36	Add ls command. Allows listing repo paths/files from the command-line, to be used primarily for testing and debugging. This command is internal-only so the interface may change at any time without notice.	2019-05-28 10:03:48 -04:00
David Steele	a474ba54c5	Refactoring path support in the storage module. Not all storage types support paths as a physical thing that must be created/destroyed. Add a feature to determine which drivers use paths and simplify the driver API as much as possible given that knowledge and by implementing as much path logic as possible in the Storage object. Remove the ignoreMissing parameter from pathSync() since it is not used and makes little sense. Create a standard list of error messages for the drivers to use and apply them where the code was modified -- there is plenty of work still to be done here.	2019-05-26 12:41:15 -04:00
David Steele	1bc84c6474	The local command for restore is implemented entirely in C. This is just the part of restore run by the local helper processes, not the entire command. Even so, various optimizations in the code (like pipelining and optimizations for zero-length files) should make the restore command faster on object stores.	2019-05-20 17:07:37 -04:00
Cynthia Shang	a839830333	Add most unimplemented functions to the remote storage driver. Add pathCreate(), pathRemove(), pathSync(), and remove() to the driver. Contributed by Cynthia Shang.	2019-05-20 16:19:14 -04:00
David Steele	2d2bec842a	Improve coverage in perl/exec module.	2019-05-13 13:36:24 -04:00
David Steele	c99c7c458b	Add pathExists() to Storage object. The S3 driver did not get an implementation since S3 has a weak notion of paths, and it is not currently required.	2019-05-09 08:28:58 -04:00
David Steele	cb00030ee3	Remove dead code missed in `1b486847`. This commit removed all Perl references to spool storage but some stuff was left behind.	2019-05-08 18:58:07 -04:00
David Steele	f1eea23121	Add macros for object free functions. Most of the Free() functions are pretty generic so add macros to make creating them as easy as possible. Create a distinction between Free() functions that the caller uses to free memory and callbacks that free third-party resources. There are a number of cases where a driver needs to free resources but does not need a normal *Free() because it is handled by the interface. Add common/object.h for macros that make object maintenance easier. This pattern can also be used for many more object functions.	2019-05-03 18:52:54 -04:00
David Steele	32ca27a20b	Simplify storage object names. Remove "File" and "Driver" from object names so they are shorter and easier to keep consistent. Also remove the "driver" directory so storage implementations are visible directly under "storage".	2019-05-03 15:46:15 -04:00
David Steele	8c712d89eb	Improve type safety of interfaces and drivers. The function pointer casting used when creating drivers made changing interfaces difficult and led to slightly divergent driver implementations. Unit testing caught production-level errors but there were a lot of small issues and the process was harder than it should have been. Use void pointers instead so that no casts are required. Introduce the THIS_VOID and THIS() macros to make dealing with void pointers a little safer. Since we don't want to expose void pointers in header files, driver functions have been removed from the headers and the various driver objects return their interface type. This cuts down on accessor methods and the vast majority of those functions were not being used. Move functions that are still required to .intern.h. Remove the special "C" crypto functions that were used in libc and instead use the standard interface.	2019-05-02 17:52:24 -04:00
David Steele	52b0b81976	Add storageInfoList() to get detailed info about all entries in a path. The function provides all the file/path/link information required to build a backup manifest. Also update storageInfo() to provide the same information for a single file.	2019-04-23 19:33:55 -04:00
David Steele	f492f0571b	Add *Save() functions to most Info objects. At the same time change the way that load constructors work (and are named) so that Ini objects do not persist after the constructors complete. infoArchiveSave() is excluded from this commit since it is just a trivial call to infoPgSave() and won't be required soon.	2019-04-23 17:08:34 -04:00
David Steele	cddb0c05b4	Add iniSave() and iniMove() to Ini object. iniSave() sorts alphabetically to maintain compatibility with the expect tests, but we plan to change this behavior when the migration is complete.	2019-04-23 13:03:22 -04:00
David Steele	81f652137c	Add separate functions to encode/decode each JSON type. In most cases the JSON type is known so this is more efficient than converting to Variant first, both in terms of memory and time. Also rename some of the existing functions for consistency.	2019-04-22 18:41:01 -04:00
David Steele	1adcbc5c91	Add unsigned int Variant type. This is better than using (unsigned int)varUInt64() because bounds checking is performed.	2019-04-19 11:22:43 -04:00
Cynthia Shang	a7281878ac	Migrate backupRegExp() to C. Removed the "anchor" parameter because it was never used in any calls in the Perl code so it was just a dead parameter that always defaulted to true. Contributed by Cynthia Shang.	2019-04-15 08:29:25 -04:00
David Steele	df12cbb162	Fix C code to recognize host:port format like Perl does. This was not an intentional feature in Perl, but it works, so it makes sense to implement the same syntax in C. This is a break from other places where a -port option is explicitly supplied, so it may make sense to support both styles going forward. This commit does not address that, however. Reported by Kyle Nevins.	2019-04-10 17:48:34 -04:00
David Steele	1b48684713	The archive-push command is implemented entirely in C. This new implementation should behave exactly like the old Perl code with the exception of updated log messages. Remove as much of the Perl code as possible without breaking other commands.	2019-03-29 13:26:33 +00:00
David Steele	abba2bd132	Add strLstMergeAnti() for merge anti-joins. We deal with some pretty big lists in archive-push so a nested-loop anti-join looked like it would not be efficient enough. This merge anti-join should do the trick even though both lists must be sorted first.	2019-03-25 20:35:20 +04:00
David Steele	e938a89250	Add WAL info to PostgreSQL interface. This allows the WAL header to be read for any supported version on PostgreSQL.	2019-03-19 19:44:06 +04:00
David Steele	856a369b86	Add file write to the S3 storage driver. Now that repositories are writable the storage drivers that don't yet support file writes need to be updated to do so. Note that the part size for multi-part upload has not been defined as a proper constant. This will become an option in the near future so it doesn't seem worth creating a constant that we might then forget to remove.	2019-03-17 22:00:54 +04:00
David Steele	8ebc6d6c34	Add file write to the remote storage driver. Now that repositories are writable the storage drivers that don't yet support file writes need to be updated to do so.	2019-03-16 21:50:19 +04:00
David Steele	2d386cd266	Move WAL path prefix logic into walPath(). This logic is used by both archive-push and archive-get.	2019-03-16 16:14:10 +04:00
David Steele	b2b2cf0511	Fix issues with remote/local command logging options. Logging was being enable on local/remote processes even if --log-subprocess was not specified, so fix that. Also, make sure that stderr is enabled at error level as it was on Perl. This helps expose error information for debugging. For remotes, suppress log and lock paths since these are not applicable on remote hosts. These options should be set in the local config if they need to be overridden.	2019-03-16 15:00:02 +04:00
David Steele	982b47c5ec	Add CIFS storage driver. This driver borrows heavily from the Posix driver. At this point the only difference is that CIFS does not allow explicit directory fsyncs so they need to be suppressed. At some point the CIFS diver will also omit link support. With the addition of this driver repository storage is now writable.	2019-03-14 13:28:33 +04:00
David Steele	2ef5ad70a2	Move crypto module to common/crypto. It makes sense for the crypto code to be in common since it is not pgBackRest-specific. Also combine the crypto tests into a single module.	2019-03-10 13:27:30 +02:00
David Steele	95597be81e	Move compress module to common/compress. It makes sense for the compression code to be in common since it is not pgBackRest-specific.	2019-03-10 13:11:20 +02:00
David Steele	2f63babe9d	Move help/help test module to command/help.	2019-03-10 11:55:01 +02:00
David Steele	d441061168	Create test matrix for mock/all to increase coverage and reduce tests. The same test configurations are run on all four test VMs, which seems a real waste of resources. Vary the tests per VM to increase coverage while reducing the total number of tests. Be sure to include each major feature (remote, s3, encryption) in each VM at least once.	2019-03-02 15:01:02 +02:00
David Steele	f7d1d4400f	Create test matrix for mock/expire to increase coverage and reduce tests. The same test configurations are run on all four test VMs, which seems a real waste of resources. Vary the tests per VM to increase coverage while reducing the total number of tests.	2019-03-01 19:04:26 +02:00
David Steele	91622942c2	Create test matrix for mock/archive-stop to increase coverage and reduce tests. The same test configurations are run on all four test VMs, which seems a real waste of resources. Vary the tests per VM to increase coverage while reducing the total number of tests. Be sure to include each major feature (remote, s3, encryption) in each VM at least once.	2019-03-01 17:12:41 +02:00
David Steele	db4b447be8	The archive-get command is implemented entirely in C. This new implementation should behave exactly like the old Perl code with the exception of a few updated log messages. Remove as much of the Perl code as possible without breaking other commands.	2019-02-27 23:03:02 +02:00
David Steele	9367cc461c	Migrate local command to C. The C local is only used for C commands in the main process. Some tweaking of the existing protocolGet() command was required. Originally the idea was to share the function for local and remote requests but the differences (as in Perl) were too great to make that practical.	2019-02-27 22:34:21 +02:00
David Steele	35abd4cd95	Add ProtocolParallel* objects for parallelizing commands. Allows commands to be easily parallelized if the jobs are broken up into discrete, non-overlapping chunks.	2019-02-27 21:10:52 +02:00
David Steele	35acfae7c2	Add ProtocolCommand object. This formalizes the creation of protocol commands, which was previously done by creating KeyValue objects manually.	2019-02-27 19:48:30 +02:00
David Steele	3a05359087	Create test matrix for mock/stanza to increase coverage and reduce tests. The same test configurations are run on all four test VMs, which seems a real waste of resources. Vary the tests per VM to increase coverage while reducing the total number of tests. Be sure to include each major feature (remote, s3, encryption) in each VM at least once.	2019-02-24 07:42:41 +02:00
David Steele	2f081f3ec7	Rename test modules for consistency. The conventions for command and info tests have shifted in the C modules, though not even all the C modules got the message.	2019-02-23 18:51:52 +02:00
David Steele	d489eb87f7	Create test matrix for mock/archive to increase coverage and reduce tests. The same test configurations are run on all four test VMs, which seems a real waste of resources. Vary the tests per VM to increase coverage while reducing the total number of tests. Be sure to include each major feature (remote, s3, encryption) in each VM at least once.	2019-02-23 15:59:39 +02:00
David Steele	ae86e6d5b2	Add missing ToLog() coverage to String, List, and PgControl. Missing coverage is exposed in the next commit which disables test tracing by default.	2019-02-22 11:31:37 +02:00
David Steele	6866ff031a	Add exists() to remote storage.	2019-02-20 22:43:02 +02:00
David Steele	da628be8a8	Migrate remote command to C. Prior to this the Perl remote was used to satisfy C requests. This worked fine but since the remote needed to be migrated to C anyway there was no reason to wait. Add the ProtocolServer object and tweak ProtocolClient to work with it. It was also necessary to add a mechanism to get option values from the remote so that encryption settings could be read and used in the storage object. Update the remote storage objects to comply with the protocol changes and add the storage protocol handler. Ideally this commit would have been broken up into smaller chunks but there are cross-dependencies in the protocol layer and it didn't seem worth the extra effort.	2019-02-19 20:57:38 +02:00
David Steele	db08656537	Rename FUNCTION_DEBUG_* and consolidate ASSERT_* macros for consistency. Rename FUNCTION_DEBUG_* macros to FUNCTION_LOG_* to more accurately reflect what they do. Further rename FUNCTION_DEBUG_RESULT* macros to FUNCTION_LOG_RETURN* to make it clearer that they return from the function as well as logging. Leave FUNCTION_TEST_* macros as they are. Consolidate the various ASSERT* macros into a single ASSERT macro that is always compiled out of production builds. It was difficult to figure out when an assert would be checked with all the different types in play. When ASSERTs are compiled in they will always be checked regardless of the log level -- tying these two concepts together was not a good idea.	2019-01-21 17:41:59 +02:00
David Steele	d245f8eb42	The info command is implemented entirely in C. The C info code has already been committed but this commit wires it into main. Also remove the info Perl code and tests since they are no longer called.	2019-01-21 13:51:45 +02:00
David Steele	7355248d6b	Add remote storage objects. This is a partial implementation of remote storage with just enough functionality to get the info command working. The client is written in C but the server is still in Perl, which limits progress until a C server is written.	2019-01-18 22:04:37 +02:00
David Steele	88201f37a3	Add ProtocolClient object and helper functions. This is a complete protocol client implementation in C. Currently there is no C server implementation so the C client is talking to a Perl server. This won't work very long, though, as the protocol format, even though in JSON, has a lot of language-specific structure. While it would be possible to maintain compatibility between C and Perl it's probably not worth the effort in the long run. Just as in Perl there are helper functions to make constructing protocol objects easier. Currently only repository remotes are supported.	2019-01-18 21:32:51 +02:00
David Steele	9cac403f61	Add Exec object. Executes a child process and allows the calling process to communicate with it using read/write io. This object is specially tailored to implement the protocol layer and may or may not be generally applicable to general purpose execution.	2019-01-18 11:45:40 +02:00
David Steele	06d41b4dc0	Add cfgExecParam() to generate parameters for executing commands. Parameters for the local/remote commands are based on parameters that are passed to the current command. Generate parameters for the new command based on the intersection of parameters between the current command and the command to be executed.	2019-01-17 22:29:19 +02:00
David Steele	ecd56105e6	Add IoHandleRead and IoHandleWrite objects. General i/o objects for reading and writing file descriptors, in particular those that can block. In other words, these are not generally to be used with file descriptors for actual files, but rather pipes, sockets, etc.	2019-01-17 22:08:31 +02:00
David Steele	1de22cac2b	Rename common/io/handle module to common/io/handleWrite. ioHandleWriteOneStr() will become a helper function for the IoHandleWrite object.	2019-01-06 14:37:39 +02:00
David Steele	26c888873e	Merge common/typeVariantListTest module into common/typeVariantTest. These modules are closely related so it makes sense for them to be merged.	2019-01-01 18:14:43 +02:00
David Steele	07b9176f25	Merge common/typeStringListTest module into common/typeStringTest. These modules are closely related so it makes sense for them to be merged.	2019-01-01 18:05:13 +02:00
Cynthia Shang	205525b607	Migrate local info command to C. The info command will only be executed in C if the repository is local, i.e. not located on a remote repository host. S3 is considered "local" in this case. This is a direct migration from Perl to integrate as seamlessly with the remaining Perl code as possible. It should not be possible to determine if the C version is running unless debug-level logging is enabled. Contributed by Cynthia Shang.	2018-12-13 16:22:34 -05:00
Cynthia Shang	e6ef40e8a3	Add infoBackup object to encapsulate the backup.info file. The infoBackup object is the counterpart to the infoArchive object which encapsulates the archive.info file. Currently the object is read-only, i.e. it is not possible to create a new or modify an existing backup.info file. There a number of constants that will also be used in the infoManifest object so go ahead and create a module to contain them so they don't need to be moved later. Contributed by Cynthia Shang.	2018-12-13 15:46:18 -05:00
Cynthia Shang	2f15a90d18	Add infoArchiveIdHistoryMatch() to the InfoArchive object. Match a PostgreSQL system identifier and version to a pgBackRest archive id. Contributed by Cynthia Shang.	2018-12-10 18:45:57 -05:00
Cynthia Shang	80a3e21521	Add strSizeFormat() to String object. Converts sizes in bytes to a more human-readable form, .e.g. 1KB, 1.1GB. Contributed by Cynthia Shang.	2018-12-10 16:11:51 -05:00
David Steele	e73416e9e3	Change file ownership only when required. Previously chown() would be called even when no ownership changes were required. In most cases changes are not required and it seems better to perform an extra stat() rather than an extra chown(). Also add unit tests for owner() since there weren't any.	2018-12-05 17:56:47 -05:00
David Steele	3e254f4cff	Add IoFilter interface to CipherBlock object. This allows CipherBlock to be used as a filter in an IoFilterGroup. The C-style functions used by Perl are now deprecated and should not be used for any new code. Also add functions to convert between cipher names and CipherType.	2018-11-28 12:42:36 -05:00
Cynthia Shang	f4a1751abc	Improve JSON to Variant conversion and add Variant to JSON conversion. Add boolean and one-dimensional list types to jsonToKv(). Add varToJson() and kvToJson() to convert Variants and KeyValues to JSON. Contributed by Cynthia Shang.	2018-11-23 16:02:33 -05:00
David Steele	256b727a3d	Add S3 storage driver. Only the storageNewRead() and storageList() functions are currently implemented, but this is enough to enable S3 for the archive-get command.	2018-11-21 19:32:49 -05:00
David Steele	72252ed2a1	Add HttpClient object. A robust HTTP client with pipelining support and automatic retries. Using a single object to make multiple requests is more efficient because requests are pipelined whenever possible. Requests are automatically retried when the connection has been closed by the server. Any 5xx response is also retried. Only the HTTPS protocol is currently supported.	2018-11-21 19:11:45 -05:00
David Steele	1dd06a6e46	Add TlsClient object. A simple, secure TLS client intended to allow access to services that are exposed via HTTPS. We call it TLS instead of SSL because SSL methods are disabled so only TLS connections are allowed. This object is intended to be used for multiple TLS connections against a service so tlsClientOpen() can be called each time a new connection is needed. By default, an open connection will be reused for pipelining so the user must be prepared to retry their transaction on a read/write error if the server closes the connection before it can be reused. If this behavior is not desirable then tlsClientClose() may be used to ensure that the next call to tlsClientOpen() will create a new TLS session. Note that tlsClientRead() is non-blocking unless there are zero bytes to be read from the session in which case it will raise an error after the defined timeout. In any case the tlsClientRead()/tlsClientWrite()/tlsClientEof() functions should not generally be called directly. Instead use the read/write interfaces available from tlsClientIoRead()/tlsClientIoWrite().	2018-11-21 18:43:25 -05:00
David Steele	bc25db5667	Add interface objects for libxml2. Add XmlDocument, XmlNode, and XmlNodeList objects as a thin interface layer on libxml2. This interface is not intended to be comprehensive. Only a few libxml2 capabilities are exposed but more can be added as needed.	2018-11-20 20:40:11 -05:00
David Steele	8f857a975e	Add constant macros to String object. There are many places (and the number is growing) where a zero-terminated string constant must be transformed into a String object to be usable. This pattern wastes time and memory, especially since the created string is generally used in a read-only fashion. Define macros to create constant String objects that are initialized at compile time rather than at run time.	2018-11-10 09:37:12 -05:00
David Steele	df200bee2a	Add regExpPrefix() to aid in static prefix searches. The storageList() command accepts a regular expression as a filter. This works fine for local filesystems where it is relatively cheap to get a complete list of files and filter them in code. However, for remote filesystems like S3 it can be expensive to fetch a complete list of files only to discard the bulk of them locally. S3 does not filter on regular expressions but it can accept a static prefix so this function extracts a prefix from a regular expression when possible. Even a few characters can drastically reduce the amount of data that must be fetched remotely so the function does not try to be too clever. It requires a ^ anchor and stops scanning when the first special character is found.	2018-11-09 16:50:22 -05:00
David Steele	48d2795f31	Merge crypto/random module into crypto/crypto. There wasn't enough code to justify a separate module/test and it seems to fit just fine in crypto/crypto.	2018-11-06 20:04:16 -05:00
David Steele	2cb312ef5a	Add cryptoError() and update crypto code to use it. This adds detail to error messages when available and improves code coverage.	2018-11-06 19:16:00 -05:00
David Steele	1f8931f732	Improve single test run performance. Improve on `7794ab50` by including the build flag files directly into the Makefile as dependencies (even though they are not includes). This simplifies some of the rsync logic and allows make to do what it does best. Also split build flag files into test, harness, and build to reduce rebuilds. Test flags are used to build test.c, harness flags are used to build the rest of the files in the test harness, and build flags are used for the files that are not directly involved in testing.	2018-11-03 16:34:04 -04:00
Cynthia Shang	34c63276cd	Automatically enable backup checksum delta when anomalies (e.g. timeline switch) are detected. There are a number of cases where a checksum delta is more appropriate than the default time-based delta: * Timeline has switched since the prior backup * File timestamp is older than recorded in the prior backup * File size changed but timestamp did not * File timestamp is in the future compared to the start of the backup * Online option has changed since the prior backup A practical example is that checksum delta will be enabled after a failover to standby due to the timeline switch. In this case, timestamps can't be trusted and our recommendation has been to run a full backup, which can impact the retention schedule and requires manual intervention. Now, a checksum delta will be performed if the backup type is incr/diff. This means more CPU will be used during the backup but the backup size will be smaller and the retention schedule will not be impacted. Contributed by Cynthia Shang.	2018-11-01 11:31:25 -04:00
David Steele	03b9db9aa2	Fix error after log file open failure when processing should continue. The C code was warning on failure and continuing but the Perl logging code was never updated with the same feature. Rather than add the feature to Perl, just disable file logging if the log file cannot be opened. Log files are always opened by C first, so this will eliminate the error in Perl. Reported by vthriller.	2018-10-25 14:58:25 +01:00
David Steele	db8dce7adc	Disable flapping archive/get unit on CentOS 6. This test has been flapping since `9b9396c7`. It seems to be some kind of timing issue since all integration tests pass and this unit passes on all other VMs. It only happens on Travis and is not reproducible in any development environment that we have tried. For now, disable the test since the constant flapping is causing major delays in testing and quite a bit of time has been spent trying to identify the root cause. We are actively developing these tests and hope the issue will be identified during the course of normal development. A number of improvements were made to the tests while searching for this issue. While none of them helped, it makes sense to keep the improvements.	2018-10-02 17:54:43 +01:00
David Steele	e66e68e324	Add cryptoHmacOne() for HMAC support. There doesn't seem to be any need to implement this as a filter since current use cases (S3 authentication) work on small datasets. So, use the single function method provided by OpenSSL for simplicity.	2018-09-27 09:20:47 +01:00
David Steele	bcca625062	Add bufHex()to Buffer object. A general-purpose function for converting buffers to hex strings.	2018-09-26 22:33:48 +01:00
David Steele	d038b9a029	Support configurable WAL segment size. PostgreSQL 11 introduces configurable WAL segment sizes, from 1MB to 1GB. There are two areas that needed to be updated to support this: building the archive-get queue and checking that WAL has been archived after a backup. Both operations require the WAL segment size to properly build a list. Checking the archive after a backup is still implemented in Perl and has an active database connection, so just get the WAL segment size from the database. The archive-get command does not have a connection to the database, so get the WAL segment size from pg_control instead. This requires a deeper inspection of pg_control than has been done in the past, so it seemed best to copy the relevant data structures from each version of PostgreSQL and build a generic interface layer to address them. While this approach is a bit verbose, it has the advantage of being relatively simple, and can easily be updated for new versions of PostgreSQL. Since the integration tests generate pg_control files for testing, teach Perl how to generate files with the correct offsets for both 32-bit and 64-bit architectures.	2018-09-25 10:24:42 +01:00
Cynthia Shang	880fbb5e57	Add checksum delta for incremental backups. Use checksums rather than timestamps to determine if files have changed. This is useful in cases where the timestamps may not be trustworthy, e.g. when performing an incremental after failing over to a standby. If checksum delta is enabled then checksums will be used for verification of resumed backups, even if they are full. Resumes have always used checksums to verify the files in the repository, enabling delta performs checksums on the database files as well. Note that the user must manually enable this feature in cases were it would be useful or just keep in enabled all the time. A future commit will address automatically enabling the feature in cases where it seems likely to be useful. Contributed by Cynthia Shang.	2018-09-19 11:12:45 -04:00
David Steele	03003562d8	Merge all posix storage tests into a single unit. As we add storage drivers it's important to keep the tests for each completely separate. Rather than have three tests for each driver, standardize on having a single test unit for each driver.	2018-09-17 11:45:41 -04:00
David Steele	8852622fa2	Fix missing test caused by a misplaced YAML tag.	2018-09-16 15:53:19 -04:00
David Steele	84ab787b1a	Merge protocol storage helper into storage helper. These are separated the same way in the Perl code where the remote storage driver is located in the Protocol module. However, in the C code the intention is to implement the remote storage driver as a regular driver in the storage layer rather than making a special case out of it. So, merge the storage helpers. This also has the benefit of making the code a bit simpler. Also separate storageSpool() and storageSpoolWrite() to make it clearer which operations require write access and to maintain consistency with the other storage helper functions.	2018-09-16 14:12:53 -04:00
David Steele	fd14ceb399	Rename posix driver files/functions for consistency. The posix driver was developed over time and the naming is not very consistent. Rename the files and functions to work well with other drivers and generally favor longer names since the driver functions are seldom (eventually never) used outside the driver itself.	2018-09-13 18:58:22 -04:00
David Steele	5aa458ffae	Simplify debug logging by allowing log functions to return String objects. Previously, debug log functions had to handle NULLs and truncate output to the available buffer size. This was verbose for both coding and testing. Instead, create a function/macro combination that allows log functions to return a simple String object. The wrapper function takes care of the memory context, handles NULLs, and truncates the log string based on the available buffer size.	2018-09-11 18:32:56 -04:00
David Steele	9b9396c7b7	Migrate local, unencrypted, non-S3 archive-get command to C. The archive-get command will only be executed in C if the repository is local, unencrypted, and type posix or cifs. Admittedly a limited use case, but this is just the first step in migrating the archive-get command entirely into C. This is a direct migration from the Perl code (including messages) to integrate as seamlessly with the remaining Perl code as possible. It should not be possible to determine if the C version is running unless debug-level logging is enabled.	2018-09-11 15:42:31 -04:00
David Steele	6e9b6fdca9	Migrate control functions to detect stop files to C from Perl. Basic functions to detect the presence of stanza or all stop files and error when they are present. The functionality to detect stop files without error was not migrated. This functionality is only used by stanza-delete and will be migrated with that command.	2018-09-07 08:03:05 -07:00
David Steele	5bdaa35fa5	Migrate walIsPartial(), walIsSegment(), and walSegmentFind() from Perl to C. Also refactor regular expression defines to make them more reusable.	2018-09-07 08:00:18 -07:00
David Steele	9660076093	Add helper for repository storage. Implement rules for generating paths within the archive part of the repository. Add a helper function, storageRepo(), to create the repository storage based on configuration settings. The repository storage helper is located in the protocol module because it will support remote file systems in the future, just as the Perl version does. Also, improve the existing helper functions a bit using string functions that were not available when they were written.	2018-09-07 07:58:08 -07:00
David Steele	960ad73298	Info objects now parse JSON and use specified storage. Use JSON code now that it is available and remove temporary hacks used to get things working initially. Use passed storage objects rather than using storageLocal(). All storage objects in C are still local but this won't always be the case. Also, move Postgres version conversion functions to postgres/info.c since they have no dependency on the info objects and will likely be useful elsewhere.	2018-09-06 10:12:14 -07:00
David Steele	cb4b715533	Add strReplaceChr() to String object.	2018-08-14 16:49:38 -04:00
David Steele	4a176681c3	Add cvtCharToZ() and macro for debugging char params.	2018-08-14 16:18:17 -04:00
David Steele	6643afe9a8	Add gzip compression/decompression filters for C.	2018-08-14 14:56:59 -04:00
David Steele	e3ff6b209d	Filters can now produce output that differs from input. This allows filters such as compression, encryption, etc. to be implemented.	2018-08-14 14:21:53 -04:00
Cynthia Shang	8ab2e72960	Migrate minimum set of code for reading archive.info files from Perl to C. Contributed by Cynthia Shang.	2018-08-09 08:57:21 -04:00
David Steele	7993f1a966	Add basic C JSON parser.	2018-08-09 08:06:23 -04:00
David Steele	01aea0c067	Implement filters that do not modify the buffer. Update cryptoHash to use the new interface.	2018-07-24 21:08:27 -04:00
David Steele	58e9f1e50c	Refactor the common/log tests to not depend on common/harnessLog. common/harnessLog was not ideally suited for general testing and made all the tests quite awkward. Instead, move all code used to test the common/log module into the logTest module and repurpose common/harnessLog to do log expect testing for all other tests in a cleaner way. Add a few exceptions for config testing since the log levels are reset by default in config/parse.	2018-07-20 18:51:42 -04:00
David Steele	0ac176b722	Abstract IO layer out of the storage layer. This allows the routines to be used for IO objects that do not have a storage representation. Implement buffer read and write IO objects.	2018-07-19 16:04:20 -04:00
Cynthia Shang	0e6b927a17	Add uint64 variant type and supporting conversion functions. Contributed by Cynthia Shang. Reviewed by Stephen Frost.	2018-07-12 15:23:18 -04:00
David Steele	350b30fa49	Move cryptographic hash functions to C using OpenSSL.	2018-06-11 14:52:26 -04:00
David Steele	064ec757e9	Rename cipher module to the more general crypto.	2018-06-11 10:53:16 -04:00

1 2 3 4 5 ...

273 Commits