pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00

Author	SHA1	Message	Date
Cynthia Shang	f96c54c4ba	Add info command set option for detailed text output. The additional details include databases that can be used for selective restore and a list of tablespaces and symlinks with their default destinations. This information is not included in the JSON output because it requires reading the manifest which is too IO intensive to do for all manifests. We plan to include this information for JSON in a future release.	2019-09-30 12:39:38 -04:00
David Steele	451ae397be	The restore command is implemented entirely in C. For the most part this is a direct migration of the Perl code into C. There is one important behavioral change with regard to how file permissions are handled. The Perl code tried to set ownership as it was in the manifest even when running as an unprivileged user. This usually just led to errors and frustration. The C code works like this: If a restore is run as a non-root user (the typical scenario) then all files restored will belong to the user/group executing pgBackRest. If existing files are not owned by the executing user/group then an error will result if the ownership cannot be updated to the executing user/group. In that case the file ownership will need to be updated by a privileged user before the restore can be retried. If a restore is run as the root user then pgBackRest will attempt to recreate the ownership recorded in the manifest when the backup was made. Only user/group names are stored in the manifest so the same names must exist on the restore host for this to work. If the user/group name cannot be found locally then the user/group of the PostgreSQL data directory will be used and finally root if the data directory user/group cannot be mapped to a name. Reviewed by Cynthia Shang.	2019-09-26 07:52:02 -04:00
David Steele	c969137021	Migrate backup manifest load/save to C. The backup manifest stores a complete list of all files, links, and paths in a backup along with metadata such as checksums, sizes, timestamps, etc. A list of databases is also included for selective restore. The purpose of the manifest is to allow the restore command to confidently reconstruct the PostgreSQL data directory and ensure that nothing is missing or corrupt. It is also useful for reporting, e.g. size of backup, backup time, etc. For now, migrate enough functionality to implement the restore command. Reviewed by Cynthia Shang.	2019-09-23 13:50:46 -04:00
David Steele	174cb7b3af	Add strPathAbsolute() and strLstRemoveIdx(). strPathAbsolute() generates an absolute path from an absolute base path and an absolute/relative path. strLstRemoveIdx() is a support function based on lstRemoveIdx().	2019-09-19 22:42:28 -04:00
David Steele	15d04ca19c	Add recursion and json output to the ls command. These features finally make the ls command practical. Currently the JSON contains only name, type, and size. We may add more fields in the future, but these seem like the minimum needed to be useful.	2019-09-12 16:29:50 -04:00
David Steele	e45baa1830	Add sorting, filters, and recursion to storageInfoList(). These are needed for the ls command and are also useful for testing.	2019-09-12 16:03:05 -04:00
David Steele	506c10f7f2	Sort and find improvements to List and StringList objects. Push the responsibility for sort and find down to the List object by introducing a general comparator function that can be used for both sorting and finding. Update insert and add functions to return the item added rather than the list. This is more useful in the core code, though numerous updates to the tests were required.	2019-09-12 12:04:25 -04:00
David Steele	f4f21d0df7	Add groupIdFromName() and userIdFromName() to user module. Update StorageWritePosix to use the new functions. A side effect is that storageWritePosixOpen() will no longer error when the user/group name does not exist. It will simply retain the original user/group, i.e. the user that executed the restore. In general this is a feature since completing a restore is more important than setting permissions exactly from the source host. However, some notification of this omission to the user would be beneficial.	2019-09-10 13:02:05 -04:00
David Steele	1049632873	Add user module for managing system users/groups. Centralize the management of users and groups. Also update Posix storage driver where users/groups were already in use.	2019-09-08 20:11:51 -04:00
David Steele	4d84820021	Improve performance of info file load/save. Info files required three copies in memory to be loaded (the original string, an ini representation, and the final info object). Not only was this memory inefficient but the Ini object does sequential scans when searching for keys making large files very slow to load. This has not been an issue since archive.info and backup.info are very small, but it becomes a big deal when loading manifests with hundreds of thousands of files. Instead of holding copies of the data in memory, use a callback to deliver the ini data directly to the object when loading. Use a similar method for save to avoid having an intermediate copy. Save is a bit complex because sections/keys must be written in alpha order or older versions of pgBackRest will not calculate the correct checksum. Also move the load retry logic to helper functions rather than embedding it in the Info object. This allows for more flexibility in loading and ensures that stack traces will be available when developing unit tests. Reviewed by Cynthia Shang.	2019-09-06 13:48:28 -04:00
David Steele	7334f30c35	Add helper function for adding CipherBlock filters to groups. Reviewed by Cynthia Shang.	2019-09-06 13:35:28 -04:00
David Steele	5c314df098	Rename infoManifest module to manifest. The manifest is not an info file so if anything it should be called backupManifest. But that seems too long for such a commonly used object so manifest seems better. Note that unlike Perl there is no storage manifest method so this stands as the only manifest in the C code, as befits its importance.	2019-09-05 19:53:00 -04:00
David Steele	7ade3fc1c3	Move constants from the infoManifest module to the infoBackup module. These constants should be kept separate because the implementation of any info file might change in the future and only the interface should be expected to remain consistent. In any case, infoBackup requires Variant constants while infoManifest uses String constants so they are not shareable. Modern compilers should combine the underlying const char * constants.	2019-09-02 21:09:43 -04:00
Cynthia Shang	c733319063	The stanza-create/update/delete commands are implemented entirely in C. Contributed by Cynthia Shang.	2019-08-21 16:26:28 -04:00
Cynthia Shang	53f27da3a6	Add checkDbConfig() to compare pgBackRest/PostgreSQL configs. Checking the PostgreSQL-reported path and version against the pgBackRest configuration helps ensure that pgBackRest is operating against the correct cluster. In Perl this functionality was in the Db object, but check seems like a better place for it in C. Contributed by Cynthia Shang.	2019-08-21 15:41:52 -04:00
Cynthia Shang	fa640f22ad	Allow Info* objects to be created from scratch in C. Previously, info files (e.g. archive.info, backup.info) were created in Perl and only loaded in C. The upcoming stanza commands in C need to create these files so refactor the Info* objects to allow new, empty objects to be created. Also, add functions needed to initialize each Info* object to a valid state. Contributed by Cynthia Shang.	2019-08-21 15:12:00 -04:00
Cynthia Shang	6a09d9294d	Require storage when calling pgControlFromFile(). Previously storageLocal() was being used internally but loading pg_control from remote storage is often required. Also, storagePg() is more appropriate than storageLocal() for all current usage. Contributed by Cynthia Shang.	2019-08-21 11:29:30 -04:00
David Steele	7d97d49f41	Add MostCommonValue object. Calculate the most common value in a list of variants. If there is a tie then the first value passed to mcvUpdate() wins. mcvResult() can be called multiple times because it does not end processing, but there is a cost to calculating the result each time since it is not stored.	2019-08-18 20:46:34 -04:00
Cynthia Shang	382ed92825	The start/stop commands are implemented entirely in C. The Perl versions remain because they are still being used by the Perl stanza commands. Once the stanza commands are migrated they can be removed. Contributed by Cynthia Shang.	2019-08-09 15:17:18 -04:00
David Steele	3d3003e9ca	The check command is implemented partly in C. Implement switch WAL and archive check in C but leave the rest in Perl for now. The main idea was to have some real integration tests for the new database code so the rest of the migration can wait. Reviewed by Cynthia Shang.	2019-08-01 20:35:01 -04:00
David Steele	e4901d50d5	Add Db object to encapsulate PostgreSQL queries and commands. Migrate functionality from the Perl Db module to C. For now this is just enough to implement the WAL switch check. Add the dbGet() helper function to get Db objects easily. Create macros in harnessPq to make writing pq scripts easier by grouping commonly used functions together. Reviewed by Cynthia Shang.	2019-08-01 15:38:27 -04:00
Cynthia Shang	03b28da1ca	Rename control/control module to control/common. This is more consistent with how other common modules are named. Contributed by Cynthia Shang.	2019-07-31 11:35:58 -04:00
David Steele	d8ca0e5c5b	Add Perl interface to C PgQuery object. This validates that all current queries work with the new interface and removes the dependency on DBD::Pg.	2019-07-25 17:05:39 -04:00
David Steele	415542b4a3	Add PostgreSQL query client. This direct interface to libpq allows simple queries to be run against PostgreSQL and supports timeouts. Testing is performed using a shim that can use scripted responses to test all aspects of the client code. The shim will be very useful for testing backup scenarios on complex topologies. Reviewed by Cynthia Shang.	2019-07-25 14:50:02 -04:00
David Steele	59f135340d	The local command for backup is implemented entirely in C. The local process is now entirely migrated to C. Since all major I/O operations are performed in the local process, the vast majority of I/O is now performed in C. Contributed by David Steele, Cynthia Shang.	2019-07-25 14:34:16 -04:00
David Steele	38ba458616	Add IoSink filter. Discard all data passed to the filter. Useful for calculating size/checksum on a remote system when no data needs to be returned. Update ioReadDrain() to automatically use the IoSink filter.	2019-07-18 08:42:42 -04:00
David Steele	eee67db4d6	Allow pg storage to be remote. None of the currently migrated commands needed remote pg storage but now backup, check, stanza-* will need it.	2019-07-17 14:09:50 -04:00
David Steele	3e1062825d	Allow multiple filters to be pushed to the remote and return results. Previously only a single filter could be pushed to the remote since order was not being maintained. Now the filters are strictly ordered. Results are returned from the remote and set in the local IoFilterGroup so they can be retrieved. Expand remote filter support to include all filters.	2019-07-15 16:49:46 -04:00
David Steele	4bffa0c5bb	Add test function to create the S3 bucket instead of using aws cli. Eventually the idea is to remove the dependency on aws cli since Python is a big install.	2019-06-26 15:02:30 -04:00
David Steele	4815752ccc	Add Perl interface to C storage layer. Maintaining the storage layer/drivers in two languages is burdensome. Since the integration tests require the Perl storage layer/drivers we'll need them even after the core code is migrated to C. Create an interface layer so the Perl code can be removed and new storage drivers/features introduced without adding Perl equivalents. The goal is to move the integration tests to C so this interface will eventually be removed. That being the case, the interface was designed for maximum compatibility to ease the transition. The result looks a bit hacky but we'll improve it as needed until it can be retired.	2019-06-26 08:24:58 -04:00
David Steele	c22e10e4a9	Honor configure --prefix option. The --prefix option was entirely ignored and DESTDIR was a combination of DESTDIR and bindir. Bring both in line with recommendations for autoconf and make as specified in https://www.gnu.org/software/make/manual/html_node/Directory-Variables.html and https://www.gnu.org/prep/standards/html_node/DESTDIR.html. Suggested by Daniel Westermann.	2019-06-24 15:42:33 -04:00
David Steele	039e515a31	Allow protocol compression when read/writing remote files. If the file is compressible (i.e. not encrypted or already compressed) it can be marked as such in storageNewRead()/storageNewWrite(). If the file is being read from/written to a remote it will be compressed in transit using gzip. Simplify filter group handling by having the IoRead/IoWrite objects create the filter group automatically. This removes the need for a lot of NULL checking and has a negligible effect on performance since a filter group needs to be created eventually unless the source file is missing. Allow filters to be created using a VariantList so filter parameters can be passed to the remote.	2019-06-24 10:20:47 -04:00
David Steele	434cd83285	The expire command is implemented entirely in C. This implementation duplicates the functionality of the Perl code but does so with different logic and includes full unit tests. Along the way at least one bug was fixed, see issue #748. Contributed by Cynthia Shang.	2019-06-18 15:19:20 -04:00
David Steele	ceafd8e19d	Migrate page checksum filter to C. This filter exactly mimics the behavior of the Perl filter so is a drop-in replacement. The filter is not integrated yet since it requires the Perl-to-C storage layer interface coming in a future commit.	2019-06-17 07:52:03 -04:00
Cynthia Shang	c64c9c0590	Add backup management functions to InfoBackup. Allow current backups to be listed and deleted. Also expose some constants required by expire and stanza-* commands. Contributed by Cynthia Shang.	2019-06-17 06:59:06 -04:00
David Steele	fdd375b63d	Integrate S3 storage driver with HTTP client cache. This allows copying from one S3 object to another. We generally try to avoid doing this but there are a few cases where it is needed and the tests do it quite a bit. One thing to look out for here is that reads require the http client to be explicitly released by calling httpClientDone(). This means than clients could grow if they are not released properly. The http statistics will hopefully alert us if this is happening.	2019-06-11 16:26:32 -04:00
David Steele	ced42d6511	Add HTTP client cache. This cache manages multiple http clients and returns one to the caller that is not busy. It is the responsibility of the caller to indicate when they are done with a client. If returnContent is set then the client will automatically be marked done. Also add special handing for HEAD requests to recognize that content-length is informational only and no content is expected.	2019-06-11 10:48:22 -04:00
David Steele	12bca3c43e	Add CPPFLAGS to compile rules. This should silence the last of the Debian package warnings.	2019-06-01 09:28:31 -04:00
David Steele	20e5b92f36	Add ls command. Allows listing repo paths/files from the command-line, to be used primarily for testing and debugging. This command is internal-only so the interface may change at any time without notice.	2019-05-28 10:03:48 -04:00
David Steele	3b3327eae6	Move tls/http statistics output to command/command. This module already has the filtering required to keep these messages from being displayed by default for commands that output to stdout (e.g. info).	2019-05-28 09:50:59 -04:00
David Steele	a474ba54c5	Refactoring path support in the storage module. Not all storage types support paths as a physical thing that must be created/destroyed. Add a feature to determine which drivers use paths and simplify the driver API as much as possible given that knowledge and by implementing as much path logic as possible in the Storage object. Remove the ignoreMissing parameter from pathSync() since it is not used and makes little sense. Create a standard list of error messages for the drivers to use and apply them where the code was modified -- there is plenty of work still to be done here.	2019-05-26 12:41:15 -04:00
David Steele	38f28bd520	Log TLS and HTTP statistics on exit. These stats measure how efficiently TLS and HTTP are reusing connections (i.e. pipelining).	2019-05-26 12:32:49 -04:00
David Steele	1bc84c6474	The local command for restore is implemented entirely in C. This is just the part of restore run by the local helper processes, not the entire command. Even so, various optimizations in the code (like pipelining and optimizations for zero-length files) should make the restore command faster on object stores.	2019-05-20 17:07:37 -04:00
David Steele	f1eea23121	Add macros for object free functions. Most of the Free() functions are pretty generic so add macros to make creating them as easy as possible. Create a distinction between Free() functions that the caller uses to free memory and callbacks that free third-party resources. There are a number of cases where a driver needs to free resources but does not need a normal *Free() because it is handled by the interface. Add common/object.h for macros that make object maintenance easier. This pattern can also be used for many more object functions.	2019-05-03 18:52:54 -04:00
David Steele	4a20d44c6b	Add common/macro.h for general-purpose macros. Add GLUE() macro which is useful for creating identifiers. Move MACRO_TO_STR() here and rename it STRINGIFY(). This appears to be the standard name for this type of macro and it is also an awesome name.	2019-05-03 17:49:57 -04:00
David Steele	32ca27a20b	Simplify storage object names. Remove "File" and "Driver" from object names so they are shorter and easier to keep consistent. Also remove the "driver" directory so storage implementations are visible directly under "storage".	2019-05-03 15:46:15 -04:00
David Steele	8c712d89eb	Improve type safety of interfaces and drivers. The function pointer casting used when creating drivers made changing interfaces difficult and led to slightly divergent driver implementations. Unit testing caught production-level errors but there were a lot of small issues and the process was harder than it should have been. Use void pointers instead so that no casts are required. Introduce the THIS_VOID and THIS() macros to make dealing with void pointers a little safer. Since we don't want to expose void pointers in header files, driver functions have been removed from the headers and the various driver objects return their interface type. This cuts down on accessor methods and the vast majority of those functions were not being used. Move functions that are still required to .intern.h. Remove the special "C" crypto functions that were used in libc and instead use the standard interface.	2019-05-02 17:52:24 -04:00
David Steele	027c263871	Add configure script for improved multi-platform support. Use autoconf to provide a basic configure script. WITH_BACKTRACE is yet to be migrated to configure and the unit tests still use a custom Makefile. Each C file must include "build.auto.conf" before all other includes and defines. This is enforced by test.pl for includes, but it won't detect incorrect define ordering. Update packages to call configure and use standard flags to pass options.	2019-04-26 08:08:23 -04:00

48 Commits