1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
Commit Graph

56 Commits

Author SHA1 Message Date
David Steele
f0ef73db70 pgBackRest is now pure C.
Remove embedded Perl from the distributed binary.  This includes code, configure, Makefile, and packages.  The distributed binary is now pure C.

Remove storagePathEnforceSet() from the C Storage object which allowed Perl to write outside of the storage base directory.  Update mock/all and real/all integration tests to use storageLocal() where they were violating this rule.

Remove "c" option that allowed the remote to tell if it was being called from C or Perl.

Code to convert options to JSON for passing to Perl (perl/config.c) has been moved to LibC since it is still required for Perl integration tests.

Update build and installation instructions in the user guide.

Remove all Perl unit tests.

Remove obsolete Perl code.  In particular this included all the Perl protocol code which required modifications to the Perl storage, manifest, and db objects that are still required for integration testing but only run locally.  Any remaining Perl code is required for testing, documentation, or code generation.

Rename perlReq to binReq in define.yaml to indicate that the binary is required for a test.  This had been the actual meaning for quite some time but the key was never renamed.
2019-12-13 17:55:41 -05:00
David Steele
8dfe0e48e2 Use more general error code when tablespace linked into PGDATA.
The specific error code was not that useful since we also test the error message which contains details of the link error.
2019-12-02 10:49:25 -05:00
David Steele
01aefc563d Update Perl page checksum expression.
This expression determines which files contain page checksums but it was also including the directory above the relation directories.  In a real PostgreSQL installation this not a problem because these directories don't contain any files.

However, our tests place a file in `base` which the Perl code thought should have page checksums while the new C code says no.

Update the expression to document the change and avoid churn in the expect logs later.
2019-11-25 07:37:09 -05:00
David Steele
29e132f5e9 PostgreSQL 12 support.
Recovery settings are now written into postgresql.auto.conf instead of recovery.conf.  Existing recovery_target* settings will be commented out to help avoid conflicts.

A comment is added before recovery settings to identify them as written by pgBackRest since it is unclear how, in general, old settings will be removed.

recovery.signal and standby.signal are automatically created based on the recovery settings.
2019-10-01 13:20:43 -04:00
Josh Soref
c2771e5469 Fix comment typos.
This includes some variable names in tests which don't seem important enough for their own commits.

Contributed by Josh Soref.
2019-08-26 12:05:36 -04:00
David Steele
01c2669b97 Fix exclusions for special files.
Prior to 2.16 the Perl manifest code would skip any file that began with a dot.  This was not intentional but it allowed PostgreSQL socket files to be located in the data directory.  The new C code in 2.16 did not have this unintentional exclusion so socket files in the data directory caused errors.

Worse, the file type error was being thrown before the exclusion check so there was really no way around the issue except to move the socket files out of the data directory.

Special file types (e.g. socket, pipe) will now be automatically skipped and a warning logged to notify the user of the exclusion.  The warning can be suppressed with an explicit --exclude.

Reported by CluelessTechnologist, Janis Puris, Rachid Broum.
2019-08-23 07:47:54 -04:00
David Steele
488fb67294 Force PostgreSQL versions to string for newer versions of JSON:PP.
Since 2.91 JSON::PP has a bias for saving variables that look like numbers as numbers even if they were declared as strings.

Force versions to strings where needed by appending ''.

Update the json-pp-perl package on Ubuntu 18.04 to 2.97 to provide test coverage.
2019-07-05 17:25:01 -04:00
blogh
e4e2606fce Add additional options to backup.manifest for debugging purposes.
Add the buffer-size, compress-level, compress-level-network, and process-max options to the backup:option section in backup.manifest to aid in debugging.

It may also make sense to propagate these options up to backup.info so they can be displayed in the info command, but for now this is deemed sufficient.

Contributed by blogh.
2019-03-10 11:03:52 +02:00
Cynthia Shang
34c63276cd Automatically enable backup checksum delta when anomalies (e.g. timeline switch) are detected.
There are a number of cases where a checksum delta is more appropriate than the default time-based delta:

* Timeline has switched since the prior backup
* File timestamp is older than recorded in the prior backup
* File size changed but timestamp did not
* File timestamp is in the future compared to the start of the backup
* Online option has changed since the prior backup

A practical example is that checksum delta will be enabled after a failover to standby due to the timeline switch.  In this case, timestamps can't be trusted and our recommendation has been to run a full backup, which can impact the retention schedule and requires manual intervention.

Now, a checksum delta will be performed if the backup type is incr/diff.  This means more CPU will be used during the backup but the backup size will be smaller and the retention schedule will not be impacted.

Contributed by Cynthia Shang.
2018-11-01 11:31:25 -04:00
Cynthia Shang
880fbb5e57 Add checksum delta for incremental backups.
Use checksums rather than timestamps to determine if files have changed.  This is useful in cases where the timestamps may not be trustworthy, e.g. when performing an incremental after failing over to a standby.

If checksum delta is enabled then checksums will be used for verification of resumed backups, even if they are full.  Resumes have always used checksums to verify the files in the repository, enabling delta performs checksums on the database files as well.

Note that the user must manually enable this feature in cases were it would be useful or just keep in enabled all the time.  A future commit will address automatically enabling the feature in cases where it seems likely to be useful.

Contributed by Cynthia Shang.
2018-09-19 11:12:45 -04:00
David Steele
375ff9f9d2 Ignore all files in a linked tablespace directory except the subdirectory for the current version of PostgreSQL.
Previously an error would be generated if other files were present and not owned by the PostgreSQL user.  This hasn't been a big deal in practice but it could cause issues.

Also add tests to make sure the same logic applies with links to files, i.e. all other files in the directory should be ignored.  This was actually working correctly, but there were no tests for it before.
2018-08-31 16:06:40 -04:00
David Steele
70514061fd Fix issue where relative links in $PGDATA could be stored in the backup with the wrong path.
Relative link paths were being combined with the paths of previous links (relative or absolute) due to the $strPath variable being modified in the current iteration rather than simply being passed to the next level of recursion.

This issue did not affect absolute links and relative tablespace links were caught by other checks, though the error was confusing.

Reported by Cynthia Shang.
2018-08-30 16:27:36 -04:00
David Steele
14cde54b37 Limit manifest build recursion (i.e. links followed) to sixteen levels to detect link loops. 2018-08-28 16:27:10 -04:00
David Steele
a6cecf7d5e Prevent manifest from being built more than once. 2018-08-28 16:22:30 -04:00
David Steele
bef58a7974 Allow arbitrary directories and/or files to be excluded from a backup.
Misuse of this feature can lead to inconsistent backups so read the --exclude documentation carefully before using.
2018-08-27 15:51:05 -04:00
Cynthia Shang
eb30d88b6a Allow zero-size files in backup manifest to reference a prior manifest regardless of timestamp delta.
Contributed by Cynthia Shang.
2018-08-24 16:50:33 -04:00
Cynthia Shang
bec4c176dc Exclude temporary and unlogged relation (table/index) files from backup.
Implemented using the same logic as the patches adding this feature to PostgreSQL, 8694cc96 and 920a5e50. Temporary relation exclusion is enabled in PostgreSQL ≥ 9.0. Unlogged relation exclusion is enabled in PostgreSQL ≥ 9.1, where the feature was introduced.

Contributed by Cynthia Shang.
2018-07-30 18:53:34 -04:00
Cynthia Shang
0acf705416 Require PostgreSQL catalog version when instantiating a Manifest object (and not loading it from disk).
Contributed by Cynthia Shang.
2018-07-16 17:25:15 -04:00
David Steele
350b30fa49 Move cryptographic hash functions to C using OpenSSL. 2018-06-11 14:52:26 -04:00
David Steele
5e090ba305 Fix failure in manifest build when two or more files in PGDATA are linked to the same directory.
Reported by Vitaliy Kukharik.
2018-05-02 12:19:54 -04:00
David Steele
be90028100 Rename db-* options to pg-* and backup-* options to repo-* to improve consistency.
* repo-* options are now indexed although only one is allowed.
* List deprecated option names in documentation and command-line help.
2018-02-03 18:27:38 -05:00
Cynthia Shang
00f58ec8c0 Fixed inability to restore a single database contained in a tablespace using --db-include.
Fixed by Cynthia Shang.
2018-01-30 16:13:54 -05:00
Cynthia Shang
bd74711ceb Add unit tests for the Manifest module.
Also minor changes to Manifest module, mostly for test reproducibility.

Contributed by Cynthia Shang.
2017-11-28 11:44:24 -05:00
Cynthia Shang
b03c26968a Repository encryption support.
Contributed by Cynthia Shang.
2017-11-06 12:51:12 -05:00
David Steele
6343fdd584 Additional backup exclusions.
* Exclude contents of pg_snapshots, pg_serial, pg_notify, and pg_dynshmem from backup since they are rebuilt on startup.
* Exclude pg_internal.init files from backup since they are rebuilt on startup.
2017-09-04 08:26:57 -04:00
David Steele
fcb7c6fd1d PostgreSQL 10 support. 2017-09-01 12:29:34 -04:00
David Steele
1e0ed07455 Configuration rules are now pulled from the C library when present. 2017-08-25 16:47:47 -04:00
David Steele
d5c1f02c72 Include archive_status directory in online backups.
The archive_status directory is now recreated on restore to support PostgreSQL 8.3 which does not recreate it automatically like more recent versions do.

Also fixed log checking after PostgreSQL shuts down to include FATAL messages and disallow immediate shutdowns which can throw FATAL errors in the log.

Reported by Stephen Frost.
2017-07-24 07:57:47 -04:00
David Steele
2310e423e9 Fixed an issue that prevented tablespaces from being backed up on PostgreSQL ≤ 8.4.
The integration tests that were supposed to prevent this regression did not work as intended.  They verified the contents of a table in the (supposedly) restored tablespace, deleted the table, and then deleted the tablespace.  All of this was deemed sufficient to prove that the tablespace had been restored correctly and was valid.

However, PostgreSQL will happily recreate a tablespace on the basis of a single full-page write, at least in the affected versions.  Since writes to the test table were replayed from WAL with each recovery, all the tests passed even though the tablespace was missing after the restore.

The tests have been updated to include direct comparisons against the file system and a new table that is not replayed after a restore because it is created before the backup and never modified again.

Versions ≥ 9.0 were not affected due to numerous synthetic integration tests that verify backups and restores file by file.
2017-06-27 16:47:40 -04:00
David Steele
de7fc37f88 Storage and IO layer refactor:
Refactor storage layer to allow for new repository filesystems using drivers. (Reviewed by Cynthia Shang.)
Refactor IO layer to allow for new compression formats, checksum types, and other capabilities using filters. (Reviewed by Cynthia Shang.)
2017-06-09 17:51:41 -04:00
David Steele
3d84f2ce5e Improvements to Ini.pm.
* Refactor Ini.pm to facilitate testing.
* Complete statement/branch coverage for Ini.pm.
* Improved functions used to test/munge manifest and info files.
2017-04-10 13:24:45 -04:00
David Steele
02730526fc Fixed an issue where databases created with a non-default tablespace would raise bogus warnings about pg_filenode.map and pg_internal.init not being page aligned.
Reported by blogh.
2017-03-02 13:50:29 -05:00
David Steele
36a5349b1c Added the --checksum-page option.
This option allows pgBackRest to validate page checksums in data files when checksums are enabled on PostgreSQL >= 9.3. Note that this functionality requires a C library which may not initially be available in OS packages. The option will automatically be enabled when the library is present and checksums are enabled on the cluster.
2016-12-12 18:54:07 -05:00
David Steele
a850335015 Simplified the result hash of File->manifest(), Db->tablespaceMapGet(), and Db->databaseMapGet(). 2016-11-30 14:36:39 -05:00
David Steele
dd621081b9 Fixed an issue where tablespace paths with the same prefix would cause an invalid link error.
Reported by Nikhilchandra Kulkarni.
2016-11-07 16:37:16 +02:00
David Steele
f43e5bc52d Removed extraneous use lib directives from Perl modules.
Suggested by Devrim Gündüz.
2016-11-04 13:56:26 +02:00
David Steele
a701309453 Converted Perl threads to processes. 2016-09-06 09:35:02 -04:00
David Steele
bcdb5cdac8 Fixed a issue where tablespaces were copied from the master during standby backup. 2016-09-04 09:19:44 -04:00
David Steele
2feaaf225e Exclude contents of $PGDATA/pg_replslot directory. 2016-09-04 09:13:13 -04:00
David Steele
5ada189a92 Backup from a standby cluster.
A connection to the primary cluster is still required to start/stop the backup and copy files that are not replicated, but the vast majority of files are copied from the standby in order to reduce load on the master.
2016-08-25 11:25:46 -04:00
David Steele
cd6278e5af Revert some backup exclusions until they have been tested more thoroughly. 2016-08-24 12:27:48 -04:00
David Steele
f1412baccf Exclude directories during backup that are cleaned, recreated, or zeroed by PostgreSQL at startup.
These include (depending on the version where they were introduced): pgsql_tmp, pg_dynshmem, pg_notify, pg_replslot, pg_serial, pg_snapshots, pg_stat_tmp, pg_subtrans. The postgresql.auto.conf.tmp file is now excluded in addition to files that were already excluded: backup_label.old, postmaster.opts, postmaster.pid, recovery.conf, recovery.done.
2016-08-16 09:35:16 -04:00
David Steele
1e0f15f425 Improve error message for links that reference links in manifest build. 2016-08-15 17:23:37 -04:00
David Steele
f9fa1270b2 Fixed #236: Recursive user tablespace symlink.
A tablespace link that referenced another link would not produce an error, but instead skip the tablespace entirely.
2016-08-15 17:11:45 -04:00
David Steele
17b79d6279 Database version refactoring.
* Refactor db version constants into a separate module.
* Update synthetic backup tests to PostgreSQL 9.4.
2016-08-11 22:35:24 -04:00
David Steele
bff262ac47 Removed all OP_* function constants that were used only for debugging, not in the protocol, and replaced with __PACKAGE__. 2016-08-11 17:32:28 -04:00
David Steele
34afe5e85b Fixed issue with tablespace link checking.
* Tablespace paths that had $PGDATA as a substring would be identified as a subdirectories of $PGDATA even when they were not.
* Also hardened relative path checking a bit.
2016-08-09 09:05:27 -04:00
David Steele
a3b8808f94 Fixed an issue where the contents of pg_xlog were being copied if the directory was symlinked. 2016-07-29 18:44:53 -04:00
David Steele
cc2a8777d5 User/group permissions improvements.
Improved handling of users/groups captured during backup that do not exist on the restore host. Also explicitly handle the case where user/group is not mapped to a name.
2016-06-26 21:01:20 -04:00
David Steele
23a3911830 Stop using pg_xlogfile_name().
The pg_xlogfile_name() function is no longer used to construct WAL filenames from LSNs. While this function is convenient it is not available on a standby. Instead, the archive is searched for the LSN in order to find the timeline. If due to some misadventure the LSN appears on multiple timelines then an error will be thrown, whereas before this condition would have passed unnoticed.
2016-06-24 08:06:20 -04:00