There are a number of cases where a checksum delta is more appropriate than the default time-based delta:
* Timeline has switched since the prior backup
* File timestamp is older than recorded in the prior backup
* File size changed but timestamp did not
* File timestamp is in the future compared to the start of the backup
* Online option has changed since the prior backup
A practical example is that checksum delta will be enabled after a failover to standby due to the timeline switch. In this case, timestamps can't be trusted and our recommendation has been to run a full backup, which can impact the retention schedule and requires manual intervention.
Now, a checksum delta will be performed if the backup type is incr/diff. This means more CPU will be used during the backup but the backup size will be smaller and the retention schedule will not be impacted.
Contributed by Cynthia Shang.
PostgreSQL 11 introduces configurable WAL segment sizes, from 1MB to 1GB.
There are two areas that needed to be updated to support this: building the archive-get queue and checking that WAL has been archived after a backup. Both operations require the WAL segment size to properly build a list.
Checking the archive after a backup is still implemented in Perl and has an active database connection, so just get the WAL segment size from the database.
The archive-get command does not have a connection to the database, so get the WAL segment size from pg_control instead. This requires a deeper inspection of pg_control than has been done in the past, so it seemed best to copy the relevant data structures from each version of PostgreSQL and build a generic interface layer to address them. While this approach is a bit verbose, it has the advantage of being relatively simple, and can easily be updated for new versions of PostgreSQL.
Since the integration tests generate pg_control files for testing, teach Perl how to generate files with the correct offsets for both 32-bit and 64-bit architectures.
Use checksums rather than timestamps to determine if files have changed. This is useful in cases where the timestamps may not be trustworthy, e.g. when performing an incremental after failing over to a standby.
If checksum delta is enabled then checksums will be used for verification of resumed backups, even if they are full. Resumes have always used checksums to verify the files in the repository, enabling delta performs checksums on the database files as well.
Note that the user must manually enable this feature in cases were it would be useful or just keep in enabled all the time. A future commit will address automatically enabling the feature in cases where it seems likely to be useful.
Contributed by Cynthia Shang.
% characters caused issues in backup/restore due to filenames being appended directly into a format string.
Reserved XML characters (<>&') caused issues in the S3 driver due to improper escaping.
Add a file with all common special characters to regression testing.
Previously an error would be generated if other files were present and not owned by the PostgreSQL user. This hasn't been a big deal in practice but it could cause issues.
Also add tests to make sure the same logic applies with links to files, i.e. all other files in the directory should be ignored. This was actually working correctly, but there were no tests for it before.
Relative link paths were being combined with the paths of previous links (relative or absolute) due to the $strPath variable being modified in the current iteration rather than simply being passed to the next level of recursion.
This issue did not affect absolute links and relative tablespace links were caught by other checks, though the error was confusing.
Reported by Cynthia Shang.
Implemented using the same logic as the patches adding this feature to PostgreSQL, 8694cc96 and 920a5e50. Temporary relation exclusion is enabled in PostgreSQL ≥ 9.0. Unlogged relation exclusion is enabled in PostgreSQL ≥ 9.1, where the feature was introduced.
Contributed by Cynthia Shang.
A regression in v0.82 removed the timestamp comparison when deciding which files from the aborted backup to keep on resume. All resumed backups should be considered inconsistent. A resumed backup can be identified by checking the log for the message "aborted backup of same type exists, will be cleaned to remove invalid files and resumed".
Reported by David Youatt, Yogesh Sharma, Stephen Frost.
Low-level functions only include stack trace in test builds while higher-level functions ship with stack trace built-in. Stack traces include all parameters passed to the function but production builds only create the parameter list when the log level is set high enough, i.e. debug or trace depending on the function.
Configuration files are loaded from the directory specified by the --config-include-path option.
Add --config-path option for overriding the default base path of the --config and --config-include-path option.
Contributed by Cynthia Shang.
The Perl process was exiting directly when called but that interfered with proper locking for the forked async process. Now Perl returns results to the C process which handles all errors, including signals.
Now only two types of locks can be taken: archive and backup. Most commands use one or the other but the stanza-* commands acquire both locks. This provides better protection than the old command-based locking scheme.
Scanning the entire backup directory can be very expensive if there are a lot of small tables. The backup manifest contains the backup directory list so use it to perform syncs instead of scanning the backup directory.
Move command begin to C except when it must be called after another command in Perl (e.g. expire after backup). Command begin logs correctly for complex data types like hash and list. Specify which commands will log to file immediately and set the default log level for log messages that are common to all commands. File logging is initiated from C.
If the backup cannot map a group to a name it stores the group in the manifest as false then uses either the owner of $PGDATA to set the group during restore or failing that the group of the current user. This logic was not working correctly because the selected group was overwriting the user on restore leaving the group undefined and the user incorrectly set to the group. (Reported by Jeff McCormick.)
The existing static files would not work with 32-bit or big-endian systems so create functions to generate these files dynamically rather than creating a bunch of new static files.
Previously, functions with sensitive options had to be logged at trace level to avoid exposing them. Trace level logging may still expose secrets so use with caution.
* Exclude contents of pg_snapshots, pg_serial, pg_notify, and pg_dynshmem from backup since they are rebuilt on startup.
* Exclude pg_internal.init files from backup since they are rebuilt on startup.