This mode is not actually necessary if we consider that the core of
pg_rman is the obtention of differential and full backups, the server
being afterwards in charge to recover necessary WAL segments from the
archive.
Regression tests and documentation are updated in accordance to the
changes.
In order to keep only the core of pg_rman for incremental/differential
backup, this looks necessary and makes the code more simple. Including
server log files in a backup could be subject to discussion as well,
as for example a Postgres base backup does not include them, just
because in this case server instance is not aware of the log files.
The previous algorithm was smart enough to remove full backups older
than the given number of generations, but not enough to remove
incremental and archive backups. This resulted in keeping in the backup
list a set of incremental and archive backups older than the latest
full backup allowed. As it is useless to keep them, the deletion
algorithm is made smarter to take that into account and remove all of
them cleanly only when necessary.
Diffs were generated because of wc that puts some spaces before the
output which is in this case a number of lines. This does not impact
regression tests on Linux/Unix.
This has the merit to put all the documentation of the project into a
single banner, and to centralize all the project in a single place at
code level.
Compiling documentation can be made by setting the variables ASCIIDOC
and XMLTO. As PostgreSQL extension system is not that smart for doc
generation, some custom Makefile path is used to install man pages into
a folder that could directly by used in MANPATH.
Backups could be removed even if generation number was set to infinite
without caring of the day threashold calculated. Backups are removed
if they either satisfy the generation or the day threshold.
This commit simplifies the way backup sizes are saved internally by
reusing the same variable for incremental and full backup, which were
using separated and exclusively used variables, resulted in a couple
of bytes wasted all the time. This was also reflected by a useless
column in the output table of subcommand "show".
Having files satisfying both conditions seems somewhat awkward, as users
would usually choose either the number of generations to keep or the
amount of days to keep the files. Hence deletion of a backup is bypassed
only when both parameters are set to infinite.
At the same time correct some typos and incorrections in the deletion
code.
Backup from standbys should use a method based on replication protocol
in a way similar to what is done in pg_basebackup, as it cannot use
pg_start/stop_backup. As I am not sure what would be the right approach
by the way, it is better for the time being to block backups taken
from a standby. It does not penalize the functionality though as taking
disk snapshots is not forbidden either, and a user can still recover
from that. This commit removes at the same time some home-made functions
that created custom backup label files, this is not relyable, especially
if Postgres core format for this file changes across versions. Removing
them at least will save from some bugs for sure.
Name file of WAL segment was generated using the API of xlog_internal.h
called XlogFileName, based on XLogSegNo and not XLogRecPtr as the
previous code assumed. This leaded to backup incorrect, actually too
many WAL files in the archive code path because the analysis was based
on a name completely fucked up. This commit fixes at the same time an
issue in search_next_wal where the function could loop for a too long
amount of time, eating much CPU when looking for the next WAL file.
Regression tests are passing cleanly with this patch.
This commit makes mandatory the presence of a full backup when doing
an incremental or archive backup on an existing timeline. In this case
the process will now simply error out and not take any backup. It looks
safer to use that as a default by the way, so as user will be forced
to take a full backup once a recovery has been done.
Database backup also contained the following condition when doing an
incremental backup:
prev_backup->tli != current.tli
This means that an incremental backup cannot be taken if there is not
already a full backup present in the same timeline. The same condition
should also be used for archive backup but it didn't seem to be the
case...
This bug has been introduced by some older code, it looks that it will be
necessary to re-create a battery of regression tests to test all those
things automtically, as former tests contain nothing to test archive
mode directly.
Those macros were mainly used in code paths where they didn't make that
much sense, complicating heavily the code. Correct at the same time some
code comments.
It was unclear what was being errored out at the beginning of the
process. But it happens that it is just necessary to check if the
backup running is only an archive or not, then return a NULL file
list before continuing process. This should be part of some safety
checks though.
The documentation found on internet is rather unclear about the role
and the goal of this feature, which looks more like a kludge to cover
the fact that most of the system XLOG functions do not work on standby
nodes. Now that this restriction has been removed by using the control
file to look for the current timestamp, this feature is not needed.
The system function used up to now was pg_xlogfile_name_offset, which
cannot be used on a node in recovery, and it was the only way present
to fetch the timeline ID of a backup, either incremental or full. So
instead scan the control file of server and fetch the timeline from
that. This also removes the restriction on which a backup could not
be taken on a standby node. The next step being to have the possibility
to take backups from streams.
It is just troublesome to have to type a subcommands for something that
could be merged into a single table. The output could be made in a
smarter way though...