pgbackrest

mirror of https://github.com/pgbackrest/pgbackrest.git synced 2026-06-20 01:17:49 +02:00

Author	SHA1	Message	Date
David Steele	d72bea9fe7	Fetch the libssh2 session from storage in SFTP read/write objects. The SFTP read and write objects cached the libssh2 session and sftp session pointers passed to their constructors. Add storageSftpSession() and storageSftpSessionSftp() getters and fetch the session from the storage object instead, removing the session parameters from storageReadSftpNew() and storageWriteSftpNew(). Fetching the session on demand means the read and write objects always use the storage object's current session rather than a copy taken at construction. The will be used in a future commit to pick up a new session after transparently reopening a connection the server dropped while idle.	2026-06-19 15:24:17 +07:00
David Steele	7da81569a0	Remove the unused sftpHandle field from the SFTP storage driver. StorageSftp carried an sftpHandle field that production code never set. Directory listing opens a local handle, and each read/write object opens and owns its own handle in its open function, so this field was always NULL. Despite that it was passed to storageReadSftpNew() and storageWriteSftpNew(), which immediately overwrite their own sftpHandle when the file is opened, and was "closed" in storageSftpLibSsh2SessionFreeResource(). Both paths were dead. Remove the field, the constructor parameter (and its function log), and the close block in the free resource. Drop the unit tests that populated the field to exercise the removed close path, and extend the free-resource branch test so the sftp_shutdown EAGAIN retry loop stays covered.	2026-06-19 14:12:01 +07:00
David Steele	af586e230e	Throw real SFTP read-open errors instead of masking them as missing. Previously storageReadSftpOpen() only inspected the failure when the libssh2 error was LIBSSH2_ERROR_SFTP_PROTOCOL or LIBSSH2_ERROR_EAGAIN. Any other error (e.g. a socket error such as LIBSSH2_ERROR_SOCKET_RECV) fell through silently, leaving a NULL handle and causing the storage layer to report the file as missing. Throw on any open failure except a genuine missing file (SFTP_PROTOCOL + LIBSSH2_FX_NO_SUCH_FILE), which is still returned to the caller via the result so ignoreMissing continues to work. This brings read-open error handling in line with storageWriteSftpOpen().	2026-06-19 12:39:40 +07:00
David Steele	1c57106e2d	Simplify error handling in SFTP storageWriteSftpOpen(). The nested LIBSSH2_ERROR_SFTP_PROTOCOL check called storageSftpEvalLibSsh2Error() identically in both branches, so collapse it into a single condition that throws FileMissingError for the no-such-file case and otherwise falls through to storageSftpEvalLibSsh2Error(). This is safe because storageSftpEvalLibSsh2Error() already omits the sftp error detail when the error is not a protocol error, and libssh2_sftp_last_error() can be read unconditionally since it has no side effects.	2026-06-19 11:36:42 +07:00
David Steele	b2106121e6	Use a storage feature to gate remove errorOnMissing. The Azure, GCS, and S3 drivers each asserted that errorOnMissing was never set in their remove function, hardcoding in every driver the fact that it cannot detect a missing file on remove. That capability was implied by an assert scattered across the drivers rather than declared anywhere. Add a storageFeatureFileRemoveMissing feature that a driver sets when it can report a file as missing on remove, and enable it for the Posix driver (but not CIFS, which shares the driver). Move the check into storageRemove(), which now asserts errorOnMissing is set only when the storage supports the feature, and drop the redundant per-driver asserts along with the now-unused errorOnMissing log parameter.	2026-06-18 13:28:31 +07:00
David Steele	9ea20ad9d0	Bring SFTP storageRemove() in line with the other repository storage drivers. The SFTP driver, like the Azure, GCS, and S3 drivers, is only ever used for repositories, and no repository code path requests errorOnMissing on remove. Those cloud drivers therefore assert that errorOnMissing is never set, but storageSftpRemove() still honored it. Assert that errorOnMissing is never set and drop it from the remove logic, so a missing file is always ignored and every other libssh2/SFTP result is raised. This matches the cloud drivers and removes a code path that was never exercised in production.	2026-06-18 12:32:20 +07:00
David Steele	3ce84b97a7	Change storage for storageMoveP() call in storage/posix unit test. Since the source file will be deleted the specified storage needs to be the same storage as the source file and writable. Currently there is a loophole what allows this to work but it will be enforced in a future commit.	2026-06-18 00:30:29 +07:00
David Steele	c869963f8c	Error on SFTP remove failures other than missing file. storageSftpRemove() only raised an error for a libssh2 LIBSSH2_ERROR_SFTP_PROTOCOL result; any other result (e.g. a transport failure such as LIBSSH2_ERROR_SOCKET_SEND) was silently treated as a successful removal unless errorOnMissing was set. A send failure says nothing about whether the file was removed -- the request may never have reached the server -- so the file could remain on the repository while pgBackRest believed it was gone. Raise an error for every result except a genuine LIBSSH2_FX_NO_SUCH_FILE when errorOnMissing is not set, which is the only case where a failed remove legitimately maps to success.	2026-06-18 00:05:05 +07:00
David Steele	ce33913a80	Unify the file remove error message. storageRemove() (Posix and SFTP) threw the ad-hoc "unable to remove '%s'" while path remove used STORAGE_ERROR_PATH_REMOVE_FILE "unable to remove file '%s'", so the same failure was reported two different ways depending on the entry point. Replace STORAGE_ERROR_PATH_REMOVE_FILE with STORAGE_ERROR_FILE_REMOVE and use it for both single-file and recursive path removal, so every file removal failure now reads "unable to remove file '%s'". The SFTP path remove detail is also separated with a colon ("...': libssh sftp [...]") to match similar messages instead of a bare space.	2026-06-17 22:44:13 +07:00
Christophe Pettus	7fd8bc89a1	Fix potential buffer overrun in error module. errorInternalThrow() called strncpy(stackTraceBuffer, stackTrace, n - 1) followed by messageBuffer[n - 1] = '\0' -- a copy-paste bug that wrote the NUL terminator into the wrong buffer. Since both buffers are ERROR_MESSAGE_BUFFER_SIZE the errant write was in-bounds and messageBuffer was already terminated, so the bug was silent in practice. But when stackTrace's length >= sizeof(stackTraceBuffer) - 1, strncpy() does not null-terminate, leaving stackTraceBuffer non-terminated and exposing errorContext.error.stackTrace to over-read by any consumer.	2026-06-17 12:22:42 +07:00
David Steele	5b77f9bc22	Refactor SFTP unit tests. Replace the verbose, hand-built HrnLibSsh2 script entries in the SFTP unit tests with a set of per-function HRN_LIBSSH2_* response macros (one per libssh2 shim function). Each macro names its function, bakes in the fixed parameters the production code always passes, and defaults the common result, so a test supplies only the values that vary as trailing designated initializers. The harness now defaults omitted values rather than requiring every field to be spelled out: libssh2_session_hostkey() defaults length/type/value; libssh2_sftp_stat_ex() defaults an omitted .attr to a regular file with mode 0640, defaults .flags to the standard attribute set (adding a size for regular files), and defaults .uid/.gid to the test user/group. Add HRN_LIBSSH2_ATTR_EXISTENCE (path exists but reports no attributes) and HRN_LIBSSH2_OWNER_ROOT (report ownership by root) sentinels, plus HRN_LIBSSH2_DIR/FILE/LINK/FIFO() helpers that OR a file type with an octal mode. Replace the NULL-terminated script array and hrnLibSsh2ScriptSet(array) with HRN_LIBSSH2_SCRIPT_SET(...), which computes the script length so no terminator entry is needed; hrnLibSsh2ScriptSet() now takes an explicit size. Rebuild HRNLIBSSH2_MACRO_STARTUP/SHUTDOWN() and HOSTKEY_HASH_ENTRY() on top of the new macros, and stat_ex now verifies the requested stat_type (via .follow) instead of scripting it as a parameter. Reorganize the tests themselves: split bundled comment groups into TEST_TITLE sections, split scripts per section, and drop redundant connect/disconnect setup. Remove tests that duplicate Posix tests for common code. Remove duplicative SFTP tests. Rename HrnLibSsh2 fields for clarity (attrPerms -> attr/mode, symlinkExTarget -> target). These changes cut sftpTest.c roughly in half.	2026-06-17 12:15:48 +07:00
David Steele	ffa325b8a4	Add batch delete for Azure storage. Path remove (used by expire and friends) issued one HTTP DELETE per file. Use the Azure Blob Batch API to remove up to 256 objects per request, as the GCS driver already does, by sending the deletes as multipart sub-requests. Azure's batch parser is stricter than the MIME/OData spec it is based on: it requires the body to begin with the boundary delimiter (no leading CRLF preamble) and splits header lines on ": ", so the multipart builder now emits the opening delimiter without a leading CRLF and writes the MIME part headers and embedded sub-request headers in the "Name: value" form. The multipart request code is shared with the GCS driver, which accepts either form. Azure omits the blank line that terminates the headers of an empty-body sub-response, relying on the boundary's CRLF as the terminator, so multipart response header parsing now allows eof to end the header block. Sub-requests that fail (not 2xx or 404) are retried individually. A failed part is mapped back to the original request by its position in the response rather than the echoed content-id header, since Azure omits content-id on some error responses.	2026-06-17 11:24:30 +07:00
Shubham ( Kira )	34bba828c6	Harden HTTP chunked response parsing. Tolerate chunk extensions by stripping everything from the first ';' on the chunk-size line before parsing the hex size, and consume any chunk trailers (and the blank line that terminates them) following the terminating zero-size chunk so the connection is left aligned for reuse. Treat the transfer-encoding value case-insensitively so "Chunked" is accepted as "chunked". Reject a response that sets both transfer-encoding and content-length even when content-length is 0. The previous check keyed off contentSize > 0, so a zero content-length slipped through. Track whether the header was present with a dedicated flag and reject the ambiguous combination uniformly, as RFC 7230 permits.	2026-06-14 16:41:31 +07:00
David Steele	bc6d399a6a	PostgreSQL 19beta1 support. Add PostgreSQL 19 as an unreleased version (release: false) and vendor its control/checkpoint structures, catalog and control versions, and XLOG_PAGE_MAGIC. PostgreSQL 19 stores data_checksum_version in pg_control as a four-state ChecksumStateType enum (OFF, VERSION, INPROGRESS_OFF, INPROGRESS_ON) rather than the prior 0/1 value. Validate the field against the version-appropriate maximum and clear the in-progress states to 0, since page checksums cannot be relied on while checksums are being enabled or disabled. Consumers of pageChecksumVersion therefore continue to see only 0 or 1. Add the PG19 test harness and the supporting Perl/CI plumbing (DbVersion, VmTest, container build) and adjust the integration test matrix.	2026-06-14 16:18:33 +07:00
David Steele	dde263d699	Clarify zero-size chunk comment in HTTP response parsing. The previous "still zero" wording leaned on the reader tracking that contentRemaining was zero on entry and remained zero after parsing the next chunk size. State the actual meaning instead: a zero-size chunk terminates the response.	2026-06-14 10:14:19 +07:00
David Steele	e166a33f48	Prevent recursive exit() on signal. musl 1.2.6 intentionally crashes when exit() is called recursively, which happened when a signal arrived while exit() was already in progress, e.g. when a server terminated a child that was already exiting. Set a flag when exit is in progress so exitOnSignal() ignores the signal and allows the in-flight exit() to complete. Reset the flag in exitInit() since a forked child may inherit it from a parent that was exiting. Also call exitSafe() before notifying the parent in the server tests so a signal sent in response to the notification cannot arrive before the exit in progress flag is set. Add Alpine 3.24 to CI to exercise the unit tests against musl 1.2.6, which is where this crash was found.	2026-06-13 12:46:46 +07:00
David Steele	aca779a6ec	Run integration tests on Alpine 3.21 (musl libc). Drop the c-only restriction for the a321 CI job so the full unit and integration suites run on musl libc, exercising the integration tests (including SFTP) against Alpine in addition to glibc. Apply the ssh-rsa HostKeyAlgorithms/PubkeyAcceptedAlgorithms workaround to a321 as well as u22, since Alpine 3.21 ships OpenSSH 9.x which no longer offers the SHA-1 ssh-rsa host-key algorithm by default and the libssh2 client requires it (otherwise the SFTP handshake fails key exchange with LIBSSH2_ERROR_KEX_FAILURE). Suppress the libssh2_session_init_ex and libssh2_session_handshake "possibly lost" leaks reported by valgrind during SFTP integration. These are persistent allocations tied to the session lifetime and are flagged only on the Linux CI runner where valgrind wraps the integration test binary. The suppressions go in valgrind.suppress.none because integration tests always run with vm none. Generalize hrnHostPgBinPath() to probe the Debian, RHEL, and Alpine PostgreSQL bin paths in turn rather than hardcoding two, and throw a clear assert if none match. Add a321 to the default VM list, install PostgreSQL 15/16/17 on Alpine, and point VMDEF_PGSQL_BIN at the Alpine layout. Rebuild the a321 base image accordingly.	2026-06-13 12:18:33 +07:00
Artur Zakirov	760dd8db69	Fix Alpine group conflicts in CI containers. Handle /etc/group entries with non-empty member fields when renaming TEST_GROUP_ID and only create the Alpine group when it does not already exist.	2026-06-12 21:34:45 +07:00
David Steele	3ce9fb9563	Update Fedora CI container to Fedora 44.	2026-06-12 21:04:06 +07:00
David Steele	cb694fcb4d	Update (almost) EOL Debian 11 to Debian 12 in CI. Debian 11 will be EOL just after the next release but it is also a blocker for some planned work due to old package versions. It seems fine to just expire it a bit early. Also update the integration tests to run Debian 12 on Posix since Azure is not supported on i386.	2026-06-12 20:57:16 +07:00
David Steele	3e2e3b1ff0	Add code count exclusion for SVG files.	2026-06-12 11:46:58 +07:00
David Steele	464f3020a9	Move ignoreMissing out of the storage read drivers. Previously each read driver decided whether a missing file was an error, which duplicated the ignoreMissing logic across the Posix, SFTP, remote, S3, Azure, and GCS drivers. Now driver open() simply reports whether the file exists and StorageRead throws FileMissingError when missing files are not ignored. Since the client now makes this decision, ignoreMissing no longer needs to be passed through the remote protocol and a missing file is reported locally rather than as an error raised from the remote.	2026-06-12 10:11:49 +07:00
David Steele	2b7825ad46	Add valgrind suppressions for libbacktrace unwind false positives. When libbacktrace is enabled, throwing an error calls backtrace_full(), which unwinds the stack with libgcc's _Unwind_Backtrace. On aarch64 the unwinder (and glibc's _dl_find_object, which it calls to look up unwind tables) branches on values valgrind considers uninitialised. Since tests run under valgrind with --exit-on-first-error=yes, the false positive aborted any test that happened to trip it, e.g. storage/sftp. Suppress Cond and Value8 errors that originate inside _Unwind_Backtrace when called from backtrace_full.	2026-06-11 12:10:35 +07:00
David Steele	eaac49e3c2	Move issue template to new format required by Github. .github/ISSUE_TEMPLATE.md is no longer filling new issues even though it should still be working according to the documentation. Rather than fight the system just move to the new format.	2026-06-09 13:27:53 +07:00
ShivakumarAmbigiTR	6c518001c2	Add support for S3 Outposts. Add a repo-s3-service option that controls the SigV4 signing service name. Defaults to 's3' for standard S3 endpoints. Set to 's3-outposts' when using an S3 Outposts endpoint. The signing service is used in the credential scope, HMAC signing key derivation, and authorization header. The option accepts free-form input to support future AWS service variants.	2026-06-09 12:37:48 +07:00
David Steele	a953aa74b5	Invert storage read/write interface ownership. Previously, drivers constructed StorageRead/StorageWrite objects directly and stored metadata in a shared interface struct. Now, StorageRead/StorageWrite create the driver via storageInterfaceNewReadP()/NewWriteP() and mediate between IoRead/IoWrite and the driver. Drivers return opaque objects and own their metadata independently. This loosens the tight coupling between drivers and the StorageRead/StorageWrite layer. The remote write driver replaces its back-pointer to StorageWrite with a filterGroup callback, eliminating the circular dependency. It makes retry in StorageRead much more readable. Also move the logic for testing whether a file version could not be found out of the drivers and into StorageRead.	2026-06-09 11:23:15 +07:00
David Steele	71f52f7d92	Remove extraneous pathSync test in the SFTP storage driver. The driver does not implement path sync so there is not reason to test the behavior.	2026-06-08 17:15:27 +07:00
David Steele	7d4c0d58ed	Test the Posix driver for syncPath rather than StorageWrite. The syncPath value in StorageWrite is for informational purposes and does not determine if the path is actually synced or not. Instead probe the Posix driver to make sure that syncPath is disabled so there is no error on CIFS.	2026-06-08 17:08:58 +07:00
David Steele	0c62043419	Rename execOne() to execOneExpect(). This clears the way to use the shorter version in the core code.	2026-06-04 15:07:50 +07:00
David Steele	beb189fbb3	Add repo-s3-key-type=pod-id documentation missed in `79544f64`.	2026-06-04 11:31:44 +07:00
David Steele	c02ddc3ad9	Update lock-threads action version. Also remove unused PR settings.	2026-06-01 12:59:46 +09:00
David Steele	e1cbd5b55c	New CI container builds. These should have been updated in `742fff17` when libsystemd was added but doing it now includes the last PostgreSQL minor releases.	2026-05-29 12:15:28 +09:00
GLFNSE	1e6d23fea7	Add user/group caching for faster manifest build. On systems where uid/gid lookups are routed to a remote name service (sssd, systemd-userdbd, LDAP, etc.), every getpwuid()/getgrgid() call incurs a Unix socket round-trip. This dominates the manifest build phase for clusters with millions of files, even though the data files almost always share a single owner. Add a small fixed-size (16-entry) per-process cache for userNameFromId() and groupNameFromId(). Linear scan is faster than a hash table at this size. Negative results (unknown ids) are also cached. Cache overflow falls through to uncached lookups.	2026-05-29 10:36:48 +09:00
David Steele	2cc9898fa1	Remove extraneous const.	2026-05-28 22:17:30 +09:00
Andrew Jackson	742fff174a	Add systemd notify integration. Allow systemd to have a better understanding as to when pgBackRest has finished starting or is stopping. This implementation is based off of the existing implementations in PostgreSQL [1] and PgBouncer [2]. PgBouncer also has an implementation for `notify-reload` but this is not implemented here as it is a very recent feature [3] that is unavailable on many Linux distributions. [1] https://www.postgresql.org/message-id/flat/CAFj8pRA4%3DhVj-d%3D8O7PSMjopsFUHPcAftd5tLqFC_xb035hNQA%40mail.gmail.com#e346a6189d8b0ed44c745c4aaaef587f [2] https://github.com/pgbouncer/pgbouncer/commit/3816a0073f09944a6f7eaa278d2226ca4942b911 [3] https://github.com/systemd/systemd/blob/fa6d3bffe30064c4d4092b3daa749465f08d35fb/NEWS#L6176	2026-05-19 11:37:48 +09:00
David Steele	3a6c67183c	C11 is now the minimum C standard. This standard is over fifteen years old and the features we are interested in seem well supported on popular compilers. The main advantage is that static_assert() will now display the specified message on error rather than the ever-cryptic `negative width in bit-field '__error_if_negative'`. Now that we can depend on having static_assert() we can replace our STATIC_ASSERT_STMT() macro. Replace our ALIGN_OF() macro with alignof(). Replace our FN_NO_RETURN macro with noreturn. Include stdnoreturn.h in build.h to avoid needing to include it in many header files. Use an anonymous union in common/type/json.c where it simplifies syntax. Other uses of union seem better as they are.	2026-05-19 10:19:58 +09:00
David Steele	6a40144d91	Add sponsors and announcement that pgBackRest will continue.	2026-05-18 20:27:00 +09:00
David Steele	91b8dd6035	Add news page. Move recent announcements that were added to the top of the homepage.	2026-05-18 20:14:45 +09:00
David Steele	e0c934d231	Fix documentation typos.	2026-05-18 10:05:02 +09:00
David Steele	892b60bc79	Add dark mode favicon. In dark mode the black favicon was barely visible. Use a white favicon in dark mode instead. Also, use the new SVG logo for the favicon and update logo.png to the new style.	2026-05-17 11:56:56 +09:00
David Steele	7ad4fd21c1	Remove attribution for armchair icon since it is now licensed.	2026-05-16 16:55:12 +09:00
David Steele	684f12c6cb	Remove information about v1 releases. These releases do no support any current version of PostgreSQL so they are of limited value.	2026-04-27 13:17:20 +09:00
Will Morland	77312b33ba	Add per-repo backup progress to info command output. When backups are running on multiple repositories simultaneously, the info command now reports per-repo progress in addition to the existing overall progress. A new repo array is included in JSON output for backup locks. This avoids confusing progress jumps when one repo finishes before another.	2026-04-14 14:51:42 +07:00
David Steele	ce8bb04d5b	Limit CI permissions on the repository to read.	2026-04-14 12:40:33 +07:00
David Steele	2ef8673534	Remove CodeQL CI job. This job has never surfaced any useful data and now it is failing, so remove it. It appears that CodeQL can now be automated directly within the Github interface, so that seems like a better route if we decide to reenable it.	2026-04-14 12:34:32 +07:00
David Steele	afd6939656	Allow FreeBSD tests to continue running even if one fails.	2026-04-13 18:22:49 +07:00
David Steele	c823ce2f7c	Update GitHub Actions versions.	2026-04-13 18:19:41 +07:00
David Steele	45987413d3	Migrate Cirrus CI tests to Github Actions. Cirrus CI is shutting down on June 1 so migrate all tests. This could have been done before, probably, but it was not clear how to run FreeBSD on Github Actions. The cross-platforms-actions action solves that problem. Fix a couple of minor test issues found on MacOS. Also remove the dead make-cmd option. This has not been valid since the migration to meson.	2026-04-13 18:11:36 +07:00
David Steele	d3cdff17f5	Migrate documentation block rendering to C. This also requires a fair amount of support code that cannot be removed from Perl yet.	2026-04-10 13:26:21 +07:00
David Steele	ff69fcb671	Remove obsolete Perl code.	2026-04-07 10:20:29 +07:00

1 2 3 4 5 ...

4848 Commits