1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2025-07-07 00:35:37 +02:00
Commit Graph

26 Commits

Author SHA1 Message Date
c7a66ac1af Improve memory usage of mem contexts.
Each mem context can track child contexts, allocations, and a callback. Before this change memory was allocated for tracking all three even if they were not used for a particular context. This made mem contexts unsuitable for String and Variant objects since they are plentiful and need to be as small as possible.

This change allows mem contexts to be configured to track any combination of child contexts, allocations, and a callback. In addition, the mem context can be configured to track a single child context and/or allocation, which saves memory and is a common use case.

Another benefit is that Variants can own objects (e.g. KeyValue) that they encapsulate. All of this makes memory accounting simpler because mem contexts have names while allocations do not. No more memory is used than before since Variants and Strings still had to store the memory context they were originally allocated in so they could be easily freed.

Update the String and Variant objects to use this new functionality. The custom strFree() and varFree() functions are no longer required and can now be a wrapper around objFree().

Lastly, this will allow strMove() and varMove() to be implemented and used in cases where strDup() and varDup() are being used to move a String or Variant to a new context. Since this will be a bit noisy it is saved for a future commit.
2022-05-18 10:52:01 -04:00
7ae5478d98 Remove most _Z constants.
Most of the time these were not making the code any clearer.

For cases where they were used to construct Strings and Buffers, replace with constants.

Also cleanup unused Buffers and Strings.
2022-05-05 12:09:21 -04:00
d7e45f12a5 Refactor httpRequestNew().
Move httpRequestProcess() outside of the object mem context. In case it ever returns a value we don't want that to end up in the object context.
2022-04-26 10:49:04 -04:00
59a5373cf8 Handle TLS servers that do not close connections gracefully.
Some TLS server implementations will simply close the socket rather than correctly closing the TLS connection. This causes problems when connection: close is specified with no content-length or chunked encoding and we are forced to read to EOF. It is hard to know if this is a real EOF or a network error.

In cases where we can parse the content and (hopefully) ensure it is correct, allow the closed socket to serve as EOF. This is not ideal, but the change in 8e1807c means that currently working servers with this issue will stop working after 2.35 is installed, which seems too risky.
2022-03-02 11:38:52 -06:00
a79034ae2f Add read range to all storage drivers.
The range feature allows reading out an arbitrary chunk of a file and will be important for efficient small file support.

Now that all drivers are required to support ranges remove the storageFeatureLimitRead feature flag that was implemented only by the Posix driver.
2022-01-11 14:42:53 -05:00
c38e2d3170 Add verb to HTTP error output.
This makes it easier to debug HTTP errors.
2021-12-08 15:00:19 -05:00
b7e17d80ea More efficient memory allocation for Strings and String Variants.
The vast majority of Strings are never modified so for most cases allocate memory for the string with the object. This results in one allocation in most cases instead of two. Use strNew() if strCat*() functions are needed.

Update varNewStr() in the same way since String Variants can never be modified. This results in one allocation in all cases instead of three. Also update varNewStrZ() to use STR() instead of strNewZ() to save two more allocations.
2021-10-07 19:43:28 -04:00
475b57c89b Allow additional memory to be allocated with a mem context.
The primary benefit is that objects can allocate memory for their struct with the context, which saves an additional allocation and makes it easier to read context/allocation dumps. Also, the memory context does not need to be stored with the object since it can be determined using the object pointer.

Object pointers cannot be moved, so this means whatever additional memory is allocated cannot be resized. That makes the additional memory ideal for object structs, but not so much for allocating a list that might change size.

Mem contexts can no longer be reused since they will probably be the wrong size so their memory is freed on memContextFree(). This still means fewer allocations and frees overall.

Interfaces still need to be freed by mem context so the old objMove() and objFree() have been preserved as objMoveContext() and objFreeContext(). This will be addressed in a future commit.
2021-09-01 11:10:35 -04:00
d30ec9c9ae Replace OBJECT_DEFINE_MOVE() and OBJECT_DEFINE_FREE() with inlines.
Inline functions are more efficient and if they are not used are automatically omitted from the binary.

This also makes the implementation of these functions easier to find and removes the need for a declaration. That is, the complete implementation is located in the header rather than being spread between the header and C file.
2021-04-08 10:04:57 -04:00
cc85c4f03d Replace OBJECT_DEFINE_GET() with *Pub struct pattern.
This macro was originally intended to simplify the creation of simple getters but it has been superseded by the pattern introduced in 79a2d02c.

Remove instances of OBJECT_DEFINE_GET() to avoid confusion with the new pattern.
2021-04-07 14:27:57 -04:00
088662d986 GCS support for repository storage.
GCS and GCS-compatible object stores can now be used for repository storage.
2021-03-05 12:13:51 -05:00
66a4ff496a Encode path before passing to HttpRequest.
GCS requires mixed encoding in the path so encoding inside HttpRequest does not work.

Instead, require the path to be correctly encoded before being passed to HttpRequest.
2021-02-19 09:05:32 -05:00
1b4b3538cc Rename uri to path where appropriate in HTTP and storage modules.
The path was originally named uri due to the canonicalized path being called "canonicalized uri" in the S3 authentication documentation. The name got propagated everywhere from there.

This is not correct for general usage, however, so rename to path when describing the path component of an HTTP request.
2021-02-19 08:22:50 -05:00
67d444b9e8 Add bufEmpty().
This seems more readable than bufUsed() == 0, just like 7d6c0319.
2021-02-01 09:22:01 -05:00
7d6c0319f0 Add lstEmpty(), strLstEmpty(), and varLstEmpty().
This seems more readable than lst*Size() == 0.

Hopefully this will also eliminate usage of lst*Size() > 0/lst*Size() != 0 variants for the inverse.
2021-01-29 14:27:56 -05:00
959f77cd6a Add general-purpose statistics collector.
Currently each module that needs to collect statistics implements custom code to do so. This is cumbersome.

Create a general purpose module for collecting and reporting statistics. Statistics are output in the log at detail level, but there are other uses they could be put to eventually.

No new functionality is added. This is just a drop-in replacement for the current statistics, with the advantage of being more flexible.

The new stats are slower because they involve a list lookup, but performance testing shows stats can be updated at about 40,000/ms which seems fast enough for our purposes.
2020-08-20 14:04:26 -04:00
de0f8c2654 Add user-agent to HTTP requests. 2020-08-18 10:01:24 -04:00
fbee6ec170 Add support for HTTP/1.0.
HTTP/1.0 connections are closed by default after a single response. Other than that, treat 1.0 the same as 1.1.

HTTP/1.0 allows different date formats that we can't parse but for now, at least, we don't need any date headers from 1.0 requests.
2020-08-14 13:11:33 -04:00
3e9dce0d76 Rename strPtr()/strPtrNull() to strZ()/strZNull().
We use the Z suffix in many functions to indicate that we are expecting a zero-terminated string so make this function conform to the pattern.

As a bonus the new name is a bit shorter, which is a good quality in a commonly-used function.
2020-07-30 07:49:06 -04:00
50ff5d905e Update comment and parameter in HttpRequest. 2020-07-15 13:54:01 -04:00
574f36c9d2 Rename httpRequest() to httpRequestResponse() and fix comment. 2020-07-14 15:14:41 -04:00
91c7adc834 Allow redactions for HTTP queries.
The Azure storage driver exposes secrets in the query when using SAS authorization. These secrets can show up during logging or when an error occurs.

Allow redaction of queries to prevent secrets from being exposed in logs and errors.
2020-07-14 13:09:48 -04:00
eaa05fdc49 Write HTTP request as a buffer to hide secrets.
The prior method of writing headers as strings could expose secrets in trace level logs.

Instead write the entire request as a buffer to prevent secrets from being logged and also reduce the amount of logging.
2020-07-08 15:07:29 -04:00
3f4371d7a2 Azure support for repository storage.
Azure and Azure-compatible object stores can now be used for repository storage.

Currently only shared key authentication is supported but SAS will be added soon.
2020-07-02 16:24:34 -04:00
1416dc3225 Add missing end braces in debug log functions.
These had no effect on functionality but made debug messages harder to read.
2020-06-26 16:44:10 -04:00
c5892d1291 Asynchronous S3 multipart upload.
When uploading large files the upload is split into multiple parts which are assembled at the end to create the final file. Previously we waited until each part was acknowledged before starting on the processing (i.e. compression, etc.) of the next part.

Now, the request for each part is sent while processing continues and the response is read just before sending the request for the next part. This asynchronous method allows us to continue processing while the S3 server formulates a response.

Testing from outside AWS in a high-bandwidth, low-latency environment showed a 35% improvement in the upload time of 1GB files. The time spent waiting for multipart notifications was reduced by ~300% (this measurement included the final part which is not uploaded asynchronously).

There are still some possible improvements: 1) the creation of the multipart id could be made asynchronous when it looks like the upload will need to be multipart (this may incur cost if the upload turns out not to be multipart). 2) allow more than one async request (this will use more memory).

A fair amount of refactoring was required to make the HTTP responses asynchronous. This may seem like overkill but having well-defined request, response, and session objects will also be advantageous for the upcoming HTTP server functionality.

Another advantage is that the lifecycle of an HttpSession is better defined. We only want to reuse sessions that complete the request/response cycle successfully, otherwise we consider the session to be in a bad state and would prefer to start clean with a new one. Previously, this required complex notifications to mark a session as "successfully done". Now, ownership of the session is passed to the request and then the response and only returned to the client after a successful response. If an error occurs anywhere along the way the session will be automatically closed by the object destructor when the request/response object is freed (depending on which one currently owns the session).
2020-06-24 13:44:00 -04:00