1
0
mirror of https://github.com/pgbackrest/pgbackrest.git synced 2024-12-14 10:13:05 +02:00
pgbackrest/test/expect
David Steele c5892d1291
Asynchronous S3 multipart upload.
When uploading large files the upload is split into multiple parts which are assembled at the end to create the final file. Previously we waited until each part was acknowledged before starting on the processing (i.e. compression, etc.) of the next part.

Now, the request for each part is sent while processing continues and the response is read just before sending the request for the next part. This asynchronous method allows us to continue processing while the S3 server formulates a response.

Testing from outside AWS in a high-bandwidth, low-latency environment showed a 35% improvement in the upload time of 1GB files. The time spent waiting for multipart notifications was reduced by ~300% (this measurement included the final part which is not uploaded asynchronously).

There are still some possible improvements: 1) the creation of the multipart id could be made asynchronous when it looks like the upload will need to be multipart (this may incur cost if the upload turns out not to be multipart). 2) allow more than one async request (this will use more memory).

A fair amount of refactoring was required to make the HTTP responses asynchronous. This may seem like overkill but having well-defined request, response, and session objects will also be advantageous for the upcoming HTTP server functionality.

Another advantage is that the lifecycle of an HttpSession is better defined. We only want to reuse sessions that complete the request/response cycle successfully, otherwise we consider the session to be in a bad state and would prefer to start clean with a new one. Previously, this required complex notifications to mark a session as "successfully done". Now, ownership of the session is passed to the request and then the response and only returned to the client after a successful response. If an error occurs anywhere along the way the session will be automatically closed by the object destructor when the request/response object is freed (depending on which one currently owns the session).
2020-06-24 13:44:00 -04:00
..
mock-all-001.log Use PostgreSQL instead of postmaster where appropriate. 2020-06-17 15:14:59 -04:00
mock-all-002.log Use PostgreSQL instead of postmaster where appropriate. 2020-06-17 15:14:59 -04:00
mock-archive-001.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-archive-002.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-archive-stop-001.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-archive-stop-002.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-expire-001.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-expire-002.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-stanza-001.log Rename most instances of master to primary in tests. 2020-06-16 14:06:38 -04:00
mock-stanza-002.log Asynchronous S3 multipart upload. 2020-06-24 13:44:00 -04:00
real-all-001.log Simplify test matrix for real/all tests. 2020-06-23 13:44:29 -04:00