Filters had different ideas about what "done" meant and this added complication to the group filter processing. For example, gzip decompression would detect end of stream and mark the filter as done before it had been flushed.
Improve the IoFilter interface to give a consistent definition of done across all filters, i.e. no filter can be done until it has started flushing no matter what the underlying driver reports. This removes quite a bit of tricky logic in the processing loop which tried to determine when a filter was "really" done.
Also improve management of the input buffers by pointing directly to the prior output buffer (or the caller's input) to eliminate loops that set/cleared these buffers.
Most of the *Free() functions are pretty generic so add macros to make creating them as easy as possible.
Create a distinction between *Free() functions that the caller uses to free memory and callbacks that free third-party resources. There are a number of cases where a driver needs to free resources but does not need a normal *Free() because it is handled by the interface.
Add common/object.h for macros that make object maintenance easier. This pattern can also be used for many more object functions.
The function pointer casting used when creating drivers made changing interfaces difficult and led to slightly divergent driver implementations. Unit testing caught production-level errors but there were a lot of small issues and the process was harder than it should have been.
Use void pointers instead so that no casts are required. Introduce the THIS_VOID and THIS() macros to make dealing with void pointers a little safer.
Since we don't want to expose void pointers in header files, driver functions have been removed from the headers and the various driver objects return their interface type. This cuts down on accessor methods and the vast majority of those functions were not being used. Move functions that are still required to .intern.h.
Remove the special "C" crypto functions that were used in libc and instead use the standard interface.
Add bufDup() and bufNewUsedC().
Arrange bufNewC() params to match bufNewUsedC() since they have always seemed backward.
Fix bufHex() to only render the used portion of the buffer and fix some places where used was not being set correctly.
Use a union to make macro assignments for all legal values without casting. This is much more likely to catch bad assignments.
This greatly reduces calls to filter processing, which is a performance benefit, but also makes the trace logs smaller and easier to read.
However, this means that ioWriteFlush() will no longer work with filters since a full flush of IoFilterGroup would require an expensive reset. Currently ioWriteFlush() is not used in this scenario so for now just add an assert to ensure it stays that way.
These are more efficient than creating buffers in place when needed.
After replacement discovered that bufNewStr() and BufNewZ() were not being used in the core code so removed them. This required using the macros in tests which is not the usual pattern.
Since the introduction of blocking read drivers (e.g. IoHandleRead, TlsClient) the non-blocking drivers have used the same rules for determining maximum buffer size, i.e. read only as much as requested. This is necessary so the blocking drivers don't get stuck waiting for data that might not be coming.
Instead mark blocking drivers so IoRead knows how much buffer to allow for the read. The non-blocking drivers can now request the maximum number of bytes allowed by buffer-size.
Add production checks to ensure no filter gets a zero-size input buffer.
Also, optimize the case where a filter returns no output. There's no sense in running downstream filters if they have no new input.
The IoRead object was passing zero-length buffers into the filter processing code but not all the filters were happy about getting them.
In particular, the gzip compression filter failed if it was given no input directly after it had flushed all of its buffers. This made the problem rather intermittent even though a zero-length buffer was being passed to the filter at the end of every file. It also explains why tweaking compress-level or buffer-size allowed the file to go through.
Since this error was happening after all processing had completed, there does not appear to be any risk that successfully processed files were corrupted.
Reported by brunre01, jwpit, Tomasz Kontusz, guruguruguru.
Some IO objects have file descriptors which can be useful for monitoring with select().
It might also be useful to expose handles for write objects but there is currently no use case.
There was a lot of extra boilerplate involved in setting up pipes so that is now automated.
In some cases testing with multiple children is useful so allow that as well.
Rather than create _P/_PP variants for every type that needs to pass/return pointers, create FUNCTION_*_P/PP() macros that will properly pass or return any single/double pointer types.
There remain a few unresolved edge cases such as CHARPY but this handles the majority of types well.
Rename FUNCTION_DEBUG_* macros to FUNCTION_LOG_* to more accurately reflect what they do. Further rename FUNCTION_DEBUG_RESULT* macros to FUNCTION_LOG_RETURN* to make it clearer that they return from the function as well as logging. Leave FUNCTION_TEST_* macros as they are.
Consolidate the various ASSERT* macros into a single ASSERT macro that is always compiled out of production builds. It was difficult to figure out when an assert would be checked with all the different types in play. When ASSERTs are compiled in they will always be checked regardless of the log level -- tying these two concepts together was not a good idea.
General i/o objects for reading and writing file descriptors, in particular those that can block. In other words, these are not generally to be used with file descriptors for actual files, but rather pipes, sockets, etc.
If InOut filters were placed next to each other then the second filter would never get a NULL input signaling it to flush. This arrangement only worked if the second filter had some other indication that it should flush, such as a decompression filter where the flush is indicated in the input stream.
This is not a live issue because currently no InOut filters are chained together.
TlsClient introduced a non-blocking read which is required to read protocol messages that are linefeed-terminated rather than a known size. However, in many cases the expected number of bytes is known in advance so in that case it is more efficient to have tlsClientRead() block until all the bytes are read.
Add block parameter to all read functions and use it when a blocking read is required. For most read functions this is a noop, i.e. if the read function never blocks then it can ignore the parameter.
In passing, set the log level of storageNew*() functions to debug to expose more high-level I/O operations.
These interfaces previously used the memory context of the object they were associated with and did not have their own destructors.
There are times when it is useful to free the interface without also freeing the underlying object so give IoRead and IoWrite their own memory contexts and destructors.
In passing fix a comment type in bufferRead.c.
By default the IoWrite object does not write until the output buffer is full but this is a problem for protocol messages that must be sent in order to get a response.
ioWriteFlush() is not called internally by IoWrite but can be used at any time to immediately write all bytes from the output buffer without closing the IoWrite object.
ioReadLine() calls ioRead(), which aggressively tries to fill the output buffer, but this doesn't play well with blocking reads.
Give ioReadLine() an option that tells it to read only what is available. That doesn't mean the function will never block but at least it won't do so by reading too far.
Allow a single linefeed-terminated line to be read or written. This is useful for various protocol implementations, including HTTP and pgBackRest's protocol.
On read the maximum line size is limited to buffer-size to prevent runaway memory usage in case a linefeed is not found. This seems fine for HTTP but we may need to revisit this decision when implementing the pgBackRest protocol. Another option would be to increase the minimum buffer size (currently 16KB).
This constructor creates a Buffer object directly from a zero-terminated string. The old way was to create a String object first, then convert that to a Buffer using bufNewStr().
Updated in all places that used the old pattern.
Fixed parameter constructors made adding new interface functions a burden, so we switched to using structs to define interfaces in the storage module at c49eaec7.
While propagating this pattern to the IO interfaces it became obvious that the existing variable parameter function pattern (begun in the storage module) was more succinct and consistent with the existing code.
So, use variable parameter functions to define all interfaces. This assumes that the non-interface parameters will be fixed, which seems reasonable for low-level code.