2017-08-18 16:52:05 -07:00
|
|
|
/*
|
2017-09-01 18:28:35 -07:00
|
|
|
* Copyright (c) 2016-present, Yann Collet, Facebook, Inc.
|
2016-08-30 10:04:33 -07:00
|
|
|
* All rights reserved.
|
|
|
|
*
|
2017-08-18 16:52:05 -07:00
|
|
|
* This source code is licensed under both the BSD-style license (found in the
|
|
|
|
* LICENSE file in the root directory of this source tree) and the GPLv2 (found
|
|
|
|
* in the COPYING file in the root directory of this source tree).
|
2017-09-08 00:09:23 -07:00
|
|
|
* You may select, at your option, one of the above-listed licenses.
|
2016-08-30 10:04:33 -07:00
|
|
|
*/
|
|
|
|
|
2017-09-01 18:28:35 -07:00
|
|
|
#ifndef ZSTD_OPT_H
|
|
|
|
#define ZSTD_OPT_H
|
2016-02-10 14:26:30 +01:00
|
|
|
|
2017-09-01 18:28:35 -07:00
|
|
|
#if defined (__cplusplus)
|
|
|
|
extern "C" {
|
|
|
|
#endif
|
2016-02-03 19:53:29 +01:00
|
|
|
|
first implementation of delayed update for btlazy2
This is a pretty nice speed win.
The new strategy consists in stacking new candidates as if it was a hash chain.
Then, only if there is a need to actually consult the chain, they are batch-updated,
before starting the match search itself.
This is supposed to be beneficial when skipping positions,
which happens a lot when using lazy strategy.
The baseline performance for btlazy2 on my laptop is :
15#calgary.tar : 3265536 -> 955985 (3.416), 7.06 MB/s , 618.0 MB/s
15#enwik7 : 10000000 -> 3067341 (3.260), 4.65 MB/s , 521.2 MB/s
15#silesia.tar : 211984896 -> 58095131 (3.649), 6.20 MB/s , 682.4 MB/s
(only level 15 remains for btlazy2, as this strategy is squeezed between lazy2 and btopt)
After this patch, and keeping all parameters identical,
speed is increased by a pretty good margin (+30-50%),
but compression ratio suffers a bit :
15#calgary.tar : 3265536 -> 958060 (3.408), 9.12 MB/s , 621.1 MB/s
15#enwik7 : 10000000 -> 3078318 (3.249), 6.37 MB/s , 525.1 MB/s
15#silesia.tar : 211984896 -> 58444111 (3.627), 9.89 MB/s , 680.4 MB/s
That's because I kept `1<<searchLog` as a maximum number of candidates to update.
But for a hash chain, this represents the total number of candidates in the chain,
while for the binary, it represents the maximum depth of searches.
Keep in mind that a lot of candidates won't even be visited in the btree,
since they are filtered out by the binary sort.
As a consequence, in the new implementation,
the effective depth of the binary tree is substantially shorter.
To compensate, it's enough to increase `searchLog` value.
Here is the result after adding just +1 to searchLog (level 15 setting in this patch):
15#calgary.tar : 3265536 -> 956311 (3.415), 8.32 MB/s , 611.4 MB/s
15#enwik7 : 10000000 -> 3067655 (3.260), 5.43 MB/s , 535.5 MB/s
15#silesia.tar : 211984896 -> 58113144 (3.648), 8.35 MB/s , 679.3 MB/s
aka, almost the same compression ratio as before,
but with a noticeable speed increase (+20-30%).
This modification makes btlazy2 more competitive.
A new round of paramgrill will be necessary to determine which levels are impacted and could adopt the new strategy.
2017-12-28 16:58:57 +01:00
|
|
|
#include "mem.h" /* U32 */
|
2017-11-07 16:15:23 -08:00
|
|
|
#include "zstd.h" /* ZSTD_CCtx, size_t */
|
|
|
|
|
first implementation of delayed update for btlazy2
This is a pretty nice speed win.
The new strategy consists in stacking new candidates as if it was a hash chain.
Then, only if there is a need to actually consult the chain, they are batch-updated,
before starting the match search itself.
This is supposed to be beneficial when skipping positions,
which happens a lot when using lazy strategy.
The baseline performance for btlazy2 on my laptop is :
15#calgary.tar : 3265536 -> 955985 (3.416), 7.06 MB/s , 618.0 MB/s
15#enwik7 : 10000000 -> 3067341 (3.260), 4.65 MB/s , 521.2 MB/s
15#silesia.tar : 211984896 -> 58095131 (3.649), 6.20 MB/s , 682.4 MB/s
(only level 15 remains for btlazy2, as this strategy is squeezed between lazy2 and btopt)
After this patch, and keeping all parameters identical,
speed is increased by a pretty good margin (+30-50%),
but compression ratio suffers a bit :
15#calgary.tar : 3265536 -> 958060 (3.408), 9.12 MB/s , 621.1 MB/s
15#enwik7 : 10000000 -> 3078318 (3.249), 6.37 MB/s , 525.1 MB/s
15#silesia.tar : 211984896 -> 58444111 (3.627), 9.89 MB/s , 680.4 MB/s
That's because I kept `1<<searchLog` as a maximum number of candidates to update.
But for a hash chain, this represents the total number of candidates in the chain,
while for the binary, it represents the maximum depth of searches.
Keep in mind that a lot of candidates won't even be visited in the btree,
since they are filtered out by the binary sort.
As a consequence, in the new implementation,
the effective depth of the binary tree is substantially shorter.
To compensate, it's enough to increase `searchLog` value.
Here is the result after adding just +1 to searchLog (level 15 setting in this patch):
15#calgary.tar : 3265536 -> 956311 (3.415), 8.32 MB/s , 611.4 MB/s
15#enwik7 : 10000000 -> 3067655 (3.260), 5.43 MB/s , 535.5 MB/s
15#silesia.tar : 211984896 -> 58113144 (3.648), 8.35 MB/s , 679.3 MB/s
aka, almost the same compression ratio as before,
but with a noticeable speed increase (+20-30%).
This modification makes btlazy2 more competitive.
A new round of paramgrill will be necessary to determine which levels are impacted and could adopt the new strategy.
2017-12-28 16:58:57 +01:00
|
|
|
void ZSTD_updateTree(ZSTD_CCtx* ctx, const BYTE* ip, const BYTE* iend, U32 nbCompares, U32 mls); /* used in ZSTD_loadDictionaryContent() */
|
|
|
|
|
2017-09-06 15:56:32 -07:00
|
|
|
size_t ZSTD_compressBlock_btopt(ZSTD_CCtx* ctx, const void* src, size_t srcSize);
|
|
|
|
size_t ZSTD_compressBlock_btultra(ZSTD_CCtx* ctx, const void* src, size_t srcSize);
|
2016-06-27 01:31:35 +02:00
|
|
|
|
2017-09-06 15:56:32 -07:00
|
|
|
size_t ZSTD_compressBlock_btopt_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize);
|
|
|
|
size_t ZSTD_compressBlock_btultra_extDict(ZSTD_CCtx* ctx, const void* src, size_t srcSize);
|
2016-06-27 01:31:35 +02:00
|
|
|
|
2017-09-01 18:28:35 -07:00
|
|
|
#if defined (__cplusplus)
|
2016-02-09 23:00:41 +01:00
|
|
|
}
|
2017-09-01 18:28:35 -07:00
|
|
|
#endif
|
2016-06-27 01:31:35 +02:00
|
|
|
|
2017-09-01 18:28:35 -07:00
|
|
|
#endif /* ZSTD_OPT_H */
|