1
0
mirror of https://github.com/pbnjay/grate.git synced 2024-12-13 13:58:27 +02:00
Commit Graph

26 Commits

Author SHA1 Message Date
Jeremy Jay
a5be267bf7 more tweaks to memory usage in xls this time
did not reduce total allocations much (bytes.Reader is more efficient
than I thought), but reduced walltime from 99s to 55s for a large collection
2021-02-13 00:06:04 -05:00
Jeremy Jay
f990af649d interface isn't necessary here 2021-02-12 14:37:08 -05:00
Jeremy Jay
b953163a8d tweaks to reduce memory usage in xlsx
on a test dataset, usage goes from 6.0GB => 3.3GB
and walltime improves from 31.6s => 19.4s

largest remaining driver is the slow/hungry xml.Decoder
2021-02-12 13:44:46 -05:00
Jeremy Jay
e244917a51 standardize and simplify debugging statements 2021-02-12 10:50:33 -05:00
Jeremy Jay
bbe0d2dbb4 don't load all sheets at once in xlsx 2021-02-12 10:48:57 -05:00
Jeremy Jay
26e4b46e83 more consistent error handling 2021-02-12 10:44:23 -05:00
Jeremy Jay
f25b853fdf adding a simple-format reader for csv/tsv 2021-02-12 01:00:35 -05:00
Jeremy Jay
749e19458a allow openers to have a priority 2021-02-12 00:59:01 -05:00
Jeremy Jay
88daa500dc accidently implemented xlsx. refactor to include it 2021-02-12 00:50:50 -05:00
Jeremy Jay
b048fa8aad refactor common excel stuff 2021-02-11 01:28:46 -05:00
Jeremy Jay
8812c44704 why not? initial implementation of xlsx 2021-02-11 01:16:02 -05:00
Jeremy Jay
ee3b4224e0 refactor xls2tsv and add some features 2021-02-10 01:34:35 -05:00
Jeremy Jay
77213bcf00 rudimentary date parsing/identification 2021-02-10 01:33:50 -05:00
Jeremy Jay
b239d8f2cb unsigned int overflow is no fun 2021-02-10 01:32:41 -05:00
Jeremy Jay
4848de04e3 handle uint32s better 2021-02-09 13:59:58 -05:00
Jeremy Jay
1c0ac5b92c don't forget the first row 2021-02-09 01:28:22 -05:00
Jeremy Jay
41ac03347f implement merged cells w/ sentinal placeholder values 2021-02-09 01:11:41 -05:00
Jeremy Jay
4418175c85 just when I thought I was out, they pull me back in 2021-02-08 23:10:24 -05:00
Jeremy Jay
bf6d144fa3 improve error handling/drop panics 2021-02-08 15:36:08 -05:00
Jeremy Jay
80c3b4cc81 many bugfixes and edge cases, impl most cell types 2021-02-08 11:02:37 -05:00
Jeremy Jay
f794a5ef9b allow seeking in SliceReader 2021-02-07 23:52:37 -05:00
Jeremy Jay
c01ac5f01b tool to create tsv from xls 2021-02-07 22:53:28 -05:00
Jeremy Jay
bde3f21cfc working sheet parser 2021-02-07 22:51:20 -05:00
Jeremy Jay
b17e0da28d working biff8 parser and rc4 decryption 2021-02-05 11:30:38 -05:00
Jeremy Jay
a477a30993 fix and export SliceReader. update test 2021-02-05 10:49:40 -05:00
Jeremy Jay
986d4d9116 initial implementation of CFB doc format 2021-02-02 15:05:55 -05:00