My problem
- 1,000s of historical giant gzip text files
- Needed to be able to search
zgrep
is cool!
- ...but slow
godbolt:~ $ ls -hl some-file.csv.gz
505M Oct 14 14:38 some-file.csv.gz
godbolt:~ $ time zgrep AAPL.O some-file.csv.gz | wc -l
864221
Executed in 27.58 secs
Enter zq
godbolt:~ 5.2s $ time zq some-file.csv.gz AAPL.O | wc -l
864221
Executed in 4.40 secs
The catch
godbolt:~ $ time zindex -d , -f 1 some-file.csv.gz
Executed in 108.26 secs
Dirty secret
godbolt:~ $ ls -hl some-file.csv.gz.index
2.3G Oct 14 14:43 some-file.csv.gz.zindex
More realistic
godbolt:~ $ zindex ~/some-file.csv.gz
godbolt:~ $ 14.6s $ ls -lh ~/some-file.csv.gz.zindex
629K Oct 14 16:58 some-file.csv.gz.zindex