Merge lp:~alexander-s-m/zim/cache_speedup into lp:~jaap.karssenberg/zim/pyzim
Status: | Merged |
---|---|
Merged at revision: | 789 |
Proposed branch: | lp:~alexander-s-m/zim/cache_speedup |
Merge into: | lp:~jaap.karssenberg/zim/pyzim |
Diff against target: |
10 lines (+1/-0) 1 file modified
zim/index.py (+1/-0) |
To merge this branch: | bzr merge lp:~alexander-s-m/zim/cache_speedup |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jaap Karssenberg | Pending | ||
Review via email: mp+272524@code.launchpad.net |
Commit message
Tremendously speedup reindexing by disabling sqlite synchronous mode
By default sqlite requests operating system to ensure that all content is safely written to the disk surface prior to continuing. If underlying filesystem is not mounted with nobarrier option and resides on rotary device flushing index.db to disk becomes bottleneck of full reindexing, speed might be about 3 pages per second.
Disabling sync makes reindexing 50-100 times faster in such cases.
Amount of written data on disk is reduced as well, which is desirable on SSD devices. On test notebook with 430 pages amount of written data during single reindex is reduced from 35MB to 300-400KB.
We don't need extra caution when dealing with cache. Even if gets corrupted (which is especially unlikely if we perform our operations fast) it may be regenerated any time.
Description of the change
Small change but it tremendously speed up reindexing.
I have rather big main notebook with 400+ pages which I sync over several machines and therefore have to reindex it quite often.
And on many configurations reindexing takes very long time. It was the worst on laptop with btrfs on HDD device. 2:10 (130 seconds)!!!
But disabling default sqlite behavior to be extra cautious makes reindexing *much* faster. With this patch reindexing of my notebook on laptop takes just 2 seconds.
There is a tiny chance that index.db might become corrupted if OS crashes during reindexing, but it is negligible and we can recreate index.db from scratch later at no cost even if it becomes corrupted.