Zim

Merge lp:~alexander-s-m/zim/cache_speedup into lp:~jaap.karssenberg/zim/pyzim

Proposed by Alexander Mescheryakov
Status: Merged
Merged at revision: 789
Proposed branch: lp:~alexander-s-m/zim/cache_speedup
Merge into: lp:~jaap.karssenberg/zim/pyzim
Diff against target: 10 lines (+1/-0)
1 file modified
zim/index.py (+1/-0)
To merge this branch: bzr merge lp:~alexander-s-m/zim/cache_speedup
Reviewer Review Type Date Requested Status
Jaap Karssenberg Pending
Review via email: mp+272524@code.launchpad.net

Commit message

Tremendously speedup reindexing by disabling sqlite synchronous mode

By default sqlite requests operating system to ensure that all content is safely written to the disk surface prior to continuing. If underlying filesystem is not mounted with nobarrier option and resides on rotary device flushing index.db to disk becomes bottleneck of full reindexing, speed might be about 3 pages per second.

Disabling sync makes reindexing 50-100 times faster in such cases.
Amount of written data on disk is reduced as well, which is desirable on SSD devices. On test notebook with 430 pages amount of written data during single reindex is reduced from 35MB to 300-400KB.

We don't need extra caution when dealing with cache. Even if gets corrupted (which is especially unlikely if we perform our operations fast) it may be regenerated any time.

Description of the change

Small change but it tremendously speed up reindexing.

I have rather big main notebook with 400+ pages which I sync over several machines and therefore have to reindex it quite often.

And on many configurations reindexing takes very long time. It was the worst on laptop with btrfs on HDD device. 2:10 (130 seconds)!!!

But disabling default sqlite behavior to be extra cautious makes reindexing *much* faster. With this patch reindexing of my notebook on laptop takes just 2 seconds.

There is a tiny chance that index.db might become corrupted if OS crashes during reindexing, but it is negligible and we can recreate index.db from scratch later at no cost even if it becomes corrupted.

To post a comment you must log in.
lp:~alexander-s-m/zim/cache_speedup updated
785. By Alexander Meshcheryakov <email address hidden>

Fixed indentation in previous commit

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'zim/index.py'
2--- zim/index.py 2015-02-07 14:28:16 +0000
3+++ zim/index.py 2015-09-26 22:49:12 +0000
4@@ -526,6 +526,7 @@
5 def _connect(self):
6 self.db = sqlite3.connect(
7 str(self.dbfile), detect_types=sqlite3.PARSE_DECLTYPES)
8+ self.db.execute('PRAGMA synchronous=OFF;')
9 self.db.row_factory = sqlite3.Row
10 self.db_commit = DBCommitContext(self.db)
11