Merge lp:~koschmieder/zim/zim into lp:~jaap.karssenberg/zim/pyzim
- zim
- Merge into pyzim
Status: | Needs review |
---|---|
Proposed branch: | lp:~koschmieder/zim/zim |
Merge into: | lp:~jaap.karssenberg/zim/pyzim |
Diff against target: |
379 lines (+375/-0) 1 file modified
zim/plugins/quicksearch.py (+375/-0) |
To merge this branch: | bzr merge lp:~koschmieder/zim/zim |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jaap Karssenberg | Disapprove | ||
Review via email: mp+257936@code.launchpad.net |
Commit message
Description of the change
My Zim plugin which is inspired by Tomboy/Gnote and provides a search pane which shows results as you type.
Lukas M. Koschmieder (koschmieder) wrote : | # |
Hi Jaap,
thank you for your quick response.
I would also prefer to re-use the search module but I fear that it doesn't meet my requirements.
These are my main concerns:
First, I rate/weight search results by certain criteria and don't see how I could fit my rating system into your search logic.
Second, the key feature of my plug-in is that it shows search results as you type. Judging by how long the default search dialog takes to process a query, I suspect that the underlying logic is too slow/bloated for my needs.
Please let me know what you think. I will try to re-write my plug-in, if you're positive that my concerns are unjustified.
Regards,
Lukas
Jaap Karssenberg (jaap.karssenberg) wrote : | # |
Hi Lukas,
With respect to the rating system, I'm open to improvements. Just specify
in more detail what you think should be different in the rating and we can
have a look at it. My objective is putting most relevant results on the
top, but I didn't optimize it much.
Search-as-you type for content search will indeed not work in the current
module. But loading the whole notebook in memory to solve that is not the
right way to go. You should either research using the SQLite database as a
search engine to achieve that, or think of optimizations like using
search-as-you-type for cached content like page names and searc hby word
for content. That could certainly work.
Have you looked at the search language we use? If you use keywords for
searches, searching names will be really fast.
Regards,
Jaap
On Sat, May 2, 2015 at 12:05 AM, Lukas M. Koschmieder <email address hidden>
wrote:
> Hi Jaap,
>
> thank you for your quick response.
>
> I would also prefer to re-use the search module but I fear that it doesn't
> meet my requirements.
>
> These are my main concerns:
>
> First, I rate/weight search results by certain criteria and don't see how
> I could fit my rating system into your search logic.
>
> Second, the key feature of my plug-in is that it shows search results as
> you type. Judging by how long the default search dialog takes to process a
> query, I suspect that the underlying logic is too slow/bloated for my needs.
>
> Please let me know what you think. I will try to re-write my plug-in, if
> you're positive that my concerns are unjustified.
>
> Regards,
>
> Lukas
> --
> https:/
> You are reviewing the proposed merge of lp:~koschmieder/zim/zim into
> lp:zim.
>
Lukas M. Koschmieder (koschmieder) wrote : | # |
Hi Jaap,
If I understand you correctly, then your main issue is the fact that my plug-in loads the entire notebook in memory. Would you allow me to keep my own search function / rating system if I got rid of the problematic page buffer / results cache?
> With respect to the rating system, I'm open to improvements. Just specify
> in more detail what you think should be different in the rating and we can
> have a look at it. My objective is putting most relevant results on the
> top, but I didn't optimize it much.
My rating system is inspired by Tomboy/Gnote. The search function prioritize "title matches" over "path matches" over "page matches". This concept may sound simple but it has proven to be useful in my daily routine. However, I'm not sure if this method is suited to improve our default results system.
With respect to improvements, it would be nice if ParseTree could also provide a case-sensitive count function.
> Have you looked at the search language we use? If you use keywords for
> searches, searching names will be really fast.
My plug-in iterates over all pages in the notebook, for each page greps its parse tree and then uses the count function to get the number of matches. Is there a faster alternative?
Unmerged revisions
- 758. By Lukas M. Koschmieder
-
Added plug-in
Preview Diff
1 | === added file 'zim/plugins/quicksearch.py' |
2 | --- zim/plugins/quicksearch.py 1970-01-01 00:00:00 +0000 |
3 | +++ zim/plugins/quicksearch.py 2015-04-30 17:05:53 +0000 |
4 | @@ -0,0 +1,375 @@ |
5 | +#!/usr/bin/env python |
6 | +# -*- coding: utf-8 -*- |
7 | +# (c) 2015 Lukas M. Koschmieder (koschmieder@gmx.net) |
8 | + |
9 | +import gtk |
10 | +import logging |
11 | +import sys |
12 | +import gobject |
13 | + |
14 | +from zim.plugins import extends, PluginClass, WindowExtension |
15 | +from zim.notebook import Path |
16 | +from zim.gui.widgets import BOTTOM_PANE, PANE_POSITIONS, ScrolledWindow |
17 | +from zim.signals import ConnectorMixin |
18 | + |
19 | +logger = logging.getLogger('zim.plugins.quicksearch') |
20 | + |
21 | +class QuickSearchPlugin(PluginClass): |
22 | + plugin_info = { |
23 | + 'name': _('Quick Search'), |
24 | + 'description': _('''This plugin provides a search side pane which shows search results as you type (similar to Tomboy or Google).'''), |
25 | + 'author': 'Lukas M. Koschmieder (koschmieder@gmx.net)', |
26 | + #'help': 'Plugins:Quick Search', |
27 | + } |
28 | + |
29 | + plugin_preferences = ( |
30 | + ('pane', 'choice', _('Position in the window'), BOTTOM_PANE, PANE_POSITIONS), |
31 | + ) |
32 | + |
33 | +@extends('MainWindow') |
34 | +class QuickSearchWindowExtension(WindowExtension): |
35 | + |
36 | + def __init__(self, plugin, window): |
37 | + ''' |
38 | + Plug-in constructor. |
39 | + |
40 | + @plugin QuickSearchPlugin object |
41 | + @window zim.plugins.MainWindow object |
42 | + ''' |
43 | + |
44 | + WindowExtension.__init__(self, plugin, window) |
45 | + |
46 | + self.widget = QuickSearchWidget(self.window.ui.notebook, self.uistate, self.window.ui) |
47 | + |
48 | + self.on_preferences_changed(plugin.preferences) |
49 | + self.connectto(plugin.preferences, 'changed', self.on_preferences_changed) |
50 | + |
51 | + def on_preferences_changed(self, preferences): |
52 | + ''' |
53 | + This function is called when the user edits our plug-in preferences. |
54 | + It re-adds our plug-in widget to Zim according to these new preferences. |
55 | + |
56 | + @preferences zim.dicts.ConfigDict object |
57 | + ''' |
58 | + |
59 | + if self.widget is None: |
60 | + return |
61 | + |
62 | + try: |
63 | + self.window.remove(self.widget) |
64 | + except ValueError: |
65 | + pass |
66 | + |
67 | + self.window.add_tab(_('Search'), self.widget, preferences['pane']) |
68 | + |
69 | + def teardown(self): |
70 | + ''' |
71 | + Plug-in destructor. |
72 | + ''' |
73 | + |
74 | + self.window.remove(self.widget) |
75 | + self.widget.disconnect_all() |
76 | + self.widget = None |
77 | + |
78 | +class QuickSearchWidget(ConnectorMixin, gtk.VBox): |
79 | + ''' |
80 | + This class is the main part of our plug-in. |
81 | + It mainly comprises of the following GUI elements and the underlying logic: |
82 | + - a gtk.Entry (search field) |
83 | + - a gtk.TreeView (results list) |
84 | + - a gtk.Label (status text) |
85 | + |
86 | + @todo Toggle between "search result view" (if a keyword has been entered) |
87 | + and "navigation view" (if the search field is empty). |
88 | + ''' |
89 | + |
90 | + TITLE_MATCH = int(sys.maxsize) |
91 | + PATH_MATCH = TITLE_MATCH - 1 |
92 | + |
93 | + '''Persistent UI state dictionary IDs''' |
94 | + UISTATE_CASE_SENSITIVE = "case_sensitive" |
95 | + |
96 | + def __init__(self, notebook, uistate, ui): |
97 | + ''' |
98 | + Constructor. |
99 | + |
100 | + @notebook zim.notebook object |
101 | + @uistate zim.dicts.ConfigDict object |
102 | + @ui zim.gui.GtkInterface object |
103 | + ''' |
104 | + |
105 | + gtk.VBox.__init__(self) |
106 | + |
107 | + self.notebook = notebook |
108 | + self.ui = ui |
109 | + self.uistate = uistate |
110 | + |
111 | + '''States''' |
112 | + self.is_searching = False |
113 | + self.has_query_changed = False |
114 | + self.has_index_changed = True |
115 | + |
116 | + '''Page buffer for faster page iteration''' |
117 | + self.buffered_pages = list() |
118 | + '''Results buffer to quickly restore previous search results''' |
119 | + self.buffered_results = dict() |
120 | + |
121 | + '''Create GUI elements''' |
122 | + self.query_entry = gtk.Entry() |
123 | + self.options_hbox = gtk.HBox() |
124 | + self.tomboy_mode = gtk.CheckButton("Show title/path matches") |
125 | + self.case_sensitive_checkbox = gtk.CheckButton("Case-sensitive") |
126 | + self.options_hbox.pack_start(self.tomboy_mode, expand=False) |
127 | + self.options_hbox.pack_start(self.case_sensitive_checkbox, expand=False) |
128 | + self.results_treeview = QuickSearchTreeView(self.ui) |
129 | + self.results_scrolledwindow = ScrolledWindow(self.results_treeview) |
130 | + self.status_label = gtk.Label("") |
131 | + |
132 | + '''Add GUI elements to widget''' |
133 | + self.pack_start(self.query_entry, expand=False) |
134 | + self.pack_start(self.options_hbox, expand=False) |
135 | + self.pack_start(self.results_scrolledwindow, expand=True) |
136 | + self.pack_start(self.status_label, expand=False) |
137 | + |
138 | + '''Edit GUI elements''' |
139 | + self.results_scrolledwindow.set_sensitive(False) |
140 | + self.tomboy_mode.set_active(True) |
141 | + self.status_label.set_text("") |
142 | + |
143 | + '''Callback connections''' |
144 | + self.query_entry.connect_object('changed', self.__class__.on_query_changed, self) |
145 | + self.case_sensitive_checkbox.connect_object('clicked', self.__class__.on_case_sensitive_changed, self) |
146 | + self.tomboy_mode.connect_object('clicked', self.__class__.on_tomboy_mode_changed, self) |
147 | + self.connectto_all(self.notebook.index, ( |
148 | + ('page-inserted', self.on_index_changed), |
149 | + ('page-deleted', self.on_index_changed), |
150 | + )) |
151 | + |
152 | + def on_tomboy_mode_changed(self): |
153 | + self.on_query_changed() |
154 | + |
155 | + def on_case_sensitive_changed(self): |
156 | + self.on_query_changed() |
157 | + |
158 | + def on_query_changed(self): |
159 | + ''' |
160 | + This function is called when the user modifies the search field. |
161 | + It (re)starts the search function. |
162 | + ''' |
163 | + |
164 | + logging.debug("Callback: Query changed!") |
165 | + |
166 | + if self.is_searching: |
167 | + self.has_query_changed = True |
168 | + else: |
169 | + gobject.idle_add(self.search) |
170 | + |
171 | + return True |
172 | + |
173 | + def on_index_changed(self, path, signal): |
174 | + ''' |
175 | + This function is called when a page is added/removed to/from the notebook. |
176 | + It tells our widget to update its page buffer. |
177 | + |
178 | + @path zim.index.Index object |
179 | + @signal zim.index.IndexPath object |
180 | + ''' |
181 | + |
182 | + logging.debug("Callback: Index changed!") |
183 | + |
184 | + self.has_index_changed = True |
185 | + |
186 | + return True |
187 | + |
188 | + def _update_buffered_pages(self): |
189 | + ''' |
190 | + This function synchronizes our page buffer with the notebook. |
191 | + |
192 | + @todo Optimize function to only add/remove new/deleted page |
193 | + ''' |
194 | + |
195 | + self.has_index_changed = False |
196 | + |
197 | + logging.debug("Updating page buffer") |
198 | + |
199 | + self.buffered_pages = list() |
200 | + for page in self.notebook.walk(): |
201 | + self.buffered_pages.append(page) |
202 | + |
203 | + def search(self): |
204 | + ''' |
205 | + This is our search function. |
206 | + ''' |
207 | + |
208 | + self.is_searching = True |
209 | + |
210 | + is_case_sensitive = self.case_sensitive_checkbox.get_active() |
211 | + is_tomboy_mode = self.tomboy_mode.get_active() |
212 | + |
213 | + if is_case_sensitive: |
214 | + query = self.query_entry.get_text() |
215 | + else: |
216 | + query = self.query_entry.get_text().lower() |
217 | + |
218 | + logging.debug("Starting new query: " + query) |
219 | + |
220 | + '''Skip search if query is empty |
221 | + @todo Enter "navigation tree mode" |
222 | + ''' |
223 | + if(query == ""): |
224 | + self.results_treeview.set_model(gtk.ListStore(str, str, int)) |
225 | + self.results_scrolledwindow.set_sensitive(False) |
226 | + self.status_label.set_text("Total: " + str(len(self.buffered_pages)) + " pages") |
227 | + self.is_searching = False |
228 | + return False |
229 | + |
230 | + '''Update page buffer and clear results buffer if page index has changed''' |
231 | + if self.has_index_changed: |
232 | + self.has_index_changed = False |
233 | + self.buffered_results.clear() |
234 | + self._update_buffered_pages() |
235 | + buffered_pages = self.buffered_pages |
236 | + |
237 | + '''Try to restore buffered results''' |
238 | + liststore = self.buffered_results.get((is_case_sensitive, is_tomboy_mode, query)) |
239 | + if liststore != None: |
240 | + self.results_treeview.set_model(liststore) |
241 | + results_count = len(liststore) |
242 | + if results_count == 0: |
243 | + self.results_scrolledwindow.set_sensitive(False) |
244 | + else: |
245 | + self.results_scrolledwindow.set_sensitive(True) |
246 | + self.status_label.set_text("Matches: " + str(results_count) + " pages") |
247 | + logging.debug("Query results restored: " + query) |
248 | + self.is_searching = False |
249 | + return False |
250 | + |
251 | + '''Start search''' |
252 | + results = list() |
253 | + |
254 | + for page in buffered_pages: |
255 | + |
256 | + while gtk.events_pending(): |
257 | + gtk.main_iteration_do(False) |
258 | + |
259 | + if self.has_query_changed: |
260 | + self.has_query_changed = False |
261 | + logging.debug("Stopping out-dated query: " + query) |
262 | + return True |
263 | + |
264 | + '''Title match''' |
265 | + if is_tomboy_mode: |
266 | + |
267 | + if is_case_sensitive: |
268 | + page_basename = page.basename |
269 | + else: |
270 | + page_basename = page.basename.lower() |
271 | + |
272 | + if query in page_basename: |
273 | + results.append([page.name, "Title match", self.TITLE_MATCH]) |
274 | + continue |
275 | + |
276 | + '''Path match''' |
277 | + if is_tomboy_mode: |
278 | + |
279 | + if is_case_sensitive: |
280 | + page_name = page.name |
281 | + else: |
282 | + page_name = page.name.lower() |
283 | + |
284 | + if query in page_name: |
285 | + results.append([page.name, "Path match", self.PATH_MATCH]) |
286 | + continue |
287 | + |
288 | + '''Text match''' |
289 | + try: |
290 | + parsetree = page.get_parsetree() |
291 | + if parsetree is None: |
292 | + continue |
293 | + |
294 | + query_count = 0 |
295 | + |
296 | + if is_case_sensitive: |
297 | + query_count = parsetree.count(query) |
298 | + else: |
299 | + for element in parsetree._etree.getiterator(): # Sorry! |
300 | + if element.text: |
301 | + query_count += element.text.lower().count(query) |
302 | + if element.tail: |
303 | + query_count += element.tail.lower().count(query) |
304 | + |
305 | + if query_count == 1: |
306 | + results.append([page.name, str(query_count) + " match", query_count]) |
307 | + continue |
308 | + if query_count > 0: |
309 | + results.append([page.name, str(query_count) + " matches", query_count]) |
310 | + continue |
311 | + except: |
312 | + pass |
313 | + |
314 | + '''Sort results by 3rd column ("query score")''' |
315 | + results.sort(key=lambda x: int(x[2]), reverse=True) |
316 | + |
317 | + liststore = gtk.ListStore(str, str, int) |
318 | + for r in results: |
319 | + liststore.append(r) |
320 | + self.results_treeview.set_model(liststore) |
321 | + |
322 | + '''Add result to results buffer''' |
323 | + self.buffered_results[(is_case_sensitive, is_tomboy_mode, query)] = liststore |
324 | + |
325 | + results_count = len(results) |
326 | + if results_count == 0: |
327 | + self.results_scrolledwindow.set_sensitive(False) |
328 | + else: |
329 | + self.results_scrolledwindow.set_sensitive(True) |
330 | + |
331 | + self.status_label.set_text("Matches: " + str(results_count) + " pages") |
332 | + |
333 | + logging.debug("Query has been processed: " + query) |
334 | + |
335 | + self.is_searching = False |
336 | + |
337 | + return False |
338 | + |
339 | +class QuickSearchTreeView(gtk.TreeView): |
340 | + ''' |
341 | + This class represents our results list GUI element. |
342 | + ''' |
343 | + |
344 | + def __init__(self, ui): |
345 | + ''' |
346 | + Constructor. |
347 | + |
348 | + @ui zim.gui.GtkInterface object: Used to open the corresponding page if the users clicks on a result |
349 | + ''' |
350 | + gtk.TreeView.__init__(self, gtk.ListStore(str, str, int)) # page name, match description, match score |
351 | + |
352 | + self.ui = ui |
353 | + |
354 | + p_col = gtk.TreeViewColumn('Page', gtk.CellRendererText(), text=0) # page name |
355 | + m_col = gtk.TreeViewColumn('Matches', gtk.CellRendererText(), text=1) # match description |
356 | + |
357 | + p_col.set_sort_column_id(0) # page name |
358 | + m_col.set_sort_column_id(2) # match score |
359 | + |
360 | + p_col.set_expand(True) |
361 | + m_col.set_expand(False) |
362 | + |
363 | + self.append_column(p_col) |
364 | + self.append_column(m_col) |
365 | + |
366 | + self.set_search_column(0) |
367 | + self.set_search_column(1) |
368 | + |
369 | + self.connect('row-activated', self._do_open_page) |
370 | + |
371 | + def _do_open_page(self, view, path, col): |
372 | + ''' |
373 | + This callback function is called when the user clicks on a row in the results list. |
374 | + It opens the corresponding page. |
375 | + ''' |
376 | + |
377 | + page = Path(self.get_model()[path][0].decode('utf-8')) |
378 | + self.ui.open_page(page) |
379 | + |
Hi Lukas,
I'm very sorry, but I'm afraid I cannot accept this functionality in it's current form. The big problem I see is that you implement your own search method, rather than re-using the search module and the logic from the dialog. Specifically you load all pages in memory and then search on that - this will break when you have a large notebook on a small computer. The search module in zim uses the cached index and greps through pages one by one (the speed of the grep is still to be improved - I'm planning for that).
My suggestion would be that you make a new version where you re-use the widgets from the search dialog and embed these in the pane. This is trivial. Then you make any improvements for the interface on those widgets, such that both the dialog and the embedded pane benefit.
Regards,
Jaap