Zim

Merge lp:~koschmieder/zim/zim into lp:~jaap.karssenberg/zim/pyzim

Proposed by Lukas M. Koschmieder
Status: Needs review
Proposed branch: lp:~koschmieder/zim/zim
Merge into: lp:~jaap.karssenberg/zim/pyzim
Diff against target: 379 lines (+375/-0)
1 file modified
zim/plugins/quicksearch.py (+375/-0)
To merge this branch: bzr merge lp:~koschmieder/zim/zim
Reviewer Review Type Date Requested Status
Jaap Karssenberg Disapprove
Review via email: mp+257936@code.launchpad.net

Description of the change

My Zim plugin which is inspired by Tomboy/Gnote and provides a search pane which shows results as you type.

To post a comment you must log in.
Revision history for this message
Jaap Karssenberg (jaap.karssenberg) wrote :

Hi Lukas,

I'm very sorry, but I'm afraid I cannot accept this functionality in it's current form. The big problem I see is that you implement your own search method, rather than re-using the search module and the logic from the dialog. Specifically you load all pages in memory and then search on that - this will break when you have a large notebook on a small computer. The search module in zim uses the cached index and greps through pages one by one (the speed of the grep is still to be improved - I'm planning for that).

My suggestion would be that you make a new version where you re-use the widgets from the search dialog and embed these in the pane. This is trivial. Then you make any improvements for the interface on those widgets, such that both the dialog and the embedded pane benefit.

Regards,

Jaap

review: Disapprove
Revision history for this message
Lukas M. Koschmieder (koschmieder) wrote :

Hi Jaap,

thank you for your quick response.

I would also prefer to re-use the search module but I fear that it doesn't meet my requirements.

These are my main concerns:

First, I rate/weight search results by certain criteria and don't see how I could fit my rating system into your search logic.

Second, the key feature of my plug-in is that it shows search results as you type. Judging by how long the default search dialog takes to process a query, I suspect that the underlying logic is too slow/bloated for my needs.

Please let me know what you think. I will try to re-write my plug-in, if you're positive that my concerns are unjustified.

Regards,

Lukas

Revision history for this message
Jaap Karssenberg (jaap.karssenberg) wrote :

Hi Lukas,

With respect to the rating system, I'm open to improvements. Just specify
in more detail what you think should be different in the rating and we can
have a look at it. My objective is putting most relevant results on the
top, but I didn't optimize it much.

Search-as-you type for content search will indeed not work in the current
module. But loading the whole notebook in memory to solve that is not the
right way to go. You should either research using the SQLite database as a
search engine to achieve that, or think of optimizations like using
search-as-you-type for cached content like page names and searc hby word
for content. That could certainly work.

Have you looked at the search language we use? If you use keywords for
searches, searching names will be really fast.

Regards,

Jaap

On Sat, May 2, 2015 at 12:05 AM, Lukas M. Koschmieder <email address hidden>
wrote:

> Hi Jaap,
>
> thank you for your quick response.
>
> I would also prefer to re-use the search module but I fear that it doesn't
> meet my requirements.
>
> These are my main concerns:
>
> First, I rate/weight search results by certain criteria and don't see how
> I could fit my rating system into your search logic.
>
> Second, the key feature of my plug-in is that it shows search results as
> you type. Judging by how long the default search dialog takes to process a
> query, I suspect that the underlying logic is too slow/bloated for my needs.
>
> Please let me know what you think. I will try to re-write my plug-in, if
> you're positive that my concerns are unjustified.
>
> Regards,
>
> Lukas
> --
> https://code.launchpad.net/~koschmieder/zim/zim/+merge/257936
> You are reviewing the proposed merge of lp:~koschmieder/zim/zim into
> lp:zim.
>

Revision history for this message
Lukas M. Koschmieder (koschmieder) wrote :

Hi Jaap,

If I understand you correctly, then your main issue is the fact that my plug-in loads the entire notebook in memory. Would you allow me to keep my own search function / rating system if I got rid of the problematic page buffer / results cache?

> With respect to the rating system, I'm open to improvements. Just specify
> in more detail what you think should be different in the rating and we can
> have a look at it. My objective is putting most relevant results on the
> top, but I didn't optimize it much.

My rating system is inspired by Tomboy/Gnote. The search function prioritize "title matches" over "path matches" over "page matches". This concept may sound simple but it has proven to be useful in my daily routine. However, I'm not sure if this method is suited to improve our default results system.
With respect to improvements, it would be nice if ParseTree could also provide a case-sensitive count function.

> Have you looked at the search language we use? If you use keywords for
> searches, searching names will be really fast.

My plug-in iterates over all pages in the notebook, for each page greps its parse tree and then uses the count function to get the number of matches. Is there a faster alternative?

Unmerged revisions

758. By Lukas M. Koschmieder

Added plug-in

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'zim/plugins/quicksearch.py'
2--- zim/plugins/quicksearch.py 1970-01-01 00:00:00 +0000
3+++ zim/plugins/quicksearch.py 2015-04-30 17:05:53 +0000
4@@ -0,0 +1,375 @@
5+#!/usr/bin/env python
6+# -*- coding: utf-8 -*-
7+# (c) 2015 Lukas M. Koschmieder (koschmieder@gmx.net)
8+
9+import gtk
10+import logging
11+import sys
12+import gobject
13+
14+from zim.plugins import extends, PluginClass, WindowExtension
15+from zim.notebook import Path
16+from zim.gui.widgets import BOTTOM_PANE, PANE_POSITIONS, ScrolledWindow
17+from zim.signals import ConnectorMixin
18+
19+logger = logging.getLogger('zim.plugins.quicksearch')
20+
21+class QuickSearchPlugin(PluginClass):
22+ plugin_info = {
23+ 'name': _('Quick Search'),
24+ 'description': _('''This plugin provides a search side pane which shows search results as you type (similar to Tomboy or Google).'''),
25+ 'author': 'Lukas M. Koschmieder (koschmieder@gmx.net)',
26+ #'help': 'Plugins:Quick Search',
27+ }
28+
29+ plugin_preferences = (
30+ ('pane', 'choice', _('Position in the window'), BOTTOM_PANE, PANE_POSITIONS),
31+ )
32+
33+@extends('MainWindow')
34+class QuickSearchWindowExtension(WindowExtension):
35+
36+ def __init__(self, plugin, window):
37+ '''
38+ Plug-in constructor.
39+
40+ @plugin QuickSearchPlugin object
41+ @window zim.plugins.MainWindow object
42+ '''
43+
44+ WindowExtension.__init__(self, plugin, window)
45+
46+ self.widget = QuickSearchWidget(self.window.ui.notebook, self.uistate, self.window.ui)
47+
48+ self.on_preferences_changed(plugin.preferences)
49+ self.connectto(plugin.preferences, 'changed', self.on_preferences_changed)
50+
51+ def on_preferences_changed(self, preferences):
52+ '''
53+ This function is called when the user edits our plug-in preferences.
54+ It re-adds our plug-in widget to Zim according to these new preferences.
55+
56+ @preferences zim.dicts.ConfigDict object
57+ '''
58+
59+ if self.widget is None:
60+ return
61+
62+ try:
63+ self.window.remove(self.widget)
64+ except ValueError:
65+ pass
66+
67+ self.window.add_tab(_('Search'), self.widget, preferences['pane'])
68+
69+ def teardown(self):
70+ '''
71+ Plug-in destructor.
72+ '''
73+
74+ self.window.remove(self.widget)
75+ self.widget.disconnect_all()
76+ self.widget = None
77+
78+class QuickSearchWidget(ConnectorMixin, gtk.VBox):
79+ '''
80+ This class is the main part of our plug-in.
81+ It mainly comprises of the following GUI elements and the underlying logic:
82+ - a gtk.Entry (search field)
83+ - a gtk.TreeView (results list)
84+ - a gtk.Label (status text)
85+
86+ @todo Toggle between "search result view" (if a keyword has been entered)
87+ and "navigation view" (if the search field is empty).
88+ '''
89+
90+ TITLE_MATCH = int(sys.maxsize)
91+ PATH_MATCH = TITLE_MATCH - 1
92+
93+ '''Persistent UI state dictionary IDs'''
94+ UISTATE_CASE_SENSITIVE = "case_sensitive"
95+
96+ def __init__(self, notebook, uistate, ui):
97+ '''
98+ Constructor.
99+
100+ @notebook zim.notebook object
101+ @uistate zim.dicts.ConfigDict object
102+ @ui zim.gui.GtkInterface object
103+ '''
104+
105+ gtk.VBox.__init__(self)
106+
107+ self.notebook = notebook
108+ self.ui = ui
109+ self.uistate = uistate
110+
111+ '''States'''
112+ self.is_searching = False
113+ self.has_query_changed = False
114+ self.has_index_changed = True
115+
116+ '''Page buffer for faster page iteration'''
117+ self.buffered_pages = list()
118+ '''Results buffer to quickly restore previous search results'''
119+ self.buffered_results = dict()
120+
121+ '''Create GUI elements'''
122+ self.query_entry = gtk.Entry()
123+ self.options_hbox = gtk.HBox()
124+ self.tomboy_mode = gtk.CheckButton("Show title/path matches")
125+ self.case_sensitive_checkbox = gtk.CheckButton("Case-sensitive")
126+ self.options_hbox.pack_start(self.tomboy_mode, expand=False)
127+ self.options_hbox.pack_start(self.case_sensitive_checkbox, expand=False)
128+ self.results_treeview = QuickSearchTreeView(self.ui)
129+ self.results_scrolledwindow = ScrolledWindow(self.results_treeview)
130+ self.status_label = gtk.Label("")
131+
132+ '''Add GUI elements to widget'''
133+ self.pack_start(self.query_entry, expand=False)
134+ self.pack_start(self.options_hbox, expand=False)
135+ self.pack_start(self.results_scrolledwindow, expand=True)
136+ self.pack_start(self.status_label, expand=False)
137+
138+ '''Edit GUI elements'''
139+ self.results_scrolledwindow.set_sensitive(False)
140+ self.tomboy_mode.set_active(True)
141+ self.status_label.set_text("")
142+
143+ '''Callback connections'''
144+ self.query_entry.connect_object('changed', self.__class__.on_query_changed, self)
145+ self.case_sensitive_checkbox.connect_object('clicked', self.__class__.on_case_sensitive_changed, self)
146+ self.tomboy_mode.connect_object('clicked', self.__class__.on_tomboy_mode_changed, self)
147+ self.connectto_all(self.notebook.index, (
148+ ('page-inserted', self.on_index_changed),
149+ ('page-deleted', self.on_index_changed),
150+ ))
151+
152+ def on_tomboy_mode_changed(self):
153+ self.on_query_changed()
154+
155+ def on_case_sensitive_changed(self):
156+ self.on_query_changed()
157+
158+ def on_query_changed(self):
159+ '''
160+ This function is called when the user modifies the search field.
161+ It (re)starts the search function.
162+ '''
163+
164+ logging.debug("Callback: Query changed!")
165+
166+ if self.is_searching:
167+ self.has_query_changed = True
168+ else:
169+ gobject.idle_add(self.search)
170+
171+ return True
172+
173+ def on_index_changed(self, path, signal):
174+ '''
175+ This function is called when a page is added/removed to/from the notebook.
176+ It tells our widget to update its page buffer.
177+
178+ @path zim.index.Index object
179+ @signal zim.index.IndexPath object
180+ '''
181+
182+ logging.debug("Callback: Index changed!")
183+
184+ self.has_index_changed = True
185+
186+ return True
187+
188+ def _update_buffered_pages(self):
189+ '''
190+ This function synchronizes our page buffer with the notebook.
191+
192+ @todo Optimize function to only add/remove new/deleted page
193+ '''
194+
195+ self.has_index_changed = False
196+
197+ logging.debug("Updating page buffer")
198+
199+ self.buffered_pages = list()
200+ for page in self.notebook.walk():
201+ self.buffered_pages.append(page)
202+
203+ def search(self):
204+ '''
205+ This is our search function.
206+ '''
207+
208+ self.is_searching = True
209+
210+ is_case_sensitive = self.case_sensitive_checkbox.get_active()
211+ is_tomboy_mode = self.tomboy_mode.get_active()
212+
213+ if is_case_sensitive:
214+ query = self.query_entry.get_text()
215+ else:
216+ query = self.query_entry.get_text().lower()
217+
218+ logging.debug("Starting new query: " + query)
219+
220+ '''Skip search if query is empty
221+ @todo Enter "navigation tree mode"
222+ '''
223+ if(query == ""):
224+ self.results_treeview.set_model(gtk.ListStore(str, str, int))
225+ self.results_scrolledwindow.set_sensitive(False)
226+ self.status_label.set_text("Total: " + str(len(self.buffered_pages)) + " pages")
227+ self.is_searching = False
228+ return False
229+
230+ '''Update page buffer and clear results buffer if page index has changed'''
231+ if self.has_index_changed:
232+ self.has_index_changed = False
233+ self.buffered_results.clear()
234+ self._update_buffered_pages()
235+ buffered_pages = self.buffered_pages
236+
237+ '''Try to restore buffered results'''
238+ liststore = self.buffered_results.get((is_case_sensitive, is_tomboy_mode, query))
239+ if liststore != None:
240+ self.results_treeview.set_model(liststore)
241+ results_count = len(liststore)
242+ if results_count == 0:
243+ self.results_scrolledwindow.set_sensitive(False)
244+ else:
245+ self.results_scrolledwindow.set_sensitive(True)
246+ self.status_label.set_text("Matches: " + str(results_count) + " pages")
247+ logging.debug("Query results restored: " + query)
248+ self.is_searching = False
249+ return False
250+
251+ '''Start search'''
252+ results = list()
253+
254+ for page in buffered_pages:
255+
256+ while gtk.events_pending():
257+ gtk.main_iteration_do(False)
258+
259+ if self.has_query_changed:
260+ self.has_query_changed = False
261+ logging.debug("Stopping out-dated query: " + query)
262+ return True
263+
264+ '''Title match'''
265+ if is_tomboy_mode:
266+
267+ if is_case_sensitive:
268+ page_basename = page.basename
269+ else:
270+ page_basename = page.basename.lower()
271+
272+ if query in page_basename:
273+ results.append([page.name, "Title match", self.TITLE_MATCH])
274+ continue
275+
276+ '''Path match'''
277+ if is_tomboy_mode:
278+
279+ if is_case_sensitive:
280+ page_name = page.name
281+ else:
282+ page_name = page.name.lower()
283+
284+ if query in page_name:
285+ results.append([page.name, "Path match", self.PATH_MATCH])
286+ continue
287+
288+ '''Text match'''
289+ try:
290+ parsetree = page.get_parsetree()
291+ if parsetree is None:
292+ continue
293+
294+ query_count = 0
295+
296+ if is_case_sensitive:
297+ query_count = parsetree.count(query)
298+ else:
299+ for element in parsetree._etree.getiterator(): # Sorry!
300+ if element.text:
301+ query_count += element.text.lower().count(query)
302+ if element.tail:
303+ query_count += element.tail.lower().count(query)
304+
305+ if query_count == 1:
306+ results.append([page.name, str(query_count) + " match", query_count])
307+ continue
308+ if query_count > 0:
309+ results.append([page.name, str(query_count) + " matches", query_count])
310+ continue
311+ except:
312+ pass
313+
314+ '''Sort results by 3rd column ("query score")'''
315+ results.sort(key=lambda x: int(x[2]), reverse=True)
316+
317+ liststore = gtk.ListStore(str, str, int)
318+ for r in results:
319+ liststore.append(r)
320+ self.results_treeview.set_model(liststore)
321+
322+ '''Add result to results buffer'''
323+ self.buffered_results[(is_case_sensitive, is_tomboy_mode, query)] = liststore
324+
325+ results_count = len(results)
326+ if results_count == 0:
327+ self.results_scrolledwindow.set_sensitive(False)
328+ else:
329+ self.results_scrolledwindow.set_sensitive(True)
330+
331+ self.status_label.set_text("Matches: " + str(results_count) + " pages")
332+
333+ logging.debug("Query has been processed: " + query)
334+
335+ self.is_searching = False
336+
337+ return False
338+
339+class QuickSearchTreeView(gtk.TreeView):
340+ '''
341+ This class represents our results list GUI element.
342+ '''
343+
344+ def __init__(self, ui):
345+ '''
346+ Constructor.
347+
348+ @ui zim.gui.GtkInterface object: Used to open the corresponding page if the users clicks on a result
349+ '''
350+ gtk.TreeView.__init__(self, gtk.ListStore(str, str, int)) # page name, match description, match score
351+
352+ self.ui = ui
353+
354+ p_col = gtk.TreeViewColumn('Page', gtk.CellRendererText(), text=0) # page name
355+ m_col = gtk.TreeViewColumn('Matches', gtk.CellRendererText(), text=1) # match description
356+
357+ p_col.set_sort_column_id(0) # page name
358+ m_col.set_sort_column_id(2) # match score
359+
360+ p_col.set_expand(True)
361+ m_col.set_expand(False)
362+
363+ self.append_column(p_col)
364+ self.append_column(m_col)
365+
366+ self.set_search_column(0)
367+ self.set_search_column(1)
368+
369+ self.connect('row-activated', self._do_open_page)
370+
371+ def _do_open_page(self, view, path, col):
372+ '''
373+ This callback function is called when the user clicks on a row in the results list.
374+ It opens the corresponding page.
375+ '''
376+
377+ page = Path(self.get_model()[path][0].decode('utf-8'))
378+ self.ui.open_page(page)
379+