Merge lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist/0.1
- mimetypes
- Merge into 0.8-python
Status: | Rejected |
---|---|
Rejected by: | Seif Lotfy |
Proposed branch: | lp:~mhr3/zeitgeist/mimetypes |
Merge into: | lp:zeitgeist/0.1 |
Diff against target: |
500 lines (+248/-172) 5 files modified
_zeitgeist/loggers/datasources/recent.py (+2/-170) test/engine-test.py (+25/-1) test/test-mimetypes.py (+25/-0) zeitgeist/Makefile.am (+2/-1) zeitgeist/mimetypes.py (+194/-0) |
To merge this branch: | bzr merge lp:~mhr3/zeitgeist/mimetypes |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Mikkel Kamstrup Erlandsen | visual review | Needs Information | |
Review via email: mp+26233@code.launchpad.net |
Commit message
Description of the change
Siegfried Gevatter (rainct) wrote : | # |
Mikkel Kamstrup Erlandsen (kamstrup) wrote : | # |
While we are at improving our mime handling situation we might as well go the extra mile - especially now that we want to make it public API.
We now have interpretations like Interpretation.
As an example take a look at my mimetype mappings libzeitgeist: http://
Anoter note: The public API should be added to our generated sphinx docs too...
Markus Korn (thekorn) wrote : | # |
I did not do a review yet, but I've one first comment: I don't like that we have the same logic in two places, in zeitgeist and in libzeitgeist. (and in the future also in C# and JS, ...)
Mikkel Kamstrup Erlandsen (kamstrup) wrote : | # |
Perhaps we can compile the mimetype-
However doing all of this probably requires more work than I think we should put in for 0.3.4 at this point. I think the best route would be to either:
1) Defer this task for post 0.3.4, or
2) Make sure the implementation is done in a way so that we can add the extra introspection API for post 0.3.4 without breaking the mimetypes API from 0.3.4
I am not entirely sure which route I prefer, but I would hate to delay 0.3.4 more than we already have...
Seif Lotfy (seif) wrote : | # |
Ok I am up for a mimetype-
I would prefer JSON for this issue...
what do u guys think?
Mikkel Kamstrup Erlandsen (kamstrup) wrote : | # |
Why JSON and not straight Python? JSON would only slow down startup time, and Python already supports a very JSON-like syntax for declaring maps...
Seif Lotfy (seif) wrote : | # |
Beacuse I think we want libzeitgeist and other languages to be able to
parse it... Or do u want to parse python syntax ?
On Mon, Jul 19, 2010 at 10:07 AM, Mikkel Kamstrup Erlandsen
<email address hidden> wrote:
> Why JSON and not straight Python? JSON would only slow down startup time, and Python already supports a very JSON-like syntax for declaring maps...
> --
> https:/
> You are subscribed to branch lp:zeitgeist.
>
--
This is me doing some advertisement for my blog http://
Markus Korn (thekorn) wrote : | # |
This is not a reviewer comment, but I would like to add another possible way to solve this, python applications which would like to use this kind of mapper, could depend on libzeitgeist and use ctypes to access the mapper functions:
from ctypes import CDLL, c_char_p
libzeitgeist = CDLL("libzeitge
interpretation_
interpretation_
interpretation_
print interpretation_
print interpretation_
manifestation_
manifestation_
manifestation_
print manifestation_
print manifestation_
Siegfried Gevatter (rainct) wrote : | # |
2010/7/19 Markus Korn <email address hidden>:
> This is not a reviewer comment, but I would like to add another possible way to solve this, python applications which would like to use this kind of mapper, could depend on libzeitgeist and use ctypes to access the mapper functions:
Eerk, making a real C->Python interface is not so difficult.
--
Siegfried-Angel Gevatter Pujals (RainCT)
Free Software Developer 363DEAE3
Michal Hruby (mhr3) wrote : | # |
I think the cross-dependency of zg->libzg->zg is not a good idea, +1 for making it like the ontologies.
Mikkel Kamstrup Erlandsen (kamstrup) wrote : | # |
Markus, RainCT: Depending on libzeitgeist would add ~1mb to the Zeitgeist runtime I guess - which I think is a very steep price to pay for this.
As I noted in comment from 2010-05-28 I'd actually prefer a Python module for libzeitgeist, so I just write a small tool for libzeitgeist that outputs the C code at build time. Very easy for me, and fast and light for ZG.
Alternatively go full monty and write the autofoo magic that compiles some JSON/XML schema into Python code like we do for the ontology. In the end giving me a Python module to use in the libzeitgeist build process
Seif Lotfy (seif) wrote : | # |
+1 for the alternative
On Mon, Jul 19, 2010 at 2:51 PM, Mikkel Kamstrup Erlandsen
<email address hidden> wrote:
> Markus, RainCT: Depending on libzeitgeist would add ~1mb to the Zeitgeist runtime I guess - which I think is a very steep price to pay for this.
>
> As I noted in comment from 2010-05-28 I'd actually prefer a Python module for libzeitgeist, so I just write a small tool for libzeitgeist that outputs the C code at build time. Very easy for me, and fast and light for ZG.
>
> Alternatively go full monty and write the autofoo magic that compiles some JSON/XML schema into Python code like we do for the ontology. In the end giving me a Python module to use in the libzeitgeist build process
> --
> https:/
> You are subscribed to branch lp:zeitgeist.
>
--
This is me doing some advertisement for my blog http://
Siegfried Gevatter (rainct) wrote : | # |
Let's go with Python code, there's no reason to have it as JSON.
Seif Lotfy (seif) wrote : | # |
Markus finished a module for that
Unmerged revisions
- 1460. By Michal Hruby
-
Interpretation.
MUSIC -> Interpretation. AUDIO - 1459. By Michal Hruby
-
Recent datasource: Use the zg.mimetypes module
- 1458. By Michal Hruby
-
Added mimetype to interpretation getter
Preview Diff
1 | === modified file '_zeitgeist/loggers/datasources/recent.py' | |||
2 | --- _zeitgeist/loggers/datasources/recent.py 2010-04-29 11:33:01 +0000 | |||
3 | +++ _zeitgeist/loggers/datasources/recent.py 2010-05-27 19:34:36 +0000 | |||
4 | @@ -24,8 +24,6 @@ | |||
5 | 24 | 24 | ||
6 | 25 | from __future__ import with_statement | 25 | from __future__ import with_statement |
7 | 26 | import os | 26 | import os |
8 | 27 | import re | ||
9 | 28 | import fnmatch | ||
10 | 29 | import urllib | 27 | import urllib |
11 | 30 | import time | 28 | import time |
12 | 31 | import logging | 29 | import logging |
13 | @@ -34,6 +32,7 @@ | |||
14 | 34 | from zeitgeist import _config | 32 | from zeitgeist import _config |
15 | 35 | from zeitgeist.datamodel import Event, Subject, Interpretation, Manifestation, \ | 33 | from zeitgeist.datamodel import Event, Subject, Interpretation, Manifestation, \ |
16 | 36 | DataSource, get_timestamp_for_now | 34 | DataSource, get_timestamp_for_now |
17 | 35 | from zeitgeist.mimetypes import get_interpretation_for_mimetype | ||
18 | 37 | from _zeitgeist.loggers.zeitgeist_base import DataProvider | 36 | from _zeitgeist.loggers.zeitgeist_base import DataProvider |
19 | 38 | 37 | ||
20 | 39 | log = logging.getLogger("zeitgeist.logger.datasources.recent") | 38 | log = logging.getLogger("zeitgeist.logger.datasources.recent") |
21 | @@ -51,166 +50,9 @@ | |||
22 | 51 | else: | 50 | else: |
23 | 52 | enabled = True | 51 | enabled = True |
24 | 53 | 52 | ||
25 | 54 | class SimpleMatch(object): | ||
26 | 55 | """ Wrapper around fnmatch.fnmatch which allows to define mimetype | ||
27 | 56 | patterns by using shell-style wildcards. | ||
28 | 57 | """ | ||
29 | 58 | |||
30 | 59 | def __init__(self, pattern): | ||
31 | 60 | self.__pattern = pattern | ||
32 | 61 | |||
33 | 62 | def match(self, text): | ||
34 | 63 | return fnmatch.fnmatch(text, self.__pattern) | ||
35 | 64 | |||
36 | 65 | def __repr__(self): | ||
37 | 66 | return "%s(%r)" %(self.__class__.__name__, self.__pattern) | ||
38 | 67 | |||
39 | 68 | DOCUMENT_MIMETYPES = [ | ||
40 | 69 | # Covers: | ||
41 | 70 | # vnd.corel-draw | ||
42 | 71 | # vnd.ms-powerpoint | ||
43 | 72 | # vnd.ms-excel | ||
44 | 73 | # vnd.oasis.opendocument.* | ||
45 | 74 | # vnd.stardivision.* | ||
46 | 75 | # vnd.sun.xml.* | ||
47 | 76 | SimpleMatch(u"application/vnd.*"), | ||
48 | 77 | # Covers: x-applix-word, x-applix-spreadsheet, x-applix-presents | ||
49 | 78 | SimpleMatch(u"application/x-applix-*"), | ||
50 | 79 | # Covers: x-kword, x-kspread, x-kpresenter, x-killustrator | ||
51 | 80 | re.compile(u"application/x-k(word|spread|presenter|illustrator)"), | ||
52 | 81 | u"application/ms-powerpoint", | ||
53 | 82 | u"application/msword", | ||
54 | 83 | u"application/pdf", | ||
55 | 84 | u"application/postscript", | ||
56 | 85 | u"application/ps", | ||
57 | 86 | u"application/rtf", | ||
58 | 87 | u"application/x-abiword", | ||
59 | 88 | u"application/x-gnucash", | ||
60 | 89 | u"application/x-gnumeric", | ||
61 | 90 | SimpleMatch(u"application/x-java*"), | ||
62 | 91 | SimpleMatch(u"*/x-tex"), | ||
63 | 92 | SimpleMatch(u"*/x-latex"), | ||
64 | 93 | SimpleMatch(u"*/x-dvi"), | ||
65 | 94 | u"text/plain" | ||
66 | 95 | ] | ||
67 | 96 | |||
68 | 97 | IMAGE_MIMETYPES = [ | ||
69 | 98 | # Covers: | ||
70 | 99 | # vnd.corel-draw | ||
71 | 100 | u"application/vnd.corel-draw", | ||
72 | 101 | # Covers: x-kword, x-kspread, x-kpresenter, x-killustrator | ||
73 | 102 | re.compile(u"application/x-k(word|spread|presenter|illustrator)"), | ||
74 | 103 | SimpleMatch(u"image/*"), | ||
75 | 104 | ] | ||
76 | 105 | |||
77 | 106 | AUDIO_MIMETYPES = [ | ||
78 | 107 | SimpleMatch(u"audio/*"), | ||
79 | 108 | u"application/ogg" | ||
80 | 109 | ] | ||
81 | 110 | |||
82 | 111 | VIDEO_MIMETYPES = [ | ||
83 | 112 | SimpleMatch(u"video/*"), | ||
84 | 113 | u"application/ogg" | ||
85 | 114 | ] | ||
86 | 115 | |||
87 | 116 | DEVELOPMENT_MIMETYPES = [ | ||
88 | 117 | u"application/ecmascript", | ||
89 | 118 | u"application/javascript", | ||
90 | 119 | u"application/x-csh", | ||
91 | 120 | u"application/x-designer", | ||
92 | 121 | u"application/x-desktop", | ||
93 | 122 | u"application/x-dia-diagram", | ||
94 | 123 | u"application/x-fluid", | ||
95 | 124 | u"application/x-glade", | ||
96 | 125 | u"application/xhtml+xml", | ||
97 | 126 | u"application/x-java-archive", | ||
98 | 127 | u"application/x-m4", | ||
99 | 128 | u"application/xml", | ||
100 | 129 | u"application/x-object", | ||
101 | 130 | u"application/x-perl", | ||
102 | 131 | u"application/x-php", | ||
103 | 132 | u"application/x-ruby", | ||
104 | 133 | u"application/x-shellscript", | ||
105 | 134 | u"application/x-sql", | ||
106 | 135 | u"text/css", | ||
107 | 136 | u"text/html", | ||
108 | 137 | u"text/x-c", | ||
109 | 138 | u"text/x-c++", | ||
110 | 139 | u"text/x-chdr", | ||
111 | 140 | u"text/x-copying", | ||
112 | 141 | u"text/x-credits", | ||
113 | 142 | u"text/x-csharp", | ||
114 | 143 | u"text/x-c++src", | ||
115 | 144 | u"text/x-csrc", | ||
116 | 145 | u"text/x-dsrc", | ||
117 | 146 | u"text/x-eiffel", | ||
118 | 147 | u"text/x-gettext-translation", | ||
119 | 148 | u"text/x-gettext-translation-template", | ||
120 | 149 | u"text/x-haskell", | ||
121 | 150 | u"text/x-idl", | ||
122 | 151 | u"text/x-java", | ||
123 | 152 | u"text/x-lisp", | ||
124 | 153 | u"text/x-lua", | ||
125 | 154 | u"text/x-makefile", | ||
126 | 155 | u"text/x-objcsrc", | ||
127 | 156 | u"text/x-ocaml", | ||
128 | 157 | u"text/x-pascal", | ||
129 | 158 | u"text/x-patch", | ||
130 | 159 | u"text/x-python", | ||
131 | 160 | u"text/x-sql", | ||
132 | 161 | u"text/x-tcl", | ||
133 | 162 | u"text/x-troff", | ||
134 | 163 | u"text/x-vala", | ||
135 | 164 | u"text/x-vhdl", | ||
136 | 165 | ] | ||
137 | 166 | |||
138 | 167 | ALL_MIMETYPES = DOCUMENT_MIMETYPES + IMAGE_MIMETYPES + AUDIO_MIMETYPES + \ | ||
139 | 168 | VIDEO_MIMETYPES + DEVELOPMENT_MIMETYPES | ||
140 | 169 | |||
141 | 170 | class MimeTypeSet(set): | ||
142 | 171 | """ Set which allows to match against a string or an object with a | ||
143 | 172 | match() method. | ||
144 | 173 | """ | ||
145 | 174 | |||
146 | 175 | def __init__(self, *items): | ||
147 | 176 | super(MimeTypeSet, self).__init__() | ||
148 | 177 | self.__pattern = set() | ||
149 | 178 | for item in items: | ||
150 | 179 | if isinstance(item, (str, unicode)): | ||
151 | 180 | self.add(item) | ||
152 | 181 | elif hasattr(item, "match"): | ||
153 | 182 | self.__pattern.add(item) | ||
154 | 183 | else: | ||
155 | 184 | raise ValueError("Bad mimetype '%s'" %item) | ||
156 | 185 | |||
157 | 186 | def __contains__(self, mimetype): | ||
158 | 187 | result = super(MimeTypeSet, self).__contains__(mimetype) | ||
159 | 188 | if not result: | ||
160 | 189 | for pattern in self.__pattern: | ||
161 | 190 | if pattern.match(mimetype): | ||
162 | 191 | return True | ||
163 | 192 | return result | ||
164 | 193 | |||
165 | 194 | def __len__(self): | ||
166 | 195 | return super(MimeTypeSet, self).__len__() + len(self.__pattern) | ||
167 | 196 | |||
168 | 197 | def __repr__(self): | ||
169 | 198 | items = ", ".join(sorted(map(repr, self | self.__pattern))) | ||
170 | 199 | return "%s(%s)" %(self.__class__.__name__, items) | ||
171 | 200 | |||
172 | 201 | 53 | ||
173 | 202 | class RecentlyUsedManagerGtk(DataProvider): | 54 | class RecentlyUsedManagerGtk(DataProvider): |
174 | 203 | 55 | ||
175 | 204 | FILTERS = { | ||
176 | 205 | # dict of name as key and the matching mimetypes as value | ||
177 | 206 | # if the value is None this filter matches all mimetypes | ||
178 | 207 | "DOCUMENT": MimeTypeSet(*DOCUMENT_MIMETYPES), | ||
179 | 208 | "IMAGE": MimeTypeSet(*IMAGE_MIMETYPES), | ||
180 | 209 | "MUSIC": MimeTypeSet(*AUDIO_MIMETYPES), | ||
181 | 210 | "VIDEO": MimeTypeSet(*VIDEO_MIMETYPES), | ||
182 | 211 | "SOURCE_CODE": MimeTypeSet(*DEVELOPMENT_MIMETYPES), | ||
183 | 212 | } | ||
184 | 213 | |||
185 | 214 | def __init__(self, client): | 56 | def __init__(self, client): |
186 | 215 | DataProvider.__init__(self, | 57 | DataProvider.__init__(self, |
187 | 216 | unique_id="com.zeitgeist-project,datahub,recent", | 58 | unique_id="com.zeitgeist-project,datahub,recent", |
188 | @@ -269,16 +111,6 @@ | |||
189 | 269 | pass # file may be a broken symlink (LP: #523761) | 111 | pass # file may be a broken symlink (LP: #523761) |
190 | 270 | return None | 112 | return None |
191 | 271 | 113 | ||
192 | 272 | def _get_interpretation_for_mimetype(self, mimetype): | ||
193 | 273 | matching_filter = None | ||
194 | 274 | for filter_name, mimetypes in self.FILTERS.iteritems(): | ||
195 | 275 | if mimetype and mimetype in mimetypes: | ||
196 | 276 | matching_filter = filter_name | ||
197 | 277 | break | ||
198 | 278 | if matching_filter: | ||
199 | 279 | return getattr(Interpretation, matching_filter).uri | ||
200 | 280 | return "" | ||
201 | 281 | |||
202 | 282 | def _get_items(self): | 114 | def _get_items(self): |
203 | 283 | # We save the start timestamp to avoid race conditions | 115 | # We save the start timestamp to avoid race conditions |
204 | 284 | last_seen = get_timestamp_for_now() | 116 | last_seen = get_timestamp_for_now() |
205 | @@ -297,7 +129,7 @@ | |||
206 | 297 | 129 | ||
207 | 298 | subject = Subject.new_for_values( | 130 | subject = Subject.new_for_values( |
208 | 299 | uri = unicode(uri), | 131 | uri = unicode(uri), |
210 | 300 | interpretation = self._get_interpretation_for_mimetype( | 132 | interpretation = get_interpretation_for_mimetype( |
211 | 301 | unicode(info.get_mime_type())), | 133 | unicode(info.get_mime_type())), |
212 | 302 | manifestation = Manifestation.FILE_DATA_OBJECT.uri, | 134 | manifestation = Manifestation.FILE_DATA_OBJECT.uri, |
213 | 303 | text = info.get_display_name(), | 135 | text = info.get_display_name(), |
214 | 304 | 136 | ||
215 | === modified file 'test/engine-test.py' | |||
216 | --- test/engine-test.py 2010-05-15 13:15:44 +0000 | |||
217 | +++ test/engine-test.py 2010-05-27 19:34:36 +0000 | |||
218 | @@ -10,6 +10,7 @@ | |||
219 | 10 | from _zeitgeist.engine import constants | 10 | from _zeitgeist.engine import constants |
220 | 11 | from _zeitgeist.engine import get_engine | 11 | from _zeitgeist.engine import get_engine |
221 | 12 | from zeitgeist.datamodel import * | 12 | from zeitgeist.datamodel import * |
222 | 13 | from zeitgeist.mimetypes import * | ||
223 | 13 | from testutils import import_events | 14 | from testutils import import_events |
224 | 14 | 15 | ||
225 | 15 | import unittest | 16 | import unittest |
226 | @@ -366,7 +367,30 @@ | |||
227 | 366 | self.assertEquals(2, len(result)) | 367 | self.assertEquals(2, len(result)) |
228 | 367 | events = self.engine.get_events(result) | 368 | events = self.engine.get_events(result) |
229 | 368 | 369 | ||
231 | 369 | 370 | def testMimetypeToInterpretation(self): | |
232 | 371 | mimetype = "text/plain" | ||
233 | 372 | interpretation = get_interpretation_for_mimetype(mimetype) | ||
234 | 373 | |||
235 | 374 | event = Event() | ||
236 | 375 | event.interpretation = Manifestation.USER_ACTIVITY | ||
237 | 376 | event.manifestation = Interpretation.CREATE_EVENT | ||
238 | 377 | event.actor = "/usr/share/applications/gnome-about.desktop" | ||
239 | 378 | |||
240 | 379 | subject = Subject() | ||
241 | 380 | subject.uri = "file:///tmp/file.txt" | ||
242 | 381 | subject.manifestation = Manifestation.FILE_DATA_OBJECT | ||
243 | 382 | subject.interpretation = interpretation | ||
244 | 383 | subject.origin = "test://" | ||
245 | 384 | subject.mimetype = "text/plain" | ||
246 | 385 | subject.text = "This subject has no text" | ||
247 | 386 | |||
248 | 387 | event.append_subject(subject) | ||
249 | 388 | ids = self.engine.insert_events([event,]) | ||
250 | 389 | |||
251 | 390 | results = self.engine.get_events([1]) | ||
252 | 391 | self.assertEquals(1, len(results)) | ||
253 | 392 | self.assertEquals( | ||
254 | 393 | interpretation, results[0].subjects[0].interpretation) | ||
255 | 370 | 394 | ||
256 | 371 | def testDontFindState(self): | 395 | def testDontFindState(self): |
257 | 372 | # searchin by storage state is currently not implemented | 396 | # searchin by storage state is currently not implemented |
258 | 373 | 397 | ||
259 | === added file 'test/test-mimetypes.py' | |||
260 | --- test/test-mimetypes.py 1970-01-01 00:00:00 +0000 | |||
261 | +++ test/test-mimetypes.py 2010-05-27 19:34:36 +0000 | |||
262 | @@ -0,0 +1,25 @@ | |||
263 | 1 | #! /usr/bin/python | ||
264 | 2 | |||
265 | 3 | # Update python path to use local zeitgeist module | ||
266 | 4 | import sys | ||
267 | 5 | import os | ||
268 | 6 | |||
269 | 7 | import zeitgeist.mimetypes | ||
270 | 8 | |||
271 | 9 | # Test the mimetype -> Interpretation conversion | ||
272 | 10 | mimetype = "image/png" | ||
273 | 11 | i = zeitgeist.mimetypes.get_interpretation_for_mimetype(mimetype) | ||
274 | 12 | print "Interpretation for %s: %s" % (mimetype, i) | ||
275 | 13 | |||
276 | 14 | mimetype = "text/plain" | ||
277 | 15 | i = zeitgeist.mimetypes.get_interpretation_for_mimetype(mimetype) | ||
278 | 16 | print "Interpretation for %s: %s" % (mimetype, i) | ||
279 | 17 | |||
280 | 18 | mimetype = "application/ogg" | ||
281 | 19 | i = zeitgeist.mimetypes.get_interpretation_for_mimetype(mimetype) | ||
282 | 20 | print "Interpretation for %s: %s" % (mimetype, i) | ||
283 | 21 | |||
284 | 22 | mimetype = "unknown/invalid" | ||
285 | 23 | i = zeitgeist.mimetypes.get_interpretation_for_mimetype(mimetype) | ||
286 | 24 | print "Interpretation for %s: %s" % (mimetype, i) | ||
287 | 25 | |||
288 | 0 | 26 | ||
289 | === modified file 'zeitgeist/Makefile.am' | |||
290 | --- zeitgeist/Makefile.am 2009-11-27 20:32:54 +0000 | |||
291 | +++ zeitgeist/Makefile.am 2010-05-27 19:34:36 +0000 | |||
292 | @@ -3,7 +3,8 @@ | |||
293 | 3 | app_PYTHON = \ | 3 | app_PYTHON = \ |
294 | 4 | __init__.py \ | 4 | __init__.py \ |
295 | 5 | datamodel.py \ | 5 | datamodel.py \ |
297 | 6 | client.py | 6 | client.py \ |
298 | 7 | mimetypes.py | ||
299 | 7 | 8 | ||
300 | 8 | nodist_app_PYTHON = _config.py | 9 | nodist_app_PYTHON = _config.py |
301 | 9 | 10 | ||
302 | 10 | 11 | ||
303 | === added file 'zeitgeist/mimetypes.py' | |||
304 | --- zeitgeist/mimetypes.py 1970-01-01 00:00:00 +0000 | |||
305 | +++ zeitgeist/mimetypes.py 2010-05-27 19:34:36 +0000 | |||
306 | @@ -0,0 +1,194 @@ | |||
307 | 1 | # -.- coding: utf-8 -.- | ||
308 | 2 | |||
309 | 3 | # Zeitgeist | ||
310 | 4 | # | ||
311 | 5 | # Copyright © 2009-2010 Siegfried-Angel Gevatter Pujals <rainct@ubuntu.com> | ||
312 | 6 | # Copyright © 2009 Mikkel Kamstrup Erlandsen <mikkel.kamstrup@gmail.com> | ||
313 | 7 | # Copyright © 2009 Markus Korn <thekorn@gmx.de> | ||
314 | 8 | # Copyright © 2010 Michal Hruby <michal.mhr@gmail.com> | ||
315 | 9 | # | ||
316 | 10 | # This program is free software: you can redistribute it and/or modify | ||
317 | 11 | # it under the terms of the GNU Lesser General Public License as published by | ||
318 | 12 | # the Free Software Foundation, either version 3 of the License, or | ||
319 | 13 | # (at your option) any later version. | ||
320 | 14 | # | ||
321 | 15 | # This program is distributed in the hope that it will be useful, | ||
322 | 16 | # but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
323 | 17 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
324 | 18 | # GNU Lesser General Public License for more details. | ||
325 | 19 | # | ||
326 | 20 | # You should have received a copy of the GNU Lesser General Public License | ||
327 | 21 | # along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
328 | 22 | |||
329 | 23 | import re | ||
330 | 24 | import fnmatch | ||
331 | 25 | |||
332 | 26 | from zeitgeist.datamodel import Interpretation | ||
333 | 27 | |||
334 | 28 | class SimpleMatch(object): | ||
335 | 29 | """ Wrapper around fnmatch.fnmatch which allows to define mimetype | ||
336 | 30 | patterns by using shell-style wildcards. | ||
337 | 31 | """ | ||
338 | 32 | |||
339 | 33 | def __init__(self, pattern): | ||
340 | 34 | self.__pattern = pattern | ||
341 | 35 | |||
342 | 36 | def match(self, text): | ||
343 | 37 | return fnmatch.fnmatch(text, self.__pattern) | ||
344 | 38 | |||
345 | 39 | def __repr__(self): | ||
346 | 40 | return "%s(%r)" %(self.__class__.__name__, self.__pattern) | ||
347 | 41 | |||
348 | 42 | class MimeTypeSet(set): | ||
349 | 43 | """ Set which allows to match against a string or an object with a | ||
350 | 44 | match() method. | ||
351 | 45 | """ | ||
352 | 46 | |||
353 | 47 | def __init__(self, *items): | ||
354 | 48 | super(MimeTypeSet, self).__init__() | ||
355 | 49 | self.__pattern = set() | ||
356 | 50 | for item in items: | ||
357 | 51 | if isinstance(item, (str, unicode)): | ||
358 | 52 | self.add(item) | ||
359 | 53 | elif hasattr(item, "match"): | ||
360 | 54 | self.__pattern.add(item) | ||
361 | 55 | else: | ||
362 | 56 | raise ValueError("Bad mimetype '%s'" %item) | ||
363 | 57 | |||
364 | 58 | def __contains__(self, mimetype): | ||
365 | 59 | result = super(MimeTypeSet, self).__contains__(mimetype) | ||
366 | 60 | if not result: | ||
367 | 61 | for pattern in self.__pattern: | ||
368 | 62 | if pattern.match(mimetype): | ||
369 | 63 | return True | ||
370 | 64 | return result | ||
371 | 65 | |||
372 | 66 | def __len__(self): | ||
373 | 67 | return super(MimeTypeSet, self).__len__() + len(self.__pattern) | ||
374 | 68 | |||
375 | 69 | def __repr__(self): | ||
376 | 70 | items = ", ".join(sorted(map(repr, self | self.__pattern))) | ||
377 | 71 | return "%s(%s)" %(self.__class__.__name__, items) | ||
378 | 72 | |||
379 | 73 | DOCUMENT_MIMETYPES = [ | ||
380 | 74 | # Covers: | ||
381 | 75 | # vnd.corel-draw | ||
382 | 76 | # vnd.ms-powerpoint | ||
383 | 77 | # vnd.ms-excel | ||
384 | 78 | # vnd.oasis.opendocument.* | ||
385 | 79 | # vnd.stardivision.* | ||
386 | 80 | # vnd.sun.xml.* | ||
387 | 81 | SimpleMatch(u"application/vnd.*"), | ||
388 | 82 | # Covers: x-applix-word, x-applix-spreadsheet, x-applix-presents | ||
389 | 83 | SimpleMatch(u"application/x-applix-*"), | ||
390 | 84 | # Covers: x-kword, x-kspread, x-kpresenter, x-killustrator | ||
391 | 85 | re.compile(u"application/x-k(word|spread|presenter|illustrator)"), | ||
392 | 86 | u"application/ms-powerpoint", | ||
393 | 87 | u"application/msword", | ||
394 | 88 | u"application/pdf", | ||
395 | 89 | u"application/postscript", | ||
396 | 90 | u"application/ps", | ||
397 | 91 | u"application/rtf", | ||
398 | 92 | u"application/x-abiword", | ||
399 | 93 | u"application/x-gnucash", | ||
400 | 94 | u"application/x-gnumeric", | ||
401 | 95 | SimpleMatch(u"application/x-java*"), | ||
402 | 96 | SimpleMatch(u"*/x-tex"), | ||
403 | 97 | SimpleMatch(u"*/x-latex"), | ||
404 | 98 | SimpleMatch(u"*/x-dvi"), | ||
405 | 99 | u"text/plain" | ||
406 | 100 | ] | ||
407 | 101 | |||
408 | 102 | IMAGE_MIMETYPES = [ | ||
409 | 103 | # Covers: | ||
410 | 104 | # vnd.corel-draw | ||
411 | 105 | u"application/vnd.corel-draw", | ||
412 | 106 | # Covers: x-kword, x-kspread, x-kpresenter, x-killustrator | ||
413 | 107 | re.compile(u"application/x-k(word|spread|presenter|illustrator)"), | ||
414 | 108 | SimpleMatch(u"image/*"), | ||
415 | 109 | ] | ||
416 | 110 | |||
417 | 111 | AUDIO_MIMETYPES = [ | ||
418 | 112 | SimpleMatch(u"audio/*"), | ||
419 | 113 | u"application/ogg" | ||
420 | 114 | ] | ||
421 | 115 | |||
422 | 116 | VIDEO_MIMETYPES = [ | ||
423 | 117 | SimpleMatch(u"video/*"), | ||
424 | 118 | u"application/ogg" | ||
425 | 119 | ] | ||
426 | 120 | |||
427 | 121 | DEVELOPMENT_MIMETYPES = [ | ||
428 | 122 | u"application/ecmascript", | ||
429 | 123 | u"application/javascript", | ||
430 | 124 | u"application/x-csh", | ||
431 | 125 | u"application/x-designer", | ||
432 | 126 | u"application/x-desktop", | ||
433 | 127 | u"application/x-dia-diagram", | ||
434 | 128 | u"application/x-fluid", | ||
435 | 129 | u"application/x-glade", | ||
436 | 130 | u"application/xhtml+xml", | ||
437 | 131 | u"application/x-java-archive", | ||
438 | 132 | u"application/x-m4", | ||
439 | 133 | u"application/xml", | ||
440 | 134 | u"application/x-object", | ||
441 | 135 | u"application/x-perl", | ||
442 | 136 | u"application/x-php", | ||
443 | 137 | u"application/x-ruby", | ||
444 | 138 | u"application/x-shellscript", | ||
445 | 139 | u"application/x-sql", | ||
446 | 140 | u"text/css", | ||
447 | 141 | u"text/html", | ||
448 | 142 | u"text/x-c", | ||
449 | 143 | u"text/x-c++", | ||
450 | 144 | u"text/x-chdr", | ||
451 | 145 | u"text/x-copying", | ||
452 | 146 | u"text/x-credits", | ||
453 | 147 | u"text/x-csharp", | ||
454 | 148 | u"text/x-c++src", | ||
455 | 149 | u"text/x-csrc", | ||
456 | 150 | u"text/x-dsrc", | ||
457 | 151 | u"text/x-eiffel", | ||
458 | 152 | u"text/x-gettext-translation", | ||
459 | 153 | u"text/x-gettext-translation-template", | ||
460 | 154 | u"text/x-haskell", | ||
461 | 155 | u"text/x-idl", | ||
462 | 156 | u"text/x-java", | ||
463 | 157 | u"text/x-lisp", | ||
464 | 158 | u"text/x-lua", | ||
465 | 159 | u"text/x-makefile", | ||
466 | 160 | u"text/x-objcsrc", | ||
467 | 161 | u"text/x-ocaml", | ||
468 | 162 | u"text/x-pascal", | ||
469 | 163 | u"text/x-patch", | ||
470 | 164 | u"text/x-python", | ||
471 | 165 | u"text/x-sql", | ||
472 | 166 | u"text/x-tcl", | ||
473 | 167 | u"text/x-troff", | ||
474 | 168 | u"text/x-vala", | ||
475 | 169 | u"text/x-vhdl", | ||
476 | 170 | ] | ||
477 | 171 | |||
478 | 172 | ALL_MIMETYPES = DOCUMENT_MIMETYPES + IMAGE_MIMETYPES + AUDIO_MIMETYPES + \ | ||
479 | 173 | VIDEO_MIMETYPES + DEVELOPMENT_MIMETYPES | ||
480 | 174 | |||
481 | 175 | FILTERS = { | ||
482 | 176 | # dict of name as key and the matching mimetypes as value | ||
483 | 177 | # if the value is None this filter matches all mimetypes | ||
484 | 178 | "DOCUMENT": MimeTypeSet(*DOCUMENT_MIMETYPES), | ||
485 | 179 | "IMAGE": MimeTypeSet(*IMAGE_MIMETYPES), | ||
486 | 180 | "AUDIO": MimeTypeSet(*AUDIO_MIMETYPES), | ||
487 | 181 | "VIDEO": MimeTypeSet(*VIDEO_MIMETYPES), | ||
488 | 182 | "SOURCE_CODE": MimeTypeSet(*DEVELOPMENT_MIMETYPES), | ||
489 | 183 | } | ||
490 | 184 | |||
491 | 185 | def get_interpretation_for_mimetype(mimetype): | ||
492 | 186 | matching_filter = None | ||
493 | 187 | for filter_name, mimetypes in FILTERS.iteritems(): | ||
494 | 188 | if mimetype and mimetype in mimetypes: | ||
495 | 189 | matching_filter = filter_name | ||
496 | 190 | break | ||
497 | 191 | if matching_filter: | ||
498 | 192 | return getattr(Interpretation, matching_filter).uri | ||
499 | 193 | return "" | ||
500 | 194 |
Looks good on a quick glance over the diff. Nice work.