The payload is sometimes mentioned as string and sometimes as array of bytes

Bug #691167 reported by Manish Sinha (मनीष सिन्हा)
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zeitgeist Framework
Won't Fix
Wishlist
J.P. Lacerda

Bug Description

In the event serialization format, the third array is array of bytes or ay as dbus signature.

When an event enters an extension, it looks like
Event([dbus.Array([u'', u'1292500628312', u'', u'', u'application://foo.desktop'], signature=dbus.Signature('s')), [Subject([u'', u'', u'', u'', u'', u'', u''])], dbus.Array([], signature=dbus.Signature('y'))])

which implies that payload is array of bytes

Now look at _zeitgeist/engine/datamodel.py at line 58
where you get the line
>> popo.append(str(ev[2]))

Really so when you do str() on dbus.Array([], signature=dbus.Signature('y'))
you get "dbus.Array([], signature=dbus.Signature('y'))" instead of the contents of bytes converted to string

Now when you call Event.get_plain on

Event([dbus.Array([u'', u'1292500628312', u'', u'', u'application://foo.desktop'], signature=dbus.Signature('s')), [Subject([u'', u'', u'', u'', u'', u'', u''])], dbus.Array([], signature=dbus.Signature('y'))])

you get

[[u'', u'1292500628312', u'', u'', u'application://foo.desktop'], [[u'', u'', u'', u'', u'', u'', u'']], "dbus.Array([], signature=dbus.Signature('y'))"]

Not what you expect

Related branches

Revision history for this message
Markus Korn (thekorn) wrote :

Manish, is this (esp. the last few lines `"dbus.Array([], signature=dbus.Signature('y'))"`) something you get somewhere with real code (if so, please attach a minimal reproducer) or is it *just* loud thinking?

payload is in fact an array of bytes, as defined in the dbus signature. You should not care about the weirdness done in zeitgeist, it should just do the right thing. It does the right thing because a String in python is in first approximation just an array of bytes (I know this statement is not 100% true for all python versions, but it is good enough for our needs)

There are only two possible reasons where this assumtion would fail:
 * we want to support python 2.x and 3.x at the same time, which is a no-go
 * I'm missing something ;) - please prove me wrong with some code.

Revision history for this message
Seif Lotfy (seif) wrote : Re: [Bug 691167] Re: The payload is sometimes mentioned as string and sometimes as array of bytes

How do you intend to blacklist events with specific payloads then ?

On Thu, Dec 16, 2010 at 6:37 PM, Markus Korn <email address hidden> wrote:

> Manish, is this (esp. the last few lines `"dbus.Array([],
> signature=dbus.Signature('y'))"`) something you get somewhere with real
> code (if so, please attach a minimal reproducer) or is it *just* loud
> thinking?
>
> payload is in fact an array of bytes, as defined in the dbus signature.
> You should not care about the weirdness done in zeitgeist, it should
> just do the right thing. It does the right thing because a String in
> python is in first approximation just an array of bytes (I know this
> statement is not 100% true for all python versions, but it is good
> enough for our needs)
>
> There are only two possible reasons where this assumtion would fail:
> * we want to support python 2.x and 3.x at the same time, which is a no-go
> * I'm missing something ;) - please prove me wrong with some code.
>
> --
> You received this bug notification because you are subscribed to The
> Zeitgeist Project.
> https://bugs.launchpad.net/bugs/691167
>
> Title:
> The payload is sometimes mentioned as string and sometimes as array of
> bytes
>
> Status in Zeitgeist Framework:
> New
>
> Bug description:
> In the event serialization format, the third array is array of bytes or ay
> as dbus signature.
>
> When an event enters an extension, it looks like
> Event([dbus.Array([u'', u'1292500628312', u'', u'',
> u'application://foo.desktop'], signature=dbus.Signature('s')),
> [Subject([u'', u'', u'', u'', u'', u'', u''])], dbus.Array([],
> signature=dbus.Signature('y'))])
>
> which implies that payload is array of bytes
>
> Now look at _zeitgeist/engine/datamodel.py at line 58
> where you get the line
> >> popo.append(str(ev[2]))
>
> Really so when you do str() on dbus.Array([],
> signature=dbus.Signature('y'))
> you get "dbus.Array([], signature=dbus.Signature('y'))" instead of the
> contents of bytes converted to string
>
>
> Now when you call Event.get_plain on
>
> Event([dbus.Array([u'', u'1292500628312', u'', u'',
> u'application://foo.desktop'], signature=dbus.Signature('s')),
> [Subject([u'', u'', u'', u'', u'', u'', u''])], dbus.Array([],
> signature=dbus.Signature('y'))])
>
> you get
>
> [[u'', u'1292500628312', u'', u'', u'application://foo.desktop'], [[u'',
> u'', u'', u'', u'', u'', u'']], "dbus.Array([],
> signature=dbus.Signature('y'))"]
>
> Now what you expect
>
>
>

--
This is me doing some advertisement for my blog http://seilo.geekyogre.com

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Markus,
I got this when trying to work on the new blacklist API.

So you need to run the daemon from this branch.
lp:~manishsinha/zeitgeist/reproduce-691167

and then need to run the following commands on the python console
>>> sub = Subject.new_for_values()
>> ev = Event.new_for_values(actor="application://foo.desktop", subjects=[sub,])
>>> obj=dbus.SessionBus().get_object("org.gnome.zeitgeist.Engine","/org/gnome/zeitgeist/blacklist")
>>> iface=dbus.Interface(obj, "org.gnome.zeitgeist.Blacklist")
>>> blk= map(Event.new_for_struct,iface.GetBlacklist())
>>> blk.append(ev)
>>> iface.SetBlacklist(blk, "foo")

Then go and check the console. You would get the output like

Event([dbus.Array([u'', u'1292500628312', u'', u'', u'application://foo.desktop'], signature=dbus.Signature('s')), [Subject([u'', u'', u'', u'', u'', u'', u''])], dbus.Array([], signature=dbus.Signature('y'))])
{'foo': [[[u'', u'1292500628312', u'', u'', u'application://foo.desktop'], [[u'', u'', u'', u'', u'', u'', u'']], "dbus.Array([], signature=dbus.Signature('y'))"]]}
[2010-12-16 21:32:44,249] - DEBUG - zeitgeist.blacklist - Blacklist updated: {'foo':
 [[[u'', u'1292500628312', u'', u'', u'application://foo.desktop'], [[u'', u'', u'', u'', u'', u'', u'']], "dbus.Array([], signature=dbus.Signature('y'))"]]}

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Seif, I don't think we can ever hit a situation where we need to compare payloads. BTW just because we might never hit this situation doesn't make this bug void. A bug is a bug.

Revision history for this message
Markus Korn (thekorn) wrote :

Okidoki, thanks for the example - what we are doing in _zeitgeist.engine.datamodel.Event.get_plain() for payloads is indeed wrong.

Let me clearify the purpose of this method, get_plain() is designed to change Event(template) objects in a way that they are dumpable by the pickle module. And we are using pickle to store blacklists in the filesystem for later use.

But: do we really want to support blacklists based on payload? Please remember, payload can be random (binary)data, what's the point of filtering events based on a binary blob?

So rather than fixing the way this method handles the payload field, I propose we should not allow payload in blacklists at all.

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Markus,

Right now event matching doesn't take payload in to consideration. It matches only metadata and subjects. So till now it has never matched payload. So we can say that this bug does not affect blacklists. Blacklist implementation was where I noticed this bug

Anyway I got the patch and requested for merge request. Have a look at it

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

Ah. I mean requested for a merge. Sorry for the grammar mistake

Revision history for this message
Seif Lotfy (seif) wrote :

I agree with Markus that we should not allow blacklists to take payloads
into consideration
However I think the fix is necessary for us. +1 for me tbh

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

After talks with Markus and Seif, it is indeed a bug

Changed in zeitgeist:
assignee: nobody → Manish Sinha (manishsinha)
status: New → Confirmed
Seif Lotfy (seif)
Changed in zeitgeist:
milestone: none → 0.7.0
Changed in zeitgeist:
status: Confirmed → In Progress
Revision history for this message
Seif Lotfy (seif) wrote :

I beg to differ. But seems to me that they are not working at all. I wrote a test case in the below diff where I insert an event with a payload and try to retrieve it again. This does not work on trunk guys -.-
---

=== modified file 'test/remote-test.py'
--- test/remote-test.py 2010-09-22 18:44:16 +0000
+++ test/remote-test.py 2010-12-20 15:42:22 +0000
@@ -321,6 +321,19 @@
   registry = iface.get_extension("DataSourceRegistry", "data_source_registry")
   registry.GetDataSources()

+ def testFindEventsWithPayload(self):
+ mainloop = gobject.MainLoop()
+ payload = "Hello World"
+ def callback(ids):
+ def callback2(events):
+ mainloop.quit()
+ self.assertEquals(events[0].payload, payload)
+ self.client.get_events(ids, callback2)
+ events = [Event.new_for_values(actor=u"boo", timestamp=124, subject_uri="file://yomomma")]
+ events[0].payload = payload
+ self.client.insert_events(events, callback)
+ mainloop.run()
+
 class ZeitgeistRemoteInterfaceTest(unittest.TestCase):

  def setUp(self):

----

Run "./test/remote-test.py ZeitgeistRemoteAPITest.testFindEventsWithPayload"

I get:

Error from Zeitgeist engine: org.freedesktop.DBus.Python.TypeError: Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.6/dbus/service.py", line 745, in _message_cb
    _method_reply_return(connection, message, method_name, signature, *retval)
  File "/usr/lib/pymodules/python2.6/dbus/service.py", line 252, in _method_reply_return
    reply.append(signature=signature, *retval)
TypeError: an integer is required

Seif Lotfy (seif)
Changed in zeitgeist:
importance: Undecided → High
Revision history for this message
Seif Lotfy (seif) wrote :

OK I just summed up the situation.
A payload is an array of bytes.

An Array of bytes should not be translated into a string (encoding could be an issue, especially if a binary). Thus I prefer having a list of int

Revision history for this message
Manish Sinha (मनीष सिन्हा) (manishsinha) wrote :

MIkkel asked this question in the merge request
https://code.launchpad.net/~manishsinha/zeitgeist/fix-691167/+merge/43948/comments/97974/

and my reply is that as a array of bytes, it is easy to have binary information in it e..g tarball, image, db-dump, but I have no clue how string can be represented. AFAIK broadly, files can be opened in two ways - text and binary, so this makes conversion from text or binary and binary to text confusing.

Revision history for this message
Siegfried Gevatter (rainct) wrote :

What's the difference between this bug and bug #692645?

Seif Lotfy (seif)
Changed in zeitgeist:
importance: High → Medium
importance: Medium → Wishlist
Seif Lotfy (seif)
Changed in zeitgeist:
milestone: 0.7.0 → none
Seif Lotfy (seif)
Changed in zeitgeist:
milestone: none → 0.8.0
Revision history for this message
Siegfried Gevatter (rainct) wrote :

I think we can bump the Python dependency to Python 2.6 by now, and fix this the right way.

Revision history for this message
Seif Lotfy (seif) wrote :

+1 on that decision

Revision history for this message
Seif Lotfy (seif) wrote :

OK guys lets close this tiny useless bug. Lets put python 2.6 as minimum requirement

Changed in zeitgeist:
assignee: Manish Sinha (मनीष सिन्हा) (manishsinha) → nobody
Seif Lotfy (seif)
Changed in zeitgeist:
assignee: nobody → Seif Lotfy (seif)
Seif Lotfy (seif)
Changed in zeitgeist:
assignee: Seif Lotfy (seif) → J.P. Lacerda (jplacerda)
Revision history for this message
Seif Lotfy (seif) wrote :

We wont fix this until it is a real issue. If any1 want to fix it please do but we wont look into it now. Our current solution works pretty well :)

Changed in zeitgeist:
status: In Progress → Won't Fix
milestone: 0.8.0 → 0.8.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.