Add origin as a property of events

Bug #425258 reported by Siegfried Gevatter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Zeitgeist Framework
Fix Released
Undecided
Siegfried Gevatter

Bug Description

Both http://live.gnome.org/GnomeZeitgeist/DatabaseDesign and our current implementation have "origin" as a property of items, but this makes no sense, as the origin can be different for every event and should as such be a property of it. (Further, to improve disk space usage we should save the actual value of it in the "uri" table).

I'm filling this bug so that we don't remember to fix it, but please let's ignore it until we get the basic event/item separation stuff merged - most important stuff first.

Related branches

Changed in zeitgeist:
importance: Undecided → Medium
milestone: none → 0.3
status: New → Triaged
Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote :

Note that events should be regarded as sub types of items, meaning that an event is comprised of both the relevant row in the item table and the corresponding row in the event table. This means that events do in fact have origins.

I do agree, however, that an origin is a URL, and as such should be stored in the uri table. Meaning that item.origin should converted to an INTEGER and be renamed to item.origin_id.

Revision history for this message
Siegfried Gevatter (rainct) wrote :

> Note that events should be regarded as sub types of items
Not with the new design, now we will have a clear differentiation between what is an event and what is an item.

> This means that events do in fact have origins.
Indeed, and each event can have a different origin, so my point is that the origin should be stored in the "event" table and not in "item".

However Seif has recently proposed that we drop "origin" entirely.

Revision history for this message
Seif Lotfy (seif) wrote : Re: [Bug 425258] Re: origin should be a property of events, not items

I was wonderign if we should actually dump the whole origin part?

2009/9/7 Mikkel Kamstrup Erlandsen <email address hidden>

> Note that events should be regarded as sub types of items, meaning that
> an event is comprised of both the relevant row in the item table and the
> corresponding row in the event table. This means that events do in fact
> have origins.
>
> I do agree, however, that an origin is a URL, and as such should be
> stored in the uri table. Meaning that item.origin should converted to an
> INTEGER and be renamed to item.origin_id.
>
> --
> origin should be a property of events, not items
> https://bugs.launchpad.net/bugs/425258
> You received this bug notification because you are subscribed to The
> Zeitgeist Project.
>
> Status in Zeitgeist Engine: Triaged
>
> Bug description:
> Both http://live.gnome.org/GnomeZeitgeist/DatabaseDesign and our current
> implementation have "origin" as a property of items, but this makes no
> sense, as the origin can be different for every event and should as such be
> a property of it. (Further, to improve disk space usage we should save the
> actual value of it in the "uri" table).
>
> I'm filling this bug so that we don't remember to fix it, but please let's
> ignore it until we get the basic event/item separation stuff merged - most
> important stuff first.
>
>

Revision history for this message
Seif Lotfy (seif) wrote :

uhm events and iems can be contentually seperated but does not mean
seperated in terms of class inheritance

2009/9/7 Seif Lotfy <email address hidden>

> I was wonderign if we should actually dump the whole origin part?
>
> 2009/9/7 Mikkel Kamstrup Erlandsen <email address hidden>
>
> Note that events should be regarded as sub types of items, meaning that
>> an event is comprised of both the relevant row in the item table and the
>> corresponding row in the event table. This means that events do in fact
>> have origins.
>>
>> I do agree, however, that an origin is a URL, and as such should be
>> stored in the uri table. Meaning that item.origin should converted to an
>> INTEGER and be renamed to item.origin_id.
>>
>> --
>> origin should be a property of events, not items
>> https://bugs.launchpad.net/bugs/425258
>> You received this bug notification because you are subscribed to The
>> Zeitgeist Project.
>>
>> Status in Zeitgeist Engine: Triaged
>>
>> Bug description:
>> Both http://live.gnome.org/GnomeZeitgeist/DatabaseDesign and our current
>> implementation have "origin" as a property of items, but this makes no
>> sense, as the origin can be different for every event and should as such be
>> a property of it. (Further, to improve disk space usage we should save the
>> actual value of it in the "uri" table).
>>
>> I'm filling this bug so that we don't remember to fix it, but please let's
>> ignore it until we get the basic event/item separation stuff merged - most
>> important stuff first.
>>
>>
>

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote : Re: origin should be a property of events, not items

Would it be possible for anyone to throw up a draft of the new design? (keeping a copy of the old database design spec around...)?

Revision history for this message
Siegfried Gevatter (rainct) wrote : Re: [Bug 425258] Re: origin should be a property of events, not items

The DB is the same, only the API changes to make everything clearer
(and export more information). See the mail I've send you.

Also I don't understand what's the problem here - the problem
described in this bug is completely unrelated to the changes. I
believe there's just a misunderstanding again because of vocabulary;
eg. "does not mean seperated in terms of class inheritance" - events
aren't classes in the implementation * so why do we have to talk this
way...?

* (well, in my branch there are Event and Item classes, but that's unrelated)

--
Siegfried-Angel Gevatter Pujals (RainCT)
Free Software Developer 363DEAE3

Revision history for this message
Seif Lotfy (seif) wrote : Re: origin should be a property of events, not items

Uhm sounds like we will stay with origin belonging to the subject :P

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote :

Yeah. We had a discussion about this at the hackfest and everyone agreed that origin must be a property of the subject(s). So the meaning og origin is "where does the subject come from?". For files it is the parent folder. For websites it is the root URL (eg. http://youtube.com).

We do need to figure out however if we want trailing slashes on origin or not... That's another issue however.

Changed in zeitgeist:
status: Triaged → Invalid
Revision history for this message
Siegfried Gevatter (rainct) wrote : Re: [Bug 425258] Re: origin should be a property of events, not items

Everyone not, I disagree with that concept of origin, but anyway, will
write more once I´m home.

--
Siegfried-Angel Gevatter Pujals (RainCT)
Free Software Developer 363DEAE3

Revision history for this message
Seif Lotfy (seif) wrote :

WISE ASS

2009/11/24 Siegfried Gevatter <email address hidden>

> Everyone not, I disagree with that concept of origin, but anyway, will
> write more once I´m home.
>
> --
> Siegfried-Angel Gevatter Pujals (RainCT)
> Free Software Developer 363DEAE3
>
> --
> origin should be a property of events, not items
> https://bugs.launchpad.net/bugs/425258
> You received this bug notification because you are subscribed to The
> Zeitgeist Project.
>
> Status in Zeitgeist Engine: Invalid
>
> Bug description:
> Both http://live.gnome.org/GnomeZeitgeist/DatabaseDesign and our current
> implementation have "origin" as a property of items, but this makes no
> sense, as the origin can be different for every event and should as such be
> a property of it. (Further, to improve disk space usage we should save the
> actual value of it in the "uri" table).
>
> I'm filling this bug so that we don't remember to fix it, but please let's
> ignore it until we get the basic event/item separation stuff merged - most
> important stuff first.
>
>

Revision history for this message
Siegfried Gevatter (rainct) wrote : Re: origin should be a property of events, not items

Seif: Thanks for your kind words.

Everyone: Okay now that I'm home (wrote the previous message when checking mails from the phone, so that I don't forget to answer :P), here comes what I wanted to say.

As I understand it, the point for "origin" is to know where a particular event came from. So, eg. if you are on a website and there's and you click on a link to download a PDF, the resulting DOWNLOAD_EVENT would have the page where the link was as origin (as opposed to the online location of the PDF, which is an information that should go into Tracker). If I am on page1.com/links and there I click on page2.com/foo, the resulting VISIT_EVENT with subject page2.com/foo would have origin page1.com/links, if I read a mail and there I right click on an e-mail address and add it to my contacts list (in case we ever log that) would give an event where the origin is the e-mail you were reading, etc. Like this I think events make sense and are useful, even though I'm not sure they are worth the mess (which is why I wouldn't disagree with removing them entirely).

About just storing the domain name or the document, what's the point? It's just duplication of information. For filtering we should support full text searches (in URI and Title), using the origin (which for this is also named wrongly, btw, it should be called "root" rather than "origin") is rather useless; why would someone want to look for items in /home? What I want is to look at items in /home/rainct/Documents/Foobar/ containing "lala" in the filename.

Cheers!

Revision history for this message
Seif Lotfy (seif) wrote : Re: [Bug 425258] Re: origin should be a property of events, not items

uhm other way round where the subject came from :)

2009/11/24 Siegfried Gevatter <email address hidden>

> Seif: Thanks for your kind words.
>
> Everyone: Okay now that I'm home (wrote the previous message when
> checking mails from the phone, so that I don't forget to answer :P),
> here comes what I wanted to say.
>
> As I understand it, the point for "origin" is to know where a particular
> event came from. So, eg. if you are on a website and there's and you
> click on a link to download a PDF, the resulting DOWNLOAD_EVENT would
> have the page where the link was as origin (as opposed to the online
> location of the PDF, which is an information that should go into
> Tracker). If I am on page1.com/links and there I click on page2.com/foo,
> the resulting VISIT_EVENT with subject page2.com/foo would have origin
> page1.com/links, if I read a mail and there I right click on an e-mail
> address and add it to my contacts list (in case we ever log that) would
> give an event where the origin is the e-mail you were reading, etc. Like
> this I think events make sense and are useful, even though I'm not sure
> they are worth the mess (which is why I wouldn't disagree with removing
> them entirely).
>
> About just storing the domain name or the document, what's the point?
> It's just duplication of information. For filtering we should support
> full text searches (in URI and Title), using the origin (which for this
> is also named wrongly, btw, it should be called "root" rather than
> "origin") is rather useless; why would someone want to look for items in
> /home? What I want is to look at items in /home/rainct/Documents/Foobar/
> containing "lala" in the filename.
>
> Cheers!
>
> --
> origin should be a property of events, not items
> https://bugs.launchpad.net/bugs/425258
> You received this bug notification because you are subscribed to The
> Zeitgeist Project.
>
> Status in Zeitgeist Engine: Invalid
>
> Bug description:
> Both http://live.gnome.org/GnomeZeitgeist/DatabaseDesign and our current
> implementation have "origin" as a property of items, but this makes no
> sense, as the origin can be different for every event and should as such be
> a property of it. (Further, to improve disk space usage we should save the
> actual value of it in the "uri" table).
>
> I'm filling this bug so that we don't remember to fix it, but please let's
> ignore it until we get the basic event/item separation stuff merged - most
> important stuff first.
>
>

Revision history for this message
Mikkel Kamstrup Erlandsen (kamstrup) wrote : Re: origin should be a property of events, not items

Siegfried, I think you make a good case. However I don't see it conflicting with what we have now though. Whether origin is something on the event or the subject(s) is purely a matter of how one looks at it now that we have a 1 to 1 mapping between events and subjects in the db.

The tricky part is defining how we want to assign the origins for which types of events. It would be most useful to have a written spec here. Consider your download example (which is a good one btw!), what if I downloaded the files via a download manager app? Then the event context is lost and we have to resort to guess work to establish the origin (and likely use the host URL of the subject). Let's not go into these details here. We should have a spec!

For me, I don't think it makes a lot of sense to talk about the origin (or root) of an event, events are intangible things. All relations to the physical world is described in the subjects. Furthermore it is technically impossible to use the event as the entity to store the origin since an event can have several subjects each with their own origin. Consider for instance if I select files ~/foo/bar.txt ~/other/one.txt and move them to the trash in Nautilus. This is one event, but each subject will have a different origin.

Then there's the topic of full text searching. I think there are two sides of the matter you touch in your comment. 1) In relation origin, and 2) in the more general sense where we want full text search on any fields that make sense.

So for 1), one reason that the parent folder is useful for file events is that we can look for "hot origins". Like we can detect that a lot of activity has been in the folder ~/Projects/zeitgeist in the past two days. We could for instance introduce new ResultTypes for the FindEventIds() method, like ResultType.MostRecentOrigin and ResultType.MostPopularOrigin. Such things can not be done easily with full text searching (it can be done though, and it is called "clustering", but no desktop search engine provides that atm. (and they likely wont in the near to mid-term future)).

That said I still think that a full text search could be useful, but I don't think we should intermix it with our FindEventIds() method. Full text searching is very different in nature than strict SELECTS on a DB. We can maybe open up a separate bug report to discuss full text searching the events?

Revision history for this message
Seif Lotfy (seif) wrote :

I opened this bug again since after considering extending the event table with one more column "current_uri" I see it as an opportunity to actually create another event_origin column.
The reasoning behind that is that it will allow us to trace origins of events like "which website got me to this website"
This is a small use case but I think its worth it

Changed in zeitgeist:
status: Invalid → New
Seif Lotfy (seif)
Changed in zeitgeist:
milestone: 0.3.0 → 0.8.0
importance: Medium → Undecided
Revision history for this message
Siegfried Gevatter (rainct) wrote :

So I think we decided to add origin to events in addition to what is there now?

summary: - origin should be a property of events, not items
+ Add origin as a property of events
Seif Lotfy (seif)
Changed in zeitgeist:
status: New → In Progress
Seif Lotfy (seif)
Changed in zeitgeist:
status: In Progress → Fix Committed
assignee: nobody → Siegfried Gevatter (rainct)
Seif Lotfy (seif)
Changed in zeitgeist:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.