Ubuntu Accomplishments Daemon

Merge lp:~jtatum/ubuntu-accomplishments-daemon/lp-caching-part-1 into lp:ubuntu-accomplishments-daemon

lp-caching-part-1
Merge into accomplishments-daemon

Proposed by James Tatum on 2012-05-10

Status:	Rejected
Rejected by:	Jono Bacon on 2012-08-03
Proposed branch:	lp:~jtatum/ubuntu-accomplishments-daemon/lp-caching-part-1
Merge into:	lp:ubuntu-accomplishments-daemon
Diff against target:	126 lines (+117/-0) 2 files modified accomplishments/lpdata.py (+35/-0) accomplishments/util/cacheddata.py (+82/-0)
To merge this branch:	bzr merge lp:~jtatum/ubuntu-accomplishments-daemon/lp-caching-part-1
Related bugs:	Link a bug report

Reviewer	Date Requested	Status
Jono Bacon	2012-05-10	Disapprove on 2012-08-03
Rafał Cieślak		Needs Information on 2012-06-08
Review via email: mp+105419@code.launchpad.net

Description of the change

This is part one of changes for simplifying LaunchPad (and other) related accomplishments and caching LaunchPad output so we don't beat it up so much.

The basic premise here is to create an object that a) makes it possible to easily query team membership and other launchpad data and b) store the output locally rather than constantly asking LP each script.

This enables code that executes pretty much instantly on cache hits and looks like:

>>> import accomplishments.lpdata
Daemon seems to be run not installed, branch base path used: /mnt/hgfs/jtatum/Projects/ubuntu-accomplishments-daemon
>>> me = accomplishments.lpdata.LPData.fetch('<email address hidden>')
>>> 'locoteams' in me.super_teams
True
>>> 'motu' in me.super_teams
False
>>> 'ubuntu-california' in me.direct_teams
True

lp:~jtatum/ubuntu-accomplishments-daemon/lp-caching-part-1 updated on 2012-05-10

41. By James Tatum on 2012-05-10: New editor was set to use tabs in some places. Hopefully fixed

Revision history for this message

James Tatum (jtatum) wrote on 2012-05-11:

I put this in the daemon code on the assumption that it would be useful to multiple consumers, but if we think that only ubuntu-community-accomplishments will need it then it probably belongs there.

Revision history for this message

Rafał Cieślak (rafalcieslak256) wrote on 2012-05-18:

Thanks for your work, James. I really like the idea of simplifying scripts, it seems you have done a pretty good job here.
However, as you have mentioned, it seems that these scripts will make sense only for the ubuntu-community collection. Could this code be somehow bundled with the accomplishments scripts within that collection, to avoid adding to the daemon pieces of code that are meant to be useful only for a particular collection? If yes, then it would make much more sense if this was included in that package instead of the daemon.

There is one more thing that worries me - the cache is not updated. Imagine such situation:
1. User A is not a Ubuntu Member.
2. The cache is created for him, it remembers that is is not a Member.
3. He reads the instructions on how to become a Member, and soon accomplishes this trophy.
4. User A clicks 'check-accomplishment', or waits up to 15 minutes for the scripts to be run.
5. The daemon uses scripts to check if Ubuntu Member has been accomplished.
6. The scripts ask the cache if he is a member of 'ubuntu-members'
7. Obviously, the cache denies that, and so the user is unable to gain this trophy (and similarly a lot of others).

Revision history for this message

James Tatum (jtatum) wrote on 2012-05-18:

Hi Rafal,

Actually, the cache is only valid for one hour. So in about an hour, the trophy will be awarded. The timeout value can be changed in the library. I think once an hour is a good amount of time for beating up the launchpad servers to request data that doesn't change that often. Even a very active user might only change teams what.. 50 times a year? Actually, I worry that even one hour is too much if this becomes distributed with Ubuntu - the Launchpad servers could become inundated if they start getting a few hundred thousand of these requests per hour.

Also, I don't think this will be just for ubuntu-community-accomplishments. Any accomplishment set that uses Launchpad data should be using this cache. As an example, the Italian Loco's accomplishments mentioned at http://www.jonobacon.org/2012/04/04/quick-ubuntu-accomplishments-update/ should almost certainly be updated to use cached data.

But, that said, I don't know about putting it here. Where are the dbus client calls going for the new API?

Revision history for this message

Rafał Cieślak (rafalcieslak256) wrote on 2012-06-08:

James,
Sorry for not replying, Launchpad had never notified me about the new comment :(

I must have not noticed the cache validity check, it is indeed valid for one hour, which makes sense. And in case we changed out mind this can be fine-tuned later.

I am still convinced that this enhancements should be shipped with the accomplishments collection rather than the core system. We aim to make the platform as little Ubuntu-oriented as possible, so that even a paragliding club or ninjas community might create collections that fully satisfy their needs. This means that such launchpad-related calls shall be avoided in the daemon itself. You are right that other accomplishments collections might want to use the vary same cache - so I imagine ubuntu-it-community might want to include the same .py module(s) that ubuntu-community-accomplishments uses for caching, and both might work on the very same cache file.

Does that make sense?

review: Needs Information

Revision history for this message

James Tatum (jtatum) wrote on 2012-06-08:

If both are to use the same file, they really need to use the same source. The source is essentially implementing a schema in the cache, so if ubuntu-it-community and ubuntu-community-accomplishments are using different versions of the cache schema, they will definitely conflict in unpleasant ways.

If you are adamant about this not belonging here, then we need a new project that stores things common to any accomplishments that are launchpad based.

Revision history for this message

Rafał Cieślak (rafalcieslak256) wrote on 2012-06-11:

This is a valid point, and I have to agree that particular collections shouldn't work on the same cache file. I guess the most elegant solution then would be to simply use separate files for each collection. I don't really think we need another project for that, this really could be done on scripts' side. The scripts are responsible for fetching the data, so ideally they would care about their cache too. I imagine such caching mechanism might be even merged into a single file (or a few) that would be shipped alongside our current scripts and included/imported by them, so that it would be easily re-usable in other collections.

As a side note, since we now have a lot of AU accomplishments, they would make a great use of their own cache, for they all download the very same file.

Revision history for this message

James Tatum (jtatum) wrote on 2012-06-21:

To the contrary - I think they all should use the same cache file! :) This code and code like it should be a common library. Why have n separate caches for the same data?

Revision history for this message

Rafał Cieślak (rafalcieslak256) wrote on 2012-07-11:

That would be possible only in an ideal situation, but as you've pointed out yourself, the cache schema may differ, thus it's the safest solution is to keep them separate.

After all, this is not a big deal. I do not expect anyone will ever have installed more than 3 accomplishment collections that use launchpad API. Very few, only Ubuntu-related collections will make use of it.

I am still convinced that the simplest and easiest to manage solution is to implement the caching algorithm on scripts side. The daemon should at most point the scripts to their collection's cache directory, and should not contain any code that works in favor of ubuntu-affiliated scripts. All that can be very simply done as a single .py file shipped within accomplishments collection, which may result in having more than one cache for the same thing, but it will very rarely be a significant case, and even then will never cause problems.

Revision history for this message

James Tatum (jtatum) wrote on 2012-07-11:

I guess I just don't understand the resistance to it. Don't repeat yourself :) If the code has a bug or needs an update, it has to be updated.. everywhere. The only mitigating argument is that this is already the case for the individual accomplishment scripts - and I think that should be changed too :)

I am with you about the daemon not being the right place for this type of code. So, let's spin up a new package that all Launchpad related accomplishments can depend on. Any suggestions on the name?

Revision history for this message

Jono Bacon (jonobacon) wrote on 2012-08-03:

Rafal has landed caching support in UCA which I believe is based on your work, so I believe this MP is no longer need. Thanks, James!

review: Disapprove

Revision history for this message

Jono Bacon (jonobacon) on 2012-08-03:

review: Disapprove

Unmerged revisions

41. By James Tatum on 2012-05-10: New editor was set to use tabs in some places. Hopefully fixed
40. By James Tatum on 2012-05-10: Removing some extra whitespace
39. By James Tatum on 2012-05-10: First part of LaunchPad data caching mechanism and accomplishment simplification
38. By Launchpad Translations on behalf of jonobacon on 2012-05-07: Launchpad automatic translations update.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

James Tatum

Jono Bacon

Matt Fischer

 === added file 'accomplishments/lpdata.py'
 --- accomplishments/lpdata.py	1970-01-01 00:00:00 +0000
 +++ accomplishments/lpdata.py	2012-05-10 23:04:19 +0000
@@ -0,0 +1,35 @@
++from launchpadlib.launchpad import Launchpad
++
++from util.cacheddata import CachedData
++
++
++class LPData(CachedData):
++    """Force launchpadlib to evaluate some of the lazy data it knows about a
++       user and store it in member variables.
++    """
++    # If you add any fields to the LPData object, increment this number to
++    # ensure the cache is invalidated on update
++    VERSION = 1
++
++    def __init__(self):
++        super(LPData, self).__init__()
++        # This is a list of all teams the user is a direct or indirect member
++        self.super_teams = []
++        # Store a list of teams user is a direct member of
++        self.direct_teams = []
++        self.name = None
++
++    @classmethod
++    def populate(cls, email):
++        """Return a new LPData object populated from the key, which is the
++        user's email address.
++        """
++        data = LPData()
++        lp = Launchpad.login_anonymously('ubuntu-community accomplishments',
++                                         'production')
++        user = lp.people.getByEmail(email=email)
++        data.name = user.name
++        data.key = email
++        data.super_teams = [i.name for i in user.super_teams]
++        data.direct_teams = [i.team.name for i in user.memberships_details]
++        return data
 === added file 'accomplishments/util/cacheddata.py'
 --- accomplishments/util/cacheddata.py	1970-01-01 00:00:00 +0000
 +++ accomplishments/util/cacheddata.py	2012-05-10 23:04:19 +0000
@@ -0,0 +1,82 @@
++try:
++    import cPickle as pickle
++except ImportError:
++    import pickle
++import os
++import os.path
++import time
++
++from launchpadlib.launchpad import Launchpad
++
++
++CACHE_LIFESPAN = 60*60
++
++
++class CachedData(object):
++    """CachedData is a parent object which allows inheriting objects to perform
++       some costly fetch operation and cache the output.
++
++       Child objects should create a populate(key) method. This method will
++       perform the costly retrieval and store the data in member variables.
++       Calling the fetch(key) classmethod will check the cache for this data
++       and return it if the following conditions are met:
++
++       1) The cache file exists
++       2) The cache is not stale (see the CACHE_LIFESPAN module global)
++       3) The key in the cache matches the requested key
++       4) The version of data in the cache is equal to the defined class
++          VERSION
++
++       If a condition is not met, the populate(key) method is called to fetch
++       the data and it is stored in the cache.
++    """
++    def __init__(self):
++        self.key = None
++        self.version = self.VERSION
++
++    def __repr__(self):
++        cls = self.__class__
++        return '<%s.%s object - %r>' % (cls.__module__,
++                                        cls.__name__,
++                                        self.name)
++
++    @classmethod
++    def fetch(cls, key):
++        """Fetch data from the cache or from a costly source.
++
++           This will call the populate(key) method on derived classes. That
++           method should return a populated object of the derived class.
++           This method will then store the object in the cache for later
++           retrieval.
++        """
++        # basedir spec says to check $XDG_CACHE_HOME for the location of
++        # the user cache dir first. Failing that, it's just ~/.cache.
++        try:
++            cache_dir = os.environ['XDG_CACHE_HOME']
++        except KeyError:
++            cache_dir = '~/.cache'
++
++        cache_dir = os.path.expanduser(cache_dir)
++        cache_dir = os.path.join(cache_dir, 'accomplishments')
++        if not os.path.exists(cache_dir):
++            os.makedirs(cache_dir)
++
++        cache_file = os.path.join(cache_dir, cls.__name__.lower())
++
++        if os.path.exists(cache_file):
++            mtime = os.path.getmtime(cache_file)
++            if abs(time.time() - mtime) < CACHE_LIFESPAN:
++                with open(cache_file, 'rb') as input:
++                    obj = pickle.load(input)
++                    if cls.VERSION == obj.version:
++                        if obj.key == key:
++                            # Cache hit. All conditions met.
++                            return obj
++
++        # Cache miss. Call populate()
++        obj = cls.populate(key)
++
++        with open(cache_file, 'wb') as output:
++            pickle.dump(obj, output)
++
++        return obj

Ubuntu Accomplishments Daemon

Merge lp:~jtatum/ubuntu-accomplishments-daemon/lp-caching-part-1 into lp:ubuntu-accomplishments-daemon

Commit message

Description of the change

Unmerged revisions

Preview Diff

Subscribers