LAVA Core (deprected)

Merge lp:~zyga/lava-core/history-and-logging into lp:lava-core

history-and-logging
Merge into trunk

Proposed by Zygmunt Krynicki on 2012-05-07

Status:	Merged
Merged at revision:	8
Proposed branch:	lp:~zyga/lava-core/history-and-logging
Merge into:	lp:lava-core
Diff against target:	956 lines (+929/-2) 4 files modified lava/core/history.py (+522/-0) lava/core/logging.py (+120/-0) lava/core/tests/test_history.py (+278/-0) setup.py (+9/-2)
To merge this branch:	bzr merge lp:~zyga/lava-core/history-and-logging
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Linaro Validation Team		2012-05-07	Pending
Review via email: mp+104882@code.launchpad.net

Description of the change

This is a re-spin of earlier merge proposal (https://code.launchpad.net/~zkrynicki/lava-core/logging-mixin/+merge/102339)

Again, to quote my extensive commit message:

Add the logging and history modules

  Both modules cooperate on centralized logging interface and have largely
  identical APIs. LoggingMixIn class adds self.history and self.logging
  properties to the class it is injected to.

  Both properties return proxy objects that ultimately call classic python
  logging and lava-core'esque history. History is much like logging but has two
  advantages over plain logging.

  The first advantage is that it was designed up front for being hierarchical.
  This naturally constructs a tree of activities. I believe that such structure
  can be visualized much better than a plain log file ever could. Any composite
  operation also computes and records the duration of all the activities it
  contains. This yields much more transparent duration awareness of entire
  processes that lava performs internally.

  The second advantage is that it can carry more data than regular python logging
  can. All arguments to the message template are passed to the string.format()
  function as keywords. This allows us to transparently provide key-value data in
  alongside each history entry. This makes automatic analysis and data retrieval
  much easier. It also allows for new data logging features such as downloading a
  file and storing the speed history as an array in the activity meta-data. Such
  object can be rendered with appropriate care by the front end thus making the
  user interface far more rich than we could previously attempt without extreme
  hacks and side-band, custom communication channels. Another feature that takes
  advantage of this rich meta-data is attachments. Any activity can have a
  collection of attachments that can be presented in a web front-end (attachments
  also carry explicit MIME type). LAVA Core currently uses this to ensure we
  capture full, uncorrupted output of any external programs we run.

  The history module offers utility functions that can render any activity (or
  the convenience lava.core.history.history.top activity which records all known
  history of the process) as JSON or plain text (in several formats). In addition
  both formats can be saved, along with all the attachments, in a history
  tarball. This functionality is exploited in the top-level lava command line
  program, that allows the user to record the history of any LAVA commands, for
  visualization, debugging, rich analysis or other, yet-unknown purpose.

lp:~zyga/lava-core/history-and-logging updated on 2012-05-07

9. By Zygmunt Krynicki on 2012-05-07: Fix module docstrings not to refer to lava.core.utils anymore
10. By Zygmunt Krynicki on 2012-05-07: Update docstrings of misc logging functions

Revision history for this message

Michael Hudson-Doyle (mwhudson) wrote on 2012-05-08:

Download full text (29.2 KiB)

[And once more, to the merge proposal as well :(]

Hi. This overall seems reasonably nice. With things like this I always
wonder if we'll ever get around to building the nice UIs that are
enabled...

Some general concerns:

1) Incremental output on the scheduler job page is _massively_ useful
   and losing it even temporarily is not acceptable, IMHO. I don't
   quite see how this meshes with this code. I guess we could still
   define a logging handler that outputs log messages to stdout/stderr
   and that will display nicely in the scheduler view until we get
   around to doing something more sophisticated there.

2) I can see having two sets of debug/info/warning/error functions with
   slightly different APIs (% vs .format() interpolation) being trying.
   But maybe this will be a non-issue -- I guess usually it the code
   will be saying self.history.info() or self.logger.info() and the
   logger vs history API will be clear.

3) I'm just a little worried by memory consumption -- it seems that
   there is the potential to end up with a _lot_ of objects hanging
   around, particularly if you hang on to tracebacks. It might be worth
   stringifying things a little earlier. I guess this won't affect the
   API and so can be done later...

On Mon, 07 May 2012 09:49:18 -0000, Zygmunt Krynicki <email address hidden> wrote:
> Zygmunt Krynicki has proposed merging lp:~zkrynicki/lava-core/history-and-logging into lp:lava-core.
>
> Requested reviews:
> Linaro Validation Team (linaro-validation)
>
> For more details, see:
> https://code.launchpad.net/~zkrynicki/lava-core/history-and-logging/+merge/104882
>
> This is a re-spin of earlier merge proposal (https://code.launchpad.net/~zkrynicki/lava-core/logging-mixin/+merge/102339)
>
> Again, to quote my extensive commit message:
>
> Add the logging and history modules
>
> Both modules cooperate on centralized logging interface and have largely
> identical APIs. LoggingMixIn class adds self.history and self.logging
> properties to the class it is injected to.
>
> Both properties return proxy objects that ultimately call classic python
> logging and lava-core'esque history. History is much like logging but has two
> advantages over plain logging.
>
> The first advantage is that it was designed up front for being hierarchical.
> This naturally constructs a tree of activities. I believe that such structure
> can be visualized much better than a plain log file ever could. Any composite
> operation also computes and records the duration of all the activities it
> contains. This yields much more transparent duration awareness of entire
> processes that lava performs internally.
>
> The second advantage is that it can carry more data than regular python logging
> can. All arguments to the message template are passed to the string.format()
> function as keywords. This allows us to transparently provide key-value data in
> alongside each history entry. This makes automatic analysis and data retrieval
> much easier. It also allows for new data logging features such as downloading a
> file and storing the speed history as an array in the acti...

[And once more, to the merge proposal as well :(]

Hi.  This overall seems reasonably nice.  With things like this I always
wonder if we'll ever get around to building the nice UIs that are
enabled...

Some general concerns:

1) Incremental output on the scheduler job page is _massively_ useful
   and losing it even temporarily is not acceptable, IMHO.  I don't
   quite see how this meshes with this code.  I guess we could still
   define a logging handler that outputs log messages to stdout/stderr
   and that will display nicely in the scheduler view until we get
   around to doing something more sophisticated there.

3) I'm just a little worried by memory consumption -- it seems that
   there is the potential to end up with a _lot_ of objects hanging
   around, particularly if you hang on to tracebacks.  It might be worth
   stringifying things a little earlier.  I guess this won't affect the
   API and so can be done later...

On Mon, 07 May 2012 09:49:18 -0000, Zygmunt Krynicki <zygmunt.krynicki@canonical.com> wrote:
> Zygmunt Krynicki has proposed merging lp:~zkrynicki/lava-core/history-and-logging into lp:lava-core.
> 
> Requested reviews:
>   Linaro Validation Team (linaro-validation)
> 
> For more details, see:
> https://code.launchpad.net/~zkrynicki/lava-core/history-and-logging/+merge/104882
> 
> This is a re-spin of earlier merge proposal (https://code.launchpad.net/~zkrynicki/lava-core/logging-mixin/+merge/102339)
> 
> Again, to quote my extensive commit message:
> 
>   Add the logging and history modules
>   
>   Both modules cooperate on centralized logging interface and have largely
>   identical APIs. LoggingMixIn class adds self.history and self.logging
>   properties to the class it is injected to.
>   
>   Both properties return proxy objects that ultimately call classic python
>   logging and lava-core'esque history. History is much like logging but has two
>   advantages over plain logging.
>   
>   The first advantage is that it was designed up front for being hierarchical.
>   This naturally constructs a tree of activities. I believe that such structure
>   can be visualized much better than a plain log file ever could. Any composite
>   operation also computes and records the duration of all the activities it
>   contains. This yields much more transparent duration awareness of entire
>   processes that lava performs internally.
>   
>   The second advantage is that it can carry more data than regular python logging
>   can. All arguments to the message template are passed to the string.format()
>   function as keywords. This allows us to transparently provide key-value data in
>   alongside each history entry. This makes automatic analysis and data retrieval
>   much easier. It also allows for new data logging features such as downloading a
>   file and storing the speed history as an array in the activity meta-data. Such
>   object can be rendered with appropriate care by the front end thus making the
>   user interface far more rich than we could previously attempt without extreme
>   hacks and side-band, custom communication channels. Another feature that takes
>   advantage of this rich meta-data is attachments. Any activity can have a
>   collection of attachments that can be presented in a web front-end (attachments
>   also carry explicit MIME type). LAVA Core currently uses this to ensure we
>   capture full, uncorrupted output of any external programs we run.
>   
>   The history module offers utility functions that can render any activity (or
>   the convenience lava.core.history.history.top activity which records all known
>   history of the process) as JSON or plain text (in several formats). In addition
>   both formats can be saved, along with all the attachments, in a history
>   tarball. This functionality is exploited in the top-level lava command line
>   program, that allows the user to record the history of any LAVA commands, for
>   visualization, debugging, rich analysis or other, yet-unknown purpose.
> 
> -- 
> https://code.launchpad.net/~zkrynicki/lava-core/history-and-logging/+merge/104882
> You are subscribed to branch lp:lava-core.
> === added file 'lava/core/history.py'
> --- lava/core/history.py	1970-01-01 00:00:00 +0000
> +++ lava/core/history.py	2012-05-07 09:48:17 +0000
> @@ -0,0 +1,504 @@
> +# Copyright (C) 2011-2012 Linaro Limited
> +# vim: set fileencoding=utf8 :
> +#
> +# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
> +#
> +# This file is part of lava-core
> +#
> +# lava-core is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU Lesser General Public License version 3
> +# as published by the Free Software Foundation
> +#
> +# lava-core is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public License
> +# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
> +
> +from __future__ import absolute_import, print_function
> +
> +"""
> +lava.core.utils.history
> +=======================
> +
> +Implementation of the hierarhical history
> +"""
> +
> +from logging import DEBUG, INFO, WARNING, ERROR, CRITICAL
> +import collections
> +import datetime
> +import os
> +import sys
> +import tarfile
> +import traceback
> +
> +from json_document.serializers import JSON
> +from json_schema_validator.extensions import (
> +    datetime_extension, timedelta_extension)
> +import simplejson
> +
> +
> +class DictProxy(object):

A short docstring would be nice.  I guess "I wish I was writing
Javascript" would not be helpful :-)

> +    def __init__(self, d):
> +        object.__setattr__(self, "_d", d)
> +
> +    def as_dict(self):
> +        return object.__getattribute__(self, "_d")

"return self._d" works here. __getattr__ is only called for attributes
that are not found by the usual mechanisms.  I guess we'd better hope
that the dictionary does not have '_d' as a key :-)

> +    def __getattr__(self, name):
> +        try:
> +            return self.as_dict()[name]
> +        except KeyError:
> +            raise AttributeError(name)
> +
> +    def __setattr__(self, name, value):
> +        self.as_dict()[name] = value

Similarly "self._d[name] = value".

> +
> +
> +ExceptionInfo = collections.namedtuple(
> +    "ExceptionInfo", "filename lineno function text")

I don't this this helps.  See below.

> +
> +_name_and_back = {
> +    DEBUG: "DEBUG",
> +    INFO: "INFO",
> +    WARNING: "WARNING",
> +    ERROR: "ERROR",
> +    CRITICAL: "CRITICAL",
> +    "DEBUG": DEBUG,
> +    "INFO": INFO,
> +    "WARNING": WARNING,
> +    "ERROR": ERROR,
> +    "CRITICAL": CRITICAL
> +}

This duplicates logging._levelNames.  But I guess the underscore scares
one away from that...

You also only seem to use the DEBUG -> "DEBUG" side of it.

> +class Activity(object):
> +
> +    def __init__(self, level, message, **meta):
> +        """
> +        Create a new activity with the specified message, level and
> +        initial meta-data
> +        """
> +        # Message that has format() references to meta-data
> +        self._level = level
> +        self._message = message
> +        self._meta = DictProxy(meta)
> +        # File attachments
> +        self._attachments = []
> +        # Children activities
> +        self._sub_activities = []
> +        # Start and end timestamps
> +        self._start = datetime.datetime.utcnow()
> +        self._end = self._start
> +        # Exception + traceback
> +        self._exc_value = None
> +        self._exc_type = None
> +        self._traceback = None
> +
> +    @property
> +    def raw_message(self):
> +        """
> +        the raw, unformatted message
> +        """
> +        return self._message
> +
> +    @property
> +    def message(self):
> +        """
> +        rendered message
> +
> +        Rendering substitutes str.format() style variables with data from
> +        self.meta.
> +        """
> +        return self._message.format(**self._meta.as_dict())
> +
> +    @property
> +    def level(self):
> +        """
> +        level number, 10 (DEBUG) - 50 (CRITICAL)
> +        """
> +        return self._level
> +
> +    @property
> +    def level_name(self):
> +        """
> +        name of the level "DEBUG" ... "CRITICAL"
> +        """
> +        return _name_and_back[self._level]
> +
> +    @property
> +    def meta(self):
> +        """
> +        collection of meta-data
> +
> +        Meta-data serves two purposes: to fill in data to the message
> +        template and to store arbitrary auxiliary data. This property
> +        returns a dictionary proxy that looks like an object (has normal
> +        attribute accessors) but still talks to the real dictionary behind
> +        the scenes. You can both read and write/set attributes at will.
> +        """
> +        return self._meta
> +
> +    def attach(self, stream, name=None,  mime_type='text/plain'):
> +        """
> +        Add an attachment.
> +
> +        Everything is extracted from the file-like object stream.

This is dishonest: it had better be a _file_, or save_as_tarball won't
work (stream.name has to name a path on disk for tarball.add() to work,
surely?).

> If you care
> +        about a particular name of the attachment it an be specified with name.
> +        MIME type can help the web front-end to render the content. It defaults
> +        to text/plain which is suitable for most log files.
> +        """
> +        if name is None:
> +            name = os.path.basename(stream.name)
> +        self._attachments.append(
> +            DictProxy({
> +                'name': name,
> +                'orig_name': stream.name,
> +                'mime_type': mime_type}))

I don't know that it's worth using DictProxy here as you only use this
once...

> +    @property
> +    def duration(self):
> +        """
> +        Duration of the activity.
> +
> +        If the activity has not started the duration is always zero.
> +        Activities that are in progress will observe the duration to grow.
> +        """
> +        if self._start is None:
> +            return datetime.timedelta()
> +        if self._end is None:
> +            return datetime.datetime.utcnow() - self._start
> +        return self._end - self._start
> +
> +    def as_json(self):
> +        """
> +        Convert this Activity to JSON

Meep.  This doesn't return JSON (JSON is the _string_).  Maybe
"prepare_for_json" ?

> +        """
> +        obj = simplejson.OrderedDict((
> +            ("level", self.level_name),
> +            ("message", self._message),
> +            ("meta", self.meta.as_dict()),

Something somewhere is going to break if meta contains things that can't
be JSON-ed, right?

> +            ("start", (datetime_extension.to_json(self._start)
> +                       if self._start is not None else None)),
> +            ("duration", timedelta_extension.to_json(self.duration)),
> +            ("end", (datetime_extension.to_json(self._end)
> +                     if self._end is not None else None)),
> +            ("exc_value", self._exc_value),
> +            ("exc_type", (self._exc_type.__name__
> +                          if self._exc_type else None)),
> +            ("traceback", ([ExceptionInfo(*ei) for ei in
> +                            traceback.extract_tb(self._traceback)]
> +                           if self._traceback else None)),

This seems a bit of a complicated way of saying "ei.filename ei.lineno
ei.function ei.text for ei in ..."

> +            ("attachments", [
> +                attachment.as_dict() for attachment in self._attachments]),
> +            ("sub_activities", [
> +                activity.as_json() for activity in self._sub_activities])))
> +        return _build_json(obj)

It strikes me that this would be more proof against odd things in meta
and a bit simpler if _build_json (which also has a bad name) handled
datetimes and timedeltas.

> +    def __str__(self):
> +        """"
> +        Returns self.message
> +        """
> +        return self.message
> +
> +    def save_as_tarball(self, scratch, tarball_stream, mode='w:gz'):
> +        """
> +        Save activity history as a tarball.
> +
> +        The tarball will have two files 'history.txt' and 'history.json' with
> +        appropriate encoding of this activity, as well as all sub-activities.
> +        Any attachments are added as well.
> +        """
> +        with tarfile.open(tarball_stream.name, mode=mode,
> +                          fileobj=tarball_stream) as tarball:
> +            with scratch.open('history.txt') as stream:
> +                self.print_(stream=stream)

Heh ordering snafu?  scratch isn't introduced yet.  Never mind though.

It's never occurred to me before, but it's pretty weird to see code of
the form:

with ${a}.open(${b}) as var:
    ... code that writes to var ...

This is the tempfile module's fault, not yours I guess :)

> +            tarball.add(stream.name, 'history.txt')
> +            with scratch.open('history.json') as stream:
> +                JSON.dump(stream, self.as_json())
> +            tarball.add(stream.name, 'history.json')
> +            self._save_tarball_attachments(tarball)
> +
> +    def _save_tarball_attachments(self, tarball):
> +        for attachment in self._attachments:
> +            if os.path.isfile(attachment.orig_name):
> +                tarball.add(attachment.orig_name, attachment.name)
> +        for activity in self._sub_activities:
> +            activity._save_tarball_attachments(tarball)

Err, you'd better hope that no two attachments have the same name or are
called 'history.json' or 'history.txt' here!

I'm a bit uneasy about how this attachment thing is going to work out in
practice.  When a job has finished, it will create a tarball, that ends
up in the media files of lava-server and the scheduler job view will
fish the history.json out of this tarball to know how to render the
basic outline, include things that will do ajax to another view that
fishes other files out of the tarball to either be downloaded or shown
inline in the page?  But we'd like to have incremental output and I
don't really like the idea of a page for a completed job being
completely different to that for a job that's still running.

It just feels like the bits don't quite line up here.

> +    def __repr__(self):
> +        return "<Activity: %r>" % str(self)
> +
> +    def __enter__(self):
> +        """
> +        Begins a 'long' activity.
> +
> +        Long activities have both start and end, and thus, non-zero duration.
> +        This is called automatically by History.__enter__ so you won't usually
> +        have to use it explicitly.
> +
> +        Here we reset self.end to None (it was initialized to 'now' in
> +        __init__)
> +        """
> +        self._end = None
> +        return self
> +
> +    def __exit__(self, exc_type, exc_value, traceback):
> +        """
> +        Terminates a 'long' activity.
> +
> +        Here we set self.end to 'now' and record any exception information.
> +        See __enter__() for more information on 'long' activities.
> +        """
> +        self._end = datetime.datetime.utcnow()
> +        self._exc_value = exc_value
> +        self._exc_type = exc_type
> +        self._traceback = traceback
> +
> +    def print_(self, stream=None, print_meta=False, print_children=True,
> +               stack=None):
> +        """
> +        Print a nice Unicode tree of activities.
> +
> +        This is very useful for tracing and debugging.
> +        """
> +        if stack is None:
> +            stack = []
> +        if stream is None:
> +            stream = sys.stdout
> +        fragments = []
> +        if print_meta is False:

Please don't compare immutable values with 'is'.

In the code as submitted, there's no way for print_meta to be True (or
print_children to be False, come to that).  Will this change in a later
branch?

> +            fragments.append("{0:7}".format(self.level_name))
> +            fragments.append(" | ")
> +            fragments.append("{start}.{micro:06}".format(
> +                start=self._start.strftime("%Y-%m-%d %T"),
> +                micro=self._start.microsecond))
> +            fragments.append(" | ")
> +            if self.duration.total_seconds() > 0:
> +                fragments.append("{0:14}s".format(self.duration))
> +            else:
> +                fragments.append("{0:15}".format(""))
> +            fragments.append(" | ")
> +            fragments.append(''.join(self._tree_indent(stack)))
> +        # Process the message to make it safer
> +        message = self.message
> +        message = message.replace("\n", " ").replace("\r", " ")
> +        if isinstance(message, unicode):
> +            message = message.encode(stream.encoding or "UTF-8", 'replace')
> +        fragments.append(message)
> +        if self._exc_value:
> +            fragments.append(", ***crashed*** {0}".format(self._exc_value))
> +        print("".join(fragments), file=stream)
> +        if print_meta:
> +            print("\tstart: {0}".format(self._start))
> +            if self.duration.total_seconds() > 0:
> +                print("\tduration: {0}".format(self.duration))
> +            if self._end is not None and self._end != self._start:
> +                print("\tend: {0}".format(self._end))
> +            if self._exc_value is not None:
> +                print("\texc_value: {0}".format(self._exc_value))
> +            if self._exc_type is not None:
> +                print("\texc_type: {0}".format(self._exc_type))
> +            if self._traceback is not None:
> +                for index, exc_info in enumerate(self._traceback):
> +                    print("\ttraceback.{0}.filename: {1}".format(
> +                        index, self._traceback.filename))
> +                    print("\ttraceback.{0}.lineno: {1}".format(
> +                        index, self._traceback.lineno))
> +                    print("\ttraceback.{0}.function {1}".format(
> +                        index, self._traceback.function))
> +                    print("\ttraceback.{0}.text {1}".format(
> +                        index, self._traceback.text))
> +            for key, value in self.meta.as_dict().iteritems():
> +                # Skip one special key used by logging integration
> +                if key == "__prefmt_msg":
> +                    continue
> +                print("\t{key}: {value!r}".format(key=key, value=value))
> +        if print_children:
> +            pos = len(stack)
> +            stack.append(len(self._sub_activities))
> +            for sub in self._sub_activities:
> +                stack[pos] -= 1
> +                sub.print_(stream, print_meta, print_children, stack)
> +            stack.pop()
> +
> +    @staticmethod
> +    def _tree_indent(stack, use_unicode=False):
> +        """
> +        Compute the indentation part of the tree displayed by print_.
> +        """
> +        # The code here might be tricky. It's easier to understand if you keep
> +        # in mind that stack[n] tells you how many more entries are left in the
> +        # tree at that stack depth. The stack is constructed by print_()

I haven't really tried to understand the printing code, sorry.

> +        for items_left in stack[:-1]:
> +            # FIXME: this is broekn
> +            if items_left == 0:
> +                yield "    "
> +            elif items_left == 1:
> +                if use_unicode:
> +                    yield "└   "

Wow, this code confused emacs when reading the original email -- I guess
it didn't know it was utf-8.  As with print_meta I don't see a way for
use_unicode to be True, and would sort of prefer to avoid unicode here I
think.

> +                else:
> +                    yield "\   "
> +            elif items_left > 1:
> +                if use_unicode:
> +                    yield "│   "
> +                else:
> +                    yield "|   "
> +        for items_left in stack[-1:]:
> +            if items_left == 0:
> +                if use_unicode:
> +                    yield "└── "
> +                else:
> +                    yield "\-- "
> +            elif items_left > 0:
> +                if use_unicode:
> +                    yield "├── "
> +                else:
> +                    yield "|-- "
> +
> +
> +def _build_json(obj):
> +    """
> +    Helper to construct json that keeps ordering and discards None values

As above: this doesn't return JSON.

> +    """
> +    if isinstance(obj, (dict, simplejson.OrderedDict)):
> +        new_obj = simplejson.OrderedDict()
> +        for key, value in obj.iteritems():
> +            if value is None or value == [] or value == {}:
> +                continue
> +            else:
> +                new_obj[str(key)] = _build_json(value)
> +        return new_obj
> +    elif isinstance(obj, list):
> +        return [_build_json(item) for item in obj]
> +    elif (isinstance(obj, (int, float, str, unicode, list, dict, bool))
> +          or obj is None):
> +        return obj

er, list is in here twice.  I don't understand how ExceptionInfo (or
tuple come to that) passed through here.  I think you should delete
ExceptionInfo anyway, but you probably want to handle tuples?

> +    else:
> +        try:
> +            return str(obj)
> +        except Exception:
> +            raise ValueError("cannot convert {0!r} to JSON".format(obj))
> +
> +
> +class History(object):
> +    """
> +    History class, provides helper methods for hierarchical logging.
> +    Maintains a stack of activities that are nested with context
> +    management.
> +    """
> +
> +    def __init__(self, top):
> +        """
> +        Create a History that starts with the specified 'top' activity
> +
> +        Usually you don't want to call this. Instead, use the singleton
> +        instance 'history'.
> +        """
> +        self._nesting = [top]
> +        self._last = None
> +
> +    @property
> +    def top(self):
> +        """
> +        the top-level activity
> +
> +        Very useful to grab entire history and do something with it, for
> +        example history.top.as_json() or history.top.print_()
> +        """
> +        return self._nesting[0]
> +
> +    @property
> +    def bottom(self):
> +        """
> +        the bottommost activity
> +
> +        This is the most nested activity, as established by:
> +
> +            with history.xxx():
> +                ...
> +        """
> +        return self._nesting[-1]
> +
> +    @property
> +    def last(self):
> +        """
> +        the most recent activity
> +
> +        This is the most recently created activity in this history.
> +        """

The lack of implementation of this method suggests it's not actually
used anywhere :-)

> +    def log(self, level, message, **meta):
> +        """
> +        Create an activity and append it to the history
> +        """
> +        self._last = Activity(level, message, **meta)
> +        self.bottom._sub_activities.append(self._last)
> +        return self
> +
> +    def debug(self, message, **meta):
> +        """
> +        Append DEBUG activity to the bottommost activity
> +        """
> +        return self.log(DEBUG, message, **meta)
> +
> +    def info(self, message, **meta):
> +        """
> +        Append INFO activity to the bottommost activity
> +        """
> +        return self.log(INFO, message, **meta)
> +
> +    def warning(self, message, **meta):
> +        """
> +        Append WARNING activity to the bottommost activity
> +        """
> +        return self.log(WARNING, message, **meta)
> +
> +    def error(self, message, **meta):
> +        """
> +        Append ERROR activity to the bottommost activity
> +        """
> +        return self.log(ERROR, message, **meta)
> +
> +    def critical(self, message, **meta):
> +        """
> +        Append CRITICAL activity to the bottommost activity
> +        """
> +        return self.log(CRITICAL, message, **meta)
> +
> +    def __enter__(self):
> +        """
> +        Begins nested activity
> +
> +        Nesting is widely useful as it helps to create a structure in the
> +        history stream. Typically you will call this method indirectly
> +        with code that looks like this:
> +
> +            with history.info("Doing something complex"):
> +                history.debug("step 1")
> +                history.debug("step 2")
> +                history.debug("step 3")
> +
> +        Here we also call __enter__() on the last activity to convert it to a
> +        'long' activity (one that has both start, end and non-zero duration)
> +        """
> +        self._nesting.append(self._last)
> +        return self._last.__enter__()
> +
> +    def __exit__(self, exc_type, exc_value, traceback):
> +        """
> +        Terminates most recently nested activity
> +
> +        Here we also call __exit__() on the last activity to allow it to record
> +        any exceptions that may have been raised.
> +        """
> +        last = self._nesting.pop()
> +        last.__exit__(exc_type, exc_value, traceback)
> +
> +
> +# Global history
> +history = History(Activity(DEBUG, "History module loaded"))
> 
> === added file 'lava/core/logging.py'
> --- lava/core/logging.py	1970-01-01 00:00:00 +0000
> +++ lava/core/logging.py	2012-05-07 09:48:17 +0000
> @@ -0,0 +1,120 @@
> +# Copyright (C) 2011-2012 Linaro Limited
> +# vim: set fileencoding=utf8 :
> +#
> +# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
> +#
> +# This file is part of lava-core
> +#
> +# lava-core is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU Lesser General Public License version 3
> +# as published by the Free Software Foundation
> +#
> +# lava-core is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU Lesser General Public License
> +# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
> +
> +from __future__ import absolute_import
> +
> +"""
> +lava.core.utils.logging
> +=======================
> +
> +Logging utilities
> +"""
> +
> +import logging
> +
> +from lava.core.history import history as global_history
> +
> +
> +class LoggerToHistoryProxy(object):
> +    """
> +    Wrapper that looks like logging.Logger but also creates history records
> +    """
> +
> +    _history = global_history
> +
> +    def __init__(self, logger):
> +        self._logger = logger
> +
> +    def log(self, level, msg, *args, **kwargs):
> +        self._logger.log(level, msg, *args, **kwargs)
> +        self._history.log(level, "{__prefmt_msg}",
> +                     __prefmt_msg=msg % args, logger=self._logger.name)
> +
> +    def debug(self, msg, *args, **kwargs):
> +        return self.log(logging.DEBUG, msg, *args, **kwargs)
> +
> +    def info(self, msg, *args, **kwargs):
> +        return self.log(logging.INFO, msg, *args, **kwargs)
> +
> +    def warning(self, msg, *args, **kwargs):
> +        return self.log(logging.WARNING, msg, *args, **kwargs)
> +
> +    def error(self, msg, *args, **kwargs):
> +        return self.log(logging.ERROR, msg, *args, **kwargs)
> +
> +    def critical(self, msg, *args, **kwargs):
> +        return self.log(logging.CRITICAL, msg, *args, **kwargs)
> +
> +    def exception(self, msg, *args):
> +        self._logger.exception(msg, *args)
> +
> +
> +class HistoryToLoggerProxy(object):
> +    """
> +    Wrapper that looks like History but also logs to logger
> +    """
> +
> +    _history = global_history
> +
> +    def __init__(self, logger):
> +        self._logger = logger
> +
> +    def log(self, level, message, **meta):
> +        self._logger.log(level, message.format(**meta))
> +        return self._history.log(level, message, **meta)
> +
> +    def debug(self, message, **meta):
> +        return self.log(logging.DEBUG, message, **meta)
> +
> +    def info(self, message, **meta):
> +        return self.log(logging.INFO, message, **meta)
> +
> +    def warning(self, message, **meta):
> +        return self.log(logging.WARNING, message, **meta)
> +
> +    def error(self, message, **meta):
> +        return self.log(logging.ERROR, message, **meta)
> +
> +    def critical(self, message, **meta):
> +        return self.log(logging.CRITICAL, message, **meta)
> +
> +
> +class LoggingMixIn(object):
> +    """
> +    Mix-in that adds a self.logger and self.history instance attributes
> +    """

I think you should explain the difference between calling a method on
logger and on history.

> +    @property
> +    def _logger(self):
> +        return logging.getLogger(
> +            self.__class__.__module__ + "." + self.__class__.__name__)
> +
> +    @property
> +    def history(self):
> +        """
> +        magic lava.utils.logging.history.History-like instance
> +        """
> +        return HistoryToLoggerProxy(self._logger)
> +
> +    @property
> +    def logger(self):
> +        """
> +        magic logging.Logger-like instance
> +        """
> +        return LoggerToHistoryProxy(self._logger)
>

Cheers,
mwh

Revision history for this message

Zygmunt Krynicki (zyga) wrote on 2012-05-08:

Download full text (36.3 KiB)

> [And once more, to the merge proposal as well :(]
>
> Hi. This overall seems reasonably nice. With things like this I always
> wonder if we'll ever get around to building the nice UIs that are
> enabled...
>
> Some general concerns:
>
> 1) Incremental output on the scheduler job page is _massively_ useful
> and losing it even temporarily is not acceptable, IMHO. I don't
> quite see how this meshes with this code. I guess we could still
> define a logging handler that outputs log messages to stdout/stderr
> and that will display nicely in the scheduler view until we get
> around to doing something more sophisticated there.

While I was somewhat skeptical about this, anything involving fast models made me reconsider quickly. I did a small proof of concept that streams data in real time (anything invoked via the agent) while retaining stdout/stderr separation. So I guess that means we'll just keep the verbose output for the current scheduler view. In the future we may replace that with real-time status updates so that we'd get the full hierarchical structure but as you rightfully point out, that may take a while.

> 2) I can see having two sets of debug/info/warning/error functions with
> slightly different APIs (% vs .format() interpolation) being trying.
> But maybe this will be a non-issue -- I guess usually it the code
> will be saying self.history.info() or self.logger.info() and the
> logger vs history API will be clear.

> 3) I'm just a little worried by memory consumption -- it seems that
> there is the potential to end up with a _lot_ of objects hanging
> around, particularly if you hang on to tracebacks. It might be worth
> stringifying things a little earlier. I guess this won't affect the
> API and so can be done later...

I thought about it too, by demo-2 I made sure we don't keep an Activity record for each stdout line we print. In theory this dispatcher is better as it never keeps an in-memory log file of any test. The amount of 'activities' is pretty much constant across all test runs. Still, unlike logging which never suffers from this problem, activities do add up. When it becomes a problem we can simply record them (in some machine format) as they are being generated. The only issue here is that activities that are in flight (__enter__()) will require seeking + updates to "close" and any exceptions we generate could need special treatment (as ...

> [And once more, to the merge proposal as well :(]
> 
> Hi.  This overall seems reasonably nice.  With things like this I always
> wonder if we'll ever get around to building the nice UIs that are
> enabled...
> 
> Some general concerns:
> 
> 1) Incremental output on the scheduler job page is _massively_ useful
>    and losing it even temporarily is not acceptable, IMHO.  I don't
>    quite see how this meshes with this code.  I guess we could still
>    define a logging handler that outputs log messages to stdout/stderr
>    and that will display nicely in the scheduler view until we get
>    around to doing something more sophisticated there.

> 2) I can see having two sets of debug/info/warning/error functions with
>    slightly different APIs (% vs .format() interpolation) being trying.
>    But maybe this will be a non-issue -- I guess usually it the code
>    will be saying self.history.info() or self.logger.info() and the
>    logger vs history API will be clear.

We can still reconsider this as it's reasonably easy/fast to replace this with the moderate amount of usage it's seen so far. My motivations for format() were twofold: python deprecates % in favor of .format() and (perhaps I have over-valued this) I wanted to have key-value meta-data in each log message, believing that it could be used by some future smart UI to present the data better. If we were to go with % formatting then we don't really loose key-value formats (they just look different). Perhaps we could do a machine substitution so that we can get consistent logging + % API and keep the true data behind the scenes. In practical terms it would do mean calling self.history.info("%(something)s", something="blarg"). What do you think?
 
> 3) I'm just a little worried by memory consumption -- it seems that
>    there is the potential to end up with a _lot_ of objects hanging
>    around, particularly if you hang on to tracebacks.  It might be worth
>    stringifying things a little earlier.  I guess this won't affect the
>    API and so can be done later...

As for stringifying: yeah, I have been doing this myself on the caller side (calling str early to avoid references to heavy objects). It's somewhat tricky to do in on the side of the callee (inside history) as we really need to call .format() with appropriate options and remember that. I'll see if there is something I can do without resorting to parsing the format message myself.

> On Mon, 07 May 2012 09:49:18 -0000, Zygmunt Krynicki
> <zygmunt.krynicki@canonical.com> wrote:
> > Zygmunt Krynicki has proposed merging lp:~zkrynicki/lava-core/history-and-
> logging into lp:lava-core.
> >
> > Requested reviews:
> >   Linaro Validation Team (linaro-validation)
> >
> > For more details, see:
> > https://code.launchpad.net/~zkrynicki/lava-core/history-and-
> logging/+merge/104882
> >
> > This is a re-spin of earlier merge proposal
> (https://code.launchpad.net/~zkrynicki/lava-core/logging-mixin/+merge/102339)
> >
> > Again, to quote my extensive commit message:
> >
> >   Add the logging and history modules
> >
> >   Both modules cooperate on centralized logging interface and have largely
> >   identical APIs. LoggingMixIn class adds self.history and self.logging
> >   properties to the class it is injected to.
> >
> >   Both properties return proxy objects that ultimately call classic python
> >   logging and lava-core'esque history. History is much like logging but has
> two
> >   advantages over plain logging.
> >
> >   The first advantage is that it was designed up front for being
> hierarchical.
> >   This naturally constructs a tree of activities. I believe that such
> structure
> >   can be visualized much better than a plain log file ever could. Any
> composite
> >   operation also computes and records the duration of all the activities it
> >   contains. This yields much more transparent duration awareness of entire
> >   processes that lava performs internally.
> >
> >   The second advantage is that it can carry more data than regular python
> logging
> >   can. All arguments to the message template are passed to the
> string.format()
> >   function as keywords. This allows us to transparently provide key-value
> data in
> >   alongside each history entry. This makes automatic analysis and data
> retrieval
> >   much easier. It also allows for new data logging features such as
> downloading a
> >   file and storing the speed history as an array in the activity meta-data.
> Such
> >   object can be rendered with appropriate care by the front end thus making
> the
> >   user interface far more rich than we could previously attempt without
> extreme
> >   hacks and side-band, custom communication channels. Another feature that
> takes
> >   advantage of this rich meta-data is attachments. Any activity can have a
> >   collection of attachments that can be presented in a web front-end
> (attachments
> >   also carry explicit MIME type). LAVA Core currently uses this to ensure we
> >   capture full, uncorrupted output of any external programs we run.
> >
> >   The history module offers utility functions that can render any activity
> (or
> >   the convenience lava.core.history.history.top activity which records all
> known
> >   history of the process) as JSON or plain text (in several formats). In
> addition
> >   both formats can be saved, along with all the attachments, in a history
> >   tarball. This functionality is exploited in the top-level lava command
> line
> >   program, that allows the user to record the history of any LAVA commands,
> for
> >   visualization, debugging, rich analysis or other, yet-unknown purpose.
> >
> > --
> > https://code.launchpad.net/~zkrynicki/lava-core/history-and-
> logging/+merge/104882
> > You are subscribed to branch lp:lava-core.
> > === added file 'lava/core/history.py'
> > --- lava/core/history.py      1970-01-01 00:00:00 +0000
> > +++ lava/core/history.py      2012-05-07 09:48:17 +0000
> > @@ -0,0 +1,504 @@
> > +# Copyright (C) 2011-2012 Linaro Limited
> > +# vim: set fileencoding=utf8 :
> > +#
> > +# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
> > +#
> > +# This file is part of lava-core
> > +#
> > +# lava-core is free software: you can redistribute it and/or modify
> > +# it under the terms of the GNU Lesser General Public License version 3
> > +# as published by the Free Software Foundation
> > +#
> > +# lava-core is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public License
> > +# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +from __future__ import absolute_import, print_function
> > +
> > +"""
> > +lava.core.utils.history
> > +=======================
> > +
> > +Implementation of the hierarhical history
> > +"""
> > +
> > +from logging import DEBUG, INFO, WARNING, ERROR, CRITICAL
> > +import collections
> > +import datetime
> > +import os
> > +import sys
> > +import tarfile
> > +import traceback
> > +
> > +from json_document.serializers import JSON
> > +from json_schema_validator.extensions import (
> > +    datetime_extension, timedelta_extension)
> > +import simplejson
> > +
> > +
> > +class DictProxy(object):
> 
> A short docstring would be nice.  I guess "I wish I was writing
> Javascript" would not be helpful :-)

Will do

> > +    def __init__(self, d):
> > +        object.__setattr__(self, "_d", d)
> > +
> > +    def as_dict(self):
> > +        return object.__getattribute__(self, "_d")
> 
> "return self._d" works here. __getattr__ is only called for attributes
> that are not found by the usual mechanisms.  I guess we'd better hope
> that the dictionary does not have '_d' as a key :-)

Have you checked this? (it could have been late but I remember it just did not work, despite all the knowledge of python descriptors telling me otherwise).

> > +    def __getattr__(self, name):
> > +        try:
> > +            return self.as_dict()[name]
> > +        except KeyError:
> > +            raise AttributeError(name)
> > +
> > +    def __setattr__(self, name, value):
> > +        self.as_dict()[name] = value
> 
> Similarly "self._d[name] = value".

Thinking about it now, I may have made a mistake early on, that I did not use object.__setattr__ in __init__. That would cascade all the getters and setters and finally run out of stack space.

> > +
> > +
> > +ExceptionInfo = collections.namedtuple(
> > +    "ExceptionInfo", "filename lineno function text")
> 
> I don't this this helps.  See below.

Hmm, I only wanted to capture exceptions as objects, not lists.
 
> > +
> > +_name_and_back = {
> > +    DEBUG: "DEBUG",
> > +    INFO: "INFO",
> > +    WARNING: "WARNING",
> > +    ERROR: "ERROR",
> > +    CRITICAL: "CRITICAL",
> > +    "DEBUG": DEBUG,
> > +    "INFO": INFO,
> > +    "WARNING": WARNING,
> > +    "ERROR": ERROR,
> > +    "CRITICAL": CRITICAL
> > +}
> 
> This duplicates logging._levelNames.  But I guess the underscore scares
> one away from that...
> 
> You also only seem to use the DEBUG -> "DEBUG" side of it.
> 
> > +class Activity(object):
> > +
> > +    def __init__(self, level, message, **meta):
> > +        """
> > +        Create a new activity with the specified message, level and
> > +        initial meta-data
> > +        """
> > +        # Message that has format() references to meta-data
> > +        self._level = level
> > +        self._message = message
> > +        self._meta = DictProxy(meta)
> > +        # File attachments
> > +        self._attachments = []
> > +        # Children activities
> > +        self._sub_activities = []
> > +        # Start and end timestamps
> > +        self._start = datetime.datetime.utcnow()
> > +        self._end = self._start
> > +        # Exception + traceback
> > +        self._exc_value = None
> > +        self._exc_type = None
> > +        self._traceback = None
> > +
> > +    @property
> > +    def raw_message(self):
> > +        """
> > +        the raw, unformatted message
> > +        """
> > +        return self._message
> > +
> > +    @property
> > +    def message(self):
> > +        """
> > +        rendered message
> > +
> > +        Rendering substitutes str.format() style variables with data from
> > +        self.meta.
> > +        """
> > +        return self._message.format(**self._meta.as_dict())
> > +
> > +    @property
> > +    def level(self):
> > +        """
> > +        level number, 10 (DEBUG) - 50 (CRITICAL)
> > +        """
> > +        return self._level
> > +
> > +    @property
> > +    def level_name(self):
> > +        """
> > +        name of the level "DEBUG" ... "CRITICAL"
> > +        """
> > +        return _name_and_back[self._level]
> > +
> > +    @property
> > +    def meta(self):
> > +        """
> > +        collection of meta-data
> > +
> > +        Meta-data serves two purposes: to fill in data to the message
> > +        template and to store arbitrary auxiliary data. This property
> > +        returns a dictionary proxy that looks like an object (has normal
> > +        attribute accessors) but still talks to the real dictionary behind
> > +        the scenes. You can both read and write/set attributes at will.
> > +        """
> > +        return self._meta
> > +
> > +    def attach(self, stream, name=None,  mime_type='text/plain'):
> > +        """
> > +        Add an attachment.
> > +
> > +        Everything is extracted from the file-like object stream.
> 
> This is dishonest: it had better be a _file_, or save_as_tarball won't
> work (stream.name has to name a path on disk for tarball.add() to work,
> surely?).

Right, I'll correct the docstring. Originally I mean to say it does not have to be the value of open() because tempfile.TemporaryFile and NamedTemporaryFile return file-like objects that have proper .name attribute that is created in the filesystem when needed (for example by fileno() or .name)

> 
> > If you care
> > +        about a particular name of the attachment it an be specified with
> name.
> > +        MIME type can help the web front-end to render the content. It
> defaults
> > +        to text/plain which is suitable for most log files.
> > +        """
> > +        if name is None:
> > +            name = os.path.basename(stream.name)
> > +        self._attachments.append(
> > +            DictProxy({
> > +                'name': name,
> > +                'orig_name': stream.name,
> > +                'mime_type': mime_type}))
> 
> I don't know that it's worth using DictProxy here as you only use this
> once...

Yeah, I have this aversion to foo['key']

> 
> > +    @property
> > +    def duration(self):
> > +        """
> > +        Duration of the activity.
> > +
> > +        If the activity has not started the duration is always zero.
> > +        Activities that are in progress will observe the duration to grow.
> > +        """
> > +        if self._start is None:
> > +            return datetime.timedelta()
> > +        if self._end is None:
> > +            return datetime.datetime.utcnow() - self._start
> > +        return self._end - self._start
> > +
> > +    def as_json(self):
> > +        """
> > +        Convert this Activity to JSON
> 
> Meep.  This doesn't return JSON (JSON is the _string_).  Maybe
> "prepare_for_json" ?

well JSON is serialized as strings but you are correct, I freely call the serialization format and the "object model" json which is confusing.

> 
> > +        """
> > +        obj = simplejson.OrderedDict((
> > +            ("level", self.level_name),
> > +            ("message", self._message),
> > +            ("meta", self.meta.as_dict()),
> 
> Something somewhere is going to break if meta contains things that can't
> be JSON-ed, right?

No, _build_json() will fix that, see below.

> 
> > +            ("start", (datetime_extension.to_json(self._start)
> > +                       if self._start is not None else None)),
> > +            ("duration", timedelta_extension.to_json(self.duration)),
> > +            ("end", (datetime_extension.to_json(self._end)
> > +                     if self._end is not None else None)),
> > +            ("exc_value", self._exc_value),
> > +            ("exc_type", (self._exc_type.__name__
> > +                          if self._exc_type else None)),
> > +            ("traceback", ([ExceptionInfo(*ei) for ei in
> > +                            traceback.extract_tb(self._traceback)]
> > +                           if self._traceback else None)),
> 
> This seems a bit of a complicated way of saying "ei.filename ei.lineno
> ei.function ei.text for ei in ..."

The thing is that python's exception info returns a list, not a key-value or named tuple. Unless you want a raw list in the json then you want this oddity to keep the names of each value.

> 
> > +            ("attachments", [
> > +                attachment.as_dict() for attachment in self._attachments]),
> > +            ("sub_activities", [
> > +                activity.as_json() for activity in self._sub_activities])))
> > +        return _build_json(obj)
> 
> It strikes me that this would be more proof against odd things in meta
> and a bit simpler if _build_json (which also has a bad name) handled
> datetimes and timedeltas.

I'm a victim of refactoring, initially build_json was a small helper that did something else, I'll fix that.

> 
> > +    def __str__(self):
> > +        """"
> > +        Returns self.message
> > +        """
> > +        return self.message
> > +
> > +    def save_as_tarball(self, scratch, tarball_stream, mode='w:gz'):
> > +        """
> > +        Save activity history as a tarball.
> > +
> > +        The tarball will have two files 'history.txt' and 'history.json'
> with
> > +        appropriate encoding of this activity, as well as all sub-
> activities.
> > +        Any attachments are added as well.
> > +        """
> > +        with tarfile.open(tarball_stream.name, mode=mode,
> > +                          fileobj=tarball_stream) as tarball:
> > +            with scratch.open('history.txt') as stream:
> > +                self.print_(stream=stream)
> 
> Heh ordering snafu?  scratch isn't introduced yet.  Never mind though.

I saw that later, you are correct ;-)
 
> It's never occurred to me before, but it's pretty weird to see code of
> the form:
> 
> with ${a}.open(${b}) as var:
>     ... code that writes to var ...
> 
> This is the tempfile module's fault, not yours I guess :)

On the other hand if you just call .unique() it's somewhat readable and clean (as long as you don't add any arguments)

with scratch.unique() as stream:
   ....

> 
> > +            tarball.add(stream.name, 'history.txt')
> > +            with scratch.open('history.json') as stream:
> > +                JSON.dump(stream, self.as_json())
> > +            tarball.add(stream.name, 'history.json')
> > +            self._save_tarball_attachments(tarball)
> > +
> > +    def _save_tarball_attachments(self, tarball):
> > +        for attachment in self._attachments:
> > +            if os.path.isfile(attachment.orig_name):
> > +                tarball.add(attachment.orig_name, attachment.name)
> > +        for activity in self._sub_activities:
> > +            activity._save_tarball_attachments(tarball)
> 
> Err, you'd better hope that no two attachments have the same name or are
> called 'history.json' or 'history.txt' here!

My bad, python's TarFile class is hideously crappy. Originally all attachments were in a directory but it turns out you really cannot add a fake directory to a TarFile without reading the implementation and hardcoding magic tuples. Still, point taken, I'll add a TODO entry about this. This is also why I need a scratch instance to create history.json/txt as a real file, and not as data stream inside TarFile instance (which is silly and wastes resources).
 
> I'm a bit uneasy about how this attachment thing is going to work out in
> practice.  When a job has finished, it will create a tarball, that ends
> up in the media files of lava-server and the scheduler job view will
> fish the history.json out of this tarball to know how to render the
> basic outline, include things that will do ajax to another view that
> fishes other files out of the tarball to either be downloaded or shown
> inline in the page?  But we'd like to have incremental output and I
> don't really like the idea of a page for a completed job being
> completely different to that for a job that's still running.

Incremental output will be something we get in the classic method (as it works now), I think we'll find a way to manage history tarballs when we get to work on GUI again. Also, read my comments at the very top of this reply, where I ack the need for incremental output in certain, *cough* slow models *cough* situations.

> It just feels like the bits don't quite line up here.
> 
> > +    def __repr__(self):
> > +        return "<Activity: %r>" % str(self)
> > +
> > +    def __enter__(self):
> > +        """
> > +        Begins a 'long' activity.
> > +
> > +        Long activities have both start and end, and thus, non-zero
> duration.
> > +        This is called automatically by History.__enter__ so you won't
> usually
> > +        have to use it explicitly.
> > +
> > +        Here we reset self.end to None (it was initialized to 'now' in
> > +        __init__)
> > +        """
> > +        self._end = None
> > +        return self
> > +
> > +    def __exit__(self, exc_type, exc_value, traceback):
> > +        """
> > +        Terminates a 'long' activity.
> > +
> > +        Here we set self.end to 'now' and record any exception information.
> > +        See __enter__() for more information on 'long' activities.
> > +        """
> > +        self._end = datetime.datetime.utcnow()
> > +        self._exc_value = exc_value
> > +        self._exc_type = exc_type
> > +        self._traceback = traceback
> > +
> > +    def print_(self, stream=None, print_meta=False, print_children=True,
> > +               stack=None):
> > +        """
> > +        Print a nice Unicode tree of activities.
> > +
> > +        This is very useful for tracing and debugging.
> > +        """
> > +        if stack is None:
> > +            stack = []
> > +        if stream is None:
> > +            stream = sys.stdout
> > +        fragments = []
> > +        if print_meta is False:
> 
> Please don't compare immutable values with 'is'.

False is not only immutable, it is a singleton ;-) I guess I over-do 'is None'

> In the code as submitted, there's no way for print_meta to be True (or
> print_children to be False, come to that).  Will this change in a later
> branch?

You can always call print_() directly but if you want I can remove it now.
> 
> > +            fragments.append("{0:7}".format(self.level_name))
> > +            fragments.append(" | ")
> > +            fragments.append("{start}.{micro:06}".format(
> > +                start=self._start.strftime("%Y-%m-%d %T"),
> > +                micro=self._start.microsecond))
> > +            fragments.append(" | ")
> > +            if self.duration.total_seconds() > 0:
> > +                fragments.append("{0:14}s".format(self.duration))
> > +            else:
> > +                fragments.append("{0:15}".format(""))
> > +            fragments.append(" | ")
> > +            fragments.append(''.join(self._tree_indent(stack)))
> > +        # Process the message to make it safer
> > +        message = self.message
> > +        message = message.replace("\n", " ").replace("\r", " ")
> > +        if isinstance(message, unicode):
> > +            message = message.encode(stream.encoding or "UTF-8", 'replace')
> > +        fragments.append(message)
> > +        if self._exc_value:
> > +            fragments.append(", ***crashed*** {0}".format(self._exc_value))
> > +        print("".join(fragments), file=stream)
> > +        if print_meta:
> > +            print("\tstart: {0}".format(self._start))
> > +            if self.duration.total_seconds() > 0:
> > +                print("\tduration: {0}".format(self.duration))
> > +            if self._end is not None and self._end != self._start:
> > +                print("\tend: {0}".format(self._end))
> > +            if self._exc_value is not None:
> > +                print("\texc_value: {0}".format(self._exc_value))
> > +            if self._exc_type is not None:
> > +                print("\texc_type: {0}".format(self._exc_type))
> > +            if self._traceback is not None:
> > +                for index, exc_info in enumerate(self._traceback):
> > +                    print("\ttraceback.{0}.filename: {1}".format(
> > +                        index, self._traceback.filename))
> > +                    print("\ttraceback.{0}.lineno: {1}".format(
> > +                        index, self._traceback.lineno))
> > +                    print("\ttraceback.{0}.function {1}".format(
> > +                        index, self._traceback.function))
> > +                    print("\ttraceback.{0}.text {1}".format(
> > +                        index, self._traceback.text))
> > +            for key, value in self.meta.as_dict().iteritems():
> > +                # Skip one special key used by logging integration
> > +                if key == "__prefmt_msg":
> > +                    continue
> > +                print("\t{key}: {value!r}".format(key=key, value=value))
> > +        if print_children:
> > +            pos = len(stack)
> > +            stack.append(len(self._sub_activities))
> > +            for sub in self._sub_activities:
> > +                stack[pos] -= 1
> > +                sub.print_(stream, print_meta, print_children, stack)
> > +            stack.pop()
> > +
> > +    @staticmethod
> > +    def _tree_indent(stack, use_unicode=False):
> > +        """
> > +        Compute the indentation part of the tree displayed by print_.
> > +        """
> > +        # The code here might be tricky. It's easier to understand if you
> keep
> > +        # in mind that stack[n] tells you how many more entries are left in
> the
> > +        # tree at that stack depth. The stack is constructed by print_()
> 
> I haven't really tried to understand the printing code, sorry.

That's fine, neither did I, the code has a bug but I failed to identify it after an hour of looking at the algorithm. Luckly browsers implement <ul>...</ul> correctly.

> > +        for items_left in stack[:-1]:
> > +            # FIXME: this is broekn
> > +            if items_left == 0:
> > +                yield "    "
> > +            elif items_left == 1:
> > +                if use_unicode:
> > +                    yield "└   "
> 
> Wow, this code confused emacs when reading the original email -- I guess
> it didn't know it was utf-8.  As with print_meta I don't see a way for
> use_unicode to be True, and would sort of prefer to avoid unicode here I
> think.

I've added the python "OMG UTF-8" stanza at the top of the file. Unicode looks better if you see it (it could be just dumped in <pre> on a page without looking childlish). If I fix the FIXME bug above I'll probably discard the non-unicode variant. Also, we have a unicode/logging bug somewhere (eh, python + redirected stdout gets ASCI insanity) so this was really a workaround for that. If you do .encode("UTF-8") then some of the formatting code above will break as it needs to measure the length of a string to align everything.
 
> > +                else:
> > +                    yield "\   "
> > +            elif items_left > 1:
> > +                if use_unicode:
> > +                    yield "│   "
> > +                else:
> > +                    yield "|   "
> > +        for items_left in stack[-1:]:
> > +            if items_left == 0:
> > +                if use_unicode:
> > +                    yield "└── "
> > +                else:
> > +                    yield "\-- "
> > +            elif items_left > 0:
> > +                if use_unicode:
> > +                    yield "├── "
> > +                else:
> > +                    yield "|-- "
> > +
> > +
> > +def _build_json(obj):
> > +    """
> > +    Helper to construct json that keeps ordering and discards None values
> 
> As above: this doesn't return JSON.

Ack

> > +    """
> > +    if isinstance(obj, (dict, simplejson.OrderedDict)):
> > +        new_obj = simplejson.OrderedDict()
> > +        for key, value in obj.iteritems():
> > +            if value is None or value == [] or value == {}:
> > +                continue
> > +            else:
> > +                new_obj[str(key)] = _build_json(value)
> > +        return new_obj
> > +    elif isinstance(obj, list):
> > +        return [_build_json(item) for item in obj]
> > +    elif (isinstance(obj, (int, float, str, unicode, list, dict, bool))
> > +          or obj is None):
> > +        return obj
> 
> er, list is in here twice.  I don't understand how ExceptionInfo (or
> tuple come to that) passed through here.  I think you should delete
> ExceptionInfo anyway, but you probably want to handle tuples?

Hmm, some good points, right. I'll merge list to (list, tuple) above and get rid of them from the elif clause.
 
> > +    else:
> > +        try:
> > +            return str(obj)
> > +        except Exception:
> > +            raise ValueError("cannot convert {0!r} to JSON".format(obj))
> > +
> > +
> > +class History(object):
> > +    """
> > +    History class, provides helper methods for hierarchical logging.
> > +    Maintains a stack of activities that are nested with context
> > +    management.
> > +    """
> > +
> > +    def __init__(self, top):
> > +        """
> > +        Create a History that starts with the specified 'top' activity
> > +
> > +        Usually you don't want to call this. Instead, use the singleton
> > +        instance 'history'.
> > +        """
> > +        self._nesting = [top]
> > +        self._last = None
> > +
> > +    @property
> > +    def top(self):
> > +        """
> > +        the top-level activity
> > +
> > +        Very useful to grab entire history and do something with it, for
> > +        example history.top.as_json() or history.top.print_()
> > +        """
> > +        return self._nesting[0]
> > +
> > +    @property
> > +    def bottom(self):
> > +        """
> > +        the bottommost activity
> > +
> > +        This is the most nested activity, as established by:
> > +
> > +            with history.xxx():
> > +                ...
> > +        """
> > +        return self._nesting[-1]
> > +
> > +    @property
> > +    def last(self):
> > +        """
> > +        the most recent activity
> > +
> > +        This is the most recently created activity in this history.
> > +        """
> 
> The lack of implementation of this method suggests it's not actually
> used anywhere :-)

But, hey, it's documented ;-)

> 
> > +    def log(self, level, message, **meta):
> > +        """
> > +        Create an activity and append it to the history
> > +        """
> > +        self._last = Activity(level, message, **meta)
> > +        self.bottom._sub_activities.append(self._last)
> > +        return self
> > +
> > +    def debug(self, message, **meta):
> > +        """
> > +        Append DEBUG activity to the bottommost activity
> > +        """
> > +        return self.log(DEBUG, message, **meta)
> > +
> > +    def info(self, message, **meta):
> > +        """
> > +        Append INFO activity to the bottommost activity
> > +        """
> > +        return self.log(INFO, message, **meta)
> > +
> > +    def warning(self, message, **meta):
> > +        """
> > +        Append WARNING activity to the bottommost activity
> > +        """
> > +        return self.log(WARNING, message, **meta)
> > +
> > +    def error(self, message, **meta):
> > +        """
> > +        Append ERROR activity to the bottommost activity
> > +        """
> > +        return self.log(ERROR, message, **meta)
> > +
> > +    def critical(self, message, **meta):
> > +        """
> > +        Append CRITICAL activity to the bottommost activity
> > +        """
> > +        return self.log(CRITICAL, message, **meta)
> > +
> > +    def __enter__(self):
> > +        """
> > +        Begins nested activity
> > +
> > +        Nesting is widely useful as it helps to create a structure in the
> > +        history stream. Typically you will call this method indirectly
> > +        with code that looks like this:
> > +
> > +            with history.info("Doing something complex"):
> > +                history.debug("step 1")
> > +                history.debug("step 2")
> > +                history.debug("step 3")
> > +
> > +        Here we also call __enter__() on the last activity to convert it to
> a
> > +        'long' activity (one that has both start, end and non-zero
> duration)
> > +        """
> > +        self._nesting.append(self._last)
> > +        return self._last.__enter__()
> > +
> > +    def __exit__(self, exc_type, exc_value, traceback):
> > +        """
> > +        Terminates most recently nested activity
> > +
> > +        Here we also call __exit__() on the last activity to allow it to
> record
> > +        any exceptions that may have been raised.
> > +        """
> > +        last = self._nesting.pop()
> > +        last.__exit__(exc_type, exc_value, traceback)
> > +
> > +
> > +# Global history
> > +history = History(Activity(DEBUG, "History module loaded"))
> >
> > === added file 'lava/core/logging.py'
> > --- lava/core/logging.py      1970-01-01 00:00:00 +0000
> > +++ lava/core/logging.py      2012-05-07 09:48:17 +0000
> > @@ -0,0 +1,120 @@
> > +# Copyright (C) 2011-2012 Linaro Limited
> > +# vim: set fileencoding=utf8 :
> > +#
> > +# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
> > +#
> > +# This file is part of lava-core
> > +#
> > +# lava-core is free software: you can redistribute it and/or modify
> > +# it under the terms of the GNU Lesser General Public License version 3
> > +# as published by the Free Software Foundation
> > +#
> > +# lava-core is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public License
> > +# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
> > +
> > +from __future__ import absolute_import
> > +
> > +"""
> > +lava.core.utils.logging
> > +=======================
> > +
> > +Logging utilities
> > +"""
> > +
> > +import logging
> > +
> > +from lava.core.history import history as global_history
> > +
> > +
> > +class LoggerToHistoryProxy(object):
> > +    """
> > +    Wrapper that looks like logging.Logger but also creates history records
> > +    """
> > +
> > +    _history = global_history
> > +
> > +    def __init__(self, logger):
> > +        self._logger = logger
> > +
> > +    def log(self, level, msg, *args, **kwargs):
> > +        self._logger.log(level, msg, *args, **kwargs)
> > +        self._history.log(level, "{__prefmt_msg}",
> > +                     __prefmt_msg=msg % args, logger=self._logger.name)
> > +
> > +    def debug(self, msg, *args, **kwargs):
> > +        return self.log(logging.DEBUG, msg, *args, **kwargs)
> > +
> > +    def info(self, msg, *args, **kwargs):
> > +        return self.log(logging.INFO, msg, *args, **kwargs)
> > +
> > +    def warning(self, msg, *args, **kwargs):
> > +        return self.log(logging.WARNING, msg, *args, **kwargs)
> > +
> > +    def error(self, msg, *args, **kwargs):
> > +        return self.log(logging.ERROR, msg, *args, **kwargs)
> > +
> > +    def critical(self, msg, *args, **kwargs):
> > +        return self.log(logging.CRITICAL, msg, *args, **kwargs)
> > +
> > +    def exception(self, msg, *args):
> > +        self._logger.exception(msg, *args)
> > +
> > +
> > +class HistoryToLoggerProxy(object):
> > +    """
> > +    Wrapper that looks like History but also logs to logger
> > +    """
> > +
> > +    _history = global_history
> > +
> > +    def __init__(self, logger):
> > +        self._logger = logger
> > +
> > +    def log(self, level, message, **meta):
> > +        self._logger.log(level, message.format(**meta))
> > +        return self._history.log(level, message, **meta)
> > +
> > +    def debug(self, message, **meta):
> > +        return self.log(logging.DEBUG, message, **meta)
> > +
> > +    def info(self, message, **meta):
> > +        return self.log(logging.INFO, message, **meta)
> > +
> > +    def warning(self, message, **meta):
> > +        return self.log(logging.WARNING, message, **meta)
> > +
> > +    def error(self, message, **meta):
> > +        return self.log(logging.ERROR, message, **meta)
> > +
> > +    def critical(self, message, **meta):
> > +        return self.log(logging.CRITICAL, message, **meta)
> > +
> > +
> > +class LoggingMixIn(object):
> > +    """
> > +    Mix-in that adds a self.logger and self.history instance attributes
> > +    """
> 
> I think you should explain the difference between calling a method on
> logger and on history.

> > +    @property
> > +    def _logger(self):
> > +        return logging.getLogger(
> > +            self.__class__.__module__ + "." + self.__class__.__name__)
> > +
> > +    @property
> > +    def history(self):
> > +        """
> > +        magic lava.utils.logging.history.History-like instance
> > +        """
> > +        return HistoryToLoggerProxy(self._logger)
> > +
> > +    @property
> > +    def logger(self):
> > +        """
> > +        magic logging.Logger-like instance
> > +        """
> > +        return LoggerToHistoryProxy(self._logger)
> >
> 
> Cheers,
> mwh

Revision history for this message

Michael Hudson-Doyle (mwhudson) wrote on 2012-05-09:

Download full text (40.5 KiB)

On Tue, 08 May 2012 10:11:19 -0000, Zygmunt Krynicki <email address hidden> wrote:
> > [And once more, to the merge proposal as well :(]
> >
> > Hi. This overall seems reasonably nice. With things like this I always
> > wonder if we'll ever get around to building the nice UIs that are
> > enabled...
> >
> > Some general concerns:
> >
> > 1) Incremental output on the scheduler job page is _massively_ useful
> > and losing it even temporarily is not acceptable, IMHO. I don't
> > quite see how this meshes with this code. I guess we could still
> > define a logging handler that outputs log messages to stdout/stderr
> > and that will display nicely in the scheduler view until we get
> > around to doing something more sophisticated there.
>
> While I was somewhat skeptical about this, anything involving fast
> models made me reconsider quickly.

I don't think fast models are fundamentally different here, but
whatever, I'm glad I don't have to convince you :-)

> I did a small proof of concept that streams data in real time
> (anything invoked via the agent) while retaining stdout/stderr
> separation. So I guess that means we'll just keep the verbose output
> for the current scheduler view. In the future we may replace that with
> real-time status updates so that we'd get the full hierarchical
> structure but as you rightfully point out, that may take a while.

Cool.

> > 2) I can see having two sets of debug/info/warning/error functions with
> > slightly different APIs (% vs .format() interpolation) being trying.
> > But maybe this will be a non-issue -- I guess usually it the code
> > will be saying self.history.info() or self.logger.info() and the
> > logger vs history API will be clear.
>
> We can still reconsider this as it's reasonably easy/fast to replace
> this with the moderate amount of usage it's seen so far. My
> motivations for format() were twofold: python deprecates % in favor of
> .format() and (perhaps I have over-valued this) I wanted to have
> key-value meta-data in each log message, believing that it could be
> used by some future smart UI to present the data better. If we were to
> go with % formatting then we don't really loose key-value formats
> (they just look different). Perhaps we could do a machine substitution
> so that we can get consistent logging + % API and keep the true data
> behind the scenes. In practical terms it would do mean calling
> self.history.info("%(something)s", something="blarg"). What do you
> think?

I'm not sure that preserving the key/value pairs that are intended to be
part of the formatted log message will be much of a win really (because
its intended to be formatted, you'll likely do things like pass a field
value rather than the whole object, for example, and there's not really
much value in having a simple value both inside and outside the log
message). I can see the potential desire to attach other data to the
Activity objects though.

So, um, I don't know. But also, I don't think it's a very big deal...

> > 3) I'm just a little worried by memory consumption -- it seems that
> > there is the potential to end up with a _lot_ of objects hangi...

On Tue, 08 May 2012 10:11:19 -0000, Zygmunt Krynicki <zygmunt.krynicki@canonical.com> wrote:
> > [And once more, to the merge proposal as well :(]
> > 
> > Hi.  This overall seems reasonably nice.  With things like this I always
> > wonder if we'll ever get around to building the nice UIs that are
> > enabled...
> > 
> > Some general concerns:
> > 
> > 1) Incremental output on the scheduler job page is _massively_ useful
> >    and losing it even temporarily is not acceptable, IMHO.  I don't
> >    quite see how this meshes with this code.  I guess we could still
> >    define a logging handler that outputs log messages to stdout/stderr
> >    and that will display nicely in the scheduler view until we get
> >    around to doing something more sophisticated there.
> 
> While I was somewhat skeptical about this, anything involving fast
> models made me reconsider quickly.

I don't think fast models are fundamentally different here, but
whatever, I'm glad I don't have to convince you :-)

Cool.

> > 2) I can see having two sets of debug/info/warning/error functions with
> >    slightly different APIs (% vs .format() interpolation) being trying.
> >    But maybe this will be a non-issue -- I guess usually it the code
> >    will be saying self.history.info() or self.logger.info() and the
> >    logger vs history API will be clear.
> 
> We can still reconsider this as it's reasonably easy/fast to replace
> this with the moderate amount of usage it's seen so far. My
> motivations for format() were twofold: python deprecates % in favor of
> .format() and (perhaps I have over-valued this) I wanted to have
> key-value meta-data in each log message, believing that it could be
> used by some future smart UI to present the data better. If we were to
> go with % formatting then we don't really loose key-value formats
> (they just look different). Perhaps we could do a machine substitution
> so that we can get consistent logging + % API and keep the true data
> behind the scenes. In practical terms it would do mean calling
> self.history.info("%(something)s", something="blarg"). What do you
> think?

I'm not sure that preserving the key/value pairs that are intended to be
part of the formatted log message will be much of a win really (because
its intended to be formatted, you'll likely do things like pass a field
value rather than the whole object, for example, and there's not really
much value in having a simple value both inside and outside the log
message).  I can see the potential desire to attach other data to the
Activity objects though.

So, um, I don't know.  But also, I don't think it's a very big deal...

> > 3) I'm just a little worried by memory consumption -- it seems that
> >    there is the potential to end up with a _lot_ of objects hanging
> >    around, particularly if you hang on to tracebacks.  It might be worth
> >    stringifying things a little earlier.  I guess this won't affect the
> >    API and so can be done later...
> 
> I thought about it too, by demo-2 I made sure we don't keep an
> Activity record for each stdout line we print. In theory this
> dispatcher is better as it never keeps an in-memory log file of any
> test.

Yeah, true, it's not like the current code is good here.

> The amount of 'activities' is pretty much constant across all test
> runs. Still, unlike logging which never suffers from this problem,
> activities do add up. When it becomes a problem we can simply record
> them (in some machine format) as they are being generated. The only
> issue here is that activities that are in flight (__enter__()) will
> require seeking + updates to "close" and any exceptions we generate
> could need special treatment (as exception is varying in size).
>
> As for stringifying: yeah, I have been doing this myself on the caller
> side (calling str early to avoid references to heavy objects). It's
> somewhat tricky to do in on the side of the callee (inside history) as
> we really need to call .format() with appropriate options and remember
> that. I'll see if there is something I can do without resorting to
> parsing the format message myself.

I think it's fine to note this as an area where work may be required.

> > On Mon, 07 May 2012 09:49:18 -0000, Zygmunt Krynicki
> > <zygmunt.krynicki@canonical.com> wrote:
> > > Zygmunt Krynicki has proposed merging lp:~zkrynicki/lava-core/history-and-
> > logging into lp:lava-core.
> > >
> > > Requested reviews:
> > >   Linaro Validation Team (linaro-validation)
> > >
> > > For more details, see:
> > > https://code.launchpad.net/~zkrynicki/lava-core/history-and-
> > logging/+merge/104882
> > >
> > > This is a re-spin of earlier merge proposal
> > (https://code.launchpad.net/~zkrynicki/lava-core/logging-mixin/+merge/102339)
> > >
> > > Again, to quote my extensive commit message:
> > >
> > >   Add the logging and history modules
> > >
> > >   Both modules cooperate on centralized logging interface and have largely
> > >   identical APIs. LoggingMixIn class adds self.history and self.logging
> > >   properties to the class it is injected to.
> > >
> > >   Both properties return proxy objects that ultimately call classic python
> > >   logging and lava-core'esque history. History is much like logging but has
> > two
> > >   advantages over plain logging.
> > >
> > >   The first advantage is that it was designed up front for being
> > hierarchical.
> > >   This naturally constructs a tree of activities. I believe that such
> > structure
> > >   can be visualized much better than a plain log file ever could. Any
> > composite
> > >   operation also computes and records the duration of all the activities it
> > >   contains. This yields much more transparent duration awareness of entire
> > >   processes that lava performs internally.
> > >
> > >   The second advantage is that it can carry more data than regular python
> > logging
> > >   can. All arguments to the message template are passed to the
> > string.format()
> > >   function as keywords. This allows us to transparently provide key-value
> > data in
> > >   alongside each history entry. This makes automatic analysis and data
> > retrieval
> > >   much easier. It also allows for new data logging features such as
> > downloading a
> > >   file and storing the speed history as an array in the activity meta-data.
> > Such
> > >   object can be rendered with appropriate care by the front end thus making
> > the
> > >   user interface far more rich than we could previously attempt without
> > extreme
> > >   hacks and side-band, custom communication channels. Another feature that
> > takes
> > >   advantage of this rich meta-data is attachments. Any activity can have a
> > >   collection of attachments that can be presented in a web front-end
> > (attachments
> > >   also carry explicit MIME type). LAVA Core currently uses this to ensure we
> > >   capture full, uncorrupted output of any external programs we run.
> > >
> > >   The history module offers utility functions that can render any activity
> > (or
> > >   the convenience lava.core.history.history.top activity which records all
> > known
> > >   history of the process) as JSON or plain text (in several formats). In
> > addition
> > >   both formats can be saved, along with all the attachments, in a history
> > >   tarball. This functionality is exploited in the top-level lava command
> > line
> > >   program, that allows the user to record the history of any LAVA commands,
> > for
> > >   visualization, debugging, rich analysis or other, yet-unknown purpose.
> > >
> > > --
> > > https://code.launchpad.net/~zkrynicki/lava-core/history-and-
> > logging/+merge/104882
> > > You are subscribed to branch lp:lava-core.
> > > === added file 'lava/core/history.py'
> > > --- lava/core/history.py      1970-01-01 00:00:00 +0000
> > > +++ lava/core/history.py      2012-05-07 09:48:17 +0000
> > > @@ -0,0 +1,504 @@
> > > +# Copyright (C) 2011-2012 Linaro Limited
> > > +# vim: set fileencoding=utf8 :
> > > +#
> > > +# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
> > > +#
> > > +# This file is part of lava-core
> > > +#
> > > +# lava-core is free software: you can redistribute it and/or modify
> > > +# it under the terms of the GNU Lesser General Public License version 3
> > > +# as published by the Free Software Foundation
> > > +#
> > > +# lava-core is distributed in the hope that it will be useful,
> > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +# GNU General Public License for more details.
> > > +#
> > > +# You should have received a copy of the GNU Lesser General Public License
> > > +# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
> > > +
> > > +from __future__ import absolute_import, print_function
> > > +
> > > +"""
> > > +lava.core.utils.history
> > > +=======================
> > > +
> > > +Implementation of the hierarhical history
> > > +"""
> > > +
> > > +from logging import DEBUG, INFO, WARNING, ERROR, CRITICAL
> > > +import collections
> > > +import datetime
> > > +import os
> > > +import sys
> > > +import tarfile
> > > +import traceback
> > > +
> > > +from json_document.serializers import JSON
> > > +from json_schema_validator.extensions import (
> > > +    datetime_extension, timedelta_extension)
> > > +import simplejson
> > > +
> > > +
> > > +class DictProxy(object):
> > 
> > A short docstring would be nice.  I guess "I wish I was writing
> > Javascript" would not be helpful :-)
> 
> Will do
> 
> > > +    def __init__(self, d):
> > > +        object.__setattr__(self, "_d", d)
> > > +
> > > +    def as_dict(self):
> > > +        return object.__getattribute__(self, "_d")
> > 
> > "return self._d" works here. __getattr__ is only called for attributes
> > that are not found by the usual mechanisms.  I guess we'd better hope
> > that the dictionary does not have '_d' as a key :-)
> 
> Have you checked this?

Yes :)

> (it could have been late but I remember it just did not work, despite
> all the knowledge of python descriptors telling me otherwise).
> 
> > > +    def __getattr__(self, name):
> > > +        try:
> > > +            return self.as_dict()[name]
> > > +        except KeyError:
> > > +            raise AttributeError(name)
> > > +
> > > +    def __setattr__(self, name, value):
> > > +        self.as_dict()[name] = value
> > 
> > Similarly "self._d[name] = value".
> 
> Thinking about it now, I may have made a mistake early on, that I did
> not use object.__setattr__ in __init__. That would cascade all the
> getters and setters and finally run out of stack space.

Ah.  That would do it.

> 
> > > +
> > > +
> > > +ExceptionInfo = collections.namedtuple(
> > > +    "ExceptionInfo", "filename lineno function text")
> > 
> > I don't this this helps.  See below.
> 
> Hmm, I only wanted to capture exceptions as objects, not lists.

Well, the only access to exceptions in the current code is in the code
that serializes them, and as the ExceptionInfo is a namedtuple is a
tuple, they serialize to a list in the JSON.

Perhaps it is worth keeping ExceptionInfo info just for additional
clarity.

> > > +
> > > +_name_and_back = {
> > > +    DEBUG: "DEBUG",
> > > +    INFO: "INFO",
> > > +    WARNING: "WARNING",
> > > +    ERROR: "ERROR",
> > > +    CRITICAL: "CRITICAL",
> > > +    "DEBUG": DEBUG,
> > > +    "INFO": INFO,
> > > +    "WARNING": WARNING,
> > > +    "ERROR": ERROR,
> > > +    "CRITICAL": CRITICAL
> > > +}
> > 
> > This duplicates logging._levelNames.  But I guess the underscore scares
> > one away from that...
> > 
> > You also only seem to use the DEBUG -> "DEBUG" side of it.
> > 
> > > +class Activity(object):
> > > +
> > > +    def __init__(self, level, message, **meta):
> > > +        """
> > > +        Create a new activity with the specified message, level and
> > > +        initial meta-data
> > > +        """
> > > +        # Message that has format() references to meta-data
> > > +        self._level = level
> > > +        self._message = message
> > > +        self._meta = DictProxy(meta)
> > > +        # File attachments
> > > +        self._attachments = []
> > > +        # Children activities
> > > +        self._sub_activities = []
> > > +        # Start and end timestamps
> > > +        self._start = datetime.datetime.utcnow()
> > > +        self._end = self._start
> > > +        # Exception + traceback
> > > +        self._exc_value = None
> > > +        self._exc_type = None
> > > +        self._traceback = None
> > > +
> > > +    @property
> > > +    def raw_message(self):
> > > +        """
> > > +        the raw, unformatted message
> > > +        """
> > > +        return self._message
> > > +
> > > +    @property
> > > +    def message(self):
> > > +        """
> > > +        rendered message
> > > +
> > > +        Rendering substitutes str.format() style variables with data from
> > > +        self.meta.
> > > +        """
> > > +        return self._message.format(**self._meta.as_dict())
> > > +
> > > +    @property
> > > +    def level(self):
> > > +        """
> > > +        level number, 10 (DEBUG) - 50 (CRITICAL)
> > > +        """
> > > +        return self._level
> > > +
> > > +    @property
> > > +    def level_name(self):
> > > +        """
> > > +        name of the level "DEBUG" ... "CRITICAL"
> > > +        """
> > > +        return _name_and_back[self._level]
> > > +
> > > +    @property
> > > +    def meta(self):
> > > +        """
> > > +        collection of meta-data
> > > +
> > > +        Meta-data serves two purposes: to fill in data to the message
> > > +        template and to store arbitrary auxiliary data. This property
> > > +        returns a dictionary proxy that looks like an object (has normal
> > > +        attribute accessors) but still talks to the real dictionary behind
> > > +        the scenes. You can both read and write/set attributes at will.
> > > +        """
> > > +        return self._meta
> > > +
> > > +    def attach(self, stream, name=None,  mime_type='text/plain'):
> > > +        """
> > > +        Add an attachment.
> > > +
> > > +        Everything is extracted from the file-like object stream.
> > 
> > This is dishonest: it had better be a _file_, or save_as_tarball won't
> > work (stream.name has to name a path on disk for tarball.add() to work,
> > surely?).
> 
> Right, I'll correct the docstring. Originally I mean to say it does
> not have to be the value of open() because tempfile.TemporaryFile and
> NamedTemporaryFile return file-like objects that have proper .name
> attribute that is created in the filesystem when needed (for example
> by fileno() or .name)

Ah, true it doesn't have to be an actual isinstace( ,file) I guess, but
it doesn have to correspond to a file on disk.  When I see "file-like" I
think StringIO :)

> > > If you care
> > > +        about a particular name of the attachment it an be specified with
> > name.
> > > +        MIME type can help the web front-end to render the content. It
> > defaults
> > > +        to text/plain which is suitable for most log files.
> > > +        """
> > > +        if name is None:
> > > +            name = os.path.basename(stream.name)
> > > +        self._attachments.append(
> > > +            DictProxy({
> > > +                'name': name,
> > > +                'orig_name': stream.name,
> > > +                'mime_type': mime_type}))
> > 
> > I don't know that it's worth using DictProxy here as you only use this
> > once...
> 
> Yeah, I have this aversion to foo['key']
> 
> > 
> > > +    @property
> > > +    def duration(self):
> > > +        """
> > > +        Duration of the activity.
> > > +
> > > +        If the activity has not started the duration is always zero.
> > > +        Activities that are in progress will observe the duration to grow.
> > > +        """
> > > +        if self._start is None:
> > > +            return datetime.timedelta()
> > > +        if self._end is None:
> > > +            return datetime.datetime.utcnow() - self._start
> > > +        return self._end - self._start
> > > +
> > > +    def as_json(self):
> > > +        """
> > > +        Convert this Activity to JSON
> > 
> > Meep.  This doesn't return JSON (JSON is the _string_).  Maybe
> > "prepare_for_json" ?
> 
> well JSON is serialized as strings but you are correct, I freely call
> the serialization format and the "object model" json which is
> confusing.
> 
> > 
> > > +        """
> > > +        obj = simplejson.OrderedDict((
> > > +            ("level", self.level_name),
> > > +            ("message", self._message),
> > > +            ("meta", self.meta.as_dict()),
> > 
> > Something somewhere is going to break if meta contains things that can't
> > be JSON-ed, right?
> 
> No, _build_json() will fix that, see below.

Well, OK, _build_json will error.  I'm not sure that erroring at this
last stage is a good idea though -- you could have the bad data in a
history.log call 1 second in, but then it would be the log serialization
right at the end of the job that would fail.  Smells bad.

> > > +            ("start", (datetime_extension.to_json(self._start)
> > > +                       if self._start is not None else None)),
> > > +            ("duration", timedelta_extension.to_json(self.duration)),
> > > +            ("end", (datetime_extension.to_json(self._end)
> > > +                     if self._end is not None else None)),
> > > +            ("exc_value", self._exc_value),
> > > +            ("exc_type", (self._exc_type.__name__
> > > +                          if self._exc_type else None)),
> > > +            ("traceback", ([ExceptionInfo(*ei) for ei in
> > > +                            traceback.extract_tb(self._traceback)]
> > > +                           if self._traceback else None)),
> > 
> > This seems a bit of a complicated way of saying "ei.filename ei.lineno
> > ei.function ei.text for ei in ..."
> 
> The thing is that python's exception info returns a list, not a
> key-value or named tuple. Unless you want a raw list in the json then
> you want this oddity to keep the names of each value.

Um.

>>> import collections, json
>>> X = collections.namedtuple("X", 'x y')
>>> json.dumps(X(1,2))
'[1, 2]'

> > 
> > > +            ("attachments", [
> > > +                attachment.as_dict() for attachment in self._attachments]),
> > > +            ("sub_activities", [
> > > +                activity.as_json() for activity in self._sub_activities])))
> > > +        return _build_json(obj)
> > 
> > It strikes me that this would be more proof against odd things in meta
> > and a bit simpler if _build_json (which also has a bad name) handled
> > datetimes and timedeltas.
> 
> I'm a victim of refactoring, initially build_json was a small helper that did something else, I'll fix that.

OK.  Yay for code review :)

> > > +    def __str__(self):
> > > +        """"
> > > +        Returns self.message
> > > +        """
> > > +        return self.message
> > > +
> > > +    def save_as_tarball(self, scratch, tarball_stream, mode='w:gz'):
> > > +        """
> > > +        Save activity history as a tarball.
> > > +
> > > +        The tarball will have two files 'history.txt' and 'history.json'
> > with
> > > +        appropriate encoding of this activity, as well as all sub-
> > activities.
> > > +        Any attachments are added as well.
> > > +        """
> > > +        with tarfile.open(tarball_stream.name, mode=mode,
> > > +                          fileobj=tarball_stream) as tarball:
> > > +            with scratch.open('history.txt') as stream:
> > > +                self.print_(stream=stream)
> > 
> > Heh ordering snafu?  scratch isn't introduced yet.  Never mind though.
> 
> I saw that later, you are correct ;-)
>  
> > It's never occurred to me before, but it's pretty weird to see code of
> > the form:
> > 
> > with ${a}.open(${b}) as var:
> >     ... code that writes to var ...
> > 
> > This is the tempfile module's fault, not yours I guess :)
> 
> On the other hand if you just call .unique() it's somewhat readable and clean (as long as you don't add any arguments)
> 
> with scratch.unique() as stream:
>    ....

Actually, now I've read the scratch module I think scratch.open is a
slightly strange interface around whether what happens when the file
already exists.  I think you'd be better off using scratch.unique()
here -- you override the stream name when adding it to the tarball
anyway, right?

> > > +            tarball.add(stream.name, 'history.txt')
> > > +            with scratch.open('history.json') as stream:
> > > +                JSON.dump(stream, self.as_json())
> > > +            tarball.add(stream.name, 'history.json')
> > > +            self._save_tarball_attachments(tarball)
> > > +
> > > +    def _save_tarball_attachments(self, tarball):
> > > +        for attachment in self._attachments:
> > > +            if os.path.isfile(attachment.orig_name):
> > > +                tarball.add(attachment.orig_name, attachment.name)
> > > +        for activity in self._sub_activities:
> > > +            activity._save_tarball_attachments(tarball)
> > 
> > Err, you'd better hope that no two attachments have the same name or are
> > called 'history.json' or 'history.txt' here!
> 
> My bad, python's TarFile class is hideously crappy. Originally all
> attachments were in a directory but it turns out you really cannot add
> a fake directory to a TarFile without reading the implementation and
> hardcoding magic tuples. Still, point taken, I'll add a TODO entry
> about this. This is also why I need a scratch instance to create
> history.json/txt as a real file, and not as data stream inside TarFile
> instance (which is silly and wastes resources).

Oh yuk.  Is the zipfile module any better?

> > I'm a bit uneasy about how this attachment thing is going to work out in
> > practice.  When a job has finished, it will create a tarball, that ends
> > up in the media files of lava-server and the scheduler job view will
> > fish the history.json out of this tarball to know how to render the
> > basic outline, include things that will do ajax to another view that
> > fishes other files out of the tarball to either be downloaded or shown
> > inline in the page?  But we'd like to have incremental output and I
> > don't really like the idea of a page for a completed job being
> > completely different to that for a job that's still running.
> 
> Incremental output will be something we get in the classic method (as
> it works now), I think we'll find a way to manage history tarballs
> when we get to work on GUI again. Also, read my comments at the very
> top of this reply, where I ack the need for incremental output in
> certain, *cough* slow models *cough* situations.

Yeah, I guess we can get something working...

> > It just feels like the bits don't quite line up here.
> > 
> > > +    def __repr__(self):
> > > +        return "<Activity: %r>" % str(self)
> > > +
> > > +    def __enter__(self):
> > > +        """
> > > +        Begins a 'long' activity.
> > > +
> > > +        Long activities have both start and end, and thus, non-zero
> > duration.
> > > +        This is called automatically by History.__enter__ so you won't
> > usually
> > > +        have to use it explicitly.
> > > +
> > > +        Here we reset self.end to None (it was initialized to 'now' in
> > > +        __init__)
> > > +        """
> > > +        self._end = None
> > > +        return self
> > > +
> > > +    def __exit__(self, exc_type, exc_value, traceback):
> > > +        """
> > > +        Terminates a 'long' activity.
> > > +
> > > +        Here we set self.end to 'now' and record any exception information.
> > > +        See __enter__() for more information on 'long' activities.
> > > +        """
> > > +        self._end = datetime.datetime.utcnow()
> > > +        self._exc_value = exc_value
> > > +        self._exc_type = exc_type
> > > +        self._traceback = traceback
> > > +
> > > +    def print_(self, stream=None, print_meta=False, print_children=True,
> > > +               stack=None):
> > > +        """
> > > +        Print a nice Unicode tree of activities.
> > > +
> > > +        This is very useful for tracing and debugging.
> > > +        """
> > > +        if stack is None:
> > > +            stack = []
> > > +        if stream is None:
> > > +            stream = sys.stdout
> > > +        fragments = []
> > > +        if print_meta is False:
> > 
> > Please don't compare immutable values with 'is'.
> 
> False is not only immutable, it is a singleton ;-) I guess I over-do 'is None'

Hm yeah.  Well this trips my wtf circuit -- I guess if not print_meta:
is better really.

> > In the code as submitted, there's no way for print_meta to be True (or
> > print_children to be False, come to that).  Will this change in a later
> > branch?
> 
> You can always call print_() directly but if you want I can remove it now.

Hm, I don't know.  The code is written I guess.

> > > +            fragments.append("{0:7}".format(self.level_name))
> > > +            fragments.append(" | ")
> > > +            fragments.append("{start}.{micro:06}".format(
> > > +                start=self._start.strftime("%Y-%m-%d %T"),
> > > +                micro=self._start.microsecond))
> > > +            fragments.append(" | ")
> > > +            if self.duration.total_seconds() > 0:
> > > +                fragments.append("{0:14}s".format(self.duration))
> > > +            else:
> > > +                fragments.append("{0:15}".format(""))
> > > +            fragments.append(" | ")
> > > +            fragments.append(''.join(self._tree_indent(stack)))
> > > +        # Process the message to make it safer
> > > +        message = self.message
> > > +        message = message.replace("\n", " ").replace("\r", " ")
> > > +        if isinstance(message, unicode):
> > > +            message = message.encode(stream.encoding or "UTF-8", 'replace')
> > > +        fragments.append(message)
> > > +        if self._exc_value:
> > > +            fragments.append(", ***crashed*** {0}".format(self._exc_value))
> > > +        print("".join(fragments), file=stream)
> > > +        if print_meta:
> > > +            print("\tstart: {0}".format(self._start))
> > > +            if self.duration.total_seconds() > 0:
> > > +                print("\tduration: {0}".format(self.duration))
> > > +            if self._end is not None and self._end != self._start:
> > > +                print("\tend: {0}".format(self._end))
> > > +            if self._exc_value is not None:
> > > +                print("\texc_value: {0}".format(self._exc_value))
> > > +            if self._exc_type is not None:
> > > +                print("\texc_type: {0}".format(self._exc_type))
> > > +            if self._traceback is not None:
> > > +                for index, exc_info in enumerate(self._traceback):
> > > +                    print("\ttraceback.{0}.filename: {1}".format(
> > > +                        index, self._traceback.filename))
> > > +                    print("\ttraceback.{0}.lineno: {1}".format(
> > > +                        index, self._traceback.lineno))
> > > +                    print("\ttraceback.{0}.function {1}".format(
> > > +                        index, self._traceback.function))
> > > +                    print("\ttraceback.{0}.text {1}".format(
> > > +                        index, self._traceback.text))
> > > +            for key, value in self.meta.as_dict().iteritems():
> > > +                # Skip one special key used by logging integration
> > > +                if key == "__prefmt_msg":
> > > +                    continue
> > > +                print("\t{key}: {value!r}".format(key=key, value=value))
> > > +        if print_children:
> > > +            pos = len(stack)
> > > +            stack.append(len(self._sub_activities))
> > > +            for sub in self._sub_activities:
> > > +                stack[pos] -= 1
> > > +                sub.print_(stream, print_meta, print_children, stack)
> > > +            stack.pop()
> > > +
> > > +    @staticmethod
> > > +    def _tree_indent(stack, use_unicode=False):
> > > +        """
> > > +        Compute the indentation part of the tree displayed by print_.
> > > +        """
> > > +        # The code here might be tricky. It's easier to understand if you
> > keep
> > > +        # in mind that stack[n] tells you how many more entries are left in
> > the
> > > +        # tree at that stack depth. The stack is constructed by print_()
> > 
> > I haven't really tried to understand the printing code, sorry.
> 
> That's fine, neither did I, the code has a bug but I failed to
> identify it after an hour of looking at the algorithm. Luckly browsers
> implement <ul>...</ul> correctly.

Heh.

> > > +        for items_left in stack[:-1]:
> > > +            # FIXME: this is broekn
> > > +            if items_left == 0:
> > > +                yield "    "
> > > +            elif items_left == 1:
> > > +                if use_unicode:
> > > +                    yield "└   "
> > 
> > Wow, this code confused emacs when reading the original email -- I guess
> > it didn't know it was utf-8.  As with print_meta I don't see a way for
> > use_unicode to be True, and would sort of prefer to avoid unicode here I
> > think.
> 
> I've added the python "OMG UTF-8" stanza at the top of the
> file. Unicode looks better if you see it (it could be just dumped in
> <pre> on a page without looking childlish). If I fix the FIXME bug
> above I'll probably discard the non-unicode variant. Also, we have a
> unicode/logging bug somewhere (eh, python + redirected stdout gets
> ASCI insanity) so this was really a workaround for that. If you do
> .encode("UTF-8") then some of the formatting code above will break as
> it needs to measure the length of a string to align everything.

Heh I'm sure.  Let's leave this alone.

> > > +                else:
> > > +                    yield "\   "
> > > +            elif items_left > 1:
> > > +                if use_unicode:
> > > +                    yield "│   "
> > > +                else:
> > > +                    yield "|   "
> > > +        for items_left in stack[-1:]:
> > > +            if items_left == 0:
> > > +                if use_unicode:
> > > +                    yield "└── "
> > > +                else:
> > > +                    yield "\-- "
> > > +            elif items_left > 0:
> > > +                if use_unicode:
> > > +                    yield "├── "
> > > +                else:
> > > +                    yield "|-- "
> > > +
> > > +
> > > +def _build_json(obj):
> > > +    """
> > > +    Helper to construct json that keeps ordering and discards None values
> > 
> > As above: this doesn't return JSON.
> 
> Ack
> 
> > > +    """
> > > +    if isinstance(obj, (dict, simplejson.OrderedDict)):
> > > +        new_obj = simplejson.OrderedDict()
> > > +        for key, value in obj.iteritems():
> > > +            if value is None or value == [] or value == {}:
> > > +                continue
> > > +            else:
> > > +                new_obj[str(key)] = _build_json(value)
> > > +        return new_obj
> > > +    elif isinstance(obj, list):
> > > +        return [_build_json(item) for item in obj]
> > > +    elif (isinstance(obj, (int, float, str, unicode, list, dict, bool))
> > > +          or obj is None):
> > > +        return obj
> > 
> > er, list is in here twice.  I don't understand how ExceptionInfo (or
> > tuple come to that) passed through here.  I think you should delete
> > ExceptionInfo anyway, but you probably want to handle tuples?
> 
> Hmm, some good points, right. I'll merge list to (list, tuple) above and get rid of them from the elif clause.

Thanks.

> > > +    else:
> > > +        try:
> > > +            return str(obj)
> > > +        except Exception:
> > > +            raise ValueError("cannot convert {0!r} to JSON".format(obj))
> > > +
> > > +
> > > +class History(object):
> > > +    """
> > > +    History class, provides helper methods for hierarchical logging.
> > > +    Maintains a stack of activities that are nested with context
> > > +    management.
> > > +    """
> > > +
> > > +    def __init__(self, top):
> > > +        """
> > > +        Create a History that starts with the specified 'top' activity
> > > +
> > > +        Usually you don't want to call this. Instead, use the singleton
> > > +        instance 'history'.
> > > +        """
> > > +        self._nesting = [top]
> > > +        self._last = None
> > > +
> > > +    @property
> > > +    def top(self):
> > > +        """
> > > +        the top-level activity
> > > +
> > > +        Very useful to grab entire history and do something with it, for
> > > +        example history.top.as_json() or history.top.print_()
> > > +        """
> > > +        return self._nesting[0]
> > > +
> > > +    @property
> > > +    def bottom(self):
> > > +        """
> > > +        the bottommost activity
> > > +
> > > +        This is the most nested activity, as established by:
> > > +
> > > +            with history.xxx():
> > > +                ...
> > > +        """
> > > +        return self._nesting[-1]
> > > +
> > > +    @property
> > > +    def last(self):
> > > +        """
> > > +        the most recent activity
> > > +
> > > +        This is the most recently created activity in this history.
> > > +        """
> > 
> > The lack of implementation of this method suggests it's not actually
> > used anywhere :-)
> 
> But, hey, it's documented ;-)
> 
> > 
> > > +    def log(self, level, message, **meta):
> > > +        """
> > > +        Create an activity and append it to the history
> > > +        """
> > > +        self._last = Activity(level, message, **meta)
> > > +        self.bottom._sub_activities.append(self._last)
> > > +        return self
> > > +
> > > +    def debug(self, message, **meta):
> > > +        """
> > > +        Append DEBUG activity to the bottommost activity
> > > +        """
> > > +        return self.log(DEBUG, message, **meta)
> > > +
> > > +    def info(self, message, **meta):
> > > +        """
> > > +        Append INFO activity to the bottommost activity
> > > +        """
> > > +        return self.log(INFO, message, **meta)
> > > +
> > > +    def warning(self, message, **meta):
> > > +        """
> > > +        Append WARNING activity to the bottommost activity
> > > +        """
> > > +        return self.log(WARNING, message, **meta)
> > > +
> > > +    def error(self, message, **meta):
> > > +        """
> > > +        Append ERROR activity to the bottommost activity
> > > +        """
> > > +        return self.log(ERROR, message, **meta)
> > > +
> > > +    def critical(self, message, **meta):
> > > +        """
> > > +        Append CRITICAL activity to the bottommost activity
> > > +        """
> > > +        return self.log(CRITICAL, message, **meta)
> > > +
> > > +    def __enter__(self):
> > > +        """
> > > +        Begins nested activity
> > > +
> > > +        Nesting is widely useful as it helps to create a structure in the
> > > +        history stream. Typically you will call this method indirectly
> > > +        with code that looks like this:
> > > +
> > > +            with history.info("Doing something complex"):
> > > +                history.debug("step 1")
> > > +                history.debug("step 2")
> > > +                history.debug("step 3")
> > > +
> > > +        Here we also call __enter__() on the last activity to convert it to
> > a
> > > +        'long' activity (one that has both start, end and non-zero
> > duration)
> > > +        """
> > > +        self._nesting.append(self._last)
> > > +        return self._last.__enter__()
> > > +
> > > +    def __exit__(self, exc_type, exc_value, traceback):
> > > +        """
> > > +        Terminates most recently nested activity
> > > +
> > > +        Here we also call __exit__() on the last activity to allow it to
> > record
> > > +        any exceptions that may have been raised.
> > > +        """
> > > +        last = self._nesting.pop()
> > > +        last.__exit__(exc_type, exc_value, traceback)
> > > +
> > > +
> > > +# Global history
> > > +history = History(Activity(DEBUG, "History module loaded"))
> > >
> > > === added file 'lava/core/logging.py'
> > > --- lava/core/logging.py      1970-01-01 00:00:00 +0000
> > > +++ lava/core/logging.py      2012-05-07 09:48:17 +0000
> > > @@ -0,0 +1,120 @@
> > > +# Copyright (C) 2011-2012 Linaro Limited
> > > +# vim: set fileencoding=utf8 :
> > > +#
> > > +# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
> > > +#
> > > +# This file is part of lava-core
> > > +#
> > > +# lava-core is free software: you can redistribute it and/or modify
> > > +# it under the terms of the GNU Lesser General Public License version 3
> > > +# as published by the Free Software Foundation
> > > +#
> > > +# lava-core is distributed in the hope that it will be useful,
> > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +# GNU General Public License for more details.
> > > +#
> > > +# You should have received a copy of the GNU Lesser General Public License
> > > +# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
> > > +
> > > +from __future__ import absolute_import
> > > +
> > > +"""
> > > +lava.core.utils.logging
> > > +=======================
> > > +
> > > +Logging utilities
> > > +"""
> > > +
> > > +import logging
> > > +
> > > +from lava.core.history import history as global_history
> > > +
> > > +
> > > +class LoggerToHistoryProxy(object):
> > > +    """
> > > +    Wrapper that looks like logging.Logger but also creates history records
> > > +    """
> > > +
> > > +    _history = global_history
> > > +
> > > +    def __init__(self, logger):
> > > +        self._logger = logger
> > > +
> > > +    def log(self, level, msg, *args, **kwargs):
> > > +        self._logger.log(level, msg, *args, **kwargs)
> > > +        self._history.log(level, "{__prefmt_msg}",
> > > +                     __prefmt_msg=msg % args, logger=self._logger.name)
> > > +
> > > +    def debug(self, msg, *args, **kwargs):
> > > +        return self.log(logging.DEBUG, msg, *args, **kwargs)
> > > +
> > > +    def info(self, msg, *args, **kwargs):
> > > +        return self.log(logging.INFO, msg, *args, **kwargs)
> > > +
> > > +    def warning(self, msg, *args, **kwargs):
> > > +        return self.log(logging.WARNING, msg, *args, **kwargs)
> > > +
> > > +    def error(self, msg, *args, **kwargs):
> > > +        return self.log(logging.ERROR, msg, *args, **kwargs)
> > > +
> > > +    def critical(self, msg, *args, **kwargs):
> > > +        return self.log(logging.CRITICAL, msg, *args, **kwargs)
> > > +
> > > +    def exception(self, msg, *args):
> > > +        self._logger.exception(msg, *args)
> > > +
> > > +
> > > +class HistoryToLoggerProxy(object):
> > > +    """
> > > +    Wrapper that looks like History but also logs to logger
> > > +    """
> > > +
> > > +    _history = global_history
> > > +
> > > +    def __init__(self, logger):
> > > +        self._logger = logger
> > > +
> > > +    def log(self, level, message, **meta):
> > > +        self._logger.log(level, message.format(**meta))
> > > +        return self._history.log(level, message, **meta)
> > > +
> > > +    def debug(self, message, **meta):
> > > +        return self.log(logging.DEBUG, message, **meta)
> > > +
> > > +    def info(self, message, **meta):
> > > +        return self.log(logging.INFO, message, **meta)
> > > +
> > > +    def warning(self, message, **meta):
> > > +        return self.log(logging.WARNING, message, **meta)
> > > +
> > > +    def error(self, message, **meta):
> > > +        return self.log(logging.ERROR, message, **meta)
> > > +
> > > +    def critical(self, message, **meta):
> > > +        return self.log(logging.CRITICAL, message, **meta)
> > > +
> > > +
> > > +class LoggingMixIn(object):
> > > +    """
> > > +    Mix-in that adds a self.logger and self.history instance attributes
> > > +    """
> > 
> > I think you should explain the difference between calling a method on
> > logger and on history.
> 
> Ok
> 
> > > +    @property
> > > +    def _logger(self):
> > > +        return logging.getLogger(
> > > +            self.__class__.__module__ + "." + self.__class__.__name__)
> > > +
> > > +    @property
> > > +    def history(self):
> > > +        """
> > > +        magic lava.utils.logging.history.History-like instance
> > > +        """
> > > +        return HistoryToLoggerProxy(self._logger)
> > > +
> > > +    @property
> > > +    def logger(self):
> > > +        """
> > > +        magic logging.Logger-like instance
> > > +        """
> > > +        return LoggerToHistoryProxy(self._logger)
> > >
> > 
> > Cheers,
> > mwh
> -- 
> https://code.launchpad.net/~zkrynicki/lava-core/history-and-logging/+merge/104882
> You are subscribed to branch lp:lava-core.

Cheers,
mwh

lp:~zyga/lava-core/history-and-logging updated on 2012-05-16

11. By Zygmunt Krynicki on 2012-05-13

Fix History.last

12. By Zygmunt Krynicki on 2012-05-16

Simplify DictProxy

As pointed out by reviewers, DictProxy can be somewhat simple with the correct
usage of __setattr__ in __init__ and usage of __getattr__ instead of
__getattribute__

I've also added __eq__ and __repr__ for convenience

13. By Zygmunt Krynicki on 2012-05-16

Allow Activity.attach() to work on plain file names

This change is backported from demo-3 branch. It allows one to
attach files to an activity simply by passing their pathname.

14. By Zygmunt Krynicki on 2012-05-16

Improve and fix _build_json

There are three separate changes here:

Early unicode correctness:
* all strings are converted to unicode with str.decode("UTF-8")
* all dictionary keys are converted to unicode with __unicode__()
* all non-native objects are converted to unicode with __unicode__()

Support for datetime and timedelta:
Previously it was a small burden on the caller, now timedelta and datetime
objects are consistently encoded to strings with a well-known representation.

Support for tuples:
Previously tuples would not work, now they are treated just like lists. In
addition, both lists, tuples and dictionaries are removed from the (dead)
code path for 'native' objects as they are already handled earlier.

15. By Zygmunt Krynicki on 2012-05-16

Simplify Activity.as_json()

This change takes advantage of the improved _build_json() that already does the
necessary conversions and 'if obj is None' checks

16. By Zygmunt Krynicki on 2012-05-16

Add depencency declaration.

This is a small backport from future branches to allow unit tests to work

17. By Zygmunt Krynicki on 2012-05-16

Add tests for lava.core.history

Most of the Activity class is tested. Part of the json serialization is tested.
DictProxy and History objects are not tested yet

18. By Zygmunt Krynicki on 2012-05-16

Fix a bug in Hisstory.last

History.last returned None just after constructing the History object. It
should have returned 'top' (root of the history) instead.

19. By Zygmunt Krynicki on 2012-05-16

Add unit tests for History class

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Linaro Validation Team

Michael Hudson-Doyle

Zygmunt Krynicki

 === added file 'lava/core/history.py'
 --- lava/core/history.py	1970-01-01 00:00:00 +0000
 +++ lava/core/history.py	2012-05-16 11:20:24 +0000
@@ -0,0 +1,522 @@
++# Copyright (C) 2011-2012 Linaro Limited
++# vim: set fileencoding=utf8 :
++#
++# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
++#
++# This file is part of lava-core
++#
++# lava-core is free software: you can redistribute it and/or modify
++# it under the terms of the GNU Lesser General Public License version 3
++# as published by the Free Software Foundation
++#
++# lava-core is distributed in the hope that it will be useful,
++# but WITHOUT ANY WARRANTY; without even the implied warranty of
++# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++# GNU General Public License for more details.
++#
++# You should have received a copy of the GNU Lesser General Public License
++# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
++
++from __future__ import absolute_import, print_function
++
++"""
++lava.core.history
++=================
++
++Implementation of hierarhical history
++"""
++
++from logging import DEBUG, INFO, WARNING, ERROR, CRITICAL
++import collections
++import datetime
++import os
++import sys
++import tarfile
++import traceback
++
++from json_document.serializers import JSON
++from json_schema_validator.extensions import (
++    datetime_extension, timedelta_extension)
++import simplejson
++
++
++class DictProxy(object):
++
++    def __init__(self, d):
++        object.__setattr__(self, "_DictProxy__d", d)
++
++    def as_dict(self):
++        return self.__d
++
++    def __repr__(self):
++        return "<DictProxy {0!r}>".format(self.__d)
++
++    def __setattr__(self, name, value):
++        self.__d[name] = value
++
++    def __getattr__(self, name):
++        try:
++            return self.__d[name]
++        except KeyError:
++            raise AttributeError(name)
++
++    def __eq__(self, other):
++        if isinstance(other, DictProxy):
++            return self.__d == other.__d
++        else:
++            return False
++
++
++ExceptionInfo = collections.namedtuple(
++    "ExceptionInfo", "filename lineno function text")
++
++
++_name_and_back = {
++    DEBUG: "DEBUG",
++    INFO: "INFO",
++    WARNING: "WARNING",
++    ERROR: "ERROR",
++    CRITICAL: "CRITICAL",
++    "DEBUG": DEBUG,
++    "INFO": INFO,
++    "WARNING": WARNING,
++    "ERROR": ERROR,
++    "CRITICAL": CRITICAL
++}
++
++
++class Activity(object):
++
++    def __init__(self, level, message, **meta):
++        """
++        Create a new activity with the specified message, level and
++        initial meta-data
++        """
++        # Message that has format() references to meta-data
++        self._level = level
++        self._message = message
++        self._meta = DictProxy(meta)
++        # File attachments
++        self._attachments = []
++        # Children activities
++        self._sub_activities = []
++        # Start and end timestamps
++        self._start = datetime.datetime.utcnow()
++        self._end = self._start
++        # Exception + traceback
++        self._exc_value = None
++        self._exc_type = None
++        self._traceback = None
++
++    @property
++    def raw_message(self):
++        """
++        the raw, unformatted message
++        """
++        return self._message
++
++    @property
++    def message(self):
++        """
++        rendered message
++
++        Rendering substitutes str.format() style variables with data from
++        self.meta.
++        """
++        return self._message.format(**self._meta.as_dict())
++
++    @property
++    def level(self):
++        """
++        level number, 10 (DEBUG) - 50 (CRITICAL)
++        """
++        return self._level
++
++    @property
++    def level_name(self):
++        """
++        name of the level "DEBUG" ... "CRITICAL"
++        """
++        return _name_and_back[self._level]
++
++    @property
++    def meta(self):
++        """
++        collection of meta-data
++
++        Meta-data serves two purposes: to fill in data to the message
++        template and to store arbitrary auxiliary data. This property
++        returns a dictionary proxy that looks like an object (has normal
++        attribute accessors) but still talks to the real dictionary behind
++        the scenes. You can both read and write/set attributes at will.
++        """
++        return self._meta
++
++    def attach(self, stream_or_pathname, name=None,  mime_type='text/plain'):
++        """
++        Add an attachment.
++
++        Everything is extracted from the file-like object stream.  If you care
++        about a particular name of the attachment it an be specified with name.
++        MIME type can help the web front-end to render the content. It defaults
++        to text/plain which is suitable for most log files.
++        """
++        if isinstance(stream_or_pathname, basestring):
++            orig_name = stream_or_pathname
++        else:
++            orig_name = stream_or_pathname.name
++        if name is None:
++            name = os.path.basename(orig_name)
++        self._attachments.append(
++            DictProxy({
++                'name': name,
++                'orig_name': orig_name,
++                'mime_type': mime_type}))
++
++    @property
++    def duration(self):
++        """
++        Duration of the activity.
++
++        If the activity has not started the duration is always zero.
++        Activities that are in progress will observe the duration to grow.
++        """
++        if self._start is None:
++            return datetime.timedelta()
++        if self._end is None:
++            return datetime.datetime.utcnow() - self._start
++        return self._end - self._start
++
++    def as_json(self):
++        """
++        Convert this Activity to JSON
++        """
++        obj = simplejson.OrderedDict((
++            ("level", self.level_name),
++            ("message", self._message),
++            ("meta", self.meta.as_dict()),
++            ("start", self._start),
++            ("duration", self.duration),
++            ("end", self._end),
++            ("exc_value", self._exc_value),
++            ("exc_type", (self._exc_type.__name__
++                          if self._exc_type else None)),
++            ("traceback", ([ExceptionInfo(*ei) for ei in
++                            traceback.extract_tb(self._traceback)]
++                           if self._traceback else None)),
++            ("attachments", [
++                attachment.as_dict() for attachment in self._attachments]),
++            ("sub_activities", [
++                activity.as_json() for activity in self._sub_activities])))
++        return _build_json(obj)
++
++    def __str__(self):
++        """"
++        Returns self.message
++        """
++        return self.message
++
++    def save_as_tarball(self, scratch, tarball_stream, mode='w:gz'):
++        """
++        Save activity history as a tarball.
++
++        The tarball will have two files 'history.txt' and 'history.json' with
++        appropriate encoding of this activity, as well as all sub-activities.
++        Any attachments are added as well.
++        """
++        with tarfile.open(tarball_stream.name, mode=mode,
++                          fileobj=tarball_stream) as tarball:
++            with scratch.open('history.txt') as stream:
++                self.print_(stream=stream)
++            tarball.add(stream.name, 'history.txt')
++            with scratch.open('history.json') as stream:
++                JSON.dump(stream, self.as_json())
++            tarball.add(stream.name, 'history.json')
++            self._save_tarball_attachments(tarball)
++
++    def _save_tarball_attachments(self, tarball):
++        for attachment in self._attachments:
++            if os.path.isfile(attachment.orig_name):
++                tarball.add(attachment.orig_name, attachment.name)
++        for activity in self._sub_activities:
++            activity._save_tarball_attachments(tarball)
++
++    def __repr__(self):
++        return "<Activity: %r>" % str(self)
++
++    def __enter__(self):
++        """
++        Begins a 'long' activity.
++
++        Long activities have both start and end, and thus, non-zero duration.
++        This is called automatically by History.__enter__ so you won't usually
++        have to use it explicitly.
++
++        Here we reset self.end to None (it was initialized to 'now' in
++        __init__)
++        """
++        self._end = None
++        return self
++
++    def __exit__(self, exc_type, exc_value, traceback):
++        """
++        Terminates a 'long' activity.
++
++        Here we set self.end to 'now' and record any exception information.
++        See __enter__() for more information on 'long' activities.
++        """
++        self._end = datetime.datetime.utcnow()
++        self._exc_value = exc_value
++        self._exc_type = exc_type
++        self._traceback = traceback
++
++    def print_(self, stream=None, print_meta=False, print_children=True,
++               stack=None):
++        """
++        Print a nice Unicode tree of activities.
++
++        This is very useful for tracing and debugging.
++        """
++        if stack is None:
++            stack = []
++        if stream is None:
++            stream = sys.stdout
++        fragments = []
++        if print_meta is False:
++            fragments.append("{0:7}".format(self.level_name))
++            fragments.append(" | ")
++            fragments.append("{start}.{micro:06}".format(
++                start=self._start.strftime("%Y-%m-%d %T"),
++                micro=self._start.microsecond))
++            fragments.append(" | ")
++            if self.duration.total_seconds() > 0:
++                fragments.append("{0:14}s".format(self.duration))
++            else:
++                fragments.append("{0:15}".format(""))
++            fragments.append(" | ")
++            fragments.append(''.join(self._tree_indent(stack)))
++        # Process the message to make it safer
++        message = self.message
++        message = message.replace("\n", " ").replace("\r", " ")
++        if isinstance(message, unicode):
++            message = message.encode(stream.encoding or "UTF-8", 'replace')
++        fragments.append(message)
++        if self._exc_value:
++            fragments.append(", ***crashed*** {0}".format(self._exc_value))
++        print("".join(fragments), file=stream)
++        if print_meta:
++            print("\tstart: {0}".format(self._start))
++            if self.duration.total_seconds() > 0:
++                print("\tduration: {0}".format(self.duration))
++            if self._end is not None and self._end != self._start:
++                print("\tend: {0}".format(self._end))
++            if self._exc_value is not None:
++                print("\texc_value: {0}".format(self._exc_value))
++            if self._exc_type is not None:
++                print("\texc_type: {0}".format(self._exc_type))
++            if self._traceback is not None:
++                for index, exc_info in enumerate(self._traceback):
++                    print("\ttraceback.{0}.filename: {1}".format(
++                        index, self._traceback.filename))
++                    print("\ttraceback.{0}.lineno: {1}".format(
++                        index, self._traceback.lineno))
++                    print("\ttraceback.{0}.function {1}".format(
++                        index, self._traceback.function))
++                    print("\ttraceback.{0}.text {1}".format(
++                        index, self._traceback.text))
++            for key, value in self.meta.as_dict().iteritems():
++                # Skip one special key used by logging integration
++                if key == "__prefmt_msg":
++                    continue
++                print("\t{key}: {value!r}".format(key=key, value=value))
++        if print_children:
++            pos = len(stack)
++            stack.append(len(self._sub_activities))
++            for sub in self._sub_activities:
++                stack[pos] -= 1
++                sub.print_(stream, print_meta, print_children, stack)
++            stack.pop()
++
++    @staticmethod
++    def _tree_indent(stack, use_unicode=False):
++        """
++        Compute the indentation part of the tree displayed by print_.
++        """
++        # The code here might be tricky. It's easier to understand if you keep
++        # in mind that stack[n] tells you how many more entries are left in the
++        # tree at that stack depth. The stack is constructed by print_()
++        for items_left in stack[:-1]:
++            # FIXME: this is broekn
++            if items_left == 0:
++                yield "    "
++            elif items_left == 1:
++                if use_unicode:
++                    yield "└   "
++                else:
++                    yield "\   "
++            elif items_left > 1:
++                if use_unicode:
++                    yield "│   "
++                else:
++                    yield "|   "
++        for items_left in stack[-1:]:
++            if items_left == 0:
++                if use_unicode:
++                    yield "└── "
++                else:
++                    yield "\-- "
++            elif items_left > 0:
++                if use_unicode:
++                    yield "├── "
++                else:
++                    yield "|-- "
++
++
++def _build_json(obj):
++    """
++    Helper to construct json that keeps ordering and discards None values
++    """
++    if isinstance(obj, (dict, simplejson.OrderedDict)):
++        new_obj = simplejson.OrderedDict()
++        for key, value in obj.iteritems():
++            if value is None or value == [] or value == {}:
++                continue
++            else:
++                new_obj[unicode(key)] = _build_json(value)
++        return new_obj
++    elif isinstance(obj, (tuple, list)):
++        return [_build_json(item) for item in obj]
++    elif (isinstance(obj, (int, float, unicode, bool))
++          or obj is None):
++        return obj
++    elif isinstance(obj, str):
++        return obj.decode("UTF-8")
++    elif isinstance(obj, datetime.datetime):
++        return datetime_extension.to_json(obj)
++    elif isinstance(obj, datetime.timedelta):
++        return timedelta_extension.to_json(obj)
++    else:
++        try:
++            return unicode(obj)
++        except Exception:
++            raise ValueError("cannot convert {0!r} to JSON".format(obj))
++
++
++class History(object):
++    """
++    History class, provides helper methods for hierarchical logging.
++    Maintains a stack of activities that are nested with context
++    management.
++    """
++
++    def __init__(self, top):
++        """
++        Create a History that starts with the specified 'top' activity
++
++        Usually you don't want to call this. Instead, use the singleton
++        instance 'history'.
++        """
++        self._nesting = [top]
++        self._last = top
++
++    @property
++    def top(self):
++        """
++        the top-level activity
++
++        Very useful to grab entire history and do something with it, for
++        example history.top.as_json() or history.top.print_()
++        """
++        return self._nesting[0]
++
++    @property
++    def bottom(self):
++        """
++        the bottommost activity
++
++        This is the most nested activity, as established by:
++
++            with history.xxx():
++                ...
++        """
++        return self._nesting[-1]
++
++    @property
++    def last(self):
++        """
++        the most recent activity
++
++        This is the most recently created activity in this history.
++        """
++        return self._last
++
++    def log(self, level, message, **meta):
++        """
++        Create an activity and append it to the history
++        """
++        self._last = Activity(level, message, **meta)
++        self.bottom._sub_activities.append(self._last)
++        return self
++
++    def debug(self, message, **meta):
++        """
++        Append DEBUG activity to the bottommost activity
++        """
++        return self.log(DEBUG, message, **meta)
++
++    def info(self, message, **meta):
++        """
++        Append INFO activity to the bottommost activity
++        """
++        return self.log(INFO, message, **meta)
++
++    def warning(self, message, **meta):
++        """
++        Append WARNING activity to the bottommost activity
++        """
++        return self.log(WARNING, message, **meta)
++
++    def error(self, message, **meta):
++        """
++        Append ERROR activity to the bottommost activity
++        """
++        return self.log(ERROR, message, **meta)
++
++    def critical(self, message, **meta):
++        """
++        Append CRITICAL activity to the bottommost activity
++        """
++        return self.log(CRITICAL, message, **meta)
++
++    def __enter__(self):
++        """
++        Begins nested activity
++
++        Nesting is widely useful as it helps to create a structure in the
++        history stream. Typically you will call this method indirectly
++        with code that looks like this:
++
++            with history.info("Doing something complex"):
++                history.debug("step 1")
++                history.debug("step 2")
++                history.debug("step 3")
++
++        Here we also call __enter__() on the last activity to convert it to a
++        'long' activity (one that has both start, end and non-zero duration)
++        """
++        self._nesting.append(self._last)
++        return self._last.__enter__()
++
++    def __exit__(self, exc_type, exc_value, traceback):
++        """
++        Terminates most recently nested activity
++
++        Here we also call __exit__() on the last activity to allow it to record
++        any exceptions that may have been raised.
++        """
++        last = self._nesting.pop()
++        last.__exit__(exc_type, exc_value, traceback)
++
++
++# Global history
++history = History(Activity(DEBUG, "History module loaded"))
 === added file 'lava/core/logging.py'
 --- lava/core/logging.py	1970-01-01 00:00:00 +0000
 +++ lava/core/logging.py	2012-05-16 11:20:24 +0000
@@ -0,0 +1,120 @@
++# Copyright (C) 2011-2012 Linaro Limited
++# vim: set fileencoding=utf8 :
++#
++# Author: Zygmunt Krynicki <zygmunt.krynicki@linaro.org>
++#
++# This file is part of lava-core
++#
++# lava-core is free software: you can redistribute it and/or modify
++# it under the terms of the GNU Lesser General Public License version 3
++# as published by the Free Software Foundation
++#
++# lava-core is distributed in the hope that it will be useful,
++# but WITHOUT ANY WARRANTY; without even the implied warranty of
++# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++# GNU General Public License for more details.
++#
++# You should have received a copy of the GNU Lesser General Public License
++# along with lava-core.  If not, see <http://www.gnu.org/licenses/>.
++
++from __future__ import absolute_import
++
++"""
++lava.core.logging
++=================
++
++Logging utilities
++"""
++
++import logging
++
++from lava.core.history import history as global_history
++
++
++class LoggerToHistoryProxy(object):
++    """
++    Wrapper that looks like logging.Logger but also creates history records
++    """
++
++    _history = global_history
++
++    def __init__(self, logger):
++        self._logger = logger
++
++    def log(self, level, msg, *args, **kwargs):
++        self._logger.log(level, msg, *args, **kwargs)
++        self._history.log(level, "{__prefmt_msg}",
++                     __prefmt_msg=msg % args, logger=self._logger.name)
++
++    def debug(self, msg, *args, **kwargs):
++        return self.log(logging.DEBUG, msg, *args, **kwargs)
++
++    def info(self, msg, *args, **kwargs):
++        return self.log(logging.INFO, msg, *args, **kwargs)
++
++    def warning(self, msg, *args, **kwargs):
++        return self.log(logging.WARNING, msg, *args, **kwargs)
++
++    def error(self, msg, *args, **kwargs):
++        return self.log(logging.ERROR, msg, *args, **kwargs)
++
++    def critical(self, msg, *args, **kwargs):
++        return self.log(logging.CRITICAL, msg, *args, **kwargs)
++
++    def exception(self, msg, *args):
++        self._logger.exception(msg, *args)
++
++
++class HistoryToLoggerProxy(object):
++    """
++    Wrapper that looks like History but also logs to logger
++    """
++
++    _history = global_history
++
++    def __init__(self, logger):
++        self._logger = logger
++
++    def log(self, level, message, **meta):
++        self._logger.log(level, message.format(**meta))
++        return self._history.log(level, message, **meta)
++
++    def debug(self, message, **meta):
++        return self.log(logging.DEBUG, message, **meta)
++
++    def info(self, message, **meta):
++        return self.log(logging.INFO, message, **meta)
++
++    def warning(self, message, **meta):
++        return self.log(logging.WARNING, message, **meta)
++
++    def error(self, message, **meta):
++        return self.log(logging.ERROR, message, **meta)
++
++    def critical(self, message, **meta):
++        return self.log(logging.CRITICAL, message, **meta)
++
++
++class LoggingMixIn(object):
++    """
++    Mix-in that adds a self.logger and self.history instance attributes
++    """
++
++    @property
++    def _logger(self):
++        return logging.getLogger(
++            self.__class__.__module__ + "." + self.__class__.__name__)
++
++    @property
++    def history(self):
++        """
++        magic lava.core.history bridge
++        """
++        return HistoryToLoggerProxy(self._logger)
++
++    @property
++    def logger(self):
++        """
++        magic classic python logging bridge
++        """
++        return LoggerToHistoryProxy(self._logger)
 === added directory 'lava/core/tests'
 === added file 'lava/core/tests/__init__.py'
 === added file 'lava/core/tests/test_history.py'
 --- lava/core/tests/test_history.py	1970-01-01 00:00:00 +0000
 +++ lava/core/tests/test_history.py	2012-05-16 11:20:24 +0000
@@ -0,0 +1,278 @@
++"""
++lava.core.tests.test_history
++============================
++
++Tests for lava.core.history
++"""
++import datetime
++
++from unittest2 import TestCase
++from mocker import Mocker
++
++from lava.core.history import Activity, DictProxy, ExceptionInfo, History
++from lava.core.history import DEBUG, INFO, WARNING, ERROR, CRITICAL
++from lava.core.history import _build_json
++
++
++class ActivityTestCase(TestCase):
++
++    def test_raw_message(self):
++        act = Activity(INFO, "some {keyword}", keyword='word')
++        self.assertEqual(act.raw_message, "some {keyword}")
++
++    def test_message(self):
++        act = Activity(INFO, "typical {keyword}", keyword='word')
++        self.assertEqual(act.message, "typical word")
++        act = Activity(INFO, "mistake {bad_keyword}")
++        with self.assertRaises(KeyError):
++            act.message
++
++    def test_level(self):
++        self.assertEqual(Activity(DEBUG, "").level, DEBUG)
++        self.assertEqual(Activity(INFO, "").level, INFO)
++        self.assertEqual(Activity(WARNING, "").level, WARNING)
++        self.assertEqual(Activity(ERROR, "").level, ERROR)
++        self.assertEqual(Activity(CRITICAL, "").level, CRITICAL)
++
++    def test_level_name(self):
++        self.assertEqual(Activity(DEBUG, "").level_name, "DEBUG")
++        self.assertEqual(Activity(INFO, "").level_name, "INFO")
++        self.assertEqual(Activity(WARNING, "").level_name, "WARNING")
++        self.assertEqual(Activity(ERROR, "").level_name, "ERROR")
++        self.assertEqual(Activity(CRITICAL, "").level_name, "CRITICAL")
++
++    def test_meta(self):
++        act = Activity(INFO, "", foo=1)
++        self.assertEqual(act.meta.foo, 1)
++
++    def test_attach(self):
++        act = Activity(INFO, "")
++        act.attach("foo")
++        self.assertEqual(act._attachments, [
++            DictProxy({'name': 'foo',
++                       'orig_name': 'foo',
++                       'mime_type': 'text/plain'})])
++
++    def test_attach_uses_basename(self):
++        act = Activity(INFO, "")
++        act.attach("/some/place/foo")
++        self.assertEqual(act._attachments, [
++            DictProxy({'name': 'foo',
++                       'orig_name': '/some/place/foo',
++                       'mime_type': 'text/plain'})])
++
++    def test_attach_uses_stream_name(self):
++        act = Activity(INFO, "")
++        mocker = Mocker()
++        stream = mocker.mock()
++        stream.name
++        mocker.result("/some/place/foo")
++        mocker.replay()
++        act.attach(stream)
++        self.assertEqual(act._attachments, [
++            DictProxy({'name': 'foo',
++                       'orig_name': '/some/place/foo',
++                       'mime_type': 'text/plain'})])
++        mocker.verify()
++
++    def test_attach_respects_mime_type(self):
++        act = Activity(INFO, "")
++        act.attach("foo", mime_type="text/html")
++        self.assertEqual(act._attachments, [
++            DictProxy({'name': 'foo',
++                       'orig_name': 'foo',
++                       'mime_type': 'text/html'})])
++
++    def test_attach_respects_name(self):
++        act = Activity(INFO, "")
++        act.attach("foo", name='blarg')
++        self.assertEqual(act._attachments, [
++            DictProxy({'name': 'blarg',
++                       'orig_name': 'foo',
++                       'mime_type': 'text/plain'})])
++
++    def test_duration_for_immediate_activities(self):
++        act = Activity(INFO, "")
++        self.assertEqual(act.duration, datetime.timedelta(0))
++
++    def test_duration_for_long_activities(self):
++        act = Activity(INFO, "")
++        with act:
++            pass
++        self.assertNotEqual(act.duration, datetime.timedelta(0))
++
++    # TODO: as_json
++
++    def test__str__(self):
++        act = Activity(INFO, "foo {bar}", bar='BAR')
++        self.assertEqual(str(act), "foo BAR")
++
++    # TODO: save_as_tarball
++
++    # TODO: _save_tarball_attachments
++
++    def test__repr__(self):
++        act = Activity(INFO, "foo {bar}", bar='BAR')
++        self.assertEqual(repr(act), "<Activity: 'foo BAR'>")
++
++    def test__enter__(self):
++        act = Activity(INFO, "")
++        retval = act.__enter__()
++        self.assertIs(retval, act)
++        self.assertEqual(act._end, None)
++
++    def test__exit__for_normal_termination(self):
++        act = Activity(INFO, "")
++        act.__enter__()
++        act.__exit__(None, None, None)
++        self.assertNotEqual(act._end, None)
++        self.assertEqual(act._exc_value, None)
++        self.assertEqual(act._exc_type, None)
++        self.assertEqual(act._traceback, None)
++
++    def test__exit__for_abnormal_termination(self):
++        act = Activity(INFO, "")
++        act.__enter__()
++        exc = Exception()
++        tb = [ExceptionInfo('file', 1, 'foo', 'blarg')]
++        act.__exit__(type(exc), exc, tb)
++        self.assertNotEqual(act._end, None)
++        self.assertEqual(act._exc_value, exc)
++        self.assertEqual(act._exc_type, type(exc))
++        self.assertEqual(act._traceback, tb)
++
++    # TODO: print_
++
++    # TODO: _tree_indent
++
++
++class test_build_json(TestCase):
++
++    def assertJSON(self, a, b):
++        self.assertEqual(_build_json(a), b)
++
++    def test_dict(self):
++        self.assertJSON({'foo': 'bar'}, {u'foo': u'bar'})
++
++    def test_dict_skips_empty_values(self):
++        self.assertJSON({'foo': None}, {})
++        self.assertJSON({'foo': {}}, {})
++        self.assertJSON({'foo': []}, {})
++
++    def test_list_or_tuple(self):
++        self.assertJSON(['foo'], [u'foo'])
++        self.assertJSON(('foo', ), [u'foo'])
++
++    def test_passthrough(self):
++        self.assertJSON(1, 1)
++        self.assertJSON(1.0, 1.0)
++        self.assertJSON(u"1", u"1")
++        self.assertJSON(True, True)
++        self.assertJSON(None, None)
++
++    def test_str(self):
++        self.assertJSON("1", u"1")
++        with self.assertRaises(UnicodeDecodeError):
++            self.assertJSON("\xff", u"\xff")
++
++    def test_datetime(self):
++        self.assertJSON(datetime.datetime(2012, 8, 19, 12, 34, 56),
++                        "2012-08-19T12:34:56Z")
++
++    def test_timedelta(self):
++        self.assertJSON(datetime.timedelta(1, 2, 3), "1d 2s 3us")
++
++    def test_other(self):
++        class Foo(object):
++            def __unicode__(self):
++                return u"blarg"
++        self.assertJSON(Foo(), u"blarg")
++
++
++class HistoryTest(TestCase):
++
++    def test_top(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        self.assertIs(hist.top, act)
++
++    def test_bottom(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        self.assertIs(hist.bottom, act)
++        with hist.info("") as act2:
++            self.assertIs(hist.bottom, act2)
++        self.assertIs(hist.bottom, act)
++
++    def test_last(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        self.assertIs(hist.last, act)
++        with hist.info("") as act2:
++            pass
++        self.assertIs(hist.last, act2)
++
++    def test_log_appends_to_bottommost_activity(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.log(INFO, "foo")
++        self.assertEqual(hist.bottom._sub_activities, [hist.last])
++
++    def test_debug(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.debug("foo")
++        self.assertEqual(hist.last.level, DEBUG)
++        self.assertEqual(hist.last.message, "foo")
++
++    def test_info(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.info("foo")
++        self.assertEqual(hist.last.level, INFO)
++        self.assertEqual(hist.last.message, "foo")
++
++    def test_warning(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.warning("foo")
++        self.assertEqual(hist.last.level, WARNING)
++        self.assertEqual(hist.last.message, "foo")
++
++    def test_error(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.error("foo")
++        self.assertEqual(hist.last.level, ERROR)
++        self.assertEqual(hist.last.message, "foo")
++
++    def test_critical(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.critical("foo")
++        self.assertEqual(hist.last.level, CRITICAL)
++        self.assertEqual(hist.last.message, "foo")
++
++    def test__enter__(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        ret = hist.__enter__()
++        # Enter will push the last activity on the nesting stack
++        self.assertIs(ret, act)
++        # NOTE: act is now a long activity
++        self.assertIs(act._end, None)
++        # NOTE: top of the history is now twice on the nesting stack
++        self.assertEqual(hist._nesting, [act, act])
++        hist.info("foo")
++        self.assertEqual(hist.top._sub_activities[0].message, "foo")
++
++    def test__exit__(self):
++        act = Activity(INFO, "")
++        hist = History(act)
++        hist.__enter__()
++        self.assertEqual(hist._nesting, [act, act])
++        hist.__exit__(None, None, None)
++        self.assertEqual(hist._nesting, [act])
++        # NOTE: act is now a long activity
++        self.assertIsNot(act._end, None)
++        self.assertNotEqual(act.duration, datetime.timedelta(0))
 === modified file 'setup.py'
 --- setup.py	2012-05-03 22:42:55 +0000
 +++ setup.py	2012-05-16 11:20:24 +0000
@@ -44,6 +44,13 @@
          "Topic :: Software Development :: Testing",
      ],
      extras_require={},
--    install_requires=[],
--    tests_require=['unittest2'],
++    install_requires=[
++        'json-document',
++        'json-schema-validator',
++        'simplejson',
++    ],
++    tests_require=[
++        'unittest2',
++        'mocker',
++    ],
      zip_safe=True)

LAVA Core (deprected)

Merge lp:~zyga/lava-core/history-and-logging into lp:lava-core

Commit message

Description of the change

Preview Diff

Subscribers