Merge lp:~mwhudson/lava-scheduler/email-on-health-check-failure into lp:lava-scheduler

Proposed by Michael Hudson-Doyle
Status: Merged
Merged at revision: 163
Proposed branch: lp:~mwhudson/lava-scheduler/email-on-health-check-failure
Merge into: lp:lava-scheduler
Diff against target: 201 lines (+117/-4)
3 files modified
lava_scheduler_app/models.py (+80/-2)
lava_scheduler_app/templates/lava_scheduler_app/job_summary_mail.txt (+27/-0)
lava_scheduler_daemon/dbjobsource.py (+10/-2)
To merge this branch: bzr merge lp:~mwhudson/lava-scheduler/email-on-health-check-failure
Reviewer Review Type Date Requested Status
Zygmunt Krynicki (community) Approve
Review via email: mp+100890@code.launchpad.net

Description of the change

Hi,

This branch allows the job submitter to supply email addresses to mail on job completion (or optionally, only when the job fails).

I am open to criticism on many levels:

1) The general approach (although really you should have complained on the mailing list about this)
2) The content of the email.
3) The location of the code.
4) Fine details of the code.

I haven't tested this at all yet, but the overall logic is simple enough.

Cheers,
mwh

To post a comment you must log in.
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

45 + sha1 = self.results_link.split('/')[-2]

# XXX: depends on permalink from the other part, I really really really feel we should mandate that the dispatcher, when invoked by the scheduler, is not doing the submission and that we really submit internally from inside the app. Still, XXX: OMG-depends-on-dashboard-urls.py note should be there

67 + if not isinstance(address, (str, unicode)):

basestr

68 + print (address, unicode, isinstance(address, unicode))

leftover print

69 + raise ValueError(msg)

I have not looked ad this in full (and launchpad context does not show the function this is in). I hope this is done on submission time (and this makes the submission rejected) and not anytime later. If this is later then perhaps this should be non-fatal.

I also somewhat feel bad about sending messages to unverified emails. I'd be much happier with an user list here and a way (even future proof, not-done-yet) to get the email out of a person account. Feels like a way to send spam in the long end.

83 + def _generate_summary_mail(self):
84 + bundle = self.results_bundle
85 + test_runs = []
86 + if bundle is not None:
87 + for tr in bundle.test_runs.all():
88 + results = tr.get_summary_results()
89 + test_runs.append({
90 + 'name':tr.test.test_id,
91 + 'passes': results.get('pass', 0),
92 + 'total': results.get('total', 0),
93 + })
94 + return render_to_string(
95 + 'lava_scheduler_app/job_summary_mail.txt',
96 + {
97 + 'job': self,
98 + 'test_runs': test_runs,
99 + })

If we cannot get the bundle then perhaps the message should be "optimized" for that. I guess all you need is to pass bundle to the context and let it run in the template. Then perhaps we don't even need any processing on the python side which would be even better.

+ send_mail(
110 + "LAVA job notification", mail, settings.SERVER_EMAIL, recipients)
111 +

It would be nice to have more informative subjects but I realize we cannot do it right now. Perhaps we could link that to testing efforts? Then a testing effort could define a notification subject and have an explicit list of people interested in stuff happening. Anyway, more of a brainstorm-away-from-ML so don't worry about this.

Overall mostly good, fix print() and this could go in

review: Approve
166. By Michael Hudson-Doyle

comments following review:
* add XXX in TestJob.results_bundle pointing out how horrible it is
* (str, unicode) -> basestring
* move some logic into the email template
* delete a leftover print

167. By Michael Hudson-Doyle

oops

168. By Michael Hudson-Doyle

include the job description in the subject

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :
Download full text (3.5 KiB)

On Wed, 04 Apr 2012 22:35:21 -0000, Zygmunt Krynicki <email address hidden> wrote:
> Review: Approve
>
> 45 + sha1 = self.results_link.split('/')[-2]
>
> # XXX: depends on permalink from the other part, I really really
> really feel we should mandate that the dispatcher, when invoked by the
> scheduler, is not doing the submission and that we really submit
> internally from inside the app. Still, XXX:
> OMG-depends-on-dashboard-urls.py note should be there

I added an XXX.

> 67 + if not isinstance(address, (str, unicode)):
>
> basestr

Changed. Although I hate that this is necessary; the json variant
always gives you unicode strings, simplejson sometimes gives you
bytestrings, sometimes unicode strings.

> 68 + print (address, unicode, isinstance(address, unicode))
>
> leftover print

Oops, thanks.

> 69 + raise ValueError(msg)
>
> I have not looked ad this in full (and launchpad context does not show
> the function this is in). I hope this is done on submission time (and
> this makes the submission rejected) and not anytime later. If this is
> later then perhaps this should be non-fatal.

It's done at submission time.

$ lava-tool submit-job http://mwhudson@127.0.0.1:8000/RPC2/ ~/dispatcher-jobs/notify-test
EXPERIMENTAL - SUBJECT TO CHANGE (See --experimental-notice for more info)
ERROR: <Fault 400: "' micahelgmail.com' is not a valid email address.">

> I also somewhat feel bad about sending messages to unverified
> emails. I'd be much happier with an user list here and a way (even
> future proof, not-done-yet) to get the email out of a person
> account. Feels like a way to send spam in the long end.

Yeah. Hm. We _could_ make it a list of usernames, we get the email
from Launchpad during the log in process (and we can make sure that the
robot users have sensible email addresses). We'd have to add a user per
mailing list we want to email (which probably would not be too bad).

> 83 + def _generate_summary_mail(self):
> 84 + bundle = self.results_bundle
> 85 + test_runs = []
> 86 + if bundle is not None:
> 87 + for tr in bundle.test_runs.all():
> 88 + results = tr.get_summary_results()
> 89 + test_runs.append({
> 90 + 'name':tr.test.test_id,
> 91 + 'passes': results.get('pass', 0),
> 92 + 'total': results.get('total', 0),
> 93 + })
> 94 + return render_to_string(
> 95 + 'lava_scheduler_app/job_summary_mail.txt',
> 96 + {
> 97 + 'job': self,
> 98 + 'test_runs': test_runs,
> 99 + })
>
> If we cannot get the bundle then perhaps the message should be
> "optimized" for that. I guess all you need is to pass bundle to the
> context and let it run in the template. Then perhaps we don't even
> need any processing on the python side which would be even better.

I don't understand what you mean here.

I changed the template to just take the job though, which works but
leads to a pretty ugly template.

> + send_mail(
> 110 + "LAVA job notification", mail, settings.SERVER_EMAIL, recipients)
> 111 +
>
> It would be nice to have more informative subjects but I realize we
> cannot do it right now.

I could put the description/job name in the subject I guess? Yeah, I'll
do that.

> Perhaps we could link that to testing efforts? Then ...

Read more...

169. By Michael Hudson-Doyle

truncate long descriptions at the beginning because all the long descriptions start with "https://android-build.linaro.org/jenkins/job/linaro-android" and have the distinguishing bits at the end

170. By Michael Hudson-Doyle

tiny refactor

171. By Michael Hudson-Doyle

move to a model where notify & notify_on_incomplete list usernames, not email addresses

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

> On Wed, 04 Apr 2012 22:35:21 -0000, Zygmunt Krynicki
> <email address hidden> wrote:
> > Review: Approve

> > 67 + if not isinstance(address, (str, unicode)):
> >
> > basestr
>
> Changed. Although I hate that this is necessary; the json variant
> always gives you unicode strings, simplejson sometimes gives you
> bytestrings, sometimes unicode strings.

Depending on python-dev being available, yes this sucks a lot.

> $ lava-tool submit-job http://mwhudson@127.0.0.1:8000/RPC2/ ~/dispatcher-jobs
> /notify-test
> EXPERIMENTAL - SUBJECT TO CHANGE (See --experimental-notice for more info)
> ERROR: <Fault 400: "' micahelgmail.com' is not a valid email address.">
>
> > I also somewhat feel bad about sending messages to unverified
> > emails. I'd be much happier with an user list here and a way (even
> > future proof, not-done-yet) to get the email out of a person
> > account. Feels like a way to send spam in the long end.
>
> Yeah. Hm. We _could_ make it a list of usernames, we get the email
> from Launchpad during the log in process (and we can make sure that the
> robot users have sensible email addresses). We'd have to add a user per
> mailing list we want to email (which probably would not be too bad).

Yeah, let's do that

>
> > 83 + def _generate_summary_mail(self):
> > 84 + bundle = self.results_bundle
> > 85 + test_runs = []
> > 86 + if bundle is not None:
> > 87 + for tr in bundle.test_runs.all():
> > 88 + results = tr.get_summary_results()
> > 89 + test_runs.append({
> > 90 + 'name':tr.test.test_id,
> > 91 + 'passes': results.get('pass', 0),
> > 92 + 'total': results.get('total', 0),
> > 93 + })
> > 94 + return render_to_string(
> > 95 + 'lava_scheduler_app/job_summary_mail.txt',
> > 96 + {
> > 97 + 'job': self,
> > 98 + 'test_runs': test_runs,
> > 99 + })
> >
> > If we cannot get the bundle then perhaps the message should be
> > "optimized" for that. I guess all you need is to pass bundle to the
> > context and let it run in the template. Then perhaps we don't even
> > need any processing on the python side which would be even better.
>
> I don't understand what you mean here.

Well I mean that bundle.test_runs == [] is a different use case from bundle is None

> I changed the template to just take the job though, which works but
> leads to a pretty ugly template.

Waiting for the updated diff now

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

149 + if len(description) > 80:
150 + description = '...' + description[-78:]

Why the last 78 characters? Note that will make the full line 81 + newline characters long

112 + 'lava_scheduler_app/job_summary_mail.txt',
113 + {
114 + 'job': self,
115 + })

Maybe you can fit that on one line?

178 +{% for run in job.results_bundle.test_runs.all %} | {{ run.test.test_id|ljust:20 }} | {{ run.get_summary_results.pass|default:0|rjust:6 }} | {{ run.get_summary_results.total|default:0|rjust:6 }} |

On the note that this makes the template uglier. Recent django has variables and other goodies in the templates and I'm sure you can rewrite that to be readable if you want.

review: Approve
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On Thu, 05 Apr 2012 00:49:18 -0000, Zygmunt Krynicki <email address hidden> wrote:
> Review: Approve
>
> 149 + if len(description) > 80:
> 150 + description = '...' + description[-78:]
>
> Why the last 78 characters? Note that will make the full line 81 + newline characters long

Just a bit arbitrary really. Can bump up the limit to say 200, which
would allow every description in the database today, but avoid complete
silliness.

> 112 + 'lava_scheduler_app/job_summary_mail.txt',
> 113 + {
> 114 + 'job': self,
> 115 + })
>
> Maybe you can fit that on one line?

Yep.

> 178 +{% for run in job.results_bundle.test_runs.all %} | {{ run.test.test_id|ljust:20 }} | {{ run.get_summary_results.pass|default:0|rjust:6 }} | {{ run.get_summary_results.total|default:0|rjust:6 }} |
>
> On the note that this makes the template uglier. Recent django has
> variables and other goodies in the templates and I'm sure you can
> rewrite that to be readable if you want.

Ah, you mean {% with %} and so on? It helps a bit, but having to have
everything on the one line is a bit painful. I've improved this a bit,
am happy to take any more suggestions :-)

I also need to make sure we get full URLs in the emails, currently
they're just the /scheduler/job/16280 sort of thing.

Cheers,
mwh

172. By Michael Hudson-Doyle

small review tweaks

173. By Michael Hudson-Doyle

increase limit of how much description we put in the subject

174. By Michael Hudson-Doyle

generate full URLs in the template

175. By Michael Hudson-Doyle

small fixes, add some logging

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On Thu, 05 Apr 2012 01:49:18 -0000, Michael Hudson-Doyle <email address hidden> wrote:
> On Thu, 05 Apr 2012 00:49:18 -0000, Zygmunt Krynicki <email address hidden> wrote:
> > Review: Approve
> >
> > 149 + if len(description) > 80:
> > 150 + description = '...' + description[-78:]
> >
> > Why the last 78 characters? Note that will make the full line 81 + newline characters long
>
> Just a bit arbitrary really. Can bump up the limit to say 200, which
> would allow every description in the database today, but avoid complete
> silliness.

I've done this.

> > 112 + 'lava_scheduler_app/job_summary_mail.txt',
> > 113 + {
> > 114 + 'job': self,
> > 115 + })
> >
> > Maybe you can fit that on one line?
>
> Yep.
>
> > 178 +{% for run in job.results_bundle.test_runs.all %} | {{ run.test.test_id|ljust:20 }} | {{ run.get_summary_results.pass|default:0|rjust:6 }} | {{ run.get_summary_results.total|default:0|rjust:6 }} |
> >
> > On the note that this makes the template uglier. Recent django has
> > variables and other goodies in the templates and I'm sure you can
> > rewrite that to be readable if you want.
>
> Ah, you mean {% with %} and so on? It helps a bit, but having to have
> everything on the one line is a bit painful. I've improved this a bit,
> am happy to take any more suggestions :-)
>
> I also need to make sure we get full URLs in the emails, currently
> they're just the /scheduler/job/16280 sort of thing.

And this.

And I've tested it. Can you take a look at how I construct the URLs,
and if that looks OK, please merge it?

See you on the 16th!

Cheers,
mwh

Revision history for this message
Paul Larson (pwlars) wrote :

> I also somewhat feel bad about sending messages to unverified emails. I'd be
> much happier with an user list here and a way (even future proof, not-done-
> yet) to get the email out of a person account. Feels like a way to send spam
> in the long end.
There are so many easier ways to send spam than with a lava test job - that only authorized users can submit. I think the mailing list things is a concern here. If you want to create a job and send results to certain individuals and email lists, the process shouldn't require going through and looking up all their lava account ids and filing a request to have an id created for each mailing list before you can submit your job.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On Thu, 05 Apr 2012 10:06:20 -0000, Paul Larson <email address hidden> wrote:
> > I also somewhat feel bad about sending messages to unverified emails. I'd be
> > much happier with an user list here and a way (even future proof, not-done-
> > yet) to get the email out of a person account. Feels like a way to send spam
> > in the long end.
>
> There are so many easier ways to send spam than with a lava test job -
> that only authorized users can submit. I think the mailing list
> things is a concern here. If you want to create a job and send
> results to certain individuals and email lists, the process shouldn't
> require going through and looking up all their lava account ids and
> filing a request to have an id created for each mailing list before
> you can submit your job.

It's not really the spam issue that made me change this. My thinking
was more like this:

1) Putting who gets notified in the job files is a pretty crappy way of
   doing this because you have to coordinate with whoever is submitting
   the job to get on the list.

2) Anything more sophisticated is going to be something you do in the
   web app.

3) The obvious thing to do would be to be able say "notify me of results
   matching <set of conditions>" somehow.

4) This means that notifications are tied to user accounts.

5) So we may as well start down that line now, to avoid confusion later.

Writing it down like this does make it clear that it doesn't have to be
like this: for example, in step 3) we could offer the option of "send
email to such and such an address" instead of just "email me". For the
reasons you give, I think this would reduce friction. Perhaps I'll go
back to email addresses in the job file then... Will think about this
overnight and merge something tomorrow :-)

Cheers,
mwh

176. By Michael Hudson-Doyle

go back to putting email addresses in the job file

177. By Michael Hudson-Doyle

email notify_on_incomplete people on CANCELED jobs too

178. By Michael Hudson-Doyle

changes

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lava_scheduler_app/models.py'
2--- lava_scheduler_app/models.py 2012-03-15 20:44:39 +0000
3+++ lava_scheduler_app/models.py 2012-04-20 03:25:20 +0000
4@@ -1,13 +1,19 @@
5+import logging
6 import simplejson
7
8+from django.conf import settings
9 from django.contrib.auth.models import User
10-from django.core.exceptions import ValidationError
11+from django.contrib.sites.models import Site
12+from django.core.exceptions import ImproperlyConfigured, ValidationError
13+from django.core.mail import send_mail
14+from django.core.validators import validate_email
15 from django.db import models
16+from django.template.loader import render_to_string
17 from django.utils.translation import ugettext as _
18
19 from django_restricted_resource.models import RestrictedResource
20
21-from dashboard_app.models import BundleStream
22+from dashboard_app.models import Bundle, BundleStream
23
24 from lava_dispatcher.job import validate_job_data
25
26@@ -237,20 +243,41 @@
27 blank = True,
28 editable = False
29 )
30+
31+ @property
32+ def duration(self):
33+ if self.end_time is None:
34+ return None
35+ return self.end_time - self.start_time
36+
37 status = models.IntegerField(
38 choices = STATUS_CHOICES,
39 default = SUBMITTED,
40 verbose_name = _(u"Status"),
41 )
42+
43 definition = models.TextField(
44 editable = False,
45 )
46+
47 log_file = models.FileField(
48 upload_to='lava-logs', default=None, null=True, blank=True)
49
50 results_link = models.CharField(
51 max_length=400, default=None, null=True, blank=True)
52
53+ @property
54+ def results_bundle(self):
55+ # XXX So this is clearly appalling (it depends on the format of bundle
56+ # links, for example). We should just have a fkey to Bundle.
57+ if not self.results_link:
58+ return None
59+ sha1 = self.results_link.strip('/').split('/')[-1]
60+ try:
61+ return Bundle.objects.get(content_sha1=sha1)
62+ except Bundle.DoesNotExist:
63+ return None
64+
65 def __unicode__(self):
66 r = "%s test job" % self.get_status_display()
67 if self.requested_device:
68@@ -274,6 +301,23 @@
69 else:
70 raise JSONDataError(
71 "Neither 'target' nor 'device_type' found in job data.")
72+
73+ for email_field in 'notify', 'notify_on_incomplete':
74+ if email_field in job_data:
75+ value = job_data[email_field]
76+ msg = ("%r must be a list of email addresses if present"
77+ % email_field)
78+ if not isinstance(value, list):
79+ raise ValueError(msg)
80+ for address in value:
81+ if not isinstance(address, basestring):
82+ raise ValueError(msg)
83+ try:
84+ validate_email(address)
85+ except ValidationError:
86+ raise ValueError(
87+ "%r is not a valid email address." % address)
88+
89 job_name = job_data.get('job_name', '')
90
91 is_check = job_data.get('health_check', False)
92@@ -324,6 +368,40 @@
93 self.status = TestJob.CANCELED
94 self.save()
95
96+ def _generate_summary_mail(self):
97+ domain = '???'
98+ try:
99+ site = Site.objects.get_current()
100+ except (Site.DoesNotExist, ImproperlyConfigured):
101+ pass
102+ else:
103+ domain = site.domain
104+ url_prefix = 'http://%s' % domain
105+ return render_to_string(
106+ 'lava_scheduler_app/job_summary_mail.txt',
107+ {'job': self, 'url_prefix': url_prefix})
108+
109+ def _get_notification_recipients(self):
110+ job_data = simplejson.loads(self.definition)
111+ recipients = job_data.get('notify', [])
112+ if self.status != self.COMPLETE:
113+ recipients.extend(job_data.get('notify_on_incomplete', []))
114+ return recipients
115+
116+ def send_summary_mails(self):
117+ recipients = self._get_notification_recipients()
118+ if not recipients:
119+ return
120+ mail = self._generate_summary_mail()
121+ description = self.description.splitlines()[0]
122+ if len(description) > 200:
123+ description = description[197:] + '...'
124+ logger = logging.getLogger(self.__class__.__name__ + '.' + str(self.pk))
125+ logger.info("sending mail to %s", recipients)
126+ send_mail(
127+ "LAVA job notification: " + description, mail,
128+ settings.SERVER_EMAIL, recipients)
129+
130
131 class DeviceStateTransition(models.Model):
132 created_on = models.DateTimeField(auto_now_add=True)
133
134=== added file 'lava_scheduler_app/templates/lava_scheduler_app/job_summary_mail.txt'
135--- lava_scheduler_app/templates/lava_scheduler_app/job_summary_mail.txt 1970-01-01 00:00:00 +0000
136+++ lava_scheduler_app/templates/lava_scheduler_app/job_summary_mail.txt 2012-04-20 03:25:20 +0000
137@@ -0,0 +1,27 @@
138+Hi,
139+
140+The job with id {{ job.id }} has finished. It took {{ job.start_time|timesince:job.end_time }}.
141+
142+The final status was {{ job.get_status_display }}.
143+
144+You can see more details at:
145+
146+ {{ url_prefix }}{{ job.get_absolute_url }}
147+{% if job.results_bundle %}
148+The results can be summarized as:
149+
150+ +----------------------+--------+--------+
151+ | Test run | Passes | Total |
152+ +----------------------+--------+--------+
153+{% for run in job.results_bundle.test_runs.all %}{% with results=run.get_summary_results %} | {{ run.test.test_id|ljust:20 }} | {{ results.pass|default:0|rjust:6 }} | {{ results.total|default:0|rjust:6 }} |
154+{% endwith %}{% endfor %} +----------------------+--------+--------+
155+
156+For more details, please see:
157+
158+ {{ url_prefix }}{{ job.results_bundle.get_absolute_url }}
159+
160+{% else %}
161+No results were reported to the dashboard for this run.
162+
163+{% endif %}LAVA
164+Linaro Automated Validation
165
166=== modified file 'lava_scheduler_daemon/dbjobsource.py'
167--- lava_scheduler_daemon/dbjobsource.py 2012-03-09 01:27:51 +0000
168+++ lava_scheduler_daemon/dbjobsource.py 2012-04-20 03:25:20 +0000
169@@ -33,7 +33,8 @@
170
171 implements(IJobSource)
172
173- logger = logging.getLogger(__name__ + '.DatabaseJobSource')
174+ def __init__(self):
175+ self.logger = logging.getLogger(__name__ + '.DatabaseJobSource')
176
177 deferToThread = staticmethod(deferToThread)
178
179@@ -235,7 +236,7 @@
180 created_by=None, device=device, old_state=old_device_status,
181 new_state=device.status, message=None, job=job).save()
182
183- if job.health_check is True:
184+ if job.health_check:
185 device.last_health_report_job = job
186 if job.status == TestJob.INCOMPLETE:
187 device.health_status = Device.HEALTH_FAIL
188@@ -249,6 +250,13 @@
189 device.save()
190 job.save()
191 token.delete()
192+ try:
193+ job.send_summary_mails()
194+ except:
195+ # Better to catch all exceptions here and log it than have this
196+ # method fail.
197+ self.logger.exception(
198+ 'sending job summary mails for job %r failed', job.pk)
199 transaction.commit()
200
201 def jobCompleted(self, board_name, exit_code):

Subscribers

People subscribed via source and target branches