Merge lp:~roadmr/canonical-identity-provider/charm-fix-talisker-logfile-migration into lp:~ubuntuone-pqm-team/canonical-identity-provider/charm

Proposed by Daniel Manrique
Status: Rejected
Rejected by: Daniel Manrique
Proposed branch: lp:~roadmr/canonical-identity-provider/charm-fix-talisker-logfile-migration
Merge into: lp:~ubuntuone-pqm-team/canonical-identity-provider/charm
Diff against target: 43 lines (+13/-3)
1 file modified
playbook.yaml (+13/-3)
To merge this branch: bzr merge lp:~roadmr/canonical-identity-provider/charm-fix-talisker-logfile-migration
Reviewer Review Type Date Requested Status
Daniel Manrique (community) Disapprove
Review via email: mp+384659@code.launchpad.net

Commit message

Fix a couple of ansible roles dealing with talisker log migration.

Apparently talisker will write to /srv/...../logs/sso.log, but the old log lives in /var/log/gunicorn.log. These tasks seem to deal with symlinking the old log which lives in /var/log so talisker writes to the same place gunicorn was writing (but then talisker doesn't seem to log anything there? I think it all goes to stdout and/or is handled by systemd)?.

The tasks/roles needed a couple of tweaks:

- they date from upstart times so the location of gunicorn.log has changed since they were written.
- islnk explodes if invoked on a nonexistent file, advice from Ansible community is to do an .exists check before .islnk.
- Mode of /var/log/gunicorn.log was not writable by the talisker user (ubunet typically), so I added a task to frob the mode to 660.

Description of the change

I thought the spec would need an update to fetch new versions of the charm but it does not, because it's not pinned in the spec; these updates should apply magically on staging, and on production we *might* need to ask for an update-deployment to be manually run, because on prod, the self-service rollout does do only update-asset I think.

Note unit sso-app/0 on staging was fixed by applying/testing these charm/playbook tweaks, once this lands unit sso-app/1 might self-fix, or it might need manual attention.

To post a comment you must log in.
96. By Daniel Manrique

oops

Revision history for this message
Maximiliano Bertacchini (maxiberta) wrote :

LGTM, thanks! But check comment below.

97. By Daniel Manrique

reoops

Revision history for this message
Daniel Manrique (roadmr) wrote :

Thanks for catching that! I've fixed it to "group" ownership.

Revision history for this message
Przemysław Suliga (suligap) wrote :

Thanks Daniel for the fixes!

My understanding which might be completely wrong is following:

1. `sso_logs` and `logs_dir` playbook vars end up configuring Django's LOGGING in SSO's settings.py.

2. gunicorn (so talisker too) logs themselves (accesslog, errorlog) are configured using gunicorn's charm options https://bazaar.launchpad.net/~ubuntuone-hackers/charms/xenial/gunicorn/trunk/view/head:/config.yaml.

The values for `wsgi_access_logfile` and `wsgi_error_logfile` gunicorn charm options are set here: https://bazaar.launchpad.net/~ubuntuone-pqm-team/canonical-identity-provider/charm/view/head:/roles/wsgi-app/defaults/main.yaml it looks like.

But it would make sense if gunicorn logged to stderr and systemd took care of the logs. But given the above I'm not sure it's the case.

In any case it looks like we don't need to change anything here (gunicorn logging config) for the move to talisker itself. So that's good

3. The Django log file from (1) (the sso_log var) seems to be the subject of the migration tasks here. But I'm not sure it needs to be migrated at all (maybe the fix is to remove all these tasks?). This is still just the file django logging goes to. I might be also just completely missing the rationale for the migration.

I'll ask Simon to look at this. Seems like he was involved in adding these tasks and support for talisker to the playbook.

Revision history for this message
Simon Davy (bloodearnest) wrote :

> Thanks Daniel for the fixes!
>
> My understanding which might be completely wrong is following:
>
> 1. `sso_logs` and `logs_dir` playbook vars end up configuring Django's LOGGING
> in SSO's settings.py.

So, I think we need to set LOGGING_CONFIG = None in settings.py, as per https://talisker.readthedocs.io/en/latest/django.html

That's the only thing that tries to write to files.

> 2. gunicorn (so talisker too) logs themselves (accesslog, errorlog) are
> configured using gunicorn's charm options https://bazaar.launchpad.net
> /~ubuntuone-hackers/charms/xenial/gunicorn/trunk/view/head:/config.yaml.

Yep, we want to disable these and use talisker's default of stderr

> The values for `wsgi_access_logfile` and `wsgi_error_logfile` gunicorn charm
> options are set here: https://bazaar.launchpad.net/~ubuntuone-pqm-team
> /canonical-identity-provider/charm/view/head:/roles/wsgi-
> app/defaults/main.yaml it looks like.
>
> But it would make sense if gunicorn logged to stderr and systemd took care of
> the logs. But given the above I'm not sure it's the case.
>
> In any case it looks like we don't need to change anything here (gunicorn
> logging config) for the move to talisker itself. So that's good

I think we want to disable access logs (talisker emits per-request logs that are much better than gunicorns access logs)

Error log should be unset, so it defaults to stderr.

> 3. The Django log file from (1) (the sso_log var) seems to be the subject of
> the migration tasks here. But I'm not sure it needs to be migrated at all
> (maybe the fix is to remove all these tasks?). This is still just the file
> django logging goes to. I might be also just completely missing the rationale
> for the migration.

So, the link was to keep access to the logs from the familiar location of /srv/sso/.../logs/sso.log.

>
> I'll ask Simon to look at this. Seems like he was involved in adding these
> tasks and support for talisker to the playbook.

Another wrinkle: this was written on trusty, where upstart defaults would mean that a service called 'gunicorn' would get it stdout/err written to /var/log/gunicorn.log

This doesn't happen with systemd, so we need charm work to add syslog config to do this. And the accompanying logrotation :(

Revision history for this message
Przemysław Suliga (suligap) wrote :

More context and discoveries and next steps from a call with Simon today:

1. We can remove the migration tasks now since they were added with different constraints in place for the planned move to talisker in the past. Since for now, we're not really need to touch Django logging, we can let it log to the same file it does now.

2. We should be able to just switch to talisker.gunicorn right now:

2a. gunicorn access logs configured to log to a file (via wsgi_access_logfile) should be still logged there in the same format.

2b. gunicorn's errorlog (wsgi_error_logfile) will be ignored by talisker. It will warn about this and will log all talisker/gunicorn logs to stderr (these include access logs in talisker's format).

2c. The above stderr logs will be retained by journald setup (but possibly not permanently) and next steps should be ideally to sort this out and move to relying on these talisker logs that should contain everything we need (and disable django logging https://talisker.readthedocs.io/en/latest/django.html#logging)

Revision history for this message
Przemysław Suliga (suligap) wrote :
Revision history for this message
Daniel Manrique (roadmr) wrote :

Rejecting per the above, deleted code is the best code :)

review: Disapprove

Unmerged revisions

97. By Daniel Manrique

reoops

96. By Daniel Manrique

oops

95. By Daniel Manrique

More fix for talisker log migration - change mode in old gunicorn log

94. By Daniel Manrique

Partial fix for talisker log file migration

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'playbook.yaml'
2--- playbook.yaml 2020-05-25 15:57:19 +0000
3+++ playbook.yaml 2020-05-27 18:48:17 +0000
4@@ -157,18 +157,28 @@
5 tags:
6 - config-changed
7 command: mv {{ sso_log }} {{ sso_log }}.bak
8- when: talisker.stat.exists and not log_file.stat.islnk
9+ when: talisker.stat.exists and (log_file.stat.exists and not log_file.stat.islnk)
10
11 - name: link log file
12 tags:
13 - config-changed
14 file:
15 path: "{{ logs_dir }}/sso.log"
16- src: /var/log/upstart/gunicorn.log
17+ src: /var/log/gunicorn.log
18 state: link
19 # temporary when until migrated to talisker
20 when: talisker.stat.exists
21
22+ # Temporary task for migration to talisker logging
23+ - name: ensure log file is writable by talisker user
24+ tags:
25+ - config-changed
26+ file:
27+ path: /var/log/gunicorn.log
28+ mode: "g=rw"
29+ group: "{{ user }}"
30+ state: file
31+
32 # the next 2 tasks should really only run when the payload has *changed*
33 - name: Install required sso packages
34 tags:
35@@ -182,7 +192,7 @@
36 - config-changed
37 - preload
38 shell: virtualenv-tools --update-path=auto {{ venv }}
39- when: build_label != "" and payload.stat.exists
40+ when: build_label != "" and payload.stat.exists
41
42 - name: Write charm config
43 tags:

Subscribers

People subscribed via source and target branches