Merge lp:~jtv/launchpad/bug-526462 into lp:launchpad

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: Jeroen T. Vermeulen
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~jtv/launchpad/bug-526462
Merge into: lp:launchpad
Diff against target: 347 lines (+338/-0)
2 files modified
lib/lp/soyuz/doc/sampledata-cleanup.txt (+22/-0)
utilities/soyuz-sampledata-cleanup.py (+316/-0)
To merge this branch: bzr merge lp:~jtv/launchpad/bug-526462
Reviewer Review Type Date Requested Status
Michael Nelson (community) code Approve
Review via email: mp+20052@code.launchpad.net

Commit message

Integrate make-ubuntu-sane.py sample-data cleanup script into Launchpad.

To post a comment you must log in.
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

= Bug 526462 =

To run Soyuz locally on a development machine, one needs to run a few scripts that are not being maintained, and are kept outside of the Launchpad source tree. See https://dev.launchpad.net/Soyuz/HowToUseSoyuzLocally

This branch takes the make-ubuntu-sane.py script from that page and integrates it into the Launchpad source tree as utilities/soyuz-sampledata-cleanup.py. The original was written by William Grant, but I discussed with him on IRC today and he has no objection to this being included under Canonical's copyright with a "based on code by wgrant" notice.

As you'll see, the script replaces the existing Ubuntu test series in the playground sample data with a series mirroring real-world Ubuntu releases. This is meant for the playground sample data that you would run on your local machine while testing manually. We discussed updating the sample data as included in the source code, but that would involve uploading tarballs to the local librarian which would have complicated the job and bloated the branch.

Since the script is not meant to run on production databases, it attempts to check for that condition. Our procedures should prevent this, however, so it's a pair of suspenders in addition to the belt we already have. There is no guarantee that the check will prevent any attempt at disastrously wrong use, and I don't think it's worth the hassle of automated testing. If deviating from the "happy path" in this way should break the script, then the script is still dealing with the situation properly: die before it deletes any Ubuntu data.

A doctest now verifies that the script will execute properly. The test would have to force a dirty database in order to avoid test isolation problems—if it weren't for the fact that it uses the --dry-run option (not present in the original script) avoids making any changes to the database.

No lint. To Q/A, just use it on your local dev system. To test:
{{{
./bin/test -vv -t sampledata-cleanup
}}}

Jeroen

Revision history for this message
Michael Nelson (michael.nelson) wrote :

Hi Jeroen,

This is great! Thanks for getting this in with a test to ensure it'll stay in sync with the code-base. I think it would also be worthwhile to add a test to ensure it won't run as expected in other envs (you never know who will come along and modify your code with unintended effects).

As mentioned on IRC, I realise this is the suspenders and not the belt, but I still think we should *only* allow this to run when LPCONFIG == 'development'. If someone wants to run it in another env. (or for some reason, hasn't set LPCONFIG), they'll know enough to modify the script for their needs. I mean, looking at lp-production-configs, there are *loads*, and so I'm not sure that blacklisting a few is that useful. This would also mean you could get rid of the 'guess' based on the ids?

Thoughts?

review: Needs Fixing (code)
Revision history for this message
Michael Nelson (michael.nelson) wrote :

As discussed, we have other damaging scripts intended for local development that do not even do any checks. I do think it would be a simpler (and sufficient) implementation to just check LPCONFIG == 'development', but I'll leave that up to you.

16:42 < jtv> noodles775: thanks for the review!
16:43 < jtv> noodles775: I'm a bit worried though that being this strict would stop people from playing with customized configs.
16:43 < jtv> I realize it's always a tradeoff, but there's also the "do we have a really large production-like table" test.
16:44 < noodles775> jtv: well, if they're playing with customized configs, they're not going to have a problem spending 10 seconds to adjust the script if
                    they want to.
16:45 < jtv> noodles775: true, but this script already has another check to ensure it's not running on production; how many scripts do we have (without ever
             any trouble) that are just as dangerous without anyone ever adding checks like this?
16:45 < jtv> I mean, if I hadn't added the check, would the average reviewer have thought of adding it?
16:45 < noodles775> jtv: which scripts? make schema?
16:46 < jtv> noodles775: I guess, though I've no idea whether that has any checks.
16:46 < noodles775> jtv: yeah, I'm not sure. All the other scripts that I can think of *add* info, not delete.
16:46 < jtv> launchpad-database-setup
16:46 * noodles775 looks
16:47 < jtv> mock-code-import ("warning! run make schema first!")
16:48 < jtv> We also have a script now, apparently, to remove private data. There may be more.
16:49 < noodles775> Yep, you're right.
16:50 < jtv> So I don't want to spend my time guarding against the admin who accidentally goes through the rigmarole for running scripts against a
             production db, with the --force option added.
16:51 < noodles775> yeah, I agree. I was more worried about the situation where a person runs it against a smaller DB with a different config.
16:53 < jtv> noodles775: there's a good chance that the script might fail, and I'd estimate the risk of them running "make schema" by accident considerably
             larger.
16:53 < jtv> (from shell history, f'rinstance)
16:54 < noodles775> jtv: I just thought it would have been a simpler implementation, not more difficult (ie. LPCONFIG == 'development'), but yep, I'm
                    updating the MP now.

review: Approve (code)
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Thanks! I did start out thinking that one check should be enough. I ended up with one that might produce false positives, plus one that might produce false negatives. For that situation, a --force against false positives plus a final just-in-case safeguard seemed the right approach. In any case the check is easy; maybe someday we'll build reusable helpers for this.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== added file 'lib/lp/soyuz/doc/sampledata-cleanup.txt'
--- lib/lp/soyuz/doc/sampledata-cleanup.txt 1970-01-01 00:00:00 +0000
+++ lib/lp/soyuz/doc/sampledata-cleanup.txt 2010-02-24 12:36:31 +0000
@@ -0,0 +1,22 @@
1= Sample data cleanup =
2
3In order to run Soyuz locally on a development system, the sample data
4must be cleaned up and customized a bit. This is done by a the script
5utilities/soyuz-sampledata-cleanup.py.
6
7We only need this script for the playground sample data, so there's
8little point in inspecting what it does to the test database in detail.
9
10 >>> from canonical.launchpad.ftests.script import run_script
11
12The --dry-run option makes the script roll back its changes.
13
14 >>> return_code, output, error = run_script(
15 ... 'utilities/soyuz-sampledata-cleanup.py', args=['--dry-run'])
16
17 >>> print return_code
18 0
19
20 >>> print error
21 INFO ...
22 INFO Done.
023
=== added file 'utilities/soyuz-sampledata-cleanup.py'
--- utilities/soyuz-sampledata-cleanup.py 1970-01-01 00:00:00 +0000
+++ utilities/soyuz-sampledata-cleanup.py 2010-02-24 12:36:31 +0000
@@ -0,0 +1,316 @@
1#!/usr/bin/python2.5
2# pylint: disable-msg=W0403
3
4# Copyright 2010 Canonical Ltd. This software is licensed under the
5# GNU Affero General Public License version 3 (see the file LICENSE).
6#
7# This code is based on William Grant's make-ubuntu-sane.py script, but
8# reorganized to fit Launchpad coding guidelines, and extended. The
9# code is included under Canonical copyright with his permission
10# (2010-02-24).
11
12"""Clean up sample data so it will allow Soyuz to run locally.
13
14DO NOT RUN ON PRODUCTION SYSTEMS. This script deletes lots of
15Ubuntu-related data.
16"""
17
18__metaclass__ = type
19
20import _pythonpath
21
22from optparse import OptionParser
23from os import getenv
24import re
25import sys
26
27from zope.component import getUtility
28from zope.event import notify
29from zope.lifecycleevent import ObjectCreatedEvent
30from zope.security.proxy import removeSecurityProxy
31
32from storm.store import Store
33
34from canonical.database.sqlbase import sqlvalues
35
36from canonical.lp import initZopeless
37
38from canonical.launchpad.interfaces.launchpad import (
39 ILaunchpadCelebrities)
40from canonical.launchpad.scripts import execute_zcml_for_scripts
41from canonical.launchpad.scripts.logger import logger, logger_options
42from canonical.launchpad.webapp.interfaces import (
43 IStoreSelector, MAIN_STORE, SLAVE_FLAVOR)
44
45from lp.registry.interfaces.series import SeriesStatus
46from lp.soyuz.interfaces.component import IComponentSet
47from lp.soyuz.interfaces.section import ISectionSet
48from lp.soyuz.interfaces.sourcepackageformat import (
49 ISourcePackageFormatSelectionSet, SourcePackageFormat)
50from lp.soyuz.model.section import SectionSelection
51from lp.soyuz.model.component import ComponentSelection
52
53
54class DoNotRunOnProduction(Exception):
55 """Error: do not run this script on production (-like) systems."""
56
57
58def get_max_id(store, table_name):
59 """Find highest assigned id in given table."""
60 max_id = store.execute("SELECT max(id) FROM %s" % table_name).get_one()
61 if max_id is None:
62 return None
63 else:
64 return max_id[0]
65
66
67def check_preconditions(options):
68 """Try to ensure that it's safe to run.
69
70 This script must not run on a production server, or anything
71 remotely like it.
72 """
73 store = getUtility(IStoreSelector).get(MAIN_STORE, SLAVE_FLAVOR)
74
75 # Just a guess, but dev systems aren't likely to have ids this high
76 # in this table. Production data does.
77 real_data = (get_max_id(store, "TranslationMessage") >= 1000000)
78 if real_data and not options.force:
79 raise DoNotRunOnProduction(
80 "Refusing to delete Ubuntu data unless you --force me.")
81
82 # For some configs it's just absolutely clear this script shouldn't
83 # run. Don't even accept --force there.
84 forbidden_configs = re.compile('(edge|lpnet|production)')
85 current_config = getenv('LPCONFIG', 'an unknown config')
86 if forbidden_configs.match(current_config):
87 raise DoNotRunOnProduction(
88 "I won't delete Ubuntu data on %s and you can't --force me."
89 % current_config)
90
91
92def parse_args(arguments):
93 """Parse command-line arguments.
94
95 :return: (options, args, logger)
96 """
97 parser = OptionParser(
98 description="Delete existing Ubuntu releases and set up new ones.")
99 parser.add_option('-f', '--force', action='store_true', dest='force',
100 help="DANGEROUS: run even if the database looks production-like.")
101 parser.add_option('-n', '--dry-run', action='store_true', dest='dry_run',
102 help="Do not commit changes.")
103 logger_options(parser)
104
105 options, args = parser.parse_args(arguments)
106 return options, args, logger(options)
107
108
109def get_person(name):
110 """Return `IPersonSet` utility."""
111 # Avoid circular import.
112 from lp.registry.interfaces.person import IPersonSet
113 return getUtility(IPersonSet).getByName(name)
114
115
116def retire_series(distribution):
117 """Mark all `DistroSeries` for `distribution` as obsolete."""
118 for series in distribution.series:
119 series.status = SeriesStatus.OBSOLETE
120
121
122def retire_active_publishing_histories(histories, requester):
123 """Retire all active publishing histories in the given collection."""
124 # Avoid circular import.
125 from lp.soyuz.interfaces.publishing import active_publishing_status
126 for history in histories(status=active_publishing_status):
127 history.requestDeletion(
128 requester, "Cleaned up because of missing Librarian files.")
129
130
131def retire_distro_archives(distribution, culprit):
132 """Retire all items in `distribution`'s archives."""
133 for archive in distribution.all_distro_archives:
134 retire_active_publishing_histories(
135 archive.getPublishedSources, culprit)
136 retire_active_publishing_histories(
137 archive.getAllPublishedBinaries, culprit)
138
139
140def retire_ppas(distribution):
141 """Disable all PPAs for `distribution`."""
142 for ppa in distribution.getAllPPAs():
143 removeSecurityProxy(ppa).publish = False
144
145
146def set_lucille_config(distribution):
147 """Set lucilleconfig on all series of `distribution`."""
148 for series in distribution.series:
149 removeSecurityProxy(series).lucilleconfig = '''[publishing]
150components = main restricted universe multiverse'''
151
152
153def create_sections(distroseries):
154 """Set up some sections for `distroseries`."""
155 section_names = (
156 'admin', 'cli-mono', 'comm', 'database', 'devel', 'debug', 'doc',
157 'editors', 'electronics', 'embedded', 'fonts', 'games', 'gnome',
158 'graphics', 'gnu-r', 'gnustep', 'hamradio', 'haskell', 'httpd',
159 'interpreters', 'java', 'kde', 'kernel', 'libs', 'libdevel', 'lisp',
160 'localization', 'mail', 'math', 'misc', 'net', 'news', 'ocaml',
161 'oldlibs', 'otherosfs', 'perl', 'php', 'python', 'ruby', 'science',
162 'shells', 'sound', 'tex', 'text', 'utils', 'vcs', 'video', 'web',
163 'x11', 'xfce', 'zope')
164 store = Store.of(distroseries)
165 for section_name in section_names:
166 section = getUtility(ISectionSet).ensure(section_name)
167 if section not in distroseries.sections:
168 store.add(
169 SectionSelection(distroseries=distroseries, section=section))
170
171
172def create_components(distroseries, uploader):
173 """Set up some components for `distroseries`."""
174 component_names = ('main', 'restricted', 'universe', 'multiverse')
175 store = Store.of(distroseries)
176 main_archive = distroseries.distribution.main_archive
177 for component_name in component_names:
178 component = getUtility(IComponentSet).ensure(component_name)
179 if component not in distroseries.components:
180 store.add(
181 ComponentSelection(
182 distroseries=distroseries, component=component))
183 main_archive.newComponentUploader(uploader, component)
184 main_archive.newQueueAdmin(uploader, component)
185
186
187def create_series(parent, full_name, version, status):
188 """Set up a `DistroSeries`."""
189 distribution = parent.distribution
190 owner = parent.owner
191 name = full_name.split()[0].lower()
192 title = "The " + full_name
193 displayname = full_name.split()[0]
194 new_series = distribution.newSeries(name=name, title=title,
195 displayname=displayname, summary='Ubuntu %s is good.' % version,
196 description='%s is awesome.' % version, version=version,
197 parent_series=parent, owner=owner)
198 new_series.status = status
199 notify(ObjectCreatedEvent(new_series))
200
201 # This bit copied from scripts/ftpmaster-tools/initialise-from-parent.py.
202 assert new_series.architectures.count() == 0, (
203 "Cannot copy distroarchseries from parent; this series already has "
204 "distroarchseries.")
205
206 store = Store.of(parent)
207 store.execute("""
208 INSERT INTO DistroArchSeries
209 (distroseries, processorfamily, architecturetag, owner, official)
210 SELECT %s, processorfamily, architecturetag, %s, official
211 FROM DistroArchSeries WHERE distroseries = %s
212 """ % sqlvalues(new_series, owner, parent))
213
214 i386 = new_series.getDistroArchSeries('i386')
215 i386.supports_virtualized = True
216 new_series.nominatedarchindep = i386
217
218 new_series.initialiseFromParent()
219 return new_series
220
221
222def create_sample_series(original_series, log):
223 """Set up sample `DistroSeries`.
224
225 :param original_series: The parent for the first new series to be
226 created. The second new series will have the first as a parent,
227 and so on.
228 """
229 series_descriptions = [
230 ('Dapper Drake', SeriesStatus.SUPPORTED, '6.06'),
231 ('Edgy Eft', SeriesStatus.OBSOLETE, '6.10'),
232 ('Feisty Fawn', SeriesStatus.OBSOLETE, '7.04'),
233 ('Gutsy Gibbon', SeriesStatus.OBSOLETE, '7.10'),
234 ('Hardy Heron', SeriesStatus.SUPPORTED, '8.04'),
235 ('Intrepid Ibex', SeriesStatus.SUPPORTED, '8.10'),
236 ('Jaunty Jackalope', SeriesStatus.SUPPORTED, '9.04'),
237 ('Karmic Koala', SeriesStatus.CURRENT, '9.10'),
238 ('Lucid Lynx', SeriesStatus.DEVELOPMENT, '10.04'),
239 ]
240
241 parent = original_series
242 for full_name, status, version in series_descriptions:
243 log.info('Creating %s...' % full_name)
244 parent = create_series(parent, full_name, version, status)
245
246
247def clean_up(distribution, log):
248 # First we eliminate all active publishings in the Ubuntu main archives.
249 # None of the librarian files exist, so it kills the publisher.
250
251 # Could use IPublishingSet.requestDeletion() on the published sources to
252 # get rid of the binaries too, but I don't trust that there aren't
253 # published binaries without corresponding sources.
254
255 log.info("Deleting all items in official archives...")
256 retire_distro_archives(distribution, get_person('name16'))
257
258 # Disable publishing of all PPAs, as they probably have broken
259 # publishings too.
260 log.info("Disabling all PPAs...")
261 retire_ppas(distribution)
262
263 retire_series(distribution)
264
265
266def set_source_package_format(distroseries):
267 """Register a series' source package format selection."""
268 utility = getUtility(ISourcePackageFormatSelectionSet)
269 format = SourcePackageFormat.FORMAT_1_0
270 if utility.getBySeriesAndFormat(distroseries, format) is None:
271 utility.add(distroseries, format)
272
273
274def populate(distribution, parent_series_name, uploader_name, log):
275 """Set up sample data on `distribution`."""
276 parent_series = distribution.getSeries(parent_series_name)
277
278 # Set up lucilleconfig on all series. The sample data lacks this.
279 log.info("Setting lucilleconfig...")
280 set_lucille_config(distribution)
281
282 log.info("Configuring sections...")
283 create_sections(parent_series)
284
285 log.info("Configuring components and permissions...")
286 create_components(parent_series, get_person(uploader_name))
287
288 set_source_package_format(parent_series)
289
290 create_sample_series(parent_series, log)
291
292
293def main(argv):
294 options, args, log = parse_args(argv[1:])
295
296 execute_zcml_for_scripts()
297 txn = initZopeless(dbuser='launchpad')
298
299 check_preconditions(options.force)
300
301 ubuntu = getUtility(ILaunchpadCelebrities).ubuntu
302 clean_up(ubuntu, log)
303
304 # Use Hoary as the root, as Breezy and Grumpy are broken.
305 populate(ubuntu, 'hoary', 'ubuntu-team', log)
306
307 if options.dry_run:
308 txn.abort()
309 else:
310 txn.commit()
311
312 log.info("Done.")
313
314
315if __name__ == "__main__":
316 main(sys.argv)