Merge lp:~danilo/linaro-license-protection/sanitized-tree into lp:~linaro-automation/linaro-license-protection/trunk

Proposed by Данило Шеган
Status: Merged
Merged at revision: 115
Proposed branch: lp:~danilo/linaro-license-protection/sanitized-tree
Merge into: lp:~linaro-automation/linaro-license-protection/trunk
Diff against target: 356 lines (+236/-86)
3 files modified
scripts/linaroscript.py (+63/-0)
scripts/make-sanitized-tree-copy.py (+96/-0)
scripts/update-deployment.py (+77/-86)
To merge this branch: bzr merge lp:~danilo/linaro-license-protection/sanitized-tree
Reviewer Review Type Date Requested Status
Stevan Radaković Approve
Review via email: mp+120889@code.launchpad.net

Description of the change

Use the functionality of SnapshotsPublisher to create a sanitized copy of
an entire tree. We rely on shutil.copy2 and shutil.copystat to ensure
appropriate flags are transferred as well.

We ignore errors and simply log them. This will result in them coming to our inboxes via linaro-infrastructure-errors so we can insure they are not a problem (and we'll need to update the script then, but hey).

The script simply prints out the destination directory when done (all other output goes to stderr). The idea is to use this and store it in an environment variable and then pass it to rsync to copy from mombin (production) to kahaku (staging).

I am still not sure if it'd be better to leave the temp dir creation to the script caller as well. Makes for slightly more complicated wrapper script, but probably not by much. What do you think?

This also extracts some of the common bits (like supporting multiple -v options to increase verbosity of the logger output, and configuration of the logger) out into a separate class and shares it with update-deployment script (the changes in update-deployment are very minor: mostly reindentation and additions of "self." where appropriate).

To post a comment you must log in.
Revision history for this message
Stevan Radaković (stevanr) wrote :

This looks good. Temp directory creation should be left here, IMO.
It might make sense to change my validation script to extend the linaroscript class as well..
Approve +1.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'scripts/linaroscript.py'
2--- scripts/linaroscript.py 1970-01-01 00:00:00 +0000
3+++ scripts/linaroscript.py 2012-08-22 23:43:25 +0000
4@@ -0,0 +1,63 @@
5+# Copyright 2012 Linaro.
6+
7+"""Helper class for creating new scripts.
8+
9+It pre-defines a logger using the script_name (passed through the constructor)
10+and allows different verbosity levels.
11+
12+Overload the work() method to define the main work to be done by the script.
13+
14+You can use the logger by accessing instance.logger attribute.
15+You can use the parser (eg. to add another argument) by accessing
16+instance.argument_parser attribute.
17+
18+Parsed arguments are available to your work() method in instance.arguments.
19+"""
20+
21+import argparse
22+import logging
23+
24+
25+class LinaroScript(object):
26+ def __init__(self, script_name, description=None):
27+ self.script_name = script_name
28+ self.description = description
29+ self.argument_parser = argparse.ArgumentParser(
30+ description=self.description)
31+ self.setup_parser()
32+
33+ def work(self):
34+ """The main body of the script. Overload when subclassing."""
35+ raise NotImplementedError
36+
37+ def run(self):
38+ self.arguments = self.argument_parser.parse_args()
39+ logging_level = self.get_logging_level_from_verbosity(
40+ self.arguments.verbose)
41+ self.logger = logging.getLogger(self.script_name)
42+ self.logger.setLevel(logging_level)
43+ formatter = logging.Formatter(
44+ fmt='%(asctime)s %(levelname)s: %(message)s')
45+ handler = logging.StreamHandler()
46+ handler.setFormatter(formatter)
47+ self.logger.addHandler(handler)
48+
49+ self.work()
50+
51+ def setup_parser(self):
52+ self.argument_parser.add_argument(
53+ "-v", "--verbose", action='count',
54+ help=("Increase the output verbosity. "
55+ "Can be used multiple times"))
56+
57+ def get_logging_level_from_verbosity(self, verbosity):
58+ """Return a logging level based on the number of -v arguments."""
59+ if verbosity == 0:
60+ logging_level = logging.ERROR
61+ elif verbosity == 1:
62+ logging_level = logging.INFO
63+ elif verbosity >= 2:
64+ logging_level = logging.DEBUG
65+ else:
66+ logging_level = logging.ERROR
67+ return logging_level
68
69=== added file 'scripts/make-sanitized-tree-copy.py'
70--- scripts/make-sanitized-tree-copy.py 1970-01-01 00:00:00 +0000
71+++ scripts/make-sanitized-tree-copy.py 2012-08-22 23:43:25 +0000
72@@ -0,0 +1,96 @@
73+#!/usr/bin/env python
74+# Create a copy of a directory structure while preserving as much metadata
75+# as possible and sanitizing any sensitive data.
76+
77+# Everything, unless whitelisted, is truncated and the contents are replaced
78+# with the base file name itself.
79+
80+import os
81+import shutil
82+import tempfile
83+
84+from linaroscript import LinaroScript
85+from publish_to_snapshots import SnapshotsPublisher
86+
87+
88+class MakeSanitizedTreeCopyScript(LinaroScript):
89+
90+ def setup_parser(self):
91+ super(MakeSanitizedTreeCopyScript, self).setup_parser()
92+ self.argument_parser.add_argument(
93+ 'directory', metavar='DIR', type=str,
94+ help="Directory to create a sanitized deep copy of.")
95+
96+ @staticmethod
97+ def filter_accepted_files(directory, list_of_files):
98+ accepted_files = []
99+ for filename in list_of_files:
100+ full_path = os.path.join(directory, filename)
101+ if SnapshotsPublisher.is_accepted_for_staging(full_path):
102+ accepted_files.append(filename)
103+ return accepted_files
104+
105+ def copy_sanitized_tree(self, source, target):
106+ """Copies the tree from `source` to `target` while sanitizing it.
107+
108+ Performs a recursive copy trying to preserve as many file
109+ attributes as possible.
110+ """
111+ assert os.path.isdir(source) and os.path.isdir(target), (
112+ "Both source (%s) and target (%s) must be directories." % (
113+ source, target))
114+ self.logger.debug("copy_sanitized_tree('%s', '%s')", source, target)
115+ filenames = os.listdir(source)
116+ for filename in filenames:
117+ self.logger.debug("Copying '%s'...", filename)
118+ source_file = os.path.join(source, filename)
119+ target_file = os.path.join(target, filename)
120+ try:
121+ if os.path.isdir(source_file):
122+ self.logger.debug("Making directory '%s'" % target_file)
123+ os.makedirs(target_file)
124+ self.copy_sanitized_tree(source_file, target_file)
125+ elif SnapshotsPublisher.is_accepted_for_staging(source_file):
126+ self.logger.debug(
127+ "Copying '%s' to '%s' with no sanitization...",
128+ source_file, target_file)
129+ shutil.copy2(source_file, target_file)
130+ else:
131+ self.logger.debug(
132+ "Creating sanitized file '%s'", target_file)
133+ # This creates an target file.
134+ open(target_file, "w").close()
135+ shutil.copystat(source_file, target_file)
136+ SnapshotsPublisher.sanitize_file(target_file)
137+ except (IOError, os.error) as why:
138+ self.logger.error(
139+ "While copying '%s' to '%s' we hit:\n\t%s",
140+ source_file, target_file, str(why))
141+
142+ try:
143+ shutil.copystat(source, target)
144+ except OSError as why:
145+ self.logger.error(
146+ "While copying '%s' to '%s' we hit:\n\t%s",
147+ source, target, str(why))
148+
149+ def work(self):
150+ source_directory = self.arguments.directory
151+ self.logger.info("Copying and sanitizing '%s'...", source_directory)
152+ target_directory = tempfile.mkdtemp()
153+ self.logger.info("Temporary directory: '%s'", target_directory)
154+
155+ self.copy_sanitized_tree(
156+ self.arguments.directory, target_directory)
157+
158+ print target_directory
159+
160+if __name__ == '__main__':
161+ script = MakeSanitizedTreeCopyScript(
162+ 'make-sanitized-tree-copy',
163+ description=(
164+ "Makes a copy of a directory tree in a temporary location "
165+ "and sanitize file that can contain potentially restricted "
166+ "content. "
167+ "Returns the path of a newly created temporary directory."))
168+ script.run()
169
170=== modified file 'scripts/update-deployment.py'
171--- scripts/update-deployment.py 2012-08-16 13:09:23 +0000
172+++ scripts/update-deployment.py 2012-08-22 23:43:25 +0000
173@@ -34,13 +34,13 @@
174
175 """
176
177-import argparse
178 import bzrlib.branch
179 import bzrlib.workingtree
180-import logging
181 import os
182 import subprocess
183
184+from linaroscript import LinaroScript
185+
186 code_base = '/srv/shared-branches'
187 branch_name = 'linaro-license-protection'
188 configs_branch_name = 'linaro-license-protection-config'
189@@ -60,92 +60,83 @@
190 configs_root = os.path.join(code_base, configs_branch_name)
191
192
193-def refresh_branch(branch_dir):
194- """Refreshes a branch checked-out to a branch_dir."""
195-
196- code_branch = bzrlib.branch.Branch.open(branch_dir)
197- parent_branch = bzrlib.branch.Branch.open(
198- code_branch.get_parent())
199- result = code_branch.pull(source=parent_branch)
200- if result.old_revno != result.new_revno:
201- logger.info("Updated %s from %d to %d.",
202- branch_dir, result.old_revno, result.new_revno)
203- else:
204- logger.info("No changes to pull from %s.", code_branch.get_parent())
205- logger.debug("Updating working tree in %s.", branch_dir)
206- update_tree(branch_dir)
207- return code_branch
208-
209-
210-def update_tree(working_tree_dir):
211- """Does a checkout update."""
212- code_tree = bzrlib.workingtree.WorkingTree.open(working_tree_dir)
213- code_tree.update()
214-
215-
216-def update_installation(config, installation_root):
217- """Updates a single installation code and databases.
218-
219- It expects code and config branches to be simple checkouts (working trees)
220- so it only does an "update" on them.
221-
222- Afterwards, it runs "syncdb" and "collectstatic" steps.
223- """
224- refresh_branch(os.path.join(installation_root, branch_name))
225- refresh_branch(os.path.join(installation_root, "configs"))
226- os.environ["PYTHONPATH"] = (
227- ":".join(
228- [installation_root,
229- os.path.join(installation_root, branch_name),
230- os.path.join(installation_root, "configs", "django"),
231- os.environ.get("PYTHONPATH", "")]))
232-
233- logger.info("Updating installation in %s with config %s...",
234- installation_root, config)
235- os.environ["DJANGO_SETTINGS_MODULE"] = config
236- logger.debug("DJANGO_SETTINGS_MODULE=%s",
237- os.environ.get("DJANGO_SETTINGS_MODULE"))
238-
239- logger.debug("Doing 'syncdb'...")
240- logger.debug(subprocess.check_output(
241- ["django-admin", "syncdb", "--noinput"], cwd=code_root))
242-
243- logger.debug("Doing 'collectstatic'...")
244- logger.debug(subprocess.check_output(
245- ["django-admin", "collectstatic", "--noinput"],
246- cwd=code_root))
247+class UpdateDeploymentScript(LinaroScript):
248+
249+ def refresh_branch(self, branch_dir):
250+ """Refreshes a branch checked-out to a branch_dir."""
251+
252+ code_branch = bzrlib.branch.Branch.open(branch_dir)
253+ parent_branch = bzrlib.branch.Branch.open(
254+ code_branch.get_parent())
255+ result = code_branch.pull(source=parent_branch)
256+ if result.old_revno != result.new_revno:
257+ self.logger.info("Updated %s from %d to %d.",
258+ branch_dir, result.old_revno, result.new_revno)
259+ else:
260+ self.logger.info(
261+ "No changes to pull from %s.", code_branch.get_parent())
262+ self.logger.debug("Updating working tree in %s.", branch_dir)
263+ self.update_tree(branch_dir)
264+ return code_branch
265+
266+ def update_tree(self, working_tree_dir):
267+ """Does a checkout update."""
268+ code_tree = bzrlib.workingtree.WorkingTree.open(working_tree_dir)
269+ code_tree.update()
270+
271+ def update_installation(self, config, installation_root):
272+ """Updates a single installation code and databases.
273+
274+ It expects code and config branches to be simple checkouts
275+ (working trees) so it only does an "update" on them.
276+
277+ Afterwards, it runs "syncdb" and "collectstatic" steps.
278+ """
279+ self.refresh_branch(os.path.join(installation_root, branch_name))
280+ self.refresh_branch(os.path.join(installation_root, "configs"))
281+ os.environ["PYTHONPATH"] = (
282+ ":".join(
283+ [installation_root,
284+ os.path.join(installation_root, branch_name),
285+ os.path.join(installation_root, "configs", "django"),
286+ os.environ.get("PYTHONPATH", "")]))
287+
288+ self.logger.info("Updating installation in %s with config %s...",
289+ installation_root, config)
290+ os.environ["DJANGO_SETTINGS_MODULE"] = config
291+ self.logger.debug("DJANGO_SETTINGS_MODULE=%s",
292+ os.environ.get("DJANGO_SETTINGS_MODULE"))
293+
294+ self.logger.debug("Doing 'syncdb'...")
295+ self.logger.debug(subprocess.check_output(
296+ ["django-admin", "syncdb", "--noinput"], cwd=code_root))
297+
298+ self.logger.debug("Doing 'collectstatic'...")
299+ self.logger.debug(subprocess.check_output(
300+ ["django-admin", "collectstatic", "--noinput"],
301+ cwd=code_root))
302+
303+ def setup_parser(self):
304+ super(UpdateDeploymentScript, self).setup_parser()
305+ self.argument_parser.add_argument(
306+ 'configs', metavar='CONFIG', nargs='+',
307+ choices=configs_to_use.keys(),
308+ help=("Django configuration module to use. One of " +
309+ ', '.join(configs_to_use.keys())))
310+
311+ def work(self):
312+ # Refresh code in shared-branches.
313+ self.refresh_branch(code_root)
314+ self.refresh_branch(configs_root)
315+
316+ # We update installations for all the configs we've got.
317+ for config in self.arguments.configs:
318+ self.update_installation(config, configs_to_use[config])
319
320
321 if __name__ == '__main__':
322- parser = argparse.ArgumentParser(
323+ script = UpdateDeploymentScript(
324+ 'update-deployment',
325 description=(
326 "Update staging deployment of lp:linaro-license-protection."))
327- parser.add_argument(
328- 'configs', metavar='CONFIG', nargs='+', choices=configs_to_use.keys(),
329- help=("Django configuration module to use. One of " +
330- ', '.join(configs_to_use.keys())))
331- parser.add_argument("-v", "--verbose", action='count',
332- help=("Increase the output verbosity. "
333- "Can be used multiple times"))
334- args = parser.parse_args()
335-
336- logging_level = logging.ERROR
337- if args.verbose == 0:
338- logging_level = logging.ERROR
339- elif args.verbose == 1:
340- logging_level = logging.INFO
341- elif args.verbose >= 2:
342- logging_level = logging.DEBUG
343-
344- logger = logging.getLogger('update-staging')
345- logging.basicConfig(
346- format='%(asctime)s %(levelname)s: %(message)s',
347- level=logging_level)
348-
349- # Refresh code in shared-branches.
350- refresh_branch(code_root)
351- refresh_branch(configs_root)
352-
353- # We update installations for all the configs we've got.
354- for config in args.configs:
355- update_installation(config, configs_to_use[config])
356+ script.run()

Subscribers

People subscribed via source and target branches