Merge lp:~ev/apport/reunpack_to_sandbox into lp:apport

Proposed by Evan on 2012-04-11
Status: Merged
Merged at revision: 2286
Proposed branch: lp:~ev/apport/reunpack_to_sandbox
Merge into: lp:apport
Diff against target: 154 lines (+92/-2)
2 files modified
backends/packaging-apt-dpkg.py (+72/-1)
test/test_backend_apt_dpkg.py (+20/-1)
To merge this branch: bzr merge lp:~ev/apport/reunpack_to_sandbox
Reviewer Review Type Date Requested Status
Martin Pitt 2012-04-11 Approve on 2012-04-18
Review via email: mp+101643@code.launchpad.net

Description of the Change

As discussed on IRC, we do not adequately handle the case of packages that Conflicts or Replaces with each other. This branch attempts to solve that by using Martin's suggested approach of deleting any cached debs (thus forcing them to be re-downloaded and unpacked) that conflict with the to-be-installed package.

It uses a dictionary of dictionaries (with None values) to maintain a mapping between a virtual package and those packages which provide it. It calculates this on the fly, as it writes packages into the cache, and so it also needs to persist the dictionary to disk to retain state between runs. It does this by using cPickle with an atexit() handler. I've chosen this approach for speed, but if there are concerns of fragility, lets discuss them :).

To post a comment you must log in.
lp:~ev/apport/reunpack_to_sandbox updated on 2012-04-11
2280. By Evan on 2012-04-11

Handle Replaces existing but Conflicts not existing.

Evan (ev) wrote :

Timings on a gedit crash against trunk (r2277) and this branch (r2280):

ubuntu@server-8149:~$ ./judge/judge ./previous_changes ./with_changes
v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^

    n mean sd min max cmd
   50 8757.4ms 315.5 8345.0 10521.3 ./previous_changes
   50 10346.5ms 269.1 9797.9 11263.7 ./with_changes
+1589.055ms +18.1% p=0.042
difference is significant at 95.0% confidence (p=0.042):
based on these samples, suggested sample size is n>=8 to have a 397.26ms confidence interval

lp:~ev/apport/reunpack_to_sandbox updated on 2012-04-12
2281. By Evan on 2012-04-12

typo

Martin Pitt (pitti) wrote :

Thanks for this!

I have some questions/remarks:

8 +import apt_pkg

I'm not sure whether this is the preferred method or using apt.apt_pkg.*. I just saw that _install_debug_kernel() imports apt_pkg within the method, this should be cleaned up along.

9 +from collections import defaultdict
36 + self._virtual_mapping_obj = defaultdict(dict)

Why do we need this, and the map of maps to none? Wouldn't it be easier to use a standard dict like 'virtualpkg' -> set(['realpkg1', ...])? That would look a lot easier IMHO. With that you can replace

+ virtual_mapping[p][pkg] = None

with

  virtual_mapping.setdefault(p, set()).add(pkg)

38 + atexit.register(self._save_virtual_mapping, configdir)

Why can't this be called after building the map and iterating through all packages in install_packages()? This would write it earlier, and also be much more reliable if you call install_packages() multiple times.

62 + if (permanent_rootdir and (candidate.record.has_key('Conflicts') or

Does the python-apt API require the usage of has_key() here, or can you just use the standard "in" syntax? (the PEP-8 check in current trunk should complain). Also, you can probably just drop that whole condition and move the code within the previous "if permanent_rootdir:" block.

Thanks!

review: Needs Fixing
Martin Pitt (pitti) wrote :

Oh and for the record, I think pickle is just fine for making the virtual dict persistent.

lp:~ev/apport/reunpack_to_sandbox updated on 2012-04-12
2282. By Evan on 2012-04-12

Use apt.apt_pkg instead of apt_pkg.

2283. By Evan on 2012-04-12

Use sets instead of dictionary keys as a list. Use 'in' instead of has_key.

2284. By Evan on 2012-04-12

Drop the atexit saving of the virtual package mapping pickle. Write it at the end of every install_packages call.

Evan (ev) wrote :

On Thu, Apr 12, 2012 at 9:15 AM, Martin Pitt <email address hidden> wrote:
> Review: Needs Fixing
>
> Thanks for this!
>
> I have some questions/remarks:
>
> 8       +import apt_pkg
>
> I'm not sure whether this is the preferred method or using apt.apt_pkg.*. I just saw that _install_debug_kernel() imports apt_pkg within the method, this should be cleaned up along.

Fixed

> 9       +from collections import defaultdict
> 36      + self._virtual_mapping_obj = defaultdict(dict)
>
> Why do we need this, and the map of maps to none? Wouldn't it be easier to use a standard dict like 'virtualpkg' -> set(['realpkg1', ...])? That would look a lot easier IMHO. With that you can replace
>
> + virtual_mapping[p][pkg] = None
>
> with
>
>  virtual_mapping.setdefault(p, set()).add(pkg)

Excellent point. It was getting late and I failed to consider that
sets stored unique values. Fixed.

> 38      + atexit.register(self._save_virtual_mapping, configdir)
>
> Why can't this be called after building the map and iterating through all packages in install_packages()? This would write it earlier, and also be much more reliable if you call install_packages() multiple times.

Fixed. I'm a bit concerned about performance here, but let's see how
the tests shake out.

> 62      + if (permanent_rootdir and (candidate.record.has_key('Conflicts') or
>
> Does the python-apt API require the usage of has_key() here, or can you just use the standard "in" syntax? (the PEP-8 check in current trunk should complain). Also, you can probably just drop that whole condition and move the code within the previous "if permanent_rootdir:" block.

Fixed

I've noticed that this breaks the previous
test_install_packages_permanent_sandbox test, as it fails to consider
the versions of the conflicting pages. In this case it removes tzdata,
even though it's only versions earlier than 2007k-1 that conflict.
Working on a fix for this now.

lp:~ev/apport/reunpack_to_sandbox updated on 2012-04-12
2285. By Evan on 2012-04-12

Correctly parse the Conflicts/Replaces packages and their versions out. Use Debfile to correctly obtain the package version from /var/cache/apt/archives. Finally, compare these two versions with the provided conditions using apt_pkg.check_dep.

2286. By Evan on 2012-04-12

Fix comment.

2287. By Evan on 2012-04-12

Put the pickle file under the release directory.

Evan (ev) wrote :

Okay, I think this is ready to review again.

I've fixed the issues you requested and fixed the problem of failing to consider the versions of conflicting packages (yay for unit tests catching something I had not thought about at all!). The function has grown quite long, but I would be concerned about the overhead in splitting it into extra functions that get called over and over again for each install_packages run.

There appears to be about a one second penalty for consideration of versions in conflicting packages, which is reasonable. This is just from eyeballing time(1) though. I'll follow up with a full timing report.

Evan (ev) wrote :

Difference between r2281 in this branch and r2287. This is only with a single package, so it's far from a comprehensive test. However, the overhead of running the general case does not seem to be a massive impact, which is good.

Suggestions welcome for how to generate a giant pile of varying Ubuntu crashes. Perhaps it would be worthwhile for me to write something that spawns and kill -SEGV's a number of native applications from an up to date system?

ubuntu@server-8149:~$ ./judge/judge ./2281 ./with_changes
v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^v^

    n mean sd min max cmd
   50 10316.2ms 355.7 9945.9 11638.6 ./2281
   50 9867.5ms 625.8 9385.6 13061.3 ./with_changes
 -448.636ms -4.3% p=0.000
difference is significant at 95.0% confidence (p=0.000):
based on these samples, suggested sample size is n>=283 to have a 112.16ms confidence interval

lp:~ev/apport/reunpack_to_sandbox updated on 2012-04-17
2288. By Evan on 2012-04-17

Put the pickle file under the cache_dir since it more closely relates to that then the configdir. This also means that if the cache_dir gets wiped out, the pickle rightly goes with it.

Martin Pitt (pitti) wrote :

Thanks for the fixes. I merged that with dropping the new python-debian dependency and calling "dpkg-deb -f <deb> Version" instead, and doing a small apt_pkg import fix.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'backends/packaging-apt-dpkg.py'
2--- backends/packaging-apt-dpkg.py 2012-04-16 14:09:57 +0000
3+++ backends/packaging-apt-dpkg.py 2012-04-17 18:44:24 +0000
4@@ -18,6 +18,8 @@
5 import warnings
6 warnings.filterwarnings('ignore', 'apt API not stable yet', FutureWarning)
7 import apt
8+from debian import debfile
9+import cPickle as pickle
10
11 import apport
12 from apport.packaging import PackageInfo
13@@ -32,6 +34,7 @@
14 self._sandbox_apt_cache = None
15 self._contents_dir = None
16 self._mirror = None
17+ self._virtual_mapping_obj = None
18
19 self.configuration = '/etc/default/apport'
20
21@@ -43,6 +46,25 @@
22 except AttributeError:
23 pass
24
25+ def _virtual_mapping(self, configdir):
26+ if self._virtual_mapping_obj is not None:
27+ return self._virtual_mapping_obj
28+
29+ mapping_file = os.path.join(configdir, 'virtual_mapping.pickle')
30+ if os.path.exists(mapping_file):
31+ with open(mapping_file, 'rb') as fp:
32+ self._virtual_mapping_obj = pickle.load(fp)
33+ else:
34+ self._virtual_mapping_obj = {}
35+
36+ return self._virtual_mapping_obj
37+
38+ def _save_virtual_mapping(self, configdir):
39+ mapping_file = os.path.join(configdir, 'virtual_mapping.pickle')
40+ if self._virtual_mapping_obj is not None:
41+ with open(mapping_file, 'wb') as fp:
42+ pickle.dump(self._virtual_mapping_obj, fp)
43+
44 def _cache(self):
45 '''Return apt.Cache() (initialized lazily).'''
46
47@@ -441,7 +463,7 @@
48 if debug_pkgname in c and c[debug_pkgname].isInstalled:
49 #print('kernel ddeb already installed')
50 return (installed, outdated)
51- target_dir = apt_pkg.Config.FindDir('Dir::Cache::archives') + '/partial'
52+ target_dir = apt.apt_pkg.Config.FindDir('Dir::Cache::archives') + '/partial'
53 deb = '%s_%s_%s.ddeb' % (debug_pkgname, ver, arch)
54 # FIXME: this package is currently not in Packages.gz
55 url = 'http://ddebs.ubuntu.com/pool/main/l/linux/%s' % deb
56@@ -549,6 +571,53 @@
57 obsolete += w + '\n'
58 real_pkgs.add(pkg)
59
60+ if permanent_rootdir:
61+ mapping_path = os.path.join(cache_dir, release)
62+ virtual_mapping = self._virtual_mapping(mapping_path)
63+ # Remember all the virtual packages that this package provides,
64+ # so that if we encounter that virtual package as a
65+ # Conflicts/Replaces later, we know to remove this package from
66+ # the cache.
67+ for p in candidate.provides:
68+ virtual_mapping.setdefault(p, set()).add(pkg)
69+ conflicts = []
70+ if 'Conflicts' in candidate.record:
71+ conflicts += apt.apt_pkg.parse_depends(candidate.record['Conflicts'])
72+ if 'Replaces' in candidate.record:
73+ conflicts += apt.apt_pkg.parse_depends(candidate.record['Replaces'])
74+ archives = apt.apt_pkg.Config.FindDir('Dir::Cache::archives')
75+ for conflict in conflicts:
76+ # apt_pkg.parse_depends needs to handle the or operator,
77+ # but as policy states it is invalid to use that in
78+ # Replaces/Depends, we can safely choose the first value
79+ # here.
80+ conflict = conflict[0]
81+ if c.is_virtual_package(conflict[0]):
82+ try:
83+ providers = virtual_mapping[conflict[0]]
84+ except KeyError:
85+ # We may not have seen the virtual package that
86+ # this conflicts with, so we can assume it's not
87+ # unpacked into the sandbox.
88+ continue
89+ for p in providers:
90+ debs = os.path.join(archives, '%s_*.deb' % p)
91+ for path in glob.glob(debs):
92+ ver = debfile.DebFile(path)
93+ ver = ver.control.debcontrol()['version']
94+ if apt.apt_pkg.check_dep(ver, conflict[2],
95+ conflict[1]):
96+ os.unlink(path)
97+ del providers
98+ else:
99+ debs = os.path.join(archives, '%s_*.deb' % conflict[0])
100+ for path in glob.glob(debs):
101+ ver = debfile.DebFile(path)
102+ ver = ver.control.debcontrol()['version']
103+ if apt.apt_pkg.check_dep(ver, conflict[2],
104+ conflict[1]):
105+ os.unlink(path)
106+
107 if candidate.architecture != 'all':
108 if pkg + '-dbg' in c:
109 real_pkgs.add(pkg + '-dbg')
110@@ -585,6 +654,8 @@
111 assert not real_pkgs, 'apt fetcher did not fetch these packages: ' \
112 + ' '.join(real_pkgs)
113
114+ if permanent_rootdir:
115+ self._save_virtual_mapping(mapping_path)
116 return obsolete
117
118 def package_name_glob(self, nameglob):
119
120=== modified file 'test/test_backend_apt_dpkg.py'
121--- test/test_backend_apt_dpkg.py 2012-04-11 16:09:08 +0000
122+++ test/test_backend_apt_dpkg.py 2012-04-17 18:44:24 +0000
123@@ -1,5 +1,5 @@
124 import unittest, gzip, imp, subprocess, tempfile, shutil, os, os.path, time
125-import apt_pkg
126+from apt import apt_pkg
127 import glob
128
129 if os.environ.get('APPORT_TEST_LOCAL'):
130@@ -543,6 +543,25 @@
131 permanent_rootdir=True)
132 apt_pkg.Config.set('Acquire::http::Proxy', '')
133
134+ @unittest.skipUnless(_has_default_route(), 'online test')
135+ def test_install_packages_permanent_sandbox_repack(self):
136+ self._setup_foonux_config()
137+ apache_bin_path = os.path.join(self.rootdir, 'usr/sbin/apache2')
138+ impl.install_packages(self.rootdir, self.configdir, 'Foonux 1.2',
139+ [('apache2-mpm-worker', None)], False, self.cachedir,
140+ permanent_rootdir=True)
141+ self.assertTrue(os.readlink(apache_bin_path).endswith('mpm-worker/apache2'))
142+ impl.install_packages(self.rootdir, self.configdir, 'Foonux 1.2',
143+ [('apache2-mpm-event', None)], False, self.cachedir,
144+ permanent_rootdir=True)
145+ self.assertTrue(os.readlink(apache_bin_path).endswith('mpm-event/apache2'),
146+ 'should have installed mpm-event, but have mpm-worker.')
147+ impl.install_packages(self.rootdir, self.configdir, 'Foonux 1.2',
148+ [('apache2-mpm-worker', None)], False, self.cachedir,
149+ permanent_rootdir=True)
150+ self.assertTrue(os.readlink(apache_bin_path).endswith('mpm-worker/apache2'),
151+ 'should have installed mpm-worker, but have mpm-event.')
152+
153 def _setup_foonux_config(self):
154 '''Set up directories and configuration for install_packages()'''
155

Subscribers

People subscribed via source and target branches