Merge lp:~ev/apport/reunpack_to_sandbox into lp:apport
| Status: | Merged |
|---|---|
| Merged at revision: | 2286 |
| Proposed branch: | lp:~ev/apport/reunpack_to_sandbox |
| Merge into: | lp:apport |
| Diff against target: |
154 lines (+92/-2) 2 files modified
backends/packaging-apt-dpkg.py (+72/-1) test/test_backend_apt_dpkg.py (+20/-1) |
| To merge this branch: | bzr merge lp:~ev/apport/reunpack_to_sandbox |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Martin Pitt | 2012-04-11 | Approve on 2012-04-18 | |
|
Review via email:
|
|||
Description of the Change
As discussed on IRC, we do not adequately handle the case of packages that Conflicts or Replaces with each other. This branch attempts to solve that by using Martin's suggested approach of deleting any cached debs (thus forcing them to be re-downloaded and unpacked) that conflict with the to-be-installed package.
It uses a dictionary of dictionaries (with None values) to maintain a mapping between a virtual package and those packages which provide it. It calculates this on the fly, as it writes packages into the cache, and so it also needs to persist the dictionary to disk to retain state between runs. It does this by using cPickle with an atexit() handler. I've chosen this approach for speed, but if there are concerns of fragility, lets discuss them :).
| Martin Pitt (pitti) wrote : | # |
Thanks for this!
I have some questions/remarks:
8 +import apt_pkg
I'm not sure whether this is the preferred method or using apt.apt_pkg.*. I just saw that _install_
9 +from collections import defaultdict
36 + self._virtual_
Why do we need this, and the map of maps to none? Wouldn't it be easier to use a standard dict like 'virtualpkg' -> set(['realpkg1', ...])? That would look a lot easier IMHO. With that you can replace
+ virtual_
with
virtual_
38 + atexit.
Why can't this be called after building the map and iterating through all packages in install_packages()? This would write it earlier, and also be much more reliable if you call install_packages() multiple times.
62 + if (permanent_rootdir and (candidate.
Does the python-apt API require the usage of has_key() here, or can you just use the standard "in" syntax? (the PEP-8 check in current trunk should complain). Also, you can probably just drop that whole condition and move the code within the previous "if permanent_rootdir:" block.
Thanks!
| Martin Pitt (pitti) wrote : | # |
Oh and for the record, I think pickle is just fine for making the virtual dict persistent.
- 2282. By Evan on 2012-04-12
-
Use apt.apt_pkg instead of apt_pkg.
- 2283. By Evan on 2012-04-12
-
Use sets instead of dictionary keys as a list. Use 'in' instead of has_key.
- 2284. By Evan on 2012-04-12
-
Drop the atexit saving of the virtual package mapping pickle. Write it at the end of every install_packages call.
On Thu, Apr 12, 2012 at 9:15 AM, Martin Pitt <email address hidden> wrote:
> Review: Needs Fixing
>
> Thanks for this!
>
> I have some questions/remarks:
>
> 8 +import apt_pkg
>
> I'm not sure whether this is the preferred method or using apt.apt_pkg.*. I just saw that _install_
Fixed
> 9 +from collections import defaultdict
> 36 + self._virtual_
>
> Why do we need this, and the map of maps to none? Wouldn't it be easier to use a standard dict like 'virtualpkg' -> set(['realpkg1', ...])? That would look a lot easier IMHO. With that you can replace
>
> + virtual_
>
> with
>
> virtual_
Excellent point. It was getting late and I failed to consider that
sets stored unique values. Fixed.
> 38 + atexit.
>
> Why can't this be called after building the map and iterating through all packages in install_packages()? This would write it earlier, and also be much more reliable if you call install_packages() multiple times.
Fixed. I'm a bit concerned about performance here, but let's see how
the tests shake out.
> 62 + if (permanent_rootdir and (candidate.
>
> Does the python-apt API require the usage of has_key() here, or can you just use the standard "in" syntax? (the PEP-8 check in current trunk should complain). Also, you can probably just drop that whole condition and move the code within the previous "if permanent_rootdir:" block.
Fixed
I've noticed that this breaks the previous
test_install_
the versions of the conflicting pages. In this case it removes tzdata,
even though it's only versions earlier than 2007k-1 that conflict.
Working on a fix for this now.
- 2285. By Evan on 2012-04-12
-
Correctly parse the Conflicts/Replaces packages and their versions out. Use Debfile to correctly obtain the package version from /var/cache/
apt/archives. Finally, compare these two versions with the provided conditions using apt_pkg.check_dep. - 2286. By Evan on 2012-04-12
-
Fix comment.
- 2287. By Evan on 2012-04-12
-
Put the pickle file under the release directory.
Okay, I think this is ready to review again.
I've fixed the issues you requested and fixed the problem of failing to consider the versions of conflicting packages (yay for unit tests catching something I had not thought about at all!). The function has grown quite long, but I would be concerned about the overhead in splitting it into extra functions that get called over and over again for each install_packages run.
There appears to be about a one second penalty for consideration of versions in conflicting packages, which is reasonable. This is just from eyeballing time(1) though. I'll follow up with a full timing report.
Difference between r2281 in this branch and r2287. This is only with a single package, so it's far from a comprehensive test. However, the overhead of running the general case does not seem to be a massive impact, which is good.
Suggestions welcome for how to generate a giant pile of varying Ubuntu crashes. Perhaps it would be worthwhile for me to write something that spawns and kill -SEGV's a number of native applications from an up to date system?
ubuntu@
v^v^v^v^
n mean sd min max cmd
50 10316.2ms 355.7 9945.9 11638.6 ./2281
50 9867.5ms 625.8 9385.6 13061.3 ./with_changes
-448.636ms -4.3% p=0.000
difference is significant at 95.0% confidence (p=0.000):
based on these samples, suggested sample size is n>=283 to have a 112.16ms confidence interval
| Martin Pitt (pitti) wrote : | # |
Thanks for the fixes. I merged that with dropping the new python-debian dependency and calling "dpkg-deb -f <deb> Version" instead, and doing a small apt_pkg import fix.


Timings on a gedit crash against trunk (r2277) and this branch (r2280):
ubuntu@ server- 8149:~$ ./judge/judge ./previous_changes ./with_changes v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^v^ v^v^v^
v^v^v^v^
n mean sd min max cmd
50 8757.4ms 315.5 8345.0 10521.3 ./previous_changes
50 10346.5ms 269.1 9797.9 11263.7 ./with_changes
+1589.055ms +18.1% p=0.042
difference is significant at 95.0% confidence (p=0.042):
based on these samples, suggested sample size is n>=8 to have a 397.26ms confidence interval