Merge lp:~brian-murray/apport/zgrep-fallback into lp:~apport-hackers/apport/trunk

Proposed by Brian Murray
Status: Merged
Approved by: Martin Pitt
Approved revision: 3107
Merged at revision: 3106
Proposed branch: lp:~brian-murray/apport/zgrep-fallback
Merge into: lp:~apport-hackers/apport/trunk
Diff against target: 36 lines (+16/-3)
1 file modified
backends/packaging-apt-dpkg.py (+16/-3)
To merge this branch: bzr merge lp:~brian-murray/apport/zgrep-fallback
Reviewer Review Type Date Requested Status
Martin Pitt (community) Approve
Review via email: mp+310218@code.launchpad.net

Description of the change

The production retracers for the Error Tracker were OOM'ing regularly when trying to use zgrep to search Contents.gz for files found in the crash report. While zgrep is faster than using gzip and reading the file line by line this still seems like a good fallback option and is better than having the retrace process crash, I've implemented the proposed change in the production version of the Error Tracker and have encountered no issues with it.

As mentioned this could likely be better:

+ try:
+ line = line.decode('UTF-8').rstrip('\n')
+ # 2016-11-01 this should be better
+ except UnicodeDecodeError:
+ continue

I added because of the following lines in Contents.gz for yakkety:

 $ zgrep -a "lenska.alias" /mnt/storage/archive-mirror/dists/yakkety/Contents-amd64.gz
usr/lib/aspell/�slenska.alias universe/text/aspell-is

Thanks!

To post a comment you must log in.
Revision history for this message
Martin Pitt (pitti) wrote :

I can't say that I like having two code paths for the same thing, but I accept that it's necessary. (Still, if these machines are that tight on RAM, can we increase that a bit?)

I would like this to be a bit faster and simpler though, see inline comment. Thanks!

3107. By Brian Murray

improve unicode handling after pitti's feedback

Revision history for this message
Brian Murray (brian-murray) wrote :

I've made the changes you've suggested, thanks!

Revision history for this message
Martin Pitt (pitti) wrote :

LGTM now, thanks! Please merge.

review: Approve
Revision history for this message
Brian Murray (brian-murray) wrote :

I don't have permission to commit to the apport project.

Revision history for this message
Martin Pitt (pitti) wrote :

Uh, what? Time to change that ☺ You are a member now.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'backends/packaging-apt-dpkg.py'
--- backends/packaging-apt-dpkg.py 2016-08-13 07:09:38 +0000
+++ backends/packaging-apt-dpkg.py 2016-11-08 19:25:48 +0000
@@ -13,6 +13,7 @@
13# the full text of the license.13# the full text of the license.
1414
15import subprocess, os, glob, stat, sys, tempfile, shutil, time15import subprocess, os, glob, stat, sys, tempfile, shutil, time
16import errno
16import hashlib17import hashlib
17import json18import json
1819
@@ -1221,9 +1222,21 @@
12211222
1222 # zgrep is magnitudes faster than a 'gzip.open/split() loop'1223 # zgrep is magnitudes faster than a 'gzip.open/split() loop'
1223 package = None1224 package = None
1224 zgrep = subprocess.Popen(['zgrep', '-m1', '^%s[[:space:]]' % file, map],1225 try:
1225 stdout=subprocess.PIPE, stderr=subprocess.PIPE)1226 zgrep = subprocess.Popen(['zgrep', '-m1', '^%s[[:space:]]' % file, map],
1226 out = zgrep.communicate()[0].decode('UTF-8')1227 stdout=subprocess.PIPE, stderr=subprocess.PIPE)
1228 out = zgrep.communicate()[0].decode('UTF-8')
1229 except OSError as e:
1230 if e.errno != errno.ENOMEM:
1231 raise
1232 file_b = file.encode()
1233 import gzip
1234 with gzip.open('%s' % map, 'rb') as contents:
1235 out = ''
1236 for line in contents:
1237 if line.startswith(file_b):
1238 out = line
1239 break
1227 # we do not check the return code, since zgrep -m1 often errors out1240 # we do not check the return code, since zgrep -m1 often errors out
1228 # with 'stdout: broken pipe'1241 # with 'stdout: broken pipe'
1229 if out:1242 if out:

Subscribers

People subscribed via source and target branches