Merge lp:~ev/apport/disable-core-removal into lp:~apport-hackers/apport/trunk

Proposed by Evan
Status: Rejected
Rejected by: Martin Pitt
Proposed branch: lp:~ev/apport/disable-core-removal
Merge into: lp:~apport-hackers/apport/trunk
Diff against target: 45 lines (+3/-17)
1 file modified
apport/crashdb_impl/launchpad.py (+3/-17)
To merge this branch: bzr merge lp:~ev/apport/disable-core-removal
Reviewer Review Type Date Requested Status
Martin Pitt (community) Needs Information
Review via email: mp+81974@code.launchpad.net

Description of the change

Martin presumably needs a dataset of core files to verify his work on the crash signature. I need this same data to test the load on various parts of the crash database implementation.

The proposed branch disables removal of core files from bug reports.

To post a comment you must log in.
Revision history for this message
Martin Pitt (pitti) wrote :

For the record, if we do this, we need to keep all bugs private, even the duplicates. Please note that I don't need the core dumps for implementing the new duplication algorithm, just the original Stacktrace.txt attachments. So we should not remove those, the core dumps can go for my sake. If you need them, we should just let this run for a few weeks, and I'd just make the change locally in the data center. It's not something I actually want to commit to trunk.

Do you need the actual core dumps for anything?

Revision history for this message
Martin Pitt (pitti) :
review: Needs Information
Revision history for this message
Martin Pitt (pitti) wrote :

I applied this to the data center retracer:

--- apport/crashdb_impl/launchpad.py 2011-11-16 14:29:24 +0000
+++ apport/crashdb_impl/launchpad.py 2011-11-16 14:29:57 +0000
@@ -636,7 +636,7 @@
                     return

             for a in bug.attachments:
- if a.title in ('CoreDump.gz', 'Stacktrace.txt',
+ if a.title in ('CoreDump.gz',
                     'ThreadStacktrace.txt', 'Dependencies.txt', 'ProcMaps.txt',
                     'ProcStatus.txt', 'Registers.txt', 'Disassembly.txt'):
                     try:

I. e. we'll retain the original Stacktrace.txt as it was produced on the client machines. This will give me the sample data for checking the "client side duplicate signature" idea we discussed. Is this also enough for you?

Revision history for this message
Evan (ev) wrote :

It would be good to have a large sample set of data to work with before we see how things pan out when this is in the archive. Ideally, this would be the raw .crash files, but I'm happy to reconstruct them from the constituent pieces.

I can see a few uses for this:

- Profiling the performance and overhead of the crash reporting daemon.
- Ensuring the retracing system in Cassandra and its message queues work.
- Having a set of real world data for testing the crash format parsing.

Equally, I think we can get by in just mocking this data up. So, if you're concerned about the disk space, security implications, or anything else: by all means, leave the data center retracer as you've modified it.

Cheers, and apologies for the delayed reply.

Revision history for this message
Martin Pitt (pitti) wrote :

Right, then we don't need this. We can get 833 reports which have a core dump:

  https://bugs.launchpad.net/ubuntu/+bugs?field.tag=apport-failed-retrace

using python-apport it's really easy to download them from launchpad and save them as a .crash file.

I'll just leave the Stacktrace.txt for a while then, to get some data for the duplication detection.

Unmerged revisions

2067. By Evan

Disable removing core files from bug reports to give us a data set to work with for generating a crash signature, as well as profiling the crash database.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'apport/crashdb_impl/launchpad.py'
2--- apport/crashdb_impl/launchpad.py 2011-11-01 19:55:36 +0000
3+++ apport/crashdb_impl/launchpad.py 2011-11-11 12:19:31 +0000
4@@ -410,14 +410,7 @@
5 bug = self.launchpad.bugs[id]
6 break
7
8- # remove core dump if stack trace is usable
9 if report.has_useful_stacktrace():
10- for a in bug.attachments:
11- if a.title == 'CoreDump.gz':
12- try:
13- a.removeFromBug()
14- except HTTPError:
15- pass # LP#249950 workaround
16 try:
17 task = self._get_distro_tasks(bug.bug_tasks).next()
18 task.importance = 'Medium'
19@@ -636,9 +629,9 @@
20 return
21
22 for a in bug.attachments:
23- if a.title in ('CoreDump.gz', 'Stacktrace.txt',
24- 'ThreadStacktrace.txt', 'Dependencies.txt', 'ProcMaps.txt',
25- 'ProcStatus.txt', 'Registers.txt', 'Disassembly.txt'):
26+ if a.title in ('Stacktrace.txt', 'ThreadStacktrace.txt',
27+ 'Dependencies.txt', 'ProcMaps.txt', 'ProcStatus.txt',
28+ 'Registers.txt', 'Disassembly.txt'):
29 try:
30 a.removeFromBug()
31 except HTTPError:
32@@ -721,13 +714,6 @@
33 task.lp_save()
34 bug.newMessage(content=invalid_msg,
35 subject='Crash report cannot be processed')
36-
37- for a in bug.attachments:
38- if a.title == 'CoreDump.gz':
39- try:
40- a.removeFromBug()
41- except HTTPError:
42- pass # LP#249950 workaround
43 else:
44 if 'apport-failed-retrace' not in bug.tags:
45 bug.tags = bug.tags + ['apport-failed-retrace'] # LP#254901 workaround

Subscribers

People subscribed via source and target branches