'make build' failed during the rollout when importing zope.security.management

Bug #575037 reported by Björn Tillenius
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
High
Gary Poster

Bug Description

During the last rollout (2010-05-04), some machines failed on the 'make build' step, when zope.security.management was imported, saying it couldn't import interfaces (on line 'from zope.security import interfaces')

Here's the log files from building on the importds:

  https://pastebin.canonical.com/31874/

This error also happened on forster. Doing a 'make clean; make build' made it work, but we need to find out why this happened, and prevent it from happening again.

Related branches

Revision history for this message
Gary Poster (gary) wrote :

I tried duping by building more and more like the deployment does it, with consultation by mthaddon, but everything worked fine for me locally (last attempt: https://pastebin.canonical.com/32158/ ).

Then we hit on this (edited):

[12:20pm] mthaddon: gary: those two servers are x86, fwiw
[12:20pm] mthaddon: gary: as opposed to most which are amd64
[12:20pm] gary: mthaddon: ah-hah. So, all the 32 bit machines were unhappy, all the 64 bit were happy?
[12:21pm] mthaddon: gary: it does seem that way, yeah (forster is 32 bit as well)
[12:21pm] mars: gary, mthaddon, and the eggs you are building and moving around architecture independent?
[12:22pm] gary: mars, no, dependent, but eggs have identifiers for the architecture. The 32 bit machines build their own eggs
[12:22pm] gary: or are supposed to, and have done so in the past

Since 32 bit/64 bit seems implicated, I'll try seeing if I can dupe this locally with virtual machines.

Revision history for this message
Gary Poster (gary) wrote :

I created a 32 bit Lucid virtual machine. I made the eggs on a 64 bit machine and scp'd them over. I then used that egg directory to build Launchpad on the 32 bit machine.

It generated no errors.

While further local investigation is possible--I could try setting up a 64 bit Hardy virtual machine and a 32 bit Hardy virtual machine and get the precise list of installed debs on the two and try to duplicate again--I think it is time, if not past time, to be able to actually look at an instance of a failed build in production.

I will request that the LOSAs somehow give me an instance of one of these failed builds for me to investigate via a screen session.

Revision history for this message
Gary Poster (gary) wrote :

The LOSAs gave me a failed build to look at. The surface-level problem can actually be seen in the traceback:

File "/home/pqm/for_rollouts/production/eggs/zope.security-3.7.1-py2.5-linux-x86_64.egg/zope/security/management.py", line 22, in <module>
ImportError: cannot import name interfaces

Notice that egg directory: zope.security-3.7.1-py2.5-linux-x86_64.egg . The x86_64 suffix indicates that this egg was built for a 64 bit machine. However, it is being run on a 32 bit machine.

I did not determine the underlying piece of software that does not detect the problem and force a rebuild. Discovering that does not seem valuable at this time, since we want to change the way we generate these eggs anyway, soon.

Instead, the LOSAs and I agreed that I would create a different target for the build machine. Instead of running ``make bin/buildout && bin/buildout`` on that machine, I will make an "build_eggs" target, so they will run ``make build_eggs`` instead. It's implementation should look something like this: http://paste.ubuntu.com/439560/

Doing this adds about 5.671 seconds to the buildout run per machine, if my instance is any indication.

Revision history for this message
Gary Poster (gary) wrote :

This is a patch against rev 9346 of production-devel:

http://pastebin.ubuntu.com/439584/

Revision history for this message
Stefan Stasik (stefan-stasik) wrote :

Hello: I am posting the 2 patches that are needed on LP production to fix this issue. Gary is landing this patch into LP Devel. If a CP happens before the next rollout, gary will submit to LP prod-devel, and these changes would need to happen then.

Here is the 2 patch-diff's.

https://pastebin.canonical.com/32708/

-- Stefan

Gary Poster (gary)
Changed in launchpad-foundations:
status: Triaged → Fix Released
status: Fix Released → Fix Committed
tags: added: qa-ok
Revision history for this message
Ursula Junque (ursinha) wrote : Bug fixed by a commit
Changed in launchpad-foundations:
milestone: 10.04 → 10.05
tags: added: qa-needstesting
removed: qa-ok
Tom Haddon (mthaddon)
tags: added: canonical-losa-lp
Gary Poster (gary)
tags: added: qa-ok
removed: qa-needstesting
Curtis Hovey (sinzui)
Changed in launchpad-foundations:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.