Code review comment for lp:~martin-lp/hipl/n900-build-fix

Revision history for this message
David Martin (martin-lp) wrote :

Hi,
did some more research on this and did actually find something.

On Tue, Sep 27, 2011 at 9:51 AM, Diego Biurrun <email address hidden> wrote:
> On Mon, Sep 26, 2011 at 01:56:24PM +0000, David Martin wrote:
>>
>> On Thu, Sep 22, 2011 at 5:35 PM, Diego Biurrun <email address hidden> wrote:
>> > review reject
>> >
>> > I think you should really give this another try and dig deeper. Find out
>> > which function is failing with some printfs, debug some more.
>>
>> like I said it's RSA_generate_key() from the OpenSSL library which fails.
>
> As a next step, print out the values that we pass to that function, maybe
> some are invalid, NULL or whatever. If pointers are passed, dereference
> them and print their values, maybe they point at invalid values.

We don't pass any pointers (well, NULL) and our values are fine. :)

On Mon, Sep 26, 2011 at 4:32 PM, Christof Mroz <email address hidden> wrote:
> Can you dump the rlimits inside the test? See getrlimit(2).

Any specific rlimit you had in mind? RLIMIT_CPU for cpu time does not seem to be
set and limiting it by hand (eg. to a second and choosing a stronger key) results
in a graceful exit with the respective signal (cpu time exceeded).

On Mon, Sep 26, 2011 at 4:32 PM, Christof Mroz <email address hidden> wrote:
> If all else fails, you may try building OpenSSL yourself with debug symbols
> so you can circumscript the culprit even further (or learn ASM :) ). Also,
> did you try strace yet?

Did not try the newer OpenSSL yet but strace is neat and helped to narrow down
the problem.

Both the N900 and the version compiled on passion set a SIGALRM with rt_sigaction
(which I think is responsible for killing it).

The N900 binary in addition sets a timer to 3 seconds with setitimer. This one
sends the SIGALRM when the timer reaches zero and this is what happens before
it breaks.

I'm not sure yet who is responsible for setting the timer. Is this platform
specific? I think Android does something like this where a process gets killed
if it's unresponsive for more than a few seconds.

Should I upload the traces? I don't see a button for this in launchpad (I'm pretty
sure to have done that before, though).

« Back to merge proposal