Bug #394629 “Hang on first collection-status” : Bugs : Duplicity

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-02:

#1

Full log of duplicity when the hang occurs. Edit (56.3 KiB, text/plain)

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-02:

#2

Is there any way to reproduce this? Does it happen all the time, or just sometimes?

I'm having a hard time replicating the problem.

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-02: Re: [Bug 394629] Re: Hang on first collection-status

#3

Kenneth Loafman a écrit :
> Is there any way to reproduce this? Does it happen all the time, or
> just sometimes?
>
> I'm having a hard time replicating the problem.

Try to „rm -rf ~/.cache/duplicity” the run duplicity without
„--archive-dir” nor „--name”.

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-02:

#4

Cristian KLEIN wrote:
> Kenneth Loafman a écrit :
>> Is there any way to reproduce this? Does it happen all the time, or
>> just sometimes?
>>
>> I'm having a hard time replicating the problem.
>
> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
> „--archive-dir” nor „--name”.

No problem with that. It just recreates the dirs, does the resync, then
the collection status, no delay anywhere.

...Ken

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-03:

#5

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> Is there any way to reproduce this? Does it happen all the time, or
>>> just sometimes?
>>>
>>> I'm having a hard time replicating the problem.
>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>> „--archive-dir” nor „--name”.
>
> No problem with that. It just recreates the dirs, does the resync, then
> the collection status, no delay anywhere.

Strange. I can reproduce it every time. At the end, I have three GnuPG
processes which are:

gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
--no-secmem-warning --use-agent --decrypt
gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
--no-secmem-warning --use-agent --decrypt
gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
--no-secmem-warning --use-agent --decrypt

If I kill them by hand, duplicity resumes.

I also tried removing everything duplicity-related and installing it
from the PPA, but the problem persists.

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-03:

#6

Cristian KLEIN wrote:
> Kenneth Loafman a écrit :
>> Cristian KLEIN wrote:
>>> Kenneth Loafman a écrit :
>>>> Is there any way to reproduce this? Does it happen all the time, or
>>>> just sometimes?
>>>>
>>>> I'm having a hard time replicating the problem.
>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>> „--archive-dir” nor „--name”.
>> No problem with that. It just recreates the dirs, does the resync, then
>> the collection status, no delay anywhere.
>
> Strange. I can reproduce it every time. At the end, I have three GnuPG
> processes which are:
>
> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
> --no-secmem-warning --use-agent --decrypt
> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
> --no-secmem-warning --use-agent --decrypt
> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
> --no-secmem-warning --use-agent --decrypt
>
> If I kill them by hand, duplicity resumes.
>
> I also tried removing everything duplicity-related and installing it
> from the PPA, but the problem persists.

If you start off with a working archive, does it do the same thing? In
other words, if the first run fails and you rerun, does the second run
fail as well? Or, is it just the first run with an empty archive?

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-03:

#7

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> Cristian KLEIN wrote:
>>>> Kenneth Loafman a écrit :
>>>>> Is there any way to reproduce this? Does it happen all the time, or
>>>>> just sometimes?
>>>>>
>>>>> I'm having a hard time replicating the problem.
>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>> „--archive-dir” nor „--name”.
>>> No problem with that. It just recreates the dirs, does the resync, then
>>> the collection status, no delay anywhere.
>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>> processes which are:
>>
>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>> --no-secmem-warning --use-agent --decrypt
>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>> --no-secmem-warning --use-agent --decrypt
>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>> --no-secmem-warning --use-agent --decrypt
>>
>> If I kill them by hand, duplicity resumes.
>>
>> I also tried removing everything duplicity-related and installing it
>> from the PPA, but the problem persists.
>
> If you start off with a working archive, does it do the same thing? In
> other words, if the first run fails and you rerun, does the second run
> fail as well? Or, is it just the first run with an empty archive?

It's just the first run with an empty archive. The second time it runs
flawlessly.

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-03:

#8

duplicity-coll-status-hang.patch Edit (1.2 KiB, text/x-patch; name="duplicity-coll-status-hang.patch")

Cristian KLEIN wrote:
> Kenneth Loafman a écrit :
>> Cristian KLEIN wrote:
>>> Kenneth Loafman a écrit :
>>>> Cristian KLEIN wrote:
>>>>> Kenneth Loafman a écrit :
>>>>>> Is there any way to reproduce this? Does it happen all the time, or
>>>>>> just sometimes?
>>>>>>
>>>>>> I'm having a hard time replicating the problem.
>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>> „--archive-dir” nor „--name”.
>>>> No problem with that. It just recreates the dirs, does the resync, then
>>>> the collection status, no delay anywhere.
>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>> processes which are:
>>>
>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>> --no-secmem-warning --use-agent --decrypt
>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>> --no-secmem-warning --use-agent --decrypt
>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>> --no-secmem-warning --use-agent --decrypt
>>>
>>> If I kill them by hand, duplicity resumes.
>>>
>>> I also tried removing everything duplicity-related and installing it
>>> from the PPA, but the problem persists.
>> If you start off with a working archive, does it do the same thing? In
>> other words, if the first run fails and you rerun, does the second run
>> fail as well? Or, is it just the first run with an empty archive?
>
> It's just the first run with an empty archive. The second time it runs
> flawlessly.

Here's a patch, no guarantees, but it will change the order of the
initial collection status and sync. It's possible you've encountered
some sort of race condition, but who knows.

Apply this to /usr/bin/duplicity, or wherever your have duplicity.

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-03:

#9

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> Cristian KLEIN wrote:
>>>> Kenneth Loafman a écrit :
>>>>> Cristian KLEIN wrote:
>>>>>> Kenneth Loafman a écrit :
>>>>>>> Is there any way to reproduce this? Does it happen all the time, or
>>>>>>> just sometimes?
>>>>>>>
>>>>>>> I'm having a hard time replicating the problem.
>>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>>> „--archive-dir” nor „--name”.
>>>>> No problem with that. It just recreates the dirs, does the resync, then
>>>>> the collection status, no delay anywhere.
>>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>>> processes which are:
>>>>
>>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>>> --no-secmem-warning --use-agent --decrypt
>>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>>> --no-secmem-warning --use-agent --decrypt
>>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>>> --no-secmem-warning --use-agent --decrypt
>>>>
>>>> If I kill them by hand, duplicity resumes.
>>>>
>>>> I also tried removing everything duplicity-related and installing it
>>>> from the PPA, but the problem persists.
>>> If you start off with a working archive, does it do the same thing? In
>>> other words, if the first run fails and you rerun, does the second run
>>> fail as well? Or, is it just the first run with an empty archive?
>> It's just the first run with an empty archive. The second time it runs
>> flawlessly.
>
> Here's a patch, no guarantees, but it will change the order of the
> initial collection status and sync. It's possible you've encountered
> some sort of race condition, but who knows.
>
> Apply this to /usr/bin/duplicity, or wherever your have duplicity.
>
>
> ** Attachment added: "duplicity-coll-status-hang.patch"
> http://launchpadlibrarian.net/28656433/duplicity-coll-status-hang.patch
>

Sorry, it doesn't solve the problem. Same effect: 3 GPG processes which,
if killed, duplicity resumes.

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> Cristian KLEIN wrote:
>>>> Kenneth Loafman a écrit :
>>>>> Cristian KLEIN wrote:
>>>>>> Kenneth Loafman a écrit :
>>>>>>> Is there any way to reproduce this?  Does it happen all the time, or
>>>>>>> just sometimes?
>>>>>>>
>>>>>>> I'm having a hard time replicating the problem.
>>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>>> „--archive-dir” nor „--name”.
>>>>> No problem with that.  It just recreates the dirs, does the resync, then
>>>>> the collection status, no delay anywhere.
>>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>>> processes which are:
>>>>
>>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>>> --no-secmem-warning --use-agent --decrypt
>>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>>> --no-secmem-warning --use-agent --decrypt
>>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>>> --no-secmem-warning --use-agent --decrypt
>>>>
>>>> If I kill them by hand, duplicity resumes.
>>>>
>>>> I also tried removing everything duplicity-related and installing it
>>>> from the PPA, but the problem persists.
>>> If you start off with a working archive, does it do the same thing?  In
>>> other words, if the first run fails and you rerun, does the second run
>>> fail as well?  Or, is it just the first run with an empty archive?
>> It's just the first run with an empty archive. The second time it runs
>> flawlessly.
> 
> Here's a patch, no guarantees, but it will change the order of the
> initial collection status and sync.  It's possible you've encountered
> some sort of race condition, but who knows.
> 
> Apply this to /usr/bin/duplicity, or wherever your have duplicity.
> 
> 
> ** Attachment added: "duplicity-coll-status-hang.patch"
>    http://launchpadlibrarian.net/28656433/duplicity-coll-status-hang.patch
>

Sorry, it doesn't solve the problem. Same effect: 3 GPG processes which,
if killed, duplicity resumes.

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-03:

#10

Cristian KLEIN wrote:
> Kenneth Loafman a écrit :
>> Cristian KLEIN wrote:
>>> Kenneth Loafman a écrit :
>>>> Cristian KLEIN wrote:
>>>>> Kenneth Loafman a écrit :
>>>>>> Cristian KLEIN wrote:
>>>>>>> Kenneth Loafman a écrit :
>>>>>>>> Is there any way to reproduce this? Does it happen all the time, or
>>>>>>>> just sometimes?
>>>>>>>>
>>>>>>>> I'm having a hard time replicating the problem.
>>>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>>>> „--archive-dir” nor „--name”.
>>>>>> No problem with that. It just recreates the dirs, does the resync, then
>>>>>> the collection status, no delay anywhere.
>>>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>>>> processes which are:
>>>>>
>>>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>>>> --no-secmem-warning --use-agent --decrypt
>>>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>>>> --no-secmem-warning --use-agent --decrypt
>>>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>
>>>>> If I kill them by hand, duplicity resumes.
>>>>>
>>>>> I also tried removing everything duplicity-related and installing it
>>>>> from the PPA, but the problem persists.
>>>> If you start off with a working archive, does it do the same thing? In
>>>> other words, if the first run fails and you rerun, does the second run
>>>> fail as well? Or, is it just the first run with an empty archive?
>>> It's just the first run with an empty archive. The second time it runs
>>> flawlessly.
>> Here's a patch, no guarantees, but it will change the order of the
>> initial collection status and sync. It's possible you've encountered
>> some sort of race condition, but who knows.
>>
>> Apply this to /usr/bin/duplicity, or wherever your have duplicity.
>>
>>
>> ** Attachment added: "duplicity-coll-status-hang.patch"
>> http://launchpadlibrarian.net/28656433/duplicity-coll-status-hang.patch
>>
>
> Sorry, it doesn't solve the problem. Same effect: 3 GPG processes which,
> if killed, duplicity resumes.

OK, will chase it some more. The 3 GPG processes are normal and pretty
much stay around through the whole run. When there's many more than
that, its a problem.

Do you have any idea what the '+' represents on the end of Python 2.6.2+
that you use? Is it normal Python?

Cristian KLEIN wrote:
> Kenneth Loafman a écrit :
>> Cristian KLEIN wrote:
>>> Kenneth Loafman a écrit :
>>>> Cristian KLEIN wrote:
>>>>> Kenneth Loafman a écrit :
>>>>>> Cristian KLEIN wrote:
>>>>>>> Kenneth Loafman a écrit :
>>>>>>>> Is there any way to reproduce this?  Does it happen all the time, or
>>>>>>>> just sometimes?
>>>>>>>>
>>>>>>>> I'm having a hard time replicating the problem.
>>>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>>>> „--archive-dir” nor „--name”.
>>>>>> No problem with that.  It just recreates the dirs, does the resync, then
>>>>>> the collection status, no delay anywhere.
>>>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>>>> processes which are:
>>>>>
>>>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>>>> --no-secmem-warning --use-agent --decrypt
>>>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>>>> --no-secmem-warning --use-agent --decrypt
>>>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>
>>>>> If I kill them by hand, duplicity resumes.
>>>>>
>>>>> I also tried removing everything duplicity-related and installing it
>>>>> from the PPA, but the problem persists.
>>>> If you start off with a working archive, does it do the same thing?  In
>>>> other words, if the first run fails and you rerun, does the second run
>>>> fail as well?  Or, is it just the first run with an empty archive?
>>> It's just the first run with an empty archive. The second time it runs
>>> flawlessly.
>> Here's a patch, no guarantees, but it will change the order of the
>> initial collection status and sync.  It's possible you've encountered
>> some sort of race condition, but who knows.
>>
>> Apply this to /usr/bin/duplicity, or wherever your have duplicity.
>>
>>
>> ** Attachment added: "duplicity-coll-status-hang.patch"
>>    http://launchpadlibrarian.net/28656433/duplicity-coll-status-hang.patch
>>
> 
> Sorry, it doesn't solve the problem. Same effect: 3 GPG processes which,
> if killed, duplicity resumes.

OK, will chase it some more.  The 3 GPG processes are normal and pretty
much stay around through the whole run.  When there's many more than
that, its a problem.

Do you have any idea what the '+' represents on the end of Python 2.6.2+
that you use?  Is it normal Python?

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-03:

#11

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> Cristian KLEIN wrote:
>>>> Kenneth Loafman a écrit :
>>>>> Cristian KLEIN wrote:
>>>>>> Kenneth Loafman a écrit :
>>>>>>> Cristian KLEIN wrote:
>>>>>>>> Kenneth Loafman a écrit :
>>>>>>>>> Is there any way to reproduce this? Does it happen all the time, or
>>>>>>>>> just sometimes?
>>>>>>>>>
>>>>>>>>> I'm having a hard time replicating the problem.
>>>>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>>>>> „--archive-dir” nor „--name”.
>>>>>>> No problem with that. It just recreates the dirs, does the resync, then
>>>>>>> the collection status, no delay anywhere.
>>>>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>>>>> processes which are:
>>>>>>
>>>>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>>
>>>>>> If I kill them by hand, duplicity resumes.
>>>>>>
>>>>>> I also tried removing everything duplicity-related and installing it
>>>>>> from the PPA, but the problem persists.
>>>>> If you start off with a working archive, does it do the same thing? In
>>>>> other words, if the first run fails and you rerun, does the second run
>>>>> fail as well? Or, is it just the first run with an empty archive?
>>>> It's just the first run with an empty archive. The second time it runs
>>>> flawlessly.
>>> Here's a patch, no guarantees, but it will change the order of the
>>> initial collection status and sync. It's possible you've encountered
>>> some sort of race condition, but who knows.
>>>
>>> Apply this to /usr/bin/duplicity, or wherever your have duplicity.
>>>
>>>
>>> ** Attachment added: "duplicity-coll-status-hang.patch"
>>> http://launchpadlibrarian.net/28656433/duplicity-coll-status-hang.patch
>>>
>> Sorry, it doesn't solve the problem. Same effect: 3 GPG processes which,
>> if killed, duplicity resumes.
>
> OK, will chase it some more. The 3 GPG processes are normal and pretty
> much stay around through the whole run. When there's many more than
> that, its a problem.
>
> Do you have any idea what the '+' represents on the end of Python 2.6.2+
> that you use? Is it normal Python?

I used a Python version from the Debian repository which corrected some
bugs in 2.6.2, so the version returned was "2.6.2+". Now I am using
Ubuntu-shipped 9.04 Python, with version "2.6.2", so it can't be
Python's fault.

Could it be perhaps due to some backup-host specific properties? I run a
„restricted” SFTP server on it. However, all other duplicity operations
run flawlessly.

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> Cristian KLEIN wrote:
>>>> Kenneth Loafman a écrit :
>>>>> Cristian KLEIN wrote:
>>>>>> Kenneth Loafman a écrit :
>>>>>>> Cristian KLEIN wrote:
>>>>>>>> Kenneth Loafman a écrit :
>>>>>>>>> Is there any way to reproduce this?  Does it happen all the time, or
>>>>>>>>> just sometimes?
>>>>>>>>>
>>>>>>>>> I'm having a hard time replicating the problem.
>>>>>>>> Try to „rm -rf ~/.cache/duplicity” the run duplicity without
>>>>>>>> „--archive-dir” nor „--name”.
>>>>>>> No problem with that.  It just recreates the dirs, does the resync, then
>>>>>>> the collection status, no delay anywhere.
>>>>>> Strange. I can reproduce it every time. At the end, I have three GnuPG
>>>>>> processes which are:
>>>>>>
>>>>>> gpg --status-fd 4 --passphrase-fd 8 --logger-fd 3 --batch --no-tty
>>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>> gpg --status-fd 4 --passphrase-fd 9 --logger-fd 3 --batch --no-tty
>>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>> gpg --status-fd 4 --passphrase-fd 10 --logger-fd 3 --batch --no-tty
>>>>>> --no-secmem-warning --use-agent --decrypt
>>>>>>
>>>>>> If I kill them by hand, duplicity resumes.
>>>>>>
>>>>>> I also tried removing everything duplicity-related and installing it
>>>>>> from the PPA, but the problem persists.
>>>>> If you start off with a working archive, does it do the same thing?  In
>>>>> other words, if the first run fails and you rerun, does the second run
>>>>> fail as well?  Or, is it just the first run with an empty archive?
>>>> It's just the first run with an empty archive. The second time it runs
>>>> flawlessly.
>>> Here's a patch, no guarantees, but it will change the order of the
>>> initial collection status and sync.  It's possible you've encountered
>>> some sort of race condition, but who knows.
>>>
>>> Apply this to /usr/bin/duplicity, or wherever your have duplicity.
>>>
>>>
>>> ** Attachment added: "duplicity-coll-status-hang.patch"
>>>    http://launchpadlibrarian.net/28656433/duplicity-coll-status-hang.patch
>>>
>> Sorry, it doesn't solve the problem. Same effect: 3 GPG processes which,
>> if killed, duplicity resumes.
> 
> OK, will chase it some more.  The 3 GPG processes are normal and pretty
> much stay around through the whole run.  When there's many more than
> that, its a problem.
> 
> Do you have any idea what the '+' represents on the end of Python 2.6.2+
> that you use?  Is it normal Python?

I used a Python version from the Debian repository which corrected some
bugs in 2.6.2, so the version returned was "2.6.2+". Now I am using
Ubuntu-shipped 9.04 Python, with version "2.6.2", so it can't be
Python's fault.

Could it be perhaps due to some backup-host specific properties? I run a
„restricted” SFTP server on it. However, all other duplicity operations
run flawlessly.

Revision history for this message

Peter Schuller (scode) wrote on 2009-07-04:

#12

I saw this earlier today too the first time I ran collection status after upgrading from 0.5.18 to 0.6.01 on Python 2.5 and FreeBSD.

For some reason I am unable to reproduce it again, even after nuking ~/.cache. Possibly because I have tacked on a couple of 0.6.01 produced backup increments onto the backups.

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-04:

#13

Peter Schuller wrote:
> I saw this earlier today too the first time I ran collection status
> after upgrading from 0.5.18 to 0.6.01 on Python 2.5 and FreeBSD.
>
> For some reason I am unable to reproduce it again, even after nuking
> ~/.cache. Possibly because I have tacked on a couple of 0.6.01 produced
> backup increments onto the backups.

It looks like something is waiting, an IO operation, a thread,
something. If we could reproduce it, we could find it.

In a cron environment, a hang like this could be a real problem. It
might be possible to hang for days without getting noticed.

Revision history for this message

Peter Schuller (scode) wrote on 2009-07-04:

#14

For the record I was able to trigger it again with my backup to rsync.net (the other one I tried was my S3 backup). So whatever the problem is it is definitely crossing backends.

As per previous discussion I too saw a GPG child process. Hitting ctrl-c yielded:

gpg: signal Interrupt caught ... exiting
Exception exceptions.KeyboardInterrupt in <module 'threading' from '/usr/local/lib/python2.5/threading.pyc'> ignored

I'll see if I can indeed reproduce it reliably and hopefully debug it.

Revision history for this message

Peter Schuller (scode) wrote on 2009-07-04:

#15

After inserting a printout on launching gpg (the '!!!....') and one on the wait() in GPGFile.close(), and comparing with pstree output, it looks like it is the GPG process from the *FIRST* downloaded file, in my case, which hangs round (with BEGIN WAIT/END WAIT) for that one not beeing seen:

Synchronizing remote metadata to local cache...
Copying duplicity-full-signatures.20090608T005130Z.sigtar to local cache.
Using temporary directory /tmp/duplicity-egi7Lz-tempdir
Running 'sftp -oServerAliveInterval=15 -oServerAliveCountMax=1 USER@HOSTt' (attempt #1)
sftp command: 'get home_scode/duplicity-full-signatures.20090608T005130Z.sigtar.gpg /tmp/duplicity-egi7Lz-tempdir/mktemp-uv8C56-1'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Copying duplicity-full.20090407T000822Z.manifest to local cache.
Running 'sftp -oServerAliveInterval=15 -oServerAliveCountMax=1 USER@HOST' (attempt #1)
sftp command: 'get home_scode/duplicity-full.20090407T000822Z.manifest.gpg /tmp/duplicity-egi7Lz-tempdir/mktemp-yytp5I-2'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
BEGIN WAIT
END WAIT
Deleting /tmp/duplicity-egi7Lz-tempdir/mktemp-yytp5I-2
Copying duplicity-full.20090509T001249Z.manifest to local cache.
Running 'sftp -oServerAliveInterval=15 -oServerAliveCountMax=1 USER@HOST' (attempt #1)
sftp command: 'get home_scode/duplicity-full.20090509T001249Z.manifest.gpg /tmp/duplicity-egi7Lz-tempdir/mktemp-p36gUr-3'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
BEGIN WAIT
END WAIT
.....

The first file is also never deleted from temp. This may be related. The close of the gpg is in this case supposed to happen as a result of SrcIter's next() method in copy_to_local() in duplicity-bin. It gets closed when you reach EOF on the file being decrypted, right before StopIteration is raised.

The file is probably not being read to EOF, which might also explain why the time file is not deleted, The would-be "Deleted ..." message comes from path.py and, when actually called, is called via this path:

Traceback (most recent call last):
  File "./duplicity-bin", line 1146, in <module>
    with_tempdir(main)
  File "./duplicity-bin", line 1139, in with_tempdir
    fn()
  File "./duplicity-bin", line 1044, in main
    sync_archive(col_stats)
  File "./duplicity-bin", line 895, in sync_archive
    copy_to_local(fn)
  File "./duplicity-bin", line 865, in copy_to_local
    copy_raw(src_iter, globals.archive_dir.append(loc_name).name)
  File "./duplicity-bin", line 811, in copy_raw
    data = src_iter.next(block_size).data
  File "./duplicity-bin", line 843, in next
    self.fileobj.close()
  File "/usr/home/scode/bzr/duplicity/trunk/duplicity/dup_temp.py", line 214, in close
    hook()
  File "/usr/home/scode/bzr/duplicity/trunk/duplicity/dup_temp.py", line 107, in delete
    path.DupPath.delete(self)
  File "/usr/home/scode/bzr/duplicity/trunk/duplicity/path.py", line 537, in delete
    raise AssertionError("keffo")

(Moving to debugging another bug for a while, so wanted to dump this in the mean time before I forget the details)

After inserting a printout on launching gpg (the '!!!....') and one on the wait() in GPGFile.close(), and comparing with pstree output, it looks like it is the GPG process from the *FIRST* downloaded file, in my case, which hangs round (with BEGIN WAIT/END WAIT) for that one not beeing seen:

Synchronizing remote metadata to local cache...
Copying duplicity-full-signatures.20090608T005130Z.sigtar to local cache.
Using temporary directory /tmp/duplicity-egi7Lz-tempdir
Running 'sftp  -oServerAliveInterval=15 -oServerAliveCountMax=1 USER@HOSTt' (attempt #1)
sftp command: 'get home_scode/duplicity-full-signatures.20090608T005130Z.sigtar.gpg /tmp/duplicity-egi7Lz-tempdir/mktemp-uv8C56-1'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Copying duplicity-full.20090407T000822Z.manifest to local cache.
Running 'sftp  -oServerAliveInterval=15 -oServerAliveCountMax=1 USER@HOST' (attempt #1)
sftp command: 'get home_scode/duplicity-full.20090407T000822Z.manifest.gpg /tmp/duplicity-egi7Lz-tempdir/mktemp-yytp5I-2'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
BEGIN WAIT
END WAIT
Deleting /tmp/duplicity-egi7Lz-tempdir/mktemp-yytp5I-2
Copying duplicity-full.20090509T001249Z.manifest to local cache.
Running 'sftp  -oServerAliveInterval=15 -oServerAliveCountMax=1 USER@HOST' (attempt #1)
sftp command: 'get home_scode/duplicity-full.20090509T001249Z.manifest.gpg /tmp/duplicity-egi7Lz-tempdir/mktemp-p36gUr-3'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
BEGIN WAIT
END WAIT
.....

The first file is also never deleted from temp. This may be related. The close of the gpg is in this case supposed to happen as a result of SrcIter's next() method in copy_to_local() in duplicity-bin. It gets closed when you reach EOF on the file being decrypted, right before StopIteration is raised.

The file is probably not being read to EOF, which might also explain why the time file is not deleted, The would-be "Deleted ..." message comes from path.py and, when actually called, is called via this path:

Traceback (most recent call last):
  File "./duplicity-bin", line 1146, in <module>
    with_tempdir(main)
  File "./duplicity-bin", line 1139, in with_tempdir
    fn()
  File "./duplicity-bin", line 1044, in main
    sync_archive(col_stats)
  File "./duplicity-bin", line 895, in sync_archive
    copy_to_local(fn)
  File "./duplicity-bin", line 865, in copy_to_local
    copy_raw(src_iter, globals.archive_dir.append(loc_name).name)
  File "./duplicity-bin", line 811, in copy_raw
    data = src_iter.next(block_size).data
  File "./duplicity-bin", line 843, in next
    self.fileobj.close()
  File "/usr/home/scode/bzr/duplicity/trunk/duplicity/dup_temp.py", line 214, in close
    hook()
  File "/usr/home/scode/bzr/duplicity/trunk/duplicity/dup_temp.py", line 107, in delete
    path.DupPath.delete(self)
  File "/usr/home/scode/bzr/duplicity/trunk/duplicity/path.py", line 537, in delete
    raise AssertionError("keffo")

(Moving to debugging another bug for a while, so wanted to dump this in the mean time before I forget the details)

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-05:

#16

Hangs also on cleanup, remove-older-than, essentially any first use of a new archive dir.

Changed in duplicity:
importance:	Undecided → High
status:	New → Confirmed
milestone:	none → 0.6.02

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-06:

#17

I just pushed the fix to this bug to the trunk. Thanks Peter for the initial analysis. The bug was caused by the early out in GPG/GZIPWriteFile() that stops at a certain volume size. I had removed the size = sys.maxint args in to_remote() and to_final() after the change to block size. My mistake. GZipWriteFile() was taking an early out at 5MB and never finishing the iteration.

Changed in duplicity:
status:	Confirmed → Fix Committed

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-06:

#18

Kenneth Loafman a écrit :
> I just pushed the fix to this bug to the trunk. Thanks Peter for the
> initial analysis. The bug was caused by the early out in
> GPG/GZIPWriteFile() that stops at a certain volume size. I had removed
> the size = sys.maxint args in to_remote() and to_final() after the
> change to block size. My mistake. GZipWriteFile() was taking an early
> out at 5MB and never finishing the iteration.
>
> ** Changed in: duplicity
> Status: Confirmed => Fix Committed

Hello,

I'm sorry to tell you, but I still experience this bug even with the
last duplicity from bzr #553. I make sure that all previous traces of
duplicity were erased and that the patch in dup_temp.py was correctly
applied.

Revision history for this message

Kenneth Loafman (kenneth-loafman) wrote on 2009-07-07:

#19

Cristian KLEIN wrote:
> Kenneth Loafman a écrit :
>> I just pushed the fix to this bug to the trunk. Thanks Peter for the
>> initial analysis. The bug was caused by the early out in
>> GPG/GZIPWriteFile() that stops at a certain volume size. I had removed
>> the size = sys.maxint args in to_remote() and to_final() after the
>> change to block size. My mistake. GZipWriteFile() was taking an early
>> out at 5MB and never finishing the iteration.
>>
>> ** Changed in: duplicity
>> Status: Confirmed => Fix Committed
>
> Hello,
>
> I'm sorry to tell you, but I still experience this bug even with the
> last duplicity from bzr #553. I make sure that all previous traces of
> duplicity were erased and that the patch in dup_temp.py was correctly
> applied.

I just pushed another release with, I hope, the final fix. Turns out
that this one was not findable with a debugger, in fact, it would pass
completely when run under a debugger. AArgh!

If you could please test this out in your environment, I could get a
release out soon.

Revision history for this message

Cristian Klein (cristiklein) wrote on 2009-07-08:

#20

Kenneth Loafman a écrit :
> Cristian KLEIN wrote:
>> Kenneth Loafman a écrit :
>>> I just pushed the fix to this bug to the trunk. Thanks Peter for the
>>> initial analysis. The bug was caused by the early out in
>>> GPG/GZIPWriteFile() that stops at a certain volume size. I had removed
>>> the size = sys.maxint args in to_remote() and to_final() after the
>>> change to block size. My mistake. GZipWriteFile() was taking an early
>>> out at 5MB and never finishing the iteration.
>>>
>>> ** Changed in: duplicity
>>> Status: Confirmed => Fix Committed
>> Hello,
>>
>> I'm sorry to tell you, but I still experience this bug even with the
>> last duplicity from bzr #553. I make sure that all previous traces of
>> duplicity were erased and that the patch in dup_temp.py was correctly
>> applied.
>
> I just pushed another release with, I hope, the final fix. Turns out
> that this one was not findable with a debugger, in fact, it would pass
> completely when run under a debugger. AArgh!
>
> If you could please test this out in your environment, I could get a
> release out soon.

Good work! I downloaded the latest bzr and the hang does not occur any
more. Thank you for your patience!

Kenneth Loafman (kenneth-loafman) on 2009-07-08

Changed in duplicity:
status:	Fix Committed → Fix Released

Duplicity

Hang on first collection-status

Bug Description

Other bug subscribers

Patches

Bug attachments

Remote bug watches