> > [1]
> >
> > + create_checkpoint_nonce()
> > + # Delete old nonces.
> > + checkpoint = find_checkpoint_nonce()
> > + if checkpoint is None:
> > + return 0
> > + return delete_old_nonces(checkpoint)
> >
> > Seems odd to create the new nonce before deleting; there is a very
> > slow race condition there.
>
> Where do you see the race condition exactly?
If there's a delay between create_checkpoint_nonce() and
find_checkpoint_nonce() of more than timestamp_threshold - which I
grant is *extremely* unlikely - then it'll delete the nonce just
created.
However, it doesn't matter anyway: by then it'll be time to delete the
new nonce! So, it's fine to leave it as it is.
>
> > [2]
> >
> > +def delete_old_nonces(checkpoint):
> > + """Delete nonces older than the given nonce."""
> > + nonce_to_delete = Nonce.objects.filter(id__lte=checkpoint.id)
> > + count = nonce_to_delete.count()
> > + nonce_to_delete.delete()
> > + return count
> >
> > Do we need the count?
>
> Well, we could do without, but it's pretty cheap, and I think it's pretty
> useful to give the caller some feedback about the actual clean-up (i.e. did it
> really happen or not); especially considering the two-phase strategy.
PostgreSQL will still need to read the resulting rows to verify that
they exist; other database systems may only need to reference the
index in this situation.
I guess it's going to be visiting those rows again very soon to delete
them, so they'll be hot and the delete will be quicker, but that might
be a false assumption. When regularly deleting nonces it probably
won't be a significant overhead, but the first run of this code might
be punishing.
PostgreSQL does return a count of deleted rows, eliminating the need
for a separate count, which might be a way to mitigate this. I wonder
if Django surfaces that in some way?
Another thought: key__endswith probably creates a LIKE '%SUFFIX'
clause, which can be very inefficient. I suggest talking to Jeroen or
Stuart Bishop about this. Can this code be changed to use a prefix
match instead? Can you check that an index on Nonce.key exists, and
that it supports prefix (or suffix) matches?
> > [1] checkpoint_ nonce() _nonce( ) old_nonces( checkpoint)
> >
> > + create_
> > + # Delete old nonces.
> > + checkpoint = find_checkpoint
> > + if checkpoint is None:
> > + return 0
> > + return delete_
> >
> > Seems odd to create the new nonce before deleting; there is a very
> > slow race condition there.
>
> Where do you see the race condition exactly?
In cleanup_ old_nonces( ):
+ create_ checkpoint_ nonce() _nonce( )
+ # Delete old nonces.
+ checkpoint = find_checkpoint
If there's a delay between create_ checkpoint_ nonce() and _nonce( ) of more than timestamp_threshold - which I
find_checkpoint
grant is *extremely* unlikely - then it'll delete the nonce just
created.
However, it doesn't matter anyway: by then it'll be time to delete the
new nonce! So, it's fine to leave it as it is.
> old_nonces( checkpoint) : filter( id__lte= checkpoint. id) delete. count() delete. delete( )
> > [2]
> >
> > +def delete_
> > + """Delete nonces older than the given nonce."""
> > + nonce_to_delete = Nonce.objects.
> > + count = nonce_to_
> > + nonce_to_
> > + return count
> >
> > Do we need the count?
>
> Well, we could do without, but it's pretty cheap, and I think it's pretty
> useful to give the caller some feedback about the actual clean-up (i.e. did it
> really happen or not); especially considering the two-phase strategy.
The count is not too bad as long as indexes are being used - wiki.postgresql .org/wiki/ Slow_Counting - but note the caveat:
http://
PostgreSQL will still need to read the resulting rows to verify that
they exist; other database systems may only need to reference the
index in this situation.
I guess it's going to be visiting those rows again very soon to delete
them, so they'll be hot and the delete will be quicker, but that might
be a false assumption. When regularly deleting nonces it probably
won't be a significant overhead, but the first run of this code might
be punishing.
PostgreSQL does return a count of deleted rows, eliminating the need
for a separate count, which might be a way to mitigate this. I wonder
if Django surfaces that in some way?
See http:// www.postgresql. org/docs/ 9.1/interactive /sql-delete. html.
Another thought: key__endswith probably creates a LIKE '%SUFFIX'
clause, which can be very inefficient. I suggest talking to Jeroen or
Stuart Bishop about this. Can this code be changed to use a prefix
match instead? Can you check that an index on Nonce.key exists, and
that it supports prefix (or suffix) matches?