Object-replicator fails to replicate objects in update_deleted

Bug #1035274 reported by Kota Tsuyuzaki
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Constantine Peresypkin

Bug Description

Object-replicator fails to replicate objects in update_deleted() when
the destination node doesn't have an "objects" directory. On the other
hand, update() succeeds in it even if the directory doesn't exist.

It is because update() sends a REPLICATE request, which will create
the "objects" directory before executing rsync, but update_deleted() does not do it.
Object-replicator tried to replicate hand-offed objects earlier
than the other objects by using update_deleted(), so object-replicator had
failed the replication for one hour on my environment.
Is that expected behavior?
(swift version is 1.5.0)

Revision history for this message
Hugo Kou (tonytkdk) wrote :

Well , I did face same issue
The easy way to prevent it , I do make accounts/containers/objects folder in all new formated device after mounted.

Revision history for this message
Kota Tsuyuzaki (tsuyuzaki-kota) wrote :

Hugo Kou,
    Thanks for the suggestion and I confirmed the way prevents this issue.
    However, I wonder whether this issue is expected behavior or not,
    and I think it should be prevent by swift without user operations.
    (e.g. If an object-replicator catches such a rsync error in update_deleted(),
     it sends a REPLICATE request to the destination node before executing next rsync)

description: updated
Revision history for this message
Constantine Peresypkin (constantine-q) wrote :

I can see three different ways out of this:

1. do an empty REPLICATE request before each rsync in the update_deleted()
Pros: very easy to implement
Cons: excessive load on the object server

2. parse rsync output to catch the problem with the directory then do a REPLICATE request and rsync once more
Pros: no load on object server
Cons: much harder to implement: needs stdout/err parser, needs second rsync run and some logic

3. change rsync execution arguments in such a way that it will create the target directory by itself
Pros: easy to implement, no load on object server
Cons: needs excessive testing (it should not break existing functionality)

I'm tending towards the 3.
But will anybody test it?

Revision history for this message
You Yamagata (y-yamagata) wrote :

I guess the 3rd way may cause trouble.
Although the device in the target storage server is unmounted for troubles or maintenance,
rsync will succeed.
I think it is not desired behavior.

I guess that the local object-replicator can check existence of the directory 'objects' in collect_jobs() and create if not exist.
Off course remote object-replicator's rsync will fail until detected and fixed by local ones, but it will be fixed certainly.

Revision history for this message
Constantine Peresypkin (constantine-q) wrote :

Hmm, I doo se your point, it actually looks good, but this is what it does right now in collecy_jobs() when it encounters a non-existing object dir:

if not os.path.exists(obj_path):
                continue

This happens after mount check.
So, you can actually create a directory right there:

if not os.path.exists(obj_path):
                mkdirs(obj_path)
                continue

Revision history for this message
You Yamagata (y-yamagata) wrote :

I thought same implementation with yours.

I guess it is reasonable way to fix this problem. (but not test)

Revision history for this message
Constantine Peresypkin (constantine-q) wrote :

I am trying to create a good test (probe test, I suppose) that will be helpful here.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/12232

Changed in swift:
assignee: nobody → Constantine Peresypkin (constantine-q)
status: New → In Progress
Revision history for this message
Constantine Peresypkin (constantine-q) wrote :

Ok, I have made the change and invented a probe test that could rather easily reproduce the problem.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/12232
Committed: http://github.com/openstack/swift/commit/73846c2c38728fb32da2b7c8540cffb738b22c42
Submitter: Jenkins
Branch: master

commit 73846c2c38728fb32da2b7c8540cffb738b22c42
Author: Constantine Peresypkin <email address hidden>
Date: Tue Aug 28 04:15:28 2012 +0300

    fix update_deleted directory creation. bug 1035274

    Change-Id: Ie3423ce90d906948a1ce2db0efe3da184e60f6e0

Changed in swift:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in swift:
milestone: none → 1.7.5
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.