Code review comment for lp:~raharper/curtin/trunk.nvme_bcache

Revision history for this message
Ryan Harper (raharper) wrote :

On Tue, Apr 11, 2017 at 9:17 AM, Scott Moser <email address hidden> wrote:

> Some comments inline.
> The unit tests seem really tied to the implementation (mocking
> os.path.exists which is called by 2 different callers).
>
> I realize this is tricky, but maybe some re-work could make the tests more
> clear.
>

I'm open to reworking them. Not 100% happy with the current coverage.

>
>
> A test for get_bcache_sys_path would be good.
>

Pretty sure I have that.

>
>
> Diff comments:
>
> > === modified file 'curtin/block/clear_holders.py'
> > --- curtin/block/clear_holders.py 2017-01-31 22:03:02 +0000
> > +++ curtin/block/clear_holders.py 2017-04-10 18:04:42 +0000
> > @@ -56,32 +56,75 @@
> >
> > def get_bcache_using_dev(device):
> > """
> > - Get the /sys/fs/bcache/ path of the bcache volume using specified
> device
> > + Get the /sys/fs/bcache/ path of the bcache cache device bound to
> > + specified device
> > """
> > # FIXME: when block.bcache is written this should be moved there
> > sysfs_path = block.sys_block_path(device)
> > return os.path.realpath(os.path.join(sysfs_path, 'bcache',
> 'cache'))
> >
> >
> > +def get_bcache_sys_path(device):
> > + """
> > + Get the /sys/class/block/<device>/bcache path if present
> > + """
> > + sysfs_path = block.sys_block_path(device)
> > + return os.path.realpath(os.path.join(sysfs_path, 'bcache'))
>
> The comment says "if present", but its not clear what the return would be.
> If the *device* does not exist (get_bcache_sys_path("bogusdev1")) then
> you'll
> raise an OSError, but if the device exists and the 'bcache' entry does not
> you'll just return it either way.
>

> >>> os.path.realpath("/bogus/path/here")
> '/bogus/path/here'
>

Ack, the comment was really from the caller;
I'll rework as I'd like it validate the path before the realpath call as
well.

>
> > +
> > +
> > def shutdown_bcache(device):
> > """
> > Shut down bcache for specified bcache device
> > +
> > + 1. Stop the cacheset that `device` is connected to
> > + 2. Stop the 'device'
> > """
> > + if not device.startswith('/sys/class/block'):
>
> we use the string '/sys/class/block' other places, we really should
> SYS_CLASS_BLOCK = '/sys/block'
> not "your fault", but i think having that constant would be good.
>

ACK, though I'd like to punt on the general cleanup into a separate MP.

>
> > + raise ValueError('Invalid Device (%s): '
> > + 'Device path must start with
> /sys/class/block/',
> > + device)
> > +
> > bcache_shutdown_message = ('shutdown_bcache running on {} has
> determined '
> > 'that the device has already been shut
> down '
> > 'during handling of another bcache dev. '
> > 'skipping'.format(device))
> > - if not os.path.exists(device):
> > - LOG.info(bcache_shutdown_message)
> > - return
> > -
> > - bcache_sysfs = get_bcache_using_dev(device)
> > - if not os.path.exists(bcache_sysfs):
> > - LOG.info(bcache_shutdown_message)
> > - return
> > -
> > - LOG.debug('stopping bcache at: %s', bcache_sysfs)
> > - util.write_file(os.path.join(bcache_sysfs, 'stop'), '1', mode=None)
> > +
> > + if not os.path.exists(device):
> > + LOG.info(bcache_shutdown_message)
> > + return
> > +
> > + # stop cacheset if it exists
> > + bcache_cache_sysfs = get_bcache_using_dev(device)
> > + if not os.path.exists(bcache_cache_sysfs):
> > + LOG.info('bcache cacheset already removed: %s',
> > + os.path.basename(bcache_cache_sysfs))
> > + else:
> > + LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
> > + util.write_file(os.path.join(bcache_cache_sysfs, 'stop'),
> > + '1', mode=None)
> > + try:
> > + util.wait_for_removal(bcache_cache_sysfs)
> > + except OSError:
> > + LOG.info('Failed to stop bcache cacheset %s',
> bcache_cache_sysfs)
> > + raise
> > +
> > + # after stopping cache set, we may need to stop the device
> > + if not os.path.exists(device):
> > + LOG.info('bcache backing device already removed: %s', device)
> > + return
> > + else:
> > + bcache_block_sysfs = get_bcache_sys_path(device)
> > + LOG.info('stopping bcache backing device at: %s',
> bcache_block_sysfs)
> > + util.write_file(os.path.join(bcache_block_sysfs, 'stop'),
> > + '1', mode=None)
>
> mostly unrelated, but in cloud-init's write_file, i've wanted to have
> a option to use perms of the file if it already exists.
> just a comment here really.
>

IIUC, you want to set mode for create?

>
> > + try:
> > + util.wait_for_removal(device)
> > + except OSError:
> > + LOG.info('Failed to stop bcache backing device %s',
> > + bcache_block_sysfs)
> > + raise
> > +
> > + return
> >
> >
> > def shutdown_lvm(device):
> >
> > === modified file 'curtin/util.py'
> > --- curtin/util.py 2017-04-07 20:12:45 +0000
> > +++ curtin/util.py 2017-04-10 18:04:42 +0000
> > @@ -194,6 +194,27 @@
> > return _subp(*args, **kwargs)
> >
> >
> > +def wait_for_removal(path, retries=[1, 3, 5, 7]):
> > + if not path:
> > + raise ValueError('wait_for_removal: missing path parameter')
> > +
> > + # Retry with waits between checking for existence
> > + LOG.debug('waiting for %s to be removed', path)
> > + for num, wait in enumerate(retries):
> > + if not os.path.exists(path):
> > + LOG.debug('%s has been removed', path)
> > + return
> > + LOG.debug('sleeping %s', wait)
>
> the defaults here are pretty high (1 second, 3 seconds) for what we're
> using it for.
> Ie, waiting for removal probably happens on sub-second, but if it missed
> that first
> second block, we're waiting 3 more.
>

I mean; it's just a guess; machines could be slower could be faster. If
they're faster
then we're less likely to wait in the first place.

>
> maybe change callers?
>
> This 'wait and check' logic can also be generalized as seen
> http://stackoverflow.com/questions/21786382/pythonic-
> way-of-retry-running-a-function
> A generic function for it might have a signature like:
> wait_for(func, check=True, retries=[1,2,3,4], msg=None, )
>

I thought about that but I didn't see any existing implementation in
curtin/cloud-init and I
didn't want to introduce a whole new class;

>
> Theres probably some sane way to call it with
> wait_for(functools.partial(os.path.exists, dev), check=lambda d: not d,
> msg=dev + ...)
>
> Maybe I'm over thinking this.
> But the general "try and wait and then give up" is probably going to be
> repeated.
>

I agree, however, I don't think we've see enough variants to find a decent
generic
abstraction yet so I was punting on adding a generic when it's not yet
needed.

>
> > + time.sleep(wait)
> > +
> > + # final check
> > + if not os.path.exists(path):
> > + LOG.debug('%s has been removed', path)
> > + return
> > +
> > + raise OSError('Timeout exceeded for removal of %s', path)
> > +
> > +
> > def load_command_environment(env=os.environ, strict=False):
> >
> > mapping = {'scratch': 'WORKING_DIR', 'fstab': 'OUTPUT_FSTAB',
>
>
> --
> https://code.launchpad.net/~raharper/curtin/trunk.nvme_
> bcache/+merge/322307
> You are the owner of lp:~raharper/curtin/trunk.nvme_bcache.
>

« Back to merge proposal