On Tue, Apr 11, 2017 at 9:17 AM, Scott Moser <email address hidden> wrote:
> Some comments inline.
> The unit tests seem really tied to the implementation (mocking
> os.path.exists which is called by 2 different callers).
>
> I realize this is tricky, but maybe some re-work could make the tests more
> clear.
>
I'm open to reworking them. Not 100% happy with the current coverage.
>
>
> A test for get_bcache_sys_path would be good.
>
Pretty sure I have that.
>
>
> Diff comments:
>
> > === modified file 'curtin/block/clear_holders.py'
> > --- curtin/block/clear_holders.py 2017-01-31 22:03:02 +0000
> > +++ curtin/block/clear_holders.py 2017-04-10 18:04:42 +0000
> > @@ -56,32 +56,75 @@
> >
> > def get_bcache_using_dev(device):
> > """
> > - Get the /sys/fs/bcache/ path of the bcache volume using specified
> device
> > + Get the /sys/fs/bcache/ path of the bcache cache device bound to
> > + specified device
> > """
> > # FIXME: when block.bcache is written this should be moved there
> > sysfs_path = block.sys_block_path(device)
> > return os.path.realpath(os.path.join(sysfs_path, 'bcache',
> 'cache'))
> >
> >
> > +def get_bcache_sys_path(device):
> > + """
> > + Get the /sys/class/block/<device>/bcache path if present
> > + """
> > + sysfs_path = block.sys_block_path(device)
> > + return os.path.realpath(os.path.join(sysfs_path, 'bcache'))
>
> The comment says "if present", but its not clear what the return would be.
> If the *device* does not exist (get_bcache_sys_path("bogusdev1")) then
> you'll
> raise an OSError, but if the device exists and the 'bcache' entry does not
> you'll just return it either way.
>
Ack, the comment was really from the caller;
I'll rework as I'd like it validate the path before the realpath call as
well.
>
> > +
> > +
> > def shutdown_bcache(device):
> > """
> > Shut down bcache for specified bcache device
> > +
> > + 1. Stop the cacheset that `device` is connected to
> > + 2. Stop the 'device'
> > """
> > + if not device.startswith('/sys/class/block'):
>
> we use the string '/sys/class/block' other places, we really should
> SYS_CLASS_BLOCK = '/sys/block'
> not "your fault", but i think having that constant would be good.
>
ACK, though I'd like to punt on the general cleanup into a separate MP.
>
> > + raise ValueError('Invalid Device (%s): '
> > + 'Device path must start with
> /sys/class/block/',
> > + device)
> > +
> > bcache_shutdown_message = ('shutdown_bcache running on {} has
> determined '
> > 'that the device has already been shut
> down '
> > 'during handling of another bcache dev. '
> > 'skipping'.format(device))
> > - if not os.path.exists(device):
> > - LOG.info(bcache_shutdown_message)
> > - return
> > -
> > - bcache_sysfs = get_bcache_using_dev(device)
> > - if not os.path.exists(bcache_sysfs):
> > - LOG.info(bcache_shutdown_message)
> > - return
> > -
> > - LOG.debug('stopping bcache at: %s', bcache_sysfs)
> > - util.write_file(os.path.join(bcache_sysfs, 'stop'), '1', mode=None)
> > +
> > + if not os.path.exists(device):
> > + LOG.info(bcache_shutdown_message)
> > + return
> > +
> > + # stop cacheset if it exists
> > + bcache_cache_sysfs = get_bcache_using_dev(device)
> > + if not os.path.exists(bcache_cache_sysfs):
> > + LOG.info('bcache cacheset already removed: %s',
> > + os.path.basename(bcache_cache_sysfs))
> > + else:
> > + LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
> > + util.write_file(os.path.join(bcache_cache_sysfs, 'stop'),
> > + '1', mode=None)
> > + try:
> > + util.wait_for_removal(bcache_cache_sysfs)
> > + except OSError:
> > + LOG.info('Failed to stop bcache cacheset %s',
> bcache_cache_sysfs)
> > + raise
> > +
> > + # after stopping cache set, we may need to stop the device
> > + if not os.path.exists(device):
> > + LOG.info('bcache backing device already removed: %s', device)
> > + return
> > + else:
> > + bcache_block_sysfs = get_bcache_sys_path(device)
> > + LOG.info('stopping bcache backing device at: %s',
> bcache_block_sysfs)
> > + util.write_file(os.path.join(bcache_block_sysfs, 'stop'),
> > + '1', mode=None)
>
> mostly unrelated, but in cloud-init's write_file, i've wanted to have
> a option to use perms of the file if it already exists.
> just a comment here really.
>
IIUC, you want to set mode for create?
>
> > + try:
> > + util.wait_for_removal(device)
> > + except OSError:
> > + LOG.info('Failed to stop bcache backing device %s',
> > + bcache_block_sysfs)
> > + raise
> > +
> > + return
> >
> >
> > def shutdown_lvm(device):
> >
> > === modified file 'curtin/util.py'
> > --- curtin/util.py 2017-04-07 20:12:45 +0000
> > +++ curtin/util.py 2017-04-10 18:04:42 +0000
> > @@ -194,6 +194,27 @@
> > return _subp(*args, **kwargs)
> >
> >
> > +def wait_for_removal(path, retries=[1, 3, 5, 7]):
> > + if not path:
> > + raise ValueError('wait_for_removal: missing path parameter')
> > +
> > + # Retry with waits between checking for existence
> > + LOG.debug('waiting for %s to be removed', path)
> > + for num, wait in enumerate(retries):
> > + if not os.path.exists(path):
> > + LOG.debug('%s has been removed', path)
> > + return
> > + LOG.debug('sleeping %s', wait)
>
> the defaults here are pretty high (1 second, 3 seconds) for what we're
> using it for.
> Ie, waiting for removal probably happens on sub-second, but if it missed
> that first
> second block, we're waiting 3 more.
>
I mean; it's just a guess; machines could be slower could be faster. If
they're faster
then we're less likely to wait in the first place.
>
> maybe change callers?
>
> This 'wait and check' logic can also be generalized as seen
> http://stackoverflow.com/questions/21786382/pythonic-
> way-of-retry-running-a-function
> A generic function for it might have a signature like:
> wait_for(func, check=True, retries=[1,2,3,4], msg=None, )
>
I thought about that but I didn't see any existing implementation in
curtin/cloud-init and I
didn't want to introduce a whole new class;
>
> Theres probably some sane way to call it with
> wait_for(functools.partial(os.path.exists, dev), check=lambda d: not d,
> msg=dev + ...)
>
> Maybe I'm over thinking this.
> But the general "try and wait and then give up" is probably going to be
> repeated.
>
I agree, however, I don't think we've see enough variants to find a decent
generic
abstraction yet so I was punting on adding a generic when it's not yet
needed.
>
> > + time.sleep(wait)
> > +
> > + # final check
> > + if not os.path.exists(path):
> > + LOG.debug('%s has been removed', path)
> > + return
> > +
> > + raise OSError('Timeout exceeded for removal of %s', path)
> > +
> > +
> > def load_command_environment(env=os.environ, strict=False):
> >
> > mapping = {'scratch': 'WORKING_DIR', 'fstab': 'OUTPUT_FSTAB',
>
>
> --
> https://code.launchpad.net/~raharper/curtin/trunk.nvme_
> bcache/+merge/322307
> You are the owner of lp:~raharper/curtin/trunk.nvme_bcache.
>
On Tue, Apr 11, 2017 at 9:17 AM, Scott Moser <email address hidden> wrote:
> Some comments inline.
> The unit tests seem really tied to the implementation (mocking
> os.path.exists which is called by 2 different callers).
>
> I realize this is tricky, but maybe some re-work could make the tests more
> clear.
>
I'm open to reworking them. Not 100% happy with the current coverage.
>
>
> A test for get_bcache_sys_path would be good.
>
Pretty sure I have that.
> block/clear_ holders. py' block/clear_ holders. py 2017-01-31 22:03:02 +0000 block/clear_ holders. py 2017-04-10 18:04:42 +0000 using_dev( device) : block_path( device) realpath( os.path. join(sysfs_ path, 'bcache', sys_path( device) : block/< device> /bcache path if present block_path( device) realpath( os.path. join(sysfs_ path, 'bcache')) sys_path( "bogusdev1" )) then
>
> Diff comments:
>
> > === modified file 'curtin/
> > --- curtin/
> > +++ curtin/
> > @@ -56,32 +56,75 @@
> >
> > def get_bcache_
> > """
> > - Get the /sys/fs/bcache/ path of the bcache volume using specified
> device
> > + Get the /sys/fs/bcache/ path of the bcache cache device bound to
> > + specified device
> > """
> > # FIXME: when block.bcache is written this should be moved there
> > sysfs_path = block.sys_
> > return os.path.
> 'cache'))
> >
> >
> > +def get_bcache_
> > + """
> > + Get the /sys/class/
> > + """
> > + sysfs_path = block.sys_
> > + return os.path.
>
> The comment says "if present", but its not clear what the return would be.
> If the *device* does not exist (get_bcache_
> you'll
> raise an OSError, but if the device exists and the 'bcache' entry does not
> you'll just return it either way.
>
> >>> os.path. realpath( "/bogus/ path/here" )
> '/bogus/path/here'
>
Ack, the comment was really from the caller;
I'll rework as I'd like it validate the path before the realpath call as
well.
> bcache( device) : startswith( '/sys/class/ block') :
> > +
> > +
> > def shutdown_
> > """
> > Shut down bcache for specified bcache device
> > +
> > + 1. Stop the cacheset that `device` is connected to
> > + 2. Stop the 'device'
> > """
> > + if not device.
>
> we use the string '/sys/class/block' other places, we really should
> SYS_CLASS_BLOCK = '/sys/block'
> not "your fault", but i think having that constant would be good.
>
ACK, though I'd like to punt on the general cleanup into a separate MP.
> shutdown_ message = ('shutdown_bcache running on {} has .format( device) ) exists( device) : bcache_ shutdown_ message) using_dev( device) exists( bcache_ sysfs): bcache_ shutdown_ message) file(os. path.join( bcache_ sysfs, 'stop'), '1', mode=None) exists( device) : bcache_ shutdown_ message) using_dev( device) exists( bcache_ cache_sysfs) : basename( bcache_ cache_sysfs) ) file(os. path.join( bcache_ cache_sysfs, 'stop'), for_removal( bcache_ cache_sysfs) exists( device) : sys_path( device) file(os. path.join( bcache_ block_sysfs, 'stop'),
> > + raise ValueError('Invalid Device (%s): '
> > + 'Device path must start with
> /sys/class/block/',
> > + device)
> > +
> > bcache_
> determined '
> > 'that the device has already been shut
> down '
> > 'during handling of another bcache dev. '
> > 'skipping'
> > - if not os.path.
> > - LOG.info(
> > - return
> > -
> > - bcache_sysfs = get_bcache_
> > - if not os.path.
> > - LOG.info(
> > - return
> > -
> > - LOG.debug('stopping bcache at: %s', bcache_sysfs)
> > - util.write_
> > +
> > + if not os.path.
> > + LOG.info(
> > + return
> > +
> > + # stop cacheset if it exists
> > + bcache_cache_sysfs = get_bcache_
> > + if not os.path.
> > + LOG.info('bcache cacheset already removed: %s',
> > + os.path.
> > + else:
> > + LOG.info('stopping bcache cacheset at: %s', bcache_cache_sysfs)
> > + util.write_
> > + '1', mode=None)
> > + try:
> > + util.wait_
> > + except OSError:
> > + LOG.info('Failed to stop bcache cacheset %s',
> bcache_cache_sysfs)
> > + raise
> > +
> > + # after stopping cache set, we may need to stop the device
> > + if not os.path.
> > + LOG.info('bcache backing device already removed: %s', device)
> > + return
> > + else:
> > + bcache_block_sysfs = get_bcache_
> > + LOG.info('stopping bcache backing device at: %s',
> bcache_block_sysfs)
> > + util.write_
> > + '1', mode=None)
>
> mostly unrelated, but in cloud-init's write_file, i've wanted to have
> a option to use perms of the file if it already exists.
> just a comment here really.
>
IIUC, you want to set mode for create?
> for_removal( device) lvm(device) : removal( path, retries=[1, 3, 5, 7]): 'wait_for_ removal: missing path parameter') exists( path):
> > + try:
> > + util.wait_
> > + except OSError:
> > + LOG.info('Failed to stop bcache backing device %s',
> > + bcache_block_sysfs)
> > + raise
> > +
> > + return
> >
> >
> > def shutdown_
> >
> > === modified file 'curtin/util.py'
> > --- curtin/util.py 2017-04-07 20:12:45 +0000
> > +++ curtin/util.py 2017-04-10 18:04:42 +0000
> > @@ -194,6 +194,27 @@
> > return _subp(*args, **kwargs)
> >
> >
> > +def wait_for_
> > + if not path:
> > + raise ValueError(
> > +
> > + # Retry with waits between checking for existence
> > + LOG.debug('waiting for %s to be removed', path)
> > + for num, wait in enumerate(retries):
> > + if not os.path.
> > + LOG.debug('%s has been removed', path)
> > + return
> > + LOG.debug('sleeping %s', wait)
>
> the defaults here are pretty high (1 second, 3 seconds) for what we're
> using it for.
> Ie, waiting for removal probably happens on sub-second, but if it missed
> that first
> second block, we're waiting 3 more.
>
I mean; it's just a guess; machines could be slower could be faster. If
they're faster
then we're less likely to wait in the first place.
> stackoverflow. com/questions/ 21786382/ pythonic- retry-running- a-function
> maybe change callers?
>
> This 'wait and check' logic can also be generalized as seen
> http://
> way-of-
> A generic function for it might have a signature like:
> wait_for(func, check=True, retries=[1,2,3,4], msg=None, )
>
I thought about that but I didn't see any existing implementation in
curtin/cloud-init and I
didn't want to introduce a whole new class;
> functools. partial( os.path. exists, dev), check=lambda d: not d,
> Theres probably some sane way to call it with
> wait_for(
> msg=dev + ...)
>
> Maybe I'm over thinking this.
> But the general "try and wait and then give up" is probably going to be
> repeated.
>
I agree, however, I don't think we've see enough variants to find a decent
generic
abstraction yet so I was punting on adding a generic when it's not yet
needed.
> exists( path): environment( env=os. environ, strict=False): /code.launchpad .net/~raharper/ curtin/ trunk.nvme_ +merge/ 322307
> > + time.sleep(wait)
> > +
> > + # final check
> > + if not os.path.
> > + LOG.debug('%s has been removed', path)
> > + return
> > +
> > + raise OSError('Timeout exceeded for removal of %s', path)
> > +
> > +
> > def load_command_
> >
> > mapping = {'scratch': 'WORKING_DIR', 'fstab': 'OUTPUT_FSTAB',
>
>
> --
> https:/
> bcache/
> You are the owner of lp:~raharper/curtin/trunk.nvme_bcache.
>