Dmedia

Merge lp:~jderose/dmedia/fix-1247530 into lp:dmedia

fix-1247530
Merge into trunk

Proposed by Jason Gerard DeRose on 2013-11-21

Status:	Merged
Merged at revision:	760
Proposed branch:	lp:~jderose/dmedia/fix-1247530
Merge into:	lp:dmedia
Diff against target:	1868 lines (+1306/-230) 8 files modified dmedia/core.py (+220/-55) dmedia/local.py (+9/-9) dmedia/metastore.py (+219/-53) dmedia/tests/test_core.py (+144/-0) dmedia/tests/test_local.py (+37/-0) dmedia/tests/test_metastore.py (+490/-112) dmedia/tests/test_views.py (+149/-0) dmedia/views.py (+38/-1)
To merge this branch:	bzr merge lp:~jderose/dmedia/fix-1247530
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
David Jordan		2013-11-21	Approve on 2013-11-22
Review via email: mp+196045@code.launchpad.net

Description of the change

For background, please see this bug:
https://bugs.launchpad.net/dmedia/+bug/1253494

Changes include:

* Replace core._vigilance_worker() function with new core.Vigilance class that breaks things down into simple methods that can be more easily unit tested; also adds a number of said unit tests, but we're still not at 100% coverage

* The copy-increasing behavior (Vigilance) now uses an IO cost accounting model and will take the least expensive route (IO wise) in order to get all files at rank=1 up to rank=2, then all files at rank=2 up to rank=3, and so on; as a result, Dmedia now converges much more quickly; the biggest gain here is because now Vigilance will simply verify a downgraded file, whereas before it would create a new copy (verifying is the cheapest option in the IO cost accounting model)

* When downloading from a local peer, Vigilance now uses the dmedia/machine docs to determine what FileStore are connected to each peer, and thereby which should have a particular file; this is much more efficient than the previous approach, which was to make a HEAD request to each peer till the file was found; this likewise means Dmedia now converges more quickly when you have multiple devices in your Dmedia library (especially in the case of 3 or more peers).

* LocalStores.filter_by_avail(), LocalStores.find_dst_store() now take a *threshold* argument for the available bytes to maintain, rather than relying on the hard-coded local.MIN_FREE_SPACE constant; this was needed because the new preemptive copy-increasing uses a different threshold than the normal copying increasing

* In the metastore module, add MIN_BYTES_FREE constant (which replaces MIN_FREE_SPACE), and rename RECLAIM_BYTES to clearer and more consistent MAX_BYTES_FREE; add unit tests for these constants, in particular to enforce that (MAX_BYTES_FREE >= 2 * MIN_BYTES_FREE)

* Vigilance now does preemptive copy-increasing up to MAX_BYTES_FREE: in other words, a 4th copy will be speculatively created, but only up to a larger min available space threshold; copies are created in descending order of atime; in tandem with MetaStore.reclaim_all(), this change will *usually* create enough shuffling "flow" to prevent the non-convergent scenario discussed in lp:1247530; however, this preemptive copy-increasing does *not* provide a strong, bounded guarantee that lp:1247530 wont happen, so we don't yet consider that bug fixed!

* Replace MetaStore.iter_fragile() with simpler, more focused MetaStore.iter_files_at_rank(); this new method steps through the files at each rank 50 docs at a time (whereas before it would retrieve the IDs for every file a the rank in question, even if that meant retrieving, say, a million rows); however, similar order randomization is still done so that different peers don't step on each others toes constantly

* Add get_rank() function, which MetaStore.iter_fragile() uses to do present-moment filtering as it retrieves each doc; previously this filtering was done by the _vigilance_worker()

* Add MetaStore.wait_for_fragile_files() method, which implements a single step in the CouchDB _changes event-loop that used to be in MetaStore.iter_fragile(); importantly, this makes this code easy to unit test now; and said unit test was added ;)

* Replace MetaStore.iter_actionable_fragile() with MetaStore.iter_fragile_files(), which now just processes rank=0 through rank=5, and does no filtering; this change was needed because now Vigilance will verify downgraded files, even if there isn't a free FileStore where a new copy could be created

* Add MetaStore.iter_preempt_files() to drive the preemptive copy-increasing; this does order randomization and present-moment filtering similar to MetaStore.iter_files_at_rank()

* Add get_copies() function, which MetaStore.iter_preempt_files() uses to do present-moment filtering as it retrieves each doc

* Add new file/preempt view to drive the preemptive copy-increasing

Revision history for this message

David Jordan (dmj726) wrote on 2013-11-22:

Looks good, let's work on completely resolving https://bugs.launchpad.net/bugs/1247530 now!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Jason Gerard DeRose

dmedia Dev

 === modified file 'dmedia/core.py'
 --- dmedia/core.py	2013-10-24 02:59:18 +0000
 +++ dmedia/core.py	2013-11-21 03:22:50 +0000
@@ -53,7 +53,8 @@
  from dmedia import util, schema, views
  from dmedia.client import Downloader, get_client, build_ssl_context
  from dmedia.metastore import MetaStore, create_stored, get_dict
--from dmedia.local import LocalStores, FileNotLocal, MIN_FREE_SPACE
++from dmedia.metastore import MIN_BYTES_FREE, MAX_BYTES_FREE
++from dmedia.local import LocalStores, FileNotLocal
  log = logging.getLogger()
@@ -230,65 +231,221 @@
          log.exception('Error updating project stats for %r', project_id)
--def _vigilance_worker(env, ssl_config):
--    """
--    Run the event-based copy-increasing loop to maintain file durability.
--    """
--    db = util.get_db(env)
--    ms = MetaStore(db)
--
--    local_stores = ms.get_local_stores()
--    if len(local_stores) == 0:
--        log.warning('No connected local stores, cannot increase copies')
--        return
--    connected = frozenset(local_stores.ids)
--    log.info('Connected %r', connected)
--
--    clients = []
--    peers = ms.get_local_peers()
--    if peers:
--        ssl_context = build_ssl_context(ssl_config)
--        for (peer_id, info) in peers.items():
--            url = info['url']
--            log.info('Peer %s at %s', peer_id, url)
--            clients.append(get_client(url, ssl_context))
--    else:
--        log.info('No known peers on local network')
--
--    for (doc, stored) in ms.iter_actionable_fragile(connected, True):
++def get_downgraded(doc):
++    downgraded = []
++    for (key, value) in doc['stored'].items():
++        copies = value['copies']
++        verified = value.get('verified')
++        if copies == 0 and not isinstance(verified, int):
++            downgraded.append(key)
++    return downgraded
++
++
++class Vigilance:
++    def __init__(self, ms, ssl_config):
++        self.ms = ms
++        self.stores = ms.get_local_stores()
++        for fs in self.stores:
++            log.info('Vigilance: local store: %r', fs)
++        self.local = frozenset(self.stores.ids)
++        self.clients = {}
++        self.store_to_peer = {}
++        remote = []
++        peers = ms.get_local_peers()
++        if peers:
++            ssl_context = build_ssl_context(ssl_config)
++            for (peer_id, info) in peers.items():
++                url = info['url']
++                log.info('Vigilance: peer %s at %s', peer_id, url)
++                self.clients[peer_id] = get_client(url, ssl_context)
++            for doc in ms.db.get_many(list(peers)):
++                if doc is not None:
++                    for store_id in get_dict(doc, 'stores'):
++                        if is_store_id(store_id):
++                            remote.append(store_id)
++                            self.store_to_peer[store_id] = doc['_id']
++                            assert doc['_id'] in peers
++        self.remote = frozenset(remote)
++
++    def run(self):
++        last_seq = self.process_backlog()
++        log.info('Vigilance: processed backlog as of %r', last_seq)
++        self.process_preempt()
++        self.run_event_loop(last_seq)
++
++    def process_backlog(self):
++        for doc in self.ms.iter_fragile_files():
++            self.wrap_up_rank(doc)
++        return self.ms.db.get()['update_seq']
++
++    def process_preempt(self):
++        for doc in self.ms.iter_preempt_files():
++            self.wrap_up_rank(doc, threshold=MAX_BYTES_FREE)
++
++    def run_event_loop(self, last_seq):
++        log.info('Vigilance: starting event loop at %d', last_seq)
++        while True:
++            result = self.ms.wait_for_fragile_files(last_seq)
++            last_seq = result['last_seq']
++            for row in result['results']:
++                self.wrap_up_rank(row['doc'])
++
++    def wrap_up_rank(self, doc, threshold=MIN_BYTES_FREE):
++        try:
++            return self.up_rank(doc, threshold)
++        except Exception:
++            log.exception('Error calling Vigilance.up_rank() for %r', doc)
++
++    def up_rank(self, doc, threshold):
++        """
++        Implements the rank-increasing decision tree.
++
++        There are 4 possible actions:
++
++            1) Verify a local copy currently in a downgraded state
++
++            2) Copy from a local FileStore to another local FileStore
++
++            3) Download from a remote peer to a local FileStore
++
++            4) Do nothing as no rank-increasing action is possible
++
++        This is a high-level tree based on set operations.  The action taken
++        here may not actually be possible because this method doesn't consider
++        whether there is a local FileStore with enough available space, which
++        when there isn't, actions (2) and (3) wont be possible.
++
++        We use a simple IO cost accounting model: reading a copy costs one unit,
++        and writing copy likewise costs one unit.  For example, consider these
++        three operations:
++
++            ====  ==============================================
++            Cost  Action
++            ====  ==============================================
++            1     Verify a copy (1 read unit)
++            2     Create a copy (1 read unit, 1 write unit)
++            3     Create two copies (1 read unit, 2 write units)
++            ====  ==============================================
++
++        This method will take the least expensive route (in IO cost units) that
++        will increase the file rank by at least 1.
++
++        It's tempting to look for actions with a lower cost to benefit ratio,
++        even when the cost is higher.  For example, consider these actions:
++
++            ====  =====  =====  ========================
++            Cost  +Rank  Ratio  Action
++            ====  =====  =====  ========================
++            1     1      1.00   Verify a downgraded copy
++            2     2      1.00   Create a copy
++            3     4      0.75   Create two copies
++            ====  =====  =====  ========================
++
++        In this sense, it's a better deal to create two new copies (which is the
++        action Dmedia formerly would take).  However, because greater IO
++        resources are consumed, this means it will necessarily delay acting on
++        other equally fragile files (other files at the current rank being
++        processed).
++
++        Dmedia will now take the cheapest route to getting all files at rank=1
++        up to at least rank=2, then getting all files at rank=2 up to at least
++        rank=3, and so on.
++
++        Another interesting "good deal" is creating new copies by reading from
++        a local downgraded copy (because the source file is always verified as
++        its read):
++
++            ====  =====  =====  ========================================
++            Cost  +Rank  Ratio  Action
++            ====  =====  =====  ========================================
++            1     1      1.00   Verify a downgraded copy
++            2     2      1.00   Create a copy
++            2     3      0.66   Create a copy from a downgraded copy
++            3     4      0.75   Create two copies
++            3     5      0.60   Create two copies from a downgraded copy
++            ====  =====  =====  ========================================
++
++        One place where this does make sense is when there is a locally
++        available file at rank=1 (a single physical copy in a downgraded state),
++        and a locally connected FileStore with enough free space to create a new
++        copy.  As a state of having only a single physical copy is so dangerous,
++        it makes sense to bend the rules here.
++
++        FIXME: Dmedia doesn't yet do this!  Probably the best way to implement
++        this is for the decision tree here to work as it does, but to add some
++        special case handling in Vigilance.up_rank_by_verifying().  Assuming the
++        needed space isn't available on another FileStore, we should still at
++        least verify the copy.
++
++        However, the same will not be done for a file at rank=3 (two physical
++        copies, one in a downgraded state).  In this case the downgraded copy
++        will simply be verified, using 1 IO unit and increasing the rank to 4.
++
++        Note that we use the same cost for a read whether reading from a local
++        drive or downloading from a peer.  Although downloading is cheaper when
++        looked at only from the perspective of the local node, it has the same
++        cost when considering the total Dmedia library.
++
++        The other peers will likewise be doing their best to address any fragile
++        files.  And furthermore, local network IO is generally a more scarce
++        resource (especially over WiFi), so we should only download when its
++        absolutely needed (ie, when no local copy is available).
++        """
++        assert isinstance(threshold, int) and threshold > 0
++        stored = set(doc['stored'])
++        local = stored.intersection(self.local)
++        downgraded = local.intersection(get_downgraded(doc))
++        free = self.local - stored
++        remote = stored.intersection(self.remote)
++        if local:
++            if downgraded:
++                return self.up_rank_by_verifying(doc, downgraded)
++            elif free:
++                return self.up_rank_by_copying(doc, free, threshold)
++        elif remote:
++            return self.up_rank_by_downloading(doc, remote, threshold)
++
++    def up_rank_by_verifying(self, doc, downgraded):
++        assert isinstance(downgraded, set)
++        store_id = downgraded.pop()
++        fs = self.stores.by_id(store_id)
++        return self.ms.verify(fs, doc)
++
++    def up_rank_by_copying(self, doc, free, threshold):
++        dst = self.stores.filter_by_avail(free, doc['bytes'], 1, threshold)
++        if dst:
++            src = self.stores.choose_local_store(doc)
++            return self.ms.copy(src, doc, *dst)
++
++    def up_rank_by_downloading(self, doc, remote, threshold):
++        fs = self.stores.find_dst_store(doc['bytes'], threshold)
++        if fs is None:
++            return
++        peer_ids = frozenset(
++            self.store_to_peer[store_id] for store_id in remote
++        )
++        downloader = None
          _id = doc['_id']
--        copies = sum(v['copies'] for v in doc['stored'].values())
--        if copies >= 3:
--            log.warning('%s already has copies >= 3, skipping', _id)
--            continue
--        size = doc['bytes']
--        local = connected.intersection(stored)  # Any local copies?
--        if local:
--            free = connected - stored
--            src = local_stores.choose_local_store(doc)
--            dst = local_stores.filter_by_avail(free, size, 3 - copies)
--            if dst:
--                ms.copy(src, doc, *dst)
--        elif clients:
--            fs = local_stores.find_dst_store(size)
--            if fs is None:
--                log.warning(
--                    'No FileStore with avail space to download %s', _id
--                )
++        for peer_id in peer_ids:
++            client = self.clients[peer_id]
++            if not client.has_file(_id):
                  continue
--            for client in clients:
--                if not client.has_file(_id):
--                    continue
--                downloader = Downloader(doc, ms, fs)
--                try:
--                    downloader.download_from(client)
--                except Exception:
--                    log.exception('Error downloading %s from %s', _id, client)
++            if downloader is None:
++                downloader = Downloader(doc, self.ms, fs)
++            try:
++                downloader.download_from(client)
++            except Exception:
++                log.exception('Error downloading %s from %s', _id, client)
++            if downloader.download_is_complete():
++                return downloader.doc
  def vigilance_worker(env, ssl_config):
      try:
--        _vigilance_worker(env, ssl_config)
++        db = util.get_db(env)
++        ms = MetaStore(db)
++        vigilance = Vigilance(ms, ssl_config)
++        vigilance.run()
      except Exception:
          log.exception('Error in vigilance_worker():')
@@ -328,7 +485,11 @@
  def is_file_id(_id):
--    return isdb32(_id) and len(_id) == 48
++    return isinstance(_id, str) and len(_id) == 48 and isdb32(_id)
++
++
++def is_store_id(_id):
++    return isinstance(_id, str) and len(_id) == 24 and isdb32(_id)
  def clean_file_id(_id):
@@ -531,6 +692,10 @@
          self.task_manager.requeue_filestore_tasks(tuple(self.stores))
      def restart_vigilance(self):
++        # FIXME: Core should also restart Vigilance whenever the FileStore
++        # connected to a peer change.  We should do this by monitoring the
++        # _changes feed for changes to any of the machine docs corresponding to
++        # the currently visible local peers.
          self.task_manager.restart_vigilance()
      def get_auto_format(self):
 === modified file 'dmedia/local.py'
 --- dmedia/local.py	2013-10-24 02:21:03 +0000
 +++ dmedia/local.py	2013-11-21 03:22:50 +0000
@@ -33,7 +33,6 @@
  log = logging.getLogger()
--MIN_FREE_SPACE = 16 * 1024**3  # 8 GiB min free space
  class NoSuchFile(Exception):
@@ -170,24 +169,25 @@
              reverse=reverse,
+         )
--    def filter_by_avail(self, free_set, size, copies_needed):
--        assert isinstance(size, int) and size >= 1
--        assert isinstance(copies_needed, int) and 1 <= copies_needed <= 3
++    def filter_by_avail(self, free, size, copies, threshold):
++        assert isinstance(size, int) and size > 0
++        assert isinstance(copies, int) and 1 <= copies <= 3
++        assert isinstance(threshold, int) and threshold > 0
          stores = []
--        required_avail = size + MIN_FREE_SPACE
++        required_avail = size + threshold
          for fs in self.sort_by_avail():
--            if fs.id in free_set and fs.statvfs().avail >= required_avail:
++            if fs.id in free and fs.statvfs().avail >= required_avail:
                  stores.append(fs)
--                if len(stores) >= copies_needed:
++                if len(stores) >= copies:
                      break
          return stores
--    def find_dst_store(self, size):
++    def find_dst_store(self, size, threshold):
          stores = self.sort_by_avail()
          if not stores:
              return
          fs = stores[0]
--        if fs.statvfs().avail >= size + MIN_FREE_SPACE:
++        if fs.statvfs().avail >= size + threshold:
              return fs
      def local_stores(self):
 === modified file 'dmedia/metastore.py'
 --- dmedia/metastore.py	2013-10-30 03:26:10 +0000
 +++ dmedia/metastore.py	2013-11-21 03:22:50 +0000
@@ -67,7 +67,7 @@
  from random import SystemRandom
  from copy import deepcopy
--from dbase32 import log_id
++from dbase32 import log_id, isdb32
  from filestore import FileStore, CorruptFile, FileNotFound, check_root_hash
  from microfiber import NotFound, Conflict, BadRequest, BulkConflict
  from microfiber import id_slice_iter, dumps
@@ -90,9 +90,9 @@
  VERIFY_BY_MTIME = DOWNGRADE_BY_MTIME // 8
  VERIFY_BY_VERIFIED = DOWNGRADE_BY_VERIFIED // 2
--GiB = 1024**3
--RECLAIM_BYTES = 64 * GiB
--
++GB = 1000000000
++MIN_BYTES_FREE = 16 * GB
++MAX_BYTES_FREE = 64 * GB
  class TimeDelta:
@@ -140,6 +140,139 @@
      return d[key]
++def get_int(d, key):
++    """
++    Force value for *key* in *d* to be an ``int`` >= 0.
++
++    For example:
++
++    >>> doc = {'foo': 'BAR'}
++    >>> get_int(doc, 'foo')
++    0
++    >>> doc
++    {'foo': 0}
++
++    """
++    if not isinstance(d, dict):
++        raise TypeError(TYPE_ERROR.format('d', dict, type(d), d))
++    if not isinstance(key, str):
++        raise TypeError(TYPE_ERROR.format('key', str, type(key), key))
++    value = d.get(key)
++    if isinstance(value, int) and value >= 0:
++        return value
++    d[key] = 0
++    return d[key]
++
++
++def get_rank(doc):
++    """
++    Calculate the rank of the file represented by *doc*.
++
++    The rank of a file is the number of physical drives its assumed to be stored
++    upon plus the sum of the assumed durability of those copies, basically::
++
++        rank = len(doc['stored']) + sum(v['copies'] for v in doc['stored'].values())
++
++    However, this function can cope with an arbitrarily broken *doc*, as long as
++    *doc* is at least a ``dict`` instance.  For example:
++
++    >>> doc = {
++    ...     'stored': {
++    ...         '333333333333333333333333': {'copies': 1},
++    ...         '999999999999999999999999': {'copies': -6},
++    ...         'AAAAAAAAAAAAAAAAAAAAAAAA': 'junk',
++    ...         'YYYYYYYYYYYYYYYY': 'store_id too short',
++    ...         42: 'the ultimate key to the ultimate value',
++    ...     },
++    ... }
++    >>> get_rank(doc)
++    4
++
++    Any needed schema coercion is done in-place:
++
++    >>> doc == {
++    ...     'stored': {
++    ...         '333333333333333333333333': {'copies': 1},
++    ...         '999999999999999999999999': {'copies': 0},
++    ...         'AAAAAAAAAAAAAAAAAAAAAAAA': {'copies': 0},
++    ...     },
++    ... }
++    True
++
++    It even works with an empty doc:
++
++    >>> doc = {}
++    >>> get_rank(doc)
++    0
++    >>> doc
++    {'stored': {}}
++
++    The rank of a file is used to order (prioritize) the copy increasing
++    behavior, which is done from lowest rank to highest rank (from most fragile
++    to least fragile).
++
++    Also see the "file/rank" CouchDB view function in `dmedia.views`.
++    """
++    stored = get_dict(doc, 'stored')
++    copies = 0
++    for key in tuple(stored):
++        if isinstance(key, str) and len(key) == 24 and isdb32(key):
++            value = get_dict(stored, key)
++            copies += get_int(value, 'copies')
++        else:
++            del stored[key]
++    return min(3, len(stored)) + min(3, copies)
++
++
++def get_copies(doc):
++    """
++    Calculate the durability of the file represented by *doc*.
++
++    For example:
++
++    >>> doc = {
++    ...     'stored': {
++    ...         '333333333333333333333333': {'copies': 1},
++    ...         '999999999999999999999999': {'copies': -6},
++    ...         'AAAAAAAAAAAAAAAAAAAAAAAA': 'junk',
++    ...         'YYYYYYYYYYYYYYYY': 'store_id too short',
++    ...         42: 'the ultimate key to the ultimate value',
++    ...     },
++    ... }
++    >>> get_copies(doc)
++    1
++
++    Any needed schema coercion is done in-place:
++
++    >>> doc == {
++    ...     'stored': {
++    ...         '333333333333333333333333': {'copies': 1},
++    ...         '999999999999999999999999': {'copies': 0},
++    ...         'AAAAAAAAAAAAAAAAAAAAAAAA': {'copies': 0},
++    ...     },
++    ... }
++    True
++
++    It even works with an empty doc:
++
++    >>> doc = {}
++    >>> get_copies(doc)
++    0
++    >>> doc
++    {'stored': {}}
++
++    """
++    stored = get_dict(doc, 'stored')
++    copies = 0
++    for key in tuple(stored):
++        if isinstance(key, str) and len(key) == 24 and isdb32(key):
++            value = get_dict(stored, key)
++            copies += get_int(value, 'copies')
++        else:
++            del stored[key]
++    return copies
++
++
  def get_mtime(fs, _id):
      return int(fs.stat(_id).mtime)
@@ -564,7 +697,7 @@
              count += len(docs)
              try:
                  self.db.save_many(docs)
--            except BulkConflict:
++            except BulkConflict as e:
                  log.exception('Conflict purging %s', store_id)
                  count -= len(e.conflicts)
          try:
@@ -598,21 +731,21 @@
          A fundamental design tenet of Dmedia is that it doesn't particularly
          trust its metadata, and instead does frequent reality checks.  This
--        allows Dmedia to work even though removable storage is constantly
--        "offline".  In other distributed file-systems, this is usually called
--        being in a "network-partitioned" state.
++        allows Dmedia to work even though removable storage is often offline,
++        meaning the overall Dmedia library is often in a network-partitioned
++        state even when all the peers in the library might be online.
          Dmedia deals with removable storage via a quickly decaying confidence
          in its metadata.  If a removable drive hasn't been connected longer
          than some threshold, Dmedia will update all those copies to count for
          zero durability.
--        And whenever a removable drive (on any drive for that matter) is
--        connected, Dmedia immediately checks to see what files are actually on
--        the drive, and whether they have good integrity.
++        Whenever a removable drive (or any drive for that matter) is connected,
++        Dmedia immediately checks to see what files are actually on the drive,
++        and whether they have good integrity.
          `MetaStore.scan()` is the most important reality check that Dmedia does
--        because it's fast and can therefor be done quite often. Thousands of
++        because it's fast and can therefor be done frequently. Thousands of
          files can be scanned in a few seconds.
          The scan insures that for every file expected in this file-store, the
@@ -625,12 +758,13 @@
          the file-store. Then the doc is updated accordingly marking the file as
          being corrupt in this file-store, and the doc is saved.
--        If the file doesn't have the expected mtime is this file-store, this
++        If the file doesn't have the expected mtime in this file-store, this
          copy gets downgraded to zero copies worth of durability, and the last
          verification timestamp is deleted, if present.  This will put the file
          first in line for full content-hash verification.  If the verification
--        passes, the durability is raised back to the appropriate number of
--        copies.
++        passes, the durability will be raised back to the appropriate number of
++        copies (although note this is done by `MetaStore.verify_by_downgraded()`,
++        not by this method).
          :param fs: a `FileStore` instance
          """
@@ -677,6 +811,7 @@
          except NotFound:
              doc = deepcopy(fs.doc)
          doc['atime'] = int(time.time())
++        doc['bytes_avail'] = fs.statvfs().avail
          self.db.save(doc)
          t.log('scan %r files in %r', count, fs)
          return count
@@ -856,37 +991,60 @@
          new = create_stored(doc['_id'], fs)
          return self.db.update(mark_added, doc, new)
--    def iter_fragile(self, monitor=False):
--        """
--        Yield doc for each fragile file.
--        """
--        for rank in range(6):
--            result = self.db.view('file', 'rank', key=rank, update_seq=True)
--            update_seq = result.get('update_seq')
--            ids = [row['id'] for row in result['rows']]
--            del result  # result might be quite large, free some memory
++    def iter_files_at_rank(self, rank):
++        if not isinstance(rank, int):
++            raise TypeError(TYPE_ERROR.format('rank', int, type(rank), rank))
++        if not (0 <= rank <= 5):
++            raise ValueError('Need 0 <= rank <= 5; got {}'.format(rank))
++        LIMIT = 50
++        kw = {
++            'limit': LIMIT,
++            'key': rank,
++        }
++        while True:
++            rows = self.db.view('file', 'rank', **kw)['rows']
++            if not rows:
++                break
++            ids = [r['id'] for r in rows]
++            if ids[0] == kw.get('startkey_docid'):
++                ids.pop(0)
++            if not ids:
++                break
++            log.info('Considering %d files at rank=%d starting at %s',
++                len(ids), rank, ids[0]
++            )
              random.shuffle(ids)
--            log.info('vigilance: %d files at rank=%d', len(ids), rank)
              for _id in ids:
--                yield self.db.get(_id)
--        if not monitor:
--            return
--
--        # Now we enter an event-based loop using the _changes feed:
--        if update_seq is None:
--            update_seq = self.db.get()['update_seq']
++                try:
++                    doc = self.db.get(_id)
++                    doc_rank = get_rank(doc)
++                    if doc_rank <= rank:
++                        yield doc
++                    else:
++                        log.info('Now at rank %d > %d, skipping %s',
++                            doc_rank, rank, doc.get('_id')
++                        )
++                except NotFound:
++                    log.warning('doc NotFound for %s at rank=%d', _id, rank)
++            if len(rows) < LIMIT:
++                break
++            kw['startkey_docid'] = rows[-1]['id']
++
++    def iter_fragile_files(self):
++        for rank in range(6):
++            for doc in self.iter_files_at_rank(rank):
++                yield doc
++
++    def wait_for_fragile_files(self, last_seq):
          kw = {
              'feed': 'longpoll',
              'include_docs': True,
              'filter': 'file/fragile',
--            'since': update_seq,
++            'since': last_seq,
+         }
          while True:
              try:
--                result = self.db.get('_changes', **kw)
--                for row in result['results']:
--                    yield row['doc']
--                kw['since'] = result['last_seq']
++                return self.db.get('_changes', **kw)
              # FIXME: Sometimes we get a 400 Bad Request from CouchDB, perhaps
              # when `since` gets ahead of the `update_seq` as viewed by the
              # changes feed?  By excepting `BadRequest` here, we prevent the
@@ -900,21 +1058,29 @@
              except (ResponseNotReady, BadRequest):
                  pass
--    def iter_actionable_fragile(self, connected, monitor=False):
--        """
--        Yield doc for each fragile file that this node might be able to fix.
--
--        To be "actionable", this machine must have at least one currently
--        connected FileStore (drive) that does *not* already contain a copy of
--        the fragile file.
--        """
--        assert isinstance(connected, frozenset)
--        for doc in self.iter_fragile(monitor):
--            stored = frozenset(get_dict(doc, 'stored'))
--            if (connected - stored):
--                yield (doc, stored)
--
--    def reclaim(self, fs, threshold=RECLAIM_BYTES):
++    def iter_preempt_files(self):
++        kw = {
++            'limit': 100,
++            'descending': True,
++        }
++        rows = self.db.view('file', 'preempt', **kw)['rows']
++        if not rows:
++            return
++        ids = [r['id'] for r in rows]
++        log.info('Considering %d files for preemptive copy increasing', len(ids))
++        random.shuffle(ids)
++        for _id in ids:
++            try:
++                doc = self.db.get(_id)
++                copies = get_copies(doc)
++                if copies == 3:
++                    yield doc
++                else:
++                    log.info('Now at copies=%d, skipping %s', copies, _id)
++            except NotFound:
++                log.warning('preempt doc NotFound for %s', _id)
++
++    def reclaim(self, fs, threshold=MAX_BYTES_FREE):
          count = 0
          size = 0
          t = TimeDelta()
@@ -936,7 +1102,7 @@
              t.log('reclaim %s in %r', count_and_size(count, size), fs)
          return (count, size)
--    def reclaim_all(self, threshold=RECLAIM_BYTES):
++    def reclaim_all(self, threshold=MAX_BYTES_FREE):
          try:
              count = 0
              size = 0
 === modified file 'dmedia/tests/test_core.py'
 --- dmedia/tests/test_core.py	2013-10-24 02:59:18 +0000
 +++ dmedia/tests/test_core.py	2013-11-21 03:22:50 +0000
@@ -230,6 +230,150 @@
          })
++class TestVigilanceMocked(TestCase):
++    def test_up_rank(self):
++        class Mocked(core.Vigilance):
++            def __init__(self, local, remote):
++                self.local = frozenset(local)
++                self.remote = frozenset(remote)
++                self._calls = []
++
++            def up_rank_by_verifying(self, doc, downgraded):
++                self._calls.extend(('verify', doc, downgraded))
++                return doc
++
++            def up_rank_by_copying(self, doc, free, threshold):
++                self._calls.extend(('copy', doc, free, threshold))
++                return doc
++
++            def up_rank_by_downloading(self, doc, remote, threshold):
++                self._calls.extend(('download', doc, remote, threshold))
++                return doc
++
++        local = tuple(random_id() for i in range(2))
++        remote = tuple(random_id() for i in range(2))
++        mocked = Mocked(local, remote)
++
++        # Verify, one local:
++        doc = {
++            'stored': {
++                local[0]: {'copies': 0},
++            },
++        }
++        self.assertIs(mocked.up_rank(doc, 17), doc)
++        self.assertEqual(mocked._calls,
++            ['verify', doc, {local[0]}]
++        )
++
++        # Verify, two local:
++        doc = {
++            'stored': {
++                local[0]: {'copies': 0},
++                local[1]: {'copies': 1},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIs(mocked.up_rank(doc, 17), doc)
++        self.assertEqual(mocked._calls,
++            ['verify', doc, {local[0]}]
++        )
++
++        # Verify, one local, one remote:
++        doc = {
++            'stored': {
++                local[0]: {'copies': 0},
++                remote[0]: {'copies': 1},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIs(mocked.up_rank(doc, 17), doc)
++        self.assertEqual(mocked._calls,
++            ['verify', doc, {local[0]}]
++        )
++
++        # Copy, one local, one remote:
++        doc = {
++            'stored': {
++                local[0]: {'copies': 1},
++                remote[0]: {'copies': 1},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIs(mocked.up_rank(doc, 17), doc)
++        self.assertEqual(mocked._calls,
++            ['copy', doc, {local[1]}, 17]
++        )
++
++        # Copy, two local, one remote:
++        doc = {
++            'stored': {
++                local[0]: {'copies': 1},
++                local[1]: {'copies': 1},
++                remote[0]: {'copies': 1},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIsNone(mocked.up_rank(doc, 17))
++        self.assertEqual(mocked._calls, [])
++
++        # Download, one remote:
++        doc = {
++            'stored': {
++                remote[0]: {'copies': 0},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIs(mocked.up_rank(doc, 17), doc)
++        self.assertEqual(mocked._calls,
++            ['download', doc, {remote[0]}, 17]
++        )
++
++        # Download, two remote:
++        doc = {
++            'stored': {
++                remote[0]: {'copies': 0},
++                remote[1]: {'copies': 1},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIs(mocked.up_rank(doc, 17), doc)
++        self.assertEqual(mocked._calls,
++            ['download', doc, set(remote), 17]
++        )
++
++        # Available in neither local nor remote:
++        doc = {
++            'stored': {
++                random_id(): {'copies': 0},
++                random_id(): {'copies': 1},
++            },
++        }
++        mocked._calls.clear()
++        self.assertIsNone(mocked.up_rank(doc, 17))
++        self.assertEqual(mocked._calls, [])
++
++        # Empty doc['stored']:
++        doc = {'stored': {}}
++        mocked._calls.clear()
++        self.assertIsNone(mocked.up_rank(doc, 17))
++        self.assertEqual(mocked._calls, [])
++
++
++class TestVigilance(CouchCase):
++    def test_init(self):
++        db = util.get_db(self.env, True)
++        ms = MetaStore(db)
++        inst = core.Vigilance(ms, None)
++        self.assertIs(inst.ms, ms)
++        self.assertIsInstance(inst.stores, LocalStores)
++        self.assertIsInstance(inst.local, frozenset)
++        self.assertEqual(inst.local, frozenset())
++        self.assertIsInstance(inst.remote, frozenset)
++        self.assertEqual(inst.remote, frozenset())
++        self.assertEqual(inst.clients, {})
++        self.assertEqual(inst.store_to_peer, {})
++
++
  class TestTaskQueue(TestCase):
      def test_init(self):
          tq = core.TaskQueue()
 === modified file 'dmedia/tests/test_local.py'
 --- dmedia/tests/test_local.py	2013-08-25 20:30:25 +0000
 +++ dmedia/tests/test_local.py	2013-11-21 03:22:50 +0000
@@ -181,6 +181,43 @@
          self.assertEqual(inst.fast, set())
          self.assertEqual(inst.slow, set())
++    def test_filter_by_avail(self):
++        # FIXME: Improve this once we can mock FileStore.statvfs()
++        inst = local.LocalStores()
++
++        # Empty
++        self.assertEqual(inst.filter_by_avail({}, 1, 1, 1), [])
++        free = {random_id(), random_id()}
++        self.assertEqual(inst.filter_by_avail(free, 1, 1, 1), [])
++
++        # One FileStore
++        fs1 = TempFileStore()
++        inst.add(fs1)
++        self.assertEqual(inst.filter_by_avail({}, 1, 1, 1), [])
++        free = {random_id(), random_id()}
++        self.assertEqual(inst.filter_by_avail(free, 1, 1, 1), [])
++        free = {fs1.id, random_id(), random_id()}
++        self.assertEqual(inst.filter_by_avail(free, 1, 1, 1), [fs1])
++
++        # Two FileStore
++        fs2 = TempFileStore()
++        inst.add(fs2)
++        self.assertEqual(inst.filter_by_avail({}, 1, 1, 1), [])
++        free = {random_id(), random_id()}
++        self.assertEqual(inst.filter_by_avail(free, 1, 1, 1), [])
++        free = {fs1.id, random_id(), random_id()}
++        self.assertEqual(inst.filter_by_avail(free, 1, 1, 1), [fs1])
++        free = {fs2.id, random_id(), random_id()}
++        self.assertEqual(inst.filter_by_avail(free, 1, 1, 1), [fs2])
++
++    def test_find_dst_store(self):
++        # FIXME: Improve this once we can mock FileStore.statvfs()
++        inst = local.LocalStores()
++        self.assertIsNone(inst.find_dst_store(1, 1))
++        fs = TempFileStore()
++        inst.add(fs)
++        self.assertIs(inst.find_dst_store(1, 1), fs)
++
      def test_local_stores(self):
          fs1 = TempFileStore(copies=1)
          fs2 = TempFileStore(copies=0)
 === modified file 'dmedia/tests/test_metastore.py'
 --- dmedia/tests/test_metastore.py	2013-10-30 03:26:10 +0000
 +++ dmedia/tests/test_metastore.py	2013-11-21 03:22:50 +0000
@@ -44,11 +44,91 @@
  from dmedia import util, schema, metastore
  from dmedia.metastore import create_stored, get_mtime
  from dmedia.constants import TYPE_ERROR
++from dmedia.units import bytes10
  random = SystemRandom()
++def doc_id(doc):
++    """
++    Used as key function for sorted by doc['_id'].
++    """
++    return doc['_id']
++
++
++def build_stored_at_rank(rank, store_ids):
++    """
++    Build doc['stored'] for a specific rank.
++
++    For example:
++
++    >>> store_ids = (
++    ...     '333333333333333333333333',
++    ...     'AAAAAAAAAAAAAAAAAAAAAAAA',
++    ...     'YYYYYYYYYYYYYYYYYYYYYYYY',
++    ... )
++    >>> build_stored_at_rank(0, store_ids)
++    {}
++    >>> build_stored_at_rank(1, store_ids)
++    {'333333333333333333333333': {'copies': 0}}
++    >>> build_stored_at_rank(2, store_ids)
++    {'333333333333333333333333': {'copies': 1}}
++
++    """
++    assert isinstance(rank, int)
++    assert 0 <= rank <= 6
++    assert isinstance(store_ids, tuple)
++    assert len(store_ids) == 3
++    for _id in store_ids:
++        assert isinstance(_id, str) and len(_id) == 24 and isdb32(_id)
++    if rank == 0:
++        return {}
++    if rank == 1:
++        return {
++            store_ids[0]: {'copies': 0},
++        }
++    if rank == 2:
++        return {
++            store_ids[0]: {'copies': 1},
++        }
++    if rank == 3:
++        return {
++            store_ids[0]: {'copies': 1},
++            store_ids[1]: {'copies': 0},
++        }
++    if rank == 4:
++        return {
++            store_ids[0]: {'copies': 1},
++            store_ids[1]: {'copies': 1},
++        }
++    if rank == 5:
++        return {
++            store_ids[0]: {'copies': 1},
++            store_ids[1]: {'copies': 1},
++            store_ids[2]: {'copies': 0},
++        }
++    if rank == 6:
++        return {
++            store_ids[0]: {'copies': 1},
++            store_ids[1]: {'copies': 1},
++            store_ids[2]: {'copies': 1},
++        }
++    raise Exception('should not have reached this point')
++
++
++def build_file_at_rank(_id, rank, store_ids):
++    assert isinstance(_id, str) and len(_id) == 48 and isdb32(_id)
++    doc = {
++        '_id': _id,
++        'type': 'dmedia/file',
++        'origin': 'user',
++        'stored': build_stored_at_rank(rank, store_ids),
++    }
++    assert metastore.get_rank(doc) == rank
++    return doc
++
++
  def create_random_file(fs, db):
      tmp_fp = fs.allocate_tmp()
      ch = write_random(tmp_fp)
@@ -196,7 +276,28 @@
          self.assertTrue(
              parent // 4 <= metastore.VERIFY_BY_VERIFIED <= parent // 2
+         )
--        self.assertGreater(metastore.VERIFY_BY_VERIFIED, metastore.VERIFY_BY_MTIME)
++        self.assertGreater(metastore.VERIFY_BY_VERIFIED, metastore.VERIFY_BY_MTIME)
++
++    def test_GB(self):
++        self.assertIsInstance(metastore.GB, int)
++        self.assertEqual(metastore.GB, 10 ** 9)
++        self.assertEqual(metastore.GB, 1000 ** 3)
++        self.assertEqual(bytes10(metastore.GB), '1 GB')
++
++    def test_MIN_BYTES_FREE(self):
++        self.assertIsInstance(metastore.MIN_BYTES_FREE, int)
++        self.assertGreaterEqual(metastore.MIN_BYTES_FREE, metastore.GB)
++        self.assertEqual(metastore.MIN_BYTES_FREE % metastore.GB, 0)
++        self.assertEqual(bytes10(metastore.MIN_BYTES_FREE), '16 GB')
++
++    def test_MAX_BYTES_FREE(self):
++        self.assertIsInstance(metastore.MAX_BYTES_FREE, int)
++        self.assertGreaterEqual(metastore.MAX_BYTES_FREE, metastore.GB)
++        self.assertEqual(metastore.MAX_BYTES_FREE % metastore.GB, 0)
++        self.assertGreaterEqual(metastore.MAX_BYTES_FREE,
++            2 * metastore.MIN_BYTES_FREE
++        )
++        self.assertEqual(bytes10(metastore.MAX_BYTES_FREE), '64 GB')
  class TestFunctions(TestCase):
@@ -243,6 +344,123 @@
          self.assertEqual(doc, {'foo': {'bar': 0, 'baz': 1}})
          self.assertIs(doc['foo'], ret)
++    def test_get_int(self):
++        # Bad `d` type:
++        bad = [random_id(), random_id()]
++        with self.assertRaises(TypeError) as cm:
++            metastore.get_int(bad, random_id())
++        self.assertEqual(
++            str(cm.exception),
++            TYPE_ERROR.format('d', dict, list, bad)
++        )
++
++        # Bad `key` type:
++        bad = random.randint(0, 1000)
++        with self.assertRaises(TypeError) as cm:
++            metastore.get_int({}, bad)
++        self.assertEqual(
++            str(cm.exception),
++            TYPE_ERROR.format('key', str, int, bad)
++        )
++
++        # Empty:
++        doc = {}
++        ret = metastore.get_int(doc, 'foo')
++        self.assertIsInstance(ret, int)
++        self.assertEqual(ret, 0)
++        self.assertEqual(doc, {'foo': 0})
++        self.assertIs(doc['foo'], ret)
++
++        # Wrong type:
++        doc = {'foo': '17'}
++        ret = metastore.get_int(doc, 'foo')
++        self.assertIsInstance(ret, int)
++        self.assertEqual(ret, 0)
++        self.assertEqual(doc, {'foo': 0})
++        self.assertIs(doc['foo'], ret)
++
++        # Trickier wrong type:
++        doc = {'foo': 17.0}
++        ret = metastore.get_int(doc, 'foo')
++        self.assertIsInstance(ret, int)
++        self.assertEqual(ret, 0)
++        self.assertEqual(doc, {'foo': 0})
++        self.assertIs(doc['foo'], ret)
++
++        # Bad Value:
++        doc = {'foo': -17}
++        ret = metastore.get_int(doc, 'foo')
++        self.assertIsInstance(ret, int)
++        self.assertEqual(ret, 0)
++        self.assertEqual(doc, {'foo': 0})
++        self.assertIs(doc['foo'], ret)
++
++        # Another bad Value:
++        doc = {'foo': -1}
++        ret = metastore.get_int(doc, 'foo')
++        self.assertIsInstance(ret, int)
++        self.assertEqual(ret, 0)
++        self.assertEqual(doc, {'foo': 0})
++        self.assertIs(doc['foo'], ret)
++
++        # All good:
++        value = 17
++        doc = {'foo': value}
++        self.assertIs(metastore.get_int(doc, 'foo'), value)
++        self.assertEqual(doc, {'foo': 17})
++        self.assertIs(doc['foo'], value)
++
++        # Also all good:
++        value = 0
++        doc = {'foo': value}
++        self.assertIs(metastore.get_int(doc, 'foo'), value)
++        self.assertEqual(doc, {'foo': 0})
++        self.assertIs(doc['foo'], value)
++
++    def test_get_rank(self):
++        file_id = random_file_id()
++        store_ids = tuple(random_id() for i in range(3))
++        for rank in range(7):
++            doc = build_file_at_rank(file_id, rank, store_ids)
++            self.assertEqual(metastore.get_rank(doc), rank)
++
++        # Empty doc
++        doc = {}
++        self.assertEqual(metastore.get_rank(doc), 0)
++        self.assertEqual(doc, {'stored': {}})
++
++        # Empty doc['stored']
++        doc = {'stored': {}}
++        self.assertEqual(metastore.get_rank(doc), 0)
++        self.assertEqual(doc, {'stored': {}})
++
++        # All kinds of broken:
++        store_ids = tuple(random_id() for i in range(6))
++        doc = {
++            'stored': {
++                store_ids[0]: {'copies': 1},
++                store_ids[1]: {'copies': '17'},
++                store_ids[2]: {'copies': -18},
++                store_ids[3]: {},
++                store_ids[4]: 'hello',
++                store_ids[5]: 3,
++                random_id(10): {'copies': 1},
++                ('a' * 24): {'copies': 1},
++                42: {'copies': 1},
++            },
++        }
++        self.assertEqual(metastore.get_rank(doc), 4)
++        self.assertEqual(doc, {
++            'stored': {
++                store_ids[0]: {'copies': 1},
++                store_ids[1]: {'copies': 0},
++                store_ids[2]: {'copies': 0},
++                store_ids[3]: {'copies': 0},
++                store_ids[4]: {'copies': 0},
++                store_ids[5]: {'copies': 0},
++            },
++        })
++
      def test_get_mtime(self):
          fs = TempFileStore()
          _id = random_file_id()
@@ -2791,6 +3009,7 @@
          doc = db.get(fs.id)
          self.assertTrue(doc['_rev'].startswith('2-'))
++        self.assertIn('bytes_avail', doc)
          atime = doc.get('atime')
          self.assertIsInstance(atime, int)
          self.assertLessEqual(atime, int(time.time()))
@@ -3643,59 +3862,93 @@
              },
          })
--    def test_iter_fragile(self):
++    def test_iter_files_at_rank(self):
          db = util.get_db(self.env, True)
          ms = metastore.MetaStore(db)
++        # Bad rank type:
++        with self.assertRaises(TypeError) as cm:
++            list(ms.iter_files_at_rank(1.0))
++        self.assertEqual(str(cm.exception),
++            TYPE_ERROR.format('rank', int, float, 1.0)
++        )
++
++        # Bad rank value:
++        with self.assertRaises(ValueError) as cm:
++            list(ms.iter_files_at_rank(-1))
++        self.assertEqual(str(cm.exception), 'Need 0 <= rank <= 5; got -1')
++        with self.assertRaises(ValueError) as cm:
++            list(ms.iter_files_at_rank(6))
++        self.assertEqual(str(cm.exception), 'Need 0 <= rank <= 5; got 6')
++
          # Test when no files are in the library:
--        self.assertEqual(list(ms.iter_fragile()), [])
++        self.assertEqual(list(ms.iter_files_at_rank(0)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(1)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(2)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(3)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(4)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(5)), [])
--        # Create rank=(0 through 6) test data:
--        ids = tuple(random_file_id() for i in range(7))
++        # Create rank=(0 through 5) test data:
          stores = tuple(random_id() for i in range(3))
--        docs = [
++        docs_0 = [
+             {
--                '_id': ids[0],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {},
--            },
++            }
++            for i in range(100)
++        ]
++        docs_1 = [
+             {
--                '_id': ids[1],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {
                      stores[0]: {'copies': 0},
                  },
--            },
++            }
++            for i in range(101)
++        ]
++        docs_2 = [
+             {
--                '_id': ids[2],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {
                      stores[0]: {'copies': 1},
                  },
--            },
++            }
++            for i in range(102)
++        ]
++        docs_3 = [
+             {
--                '_id': ids[3],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {
                      stores[0]: {'copies': 1},
                      stores[1]: {'copies': 0},
                  },
--            },
++            }
++            for i in range(103)
++        ]
++        docs_4 = [
+             {
--                '_id': ids[4],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {
                      stores[0]: {'copies': 1},
                      stores[1]: {'copies': 1},
                  },
--            },
++            }
++            for i in range(104)
++        ]
++        docs_5 = [
+             {
--                '_id': ids[5],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {
@@ -3703,9 +3956,181 @@
                      stores[1]: {'copies': 1},
                      stores[2]: {'copies': 0},
                  },
--            },
++            }
++            for i in range(105)
++        ]
++        docs = []
++        doc_groups = (docs_0, docs_1, docs_2, docs_3, docs_4, docs_5)
++        for docs_n in doc_groups:
++            docs.extend(docs_n)
++            docs_n.sort(key=doc_id)
++        self.assertEqual(len(docs), 615)
++        db.save_many(docs)
++
++        # Test that for each rank, we get the expected docs and no duplicates:
++        for (n, docs_n) in enumerate(doc_groups):
++            result = list(ms.iter_files_at_rank(n))
++            self.assertEqual(len(result), 100 + n)
++            self.assertNotEqual(result, docs_n)  # Due to random.shuffle()
++            self.assertEqual(sorted(result, key=doc_id), docs_n)
++
++        # Similar to above, except this time we're modifying the docs as they're
++        # yielded so they're bumped up to rank=6 in the file/rank view:
++        self.assertEqual(len(doc_groups), 6)
++        self.assertEqual(db.view('file', 'rank', key=6)['rows'], [])
++        for (n, docs_n) in enumerate(doc_groups):
++            result = []
++            for doc in ms.iter_files_at_rank(n):
++                result.append(doc)
++                new = deepcopy(doc)
++                new['stored'] = {
++                    stores[0]: {'copies': 1},
++                    stores[1]: {'copies': 1},
++                    stores[2]: {'copies': 1},
++                }
++                db.save(new)
++            self.assertEqual(len(result), 100 + n)
++            self.assertNotEqual(result, docs_n)  # Due to random.shuffle()
++            self.assertEqual(sorted(result, key=doc_id), docs_n)
++            self.assertEqual(list(ms.iter_files_at_rank(n)), [])
++
++        # Double check that rank 0 through 5 are still returning no docs:
++        self.assertEqual(list(ms.iter_files_at_rank(0)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(1)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(2)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(3)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(4)), [])
++        self.assertEqual(list(ms.iter_files_at_rank(5)), [])
++
++        # And check that all the docs are still at rank=6 and _rev=2:
++        ids = sorted(d['_id'] for d in docs)
++        rows = db.view('file', 'rank', key=6)['rows']
++        self.assertEqual(len(rows), 615)
++        self.assertEqual([r['id'] for r in rows], ids)
++        for doc in db.get_many(ids):
++            self.assertEqual(doc['_rev'][:2], '2-')
++
++    def test_iter_files_at_rank_2(self):
++        """
++        Ensure that get_rank() is used to filter out greater than current rank.
++        """
++        db = util.get_db(self.env, True)
++        ms = metastore.MetaStore(db)
++        for rank in range(6):
++            ids = tuple(random_file_id() for i in range(50))
++            store_ids = tuple(random_id() for i in range(3))
++            docs = [build_file_at_rank(_id, rank, store_ids) for _id in ids]
++            db.save_many(docs)
++            dmap = dict((d['_id'], d) for d in docs)
++            docs.sort(key=doc_id)
++
++            # Test with no modification:
++            result = list(ms.iter_files_at_rank(rank))
++            self.assertEqual(len(result), 50)
++            self.assertNotEqual(result, docs)  # Due to random.shuffle()
++            self.assertEqual(sorted(result, key=doc_id), docs)
++
++            # Adjust 17 files to rank+1 after the first doc is yielded:
++            include = None
++            result = []
++            for doc in ms.iter_files_at_rank(rank):
++                result.append(doc)
++                if include is None:
++                    include = {doc['_id']}
++                    remaining = set(ids) - include
++                    remove = random.sample(remaining, 17)
++                    include.update(remaining - set(remove))
++                    rdocs = [dmap[_id] for _id in remove]
++                    for rdoc in rdocs:
++                        rdoc['stored'] = build_stored_at_rank(rank + 1, store_ids)
++                        self.assertEqual(metastore.get_rank(rdoc), rank + 1)
++                    db.save_many(rdocs)
++            expected = [dmap[_id] for _id in include]
++            expected.sort(key=doc_id)
++            self.assertEqual(len(expected), 33)
++            self.assertEqual(len(result), 33)
++            self.assertNotEqual(result, expected)  # Due to random.shuffle()
++            self.assertEqual(sorted(result, key=doc_id), expected)
++
++            # Now check rank+1, unless we're at rank=5:
++            if rank < 5:
++                rdocs.sort(key=doc_id)
++                result = list(ms.iter_files_at_rank(rank + 1))
++                self.assertEqual(len(result), 17)
++                self.assertNotEqual(result, rdocs)  # Due to random.shuffle()
++                self.assertEqual(sorted(result, key=doc_id), rdocs)
++                # Clean up for rank+1:
++                for rdoc in rdocs:
++                    rdoc['_deleted'] = True
++                db.save_many(rdocs)
++                self.assertEqual(list(ms.iter_files_at_rank(rank + 1)), [])
++
++    def test_iter_fragile_files(self):
++        db = util.get_db(self.env, True)
++
++        # Test with a mocked MetaStore.iter_files_at_rank():
++        class Mocked(metastore.MetaStore):
++            def __init__(self, db, log_db=None):
++                super().__init__(db, log_db)
++                self._calls = []
++                self._ranks = tuple(
++                    tuple(random_id() for i in range(25))
++                    for rank in range(6)
++                )
++
++            def iter_files_at_rank(self, rank):
++                assert isinstance(rank, int)
++                assert 0 <= rank <= 5
++                self._calls.append(rank)
++                for _id in self._ranks[rank]:
++                    yield _id
++
++        mocked = Mocked(db)
++        expected = []
++        for ids in mocked._ranks:
++            expected.extend(ids)
++        self.assertEqual(list(mocked.iter_fragile_files()), expected)
++        self.assertEqual(mocked._calls, [0, 1, 2, 3, 4, 5])
++
++        # Now do a live test:
++        ms = metastore.MetaStore(db)
++        self.assertEqual(list(ms.iter_fragile_files()), [])
++        store_ids = tuple(random_id() for i in range(3))
++        docs = [
++            build_file_at_rank(random_file_id(), rank, store_ids)
++            for rank in range(7)
++        ]
++        db.save_many(docs)
++        self.assertEqual(list(ms.iter_fragile_files()), docs[:-1])
++        for doc in docs:
++            doc['_deleted'] = True
++        db.save_many(docs)
++        self.assertEqual(list(ms.iter_fragile_files()), [])
++
++        # Test pushing up through ranks:
++        docs = [
++            build_file_at_rank(random_file_id(), 0, store_ids)
++            for i in range(100)
++        ]
++        db.save_many(docs)
++        docs.sort(key=doc_id)
++        for rank in range(6):
++            result = list(ms.iter_fragile_files())
++            self.assertEqual(len(result), 100)
++            self.assertNotEqual(result, docs)  # Due to random.shuffle()
++            self.assertEqual(sorted(result, key=doc_id), docs)
++            for doc in docs:
++                doc['stored'] = build_stored_at_rank(rank + 1, store_ids)
++            db.save_many(docs)
++
++    def test_wait_for_fragile_files(self):
++        db = util.get_db(self.env, True)
++        ms = metastore.MetaStore(db)
++
++        stores = tuple(random_id() for i in range(3))
++        docs = [
+             {
--                '_id': ids[6],
++                '_id': random_file_id(),
                  'type': 'dmedia/file',
                  'origin': 'user',
                  'stored': {
@@ -3713,104 +4138,57 @@
                      stores[1]: {'copies': 1},
                      stores[2]: {'copies': 1},
                  },
--            },
++            }
++            for i in range(4)
+         ]
          db.save_many(docs)
--
--        # We should get docs[0:6]:
--        self.assertEqual(list(ms.iter_fragile()), docs[0:-1])
--
--        # Docs should not be changed:
--        self.assertEqual(db.get_many(ids), docs)
--
--    def test_iter_actionable_fragile(self):
++        last_seq = db.get()['update_seq']
++        for doc in docs:
++            del doc['stored'][stores[0]]
++            db.save(doc)
++            result = ms.wait_for_fragile_files(last_seq)
++            self.assertEqual(result, {
++                'last_seq': last_seq + 1,
++                'results': [
++                    {
++                        'changes': [{'rev': doc['_rev']}],
++                        'doc': doc,
++                        'id': doc['_id'],
++                        'seq': last_seq + 1,
++                    }
++                ],
++            })
++            last_seq = result['last_seq']
++
++    def test_iter_preempt_files(self):
          db = util.get_db(self.env, True)
          ms = metastore.MetaStore(db)
--        store_id1 = random_id()
--        store_id2 = random_id()
--        store_id3 = random_id()
--        store_id4 = random_id()
--        empty = frozenset()
--        one = frozenset([store_id1])
--        two = frozenset([store_id1, store_id2])
--        three = frozenset([store_id1, store_id2, store_id3])
--
--        id1 = random_file_id()
--        id2 = random_file_id()
--        id3 = random_file_id()
--        doc1 = {
--            '_id': id1,
--            'type': 'dmedia/file',
--            'origin': 'user',
--            'stored': {
--                store_id1: {'copies': 0},
--            },
--        }
--        doc2 = {
--            '_id': id2,
--            'type': 'dmedia/file',
--            'origin': 'user',
--            'stored': {
--                store_id1: {'copies': 0},
--                store_id4: {'copies': 1},
--            },
--        }
--        doc3 = {
--            '_id': id3,
--            'type': 'dmedia/file',
--            'origin': 'user',
--            'stored': {
--                store_id1: {'copies': 1},
--                store_id2: {'copies': 1},
--            },
--        }
--
--        # Test when no files are in the library:
--        self.assertEqual(list(ms.iter_actionable_fragile(empty)), [])
--        self.assertEqual(list(ms.iter_actionable_fragile(one)), [])
--        self.assertEqual(list(ms.iter_actionable_fragile(two)), [])
--        self.assertEqual(list(ms.iter_actionable_fragile(three)), [])
--
--        # All 3 docs should be included:
--        db.save_many([doc1, doc2, doc3])
--        self.assertEqual(list(ms.iter_actionable_fragile(three)), [
--            (doc1, set([store_id1])),
--            (doc2, set([store_id1, store_id4])),
--            (doc3, set([store_id1, store_id2])),
--        ])
--
--        # If only store_id1, store_id2 are connected, doc3 shouldn't be
--        # actionable:
--        self.assertEqual(list(ms.iter_actionable_fragile(two)), [
--            (doc1, set([store_id1])),
--            (doc2, set([store_id1, store_id4])),
--        ])
--
--        # All files have a copy in store_id1, so nothing should be returned:
--        self.assertEqual(list(ms.iter_actionable_fragile(one)), [])
--
--        # If doc2 was only stored on a non-connected store:
--        doc1['stored'] = {
--            store_id4: {'copies': 1},
--        }
--        db.save(doc1)
--        self.assertEqual(list(ms.iter_actionable_fragile(one)), [
--            (doc1, set([store_id4]))
--        ])
--
--        # If doc2 has sufficent durablity, it shouldn't be included, even though
--        # there is a free drive where a copy could be created:
--        doc2['stored'] = {
--            store_id1: {'copies': 1},
--            store_id2: {'copies': 1},
--            store_id4: {'copies': 1},
--        }
--        db.save(doc2)
--        self.assertEqual(list(ms.iter_actionable_fragile(three)), [
--            (doc1, set([store_id4])),
--            (doc3, set([store_id1, store_id2])),
--        ])
++        # When empty
++        self.assertEqual(list(ms.iter_preempt_files()), [])
++
++        # With live data:
++        store_ids = tuple(random_id() for i in range(3))
++        docs = [
++            build_file_at_rank(random_file_id(), 6, store_ids)
++            for i in range(107)
++        ]
++        base = int(time.time())
++        for (i, doc) in enumerate(docs):
++            doc['atime'] = base - i
++        db.save_many(docs)
++        expected = docs[0:100]
++        result = list(ms.iter_preempt_files())
++        self.assertEqual(len(result), 100)
++        self.assertNotEqual(result, expected)  # Due to random.shuffle()
++        result.sort(key=lambda d: d['atime'], reverse=True)
++        self.assertEqual(result, expected)
++
++        # Make sure files are excluded when durability isn't 3:
++        for doc in docs:
++            doc['stored'] = build_stored_at_rank(5, store_ids)
++        db.save_many(docs)
++        self.assertEqual(list(ms.iter_preempt_files()), [])
      def test_reclaim(self):
          # FIXME: Till we have a nice way of mocking FileStore.statvfs(), this is
 === modified file 'dmedia/tests/test_views.py'
 --- dmedia/tests/test_views.py	2013-10-12 07:19:16 +0000
 +++ dmedia/tests/test_views.py	2013-11-21 03:22:50 +0000
@@ -1253,6 +1253,155 @@
              {'rows': [], 'offset': 0, 'total_rows': 0},
+         )
++        ######################################################################
++        # 3rd, test assumptions about how "startkey_docid" and "key" interact:
++        db = Database('baz', self.env)
++        db.put(None)
++        design = self.build_view('rank')
++        db.save(design)
++
++        # Create rank=(0 through 6) test data:
++        ranks = tuple(
++            tuple(random_file_id() for j in range(17 + i))
++            for i in range(7)
++        )
++        sorted_ranks = tuple(
++            tuple(sorted(rank_n)) for rank_n in ranks
++        )
++        flattened = []
++        for (n, rank_n) in enumerate(sorted_ranks):
++            flattened.extend((n, _id) for _id in rank_n)
++        flattened = tuple(flattened)
++        self.assertEqual(len(flattened), 140)
++
++        stores = tuple(random_id() for i in range(3))
++        docs = []
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {},
++            }
++            for _id in ranks[0]
++        )
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {
++                    stores[0]: {'copies': 0},
++                },
++            }
++            for _id in ranks[1]
++        )
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {
++                    stores[0]: {'copies': 1},
++                },
++            }
++            for _id in ranks[2]
++        )
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {
++                    stores[0]: {'copies': 1},
++                    stores[1]: {'copies': 0},
++                },
++            }
++            for _id in ranks[3]
++        )
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {
++                    stores[0]: {'copies': 1},
++                    stores[1]: {'copies': 1},
++                },
++            }
++            for _id in ranks[4]
++        )
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {
++                    stores[0]: {'copies': 1},
++                    stores[1]: {'copies': 1},
++                    stores[2]: {'copies': 0},
++                },
++            }
++            for _id in ranks[5]
++        )
++        docs.extend(
++            {
++                '_id': _id,
++                'type': 'dmedia/file',
++                'origin': 'user',
++                'stored': {
++                    stores[0]: {'copies': 1},
++                    stores[1]: {'copies': 1},
++                    stores[2]: {'copies': 1},
++                },
++            }
++            for _id in ranks[6]
++        )
++        self.assertEqual(len(docs), 140)
++        db.save_many(docs)
++
++        # Test that sorting is being done by (key, _id):
++        self.assertEqual(
++            db.view('file', 'rank'),
++            {
++                'offset': 0,
++                'total_rows': 140,
++                'rows': [
++                    {'key': n, 'id': _id, 'value': None}
++                    for (n, _id) in flattened
++                ],
++            },
++        )
++
++        # Test that sorting is being done by _id within a single key, then test
++        # that we can use "startkey_docid" as expected:
++        offset = 0
++        for (n, rank_n) in enumerate(sorted_ranks):
++            self.assertEqual(
++                db.view('file', 'rank', key=n),
++                {
++                    'offset': offset,
++                    'total_rows': 140,
++                    'rows': [
++                        {'key': n, 'id': _id, 'value': None}
++                        for _id in rank_n
++                    ],
++                },
++            )
++            for i in range(len(rank_n)):
++                self.assertEqual(
++                    db.view('file', 'rank', key=n, startkey_docid=rank_n[i]),
++                    {
++                        'offset': offset + i,
++                        'total_rows': 140,
++                        'rows': [
++                            {'key': n, 'id': _id, 'value': None}
++                            for _id in rank_n[i:]
++                        ],
++                    },
++                )
++            offset += len(rank_n)
++
      def test_fragile(self):
          db = Database('foo', self.env)
          db.put(None)
 === modified file 'dmedia/views.py'
 --- dmedia/views.py	2013-10-12 07:19:16 +0000
 +++ dmedia/views.py	2013-11-21 03:22:50 +0000
@@ -110,6 +110,10 @@
  file_copies = """
  function(doc) {
      if (doc.type == 'dmedia/file' && doc.origin == 'user') {
++        if (typeof doc.stored != 'object' || isArray(doc.stored)) {
++            emit(0, null);
++            return;
++        }
          var total = 0;
          var key, copies;
          for (key in doc.stored) {
@@ -146,6 +150,27 @@
+ }
  """
++file_preempt = """
++function(doc) {
++    if (doc.type == 'dmedia/file' && doc.origin == 'user') {
++        if (typeof doc.stored != 'object' || isArray(doc.stored)) {
++            return;
++        }
++        var total = 0;
++        var key, copies;
++        for (key in doc.stored) {
++            copies = doc.stored[key].copies;
++            if (typeof copies == 'number' && copies > 0) {
++                total += copies;
++            }
++        }
++        if (total == 3) {
++            emit(doc.atime, null);
++        }
++    }
++}
++"""
++
  file_fragile = """
  function(doc) {
      if (doc.type == 'dmedia/file' && doc.origin == 'user') {
@@ -302,7 +327,8 @@
          'stored': {'map': file_stored, 'reduce': _stats},
          'nonzero': {'map': file_nonzero},
          'copies': {'map': file_copies},
--        'rank': {'map': file_rank},
++        'rank': {'map': file_rank, 'reduce': _count},
++        'preempt': {'map': file_preempt},
          'fragile': {'map': file_fragile},
          'downgrade-by-mtime': {'map': file_downgrade_by_mtime},
          'downgrade-by-verified': {'map': file_downgrade_by_verified},
@@ -332,6 +358,16 @@
+ }
  """
++store_bytes_avail = """
++function(doc) {
++    if (doc.type == 'dmedia/store') {
++        if (typeof doc.bytes_avail == 'number') {
++            emit(doc.bytes_avail, null);
++        }
++    }
++}
++"""
++
  store_drive_serial = """
  function(doc) {
      if (doc.type == 'dmedia/store') {
@@ -344,6 +380,7 @@
      '_id': '_design/store',
      'views': {
          'atime': {'map': store_atime},
++        'bytes_avail': {'map': store_bytes_avail},
          'drive_serial': {'map': store_drive_serial},
      },
+ }

Dmedia

Merge lp:~jderose/dmedia/fix-1247530 into lp:dmedia

Commit message

Description of the change

Preview Diff

Subscribers