couchdb:advance-dbseq-during-indexing-retry

Last commit made on 2021-03-03
Get this branch:
git clone -b advance-dbseq-during-indexing-retry https://git.launchpad.net/couchdb

Branch merges

Branch information

Name:
advance-dbseq-during-indexing-retry
Repository:
lp:couchdb

Recent commits

dbbab64... by Adam Kocoloski <email address hidden>

Require subscribers to wait until indexer finishes

This clause allowed a subscriber to start reading a view as soon as the
indexer made it past the sequence of interest. The trouble with that
approach is the resulting view is not directly related to any snapshot
of the underlying DB. Waiting until the indexer finishes allows us to
provide better semantics, where the view observes a consistent snapshot
of the database at some point in time >= the requested sequence.

In order to see this work through the view reader should explicitly set
the read version of FDB to match the commit version introduced by the
indexer, to avoid seeing partial results from a follow-on indexing job.

2e40063... by Adam Kocoloski <email address hidden>

Advance DB sequence when retrying index build

This patch moves the retrieval of the DB sequence number into the
transaction function. This sequence number is used to determine when the
indexer has completed its work. If the transaction is aborted and
retries, the sequence will advance to the latest committed sequence
observable by the current read version of FDB acquired by the indexer.

This change has significant implications that are not obvious at first
glance. We assert the following:

* A view should, by default, observe a consistent snapshot of the DB as
  it existed at some point in time

* The only way to observe a consistent snapshot across multiple
  transactions is for the final transaction to process all changes up to
  the latest DB sequence observable from the current read version of
  that transaction

Without this change, an indexer might only observe a subset of the
outstanding document updates during a retry. This subset does not
necessarily correspond to any historical version of the database, as it
will be missing entries for documents that were updated concurrently
with the previous attempt by the indexer to commit its work. So this is
first and foremost a correctness issue.

Moving the acquisition of the DB sequence inside the indexing
transaction does provides a small latency benefit by avoiding an extra
GRV call on ~idle DBs. Unfortunately, it increases tail latencies for
stale=false view queries on DBs with heavy write loads. If the indexer
cannot complete its work within the 5 second period of validity for its
read version, it will need to add all the additional updates committed
during that interval to the set of documents that it needs to process.
This is the price we pay for having well-defined isolation semantics for
the view. Long-lived read versions would help to alleviate this,
ensuring that the view can always be generated against the version of
the DB available at the beginning of the transaction.

650ba28... by Adam Kocoloski <email address hidden>

Relax isolation level when indexer reads from DB (#3393)

* Relax isolation level when indexer reads from DB

This patch causes the indexing subsystem to use snapshot isolation when
reading from the database. This reduces commit conflicts and ensures
the index can make progress even in the case of frequently updated docs.

In the pathological case, a document updated in a fast loop can cause
the indexer to stall out entirely when using serializable reads. Each
successful update of the doc will cause the indexer to fail to commit.
The indexer will retry with a new GRV but the same target DbSeq. In the
meantime, our frequently updated document will have advanced beyond
DbSeq and so the indexer will finish without indexing it in that pass.
This process can be repeated ad infinitum and the document will never
actually show up in a view response.

Snapshot reads are safe for this use case precisely because we do have
the _changes feed, and we can always be assured that a concurrent doc
update will show up again later in the feed.

* Bump erlfdb version

Needed to pull in fix for snapshot range reads.

fac13a3... by Nick Vatamaniuc <email address hidden>

Fix badmatch in couch_views_indexer

Previously, when an erlfdb error occured and a recursive call to `update/3` was
made, the result of that call was always matched against `{Mrst, State}`.
However, in the case when the call had finalized and returned
`couch_eval:release_map_context/1` response, the result would be `ok` which
would blow with a badmatch error against `{Mrst, State}`.

04086e6... by =?utf-8?q?Bessenyei_Bal=C3=A1zs_Don=C3=A1t?= <email address hidden>

Make session elixir test more robust

6822fe4... by =?utf-8?q?Bessenyei_Bal=C3=A1zs_Don=C3=A1t?= <email address hidden>

Set default nodes in dev/run to 1

a9e0ebe... by Robert Newson <email address hidden>

Merge pull request #3386 from apache/ebtree-lookup-opt

Optimize lookup/3

45d4039... by Robert Newson <email address hidden>

Optimize lookup/3

A tidier version of https://github.com/apache/couchdb/pull/3384 that
saves an unnecessary call to collate.

3d4a827... by Robert Newson <email address hidden>

Merge pull request #3384 from apache/ebtree-lookup-collate-eq

use collate in lookup

ec4b213... by Paul J. Davis

Fix ebtree:lookup_multi/3

If one of the provided lookup keys doesn't exist in the ebtree, it can
inadvertently prevent a second lookup key from being found if it the
first key greater than the missing lookup key is equal to the second
lookup key.