Adds an internal replication backlog metric. In the `_system` endpoint
this is called `internal_replication_jobs`, so I've preserved the name,
though it appears to represent the backlog of changes.
Adding a dependency on mem3 to `couch_prometheus` requires some changes
to the tests and dependency tree:
- `couchdb.app.src` no longer lists a dependency on `couch_prometheus`.
I don't know why this was needed previously - it doesn't appear to be
required.
- `couch_prometheus` now has dependencies on `couch` and `mem3`.
This both ensures that `couch_prometheus` doesn't crash if mem3 isn't
running and also resolves a race condition on startup where the
`_prometheus` endpoint returns incomplete stats.
- `couch_prometheus:system_stats_test/0` is moved to
`couch_prometheus_e2e_tests:t_starts_with_couchdb/0`. It is really
an integration test, since it depends on the `_prometheus` endpoint
being able to collect data for all the metrics, and it tests only
that the metrics names begin with `couchdb_`.
feat (prometheus): metrics for individual message queues
The `_prometheus` endpoint today includes size/min/max metrics
across all message queues. This adds a new metric -
`erlang_message_queue_size{queue_name="<name>"}` which tracks the
size of individual message queues.
This could replace the previous metrics since those can be derived from
the new metric by prometheus, but I've left them in place for
compatibility.
Spidermonkey sometimes throws an `InternalError` when exceeding memory limits,
when normally we'd expect it to crash or exit with a non-0 exit code. Because
we trap exceptions, and continue emitting rows, it is possible for users views
to randomly miss indexed rows based on whether GC had run or not, other
internal runtime state which may have been consuming more or less memory until
that time.
To prevent the view continuing processing documents, and randomly dropping
emitted rows, depending on memory pressure in the JS runtime at the time,
choose to treat Internal errors as fatal.
After an InternalError is raised we expect the process to exit just like it
would during OOM.