Created by Brian Aker on 2010-07-15 and last modified on 2018-05-25
Get this branch:
bzr branch lp:memcached

Branch merges

Related bugs

Related blueprints

Branch information

Brian Aker
Memcached Mirror

Import details

Import Status: Failed

This branch is an import of the HEAD branch of the Git repository at git://github.com/memcached/memcached.

The import has been suspended because it failed 5 or more times in succession.

Last successful import was on 2018-07-07.

Import started on 2018-07-12 on izar and finished on 2018-07-12 taking 10 seconds — see the log
Import started on 2018-07-10 on alnitak and finished on 2018-07-10 taking 20 seconds — see the log
Import started on 2018-07-08 on izar and finished on 2018-07-09 taking 20 seconds — see the log
Import started on 2018-07-08 on alnitak and finished on 2018-07-08 taking 15 seconds — see the log

Recent revisions

1414. By dormando on 2018-05-25

fix sasl tests

apparently I broke them with the unix domain socket update, since you have to
add an environment variable for them to even run.

need to make it work with the unix socket at some point though.

1413. By dormando on 2018-05-25

extstore tests: further tuning of tests

enforce actually waiting for extstore to flush items before moving on, remove
some pacing that was making things worse. The compactor is rescuing the
canary items occasionally.

Detunes the compactor on start, then ramps it up before the compactor
specific tests. This seems to fix the flakiness, at least enough that it's
been passing in a loop on fast and slow systems for me.

1412. By dormando on 2018-05-24

extstore tests: adjust pacing and sizing

on 32bit systems the items take a bit less space in a few dimensions, so we
need to fill harder. also, these tend to be slower systems, so pace out the
inserts with occasional 1s sleeps, rather than all at once at the end.

1411. By dormando on 2018-05-23

gate arm crc32 behind --enable-arm-crc32

users also need to add CFLAGS="-march=armv8-a+crc" if they have actual
aarch64 platforms.

1410. By dormando on 2018-05-23

extstore test: try harder to find missing values

1409. By dormando on 2018-05-23

alignment and 32bit fixes for extstore

memory alignment when reading header data back.

left "32" in a few places that should've at least been a define, is now
properly an offsetof. used for skipping crc32 for dynamic parts of the item

1408. By dormando on 2018-05-23

crc32c for aarch64 support

Patch authored by @aparida1 on github

portability/compilation/warnings modifications by dormando

1407. By dormando on 2018-05-17

fix deadlock during hash table expansion

The slab rebalancer was paused after the LRU maintainer thread during hash
table expansion. This was mostly fine, except in one case.

If an item is in the COLD LRU when it gets hit, it gets assigned an ACTIVE
bitflag and queued for an asynchronous bump. If the item already has an
ACTIVE bit, it does not queue a second time. Items in other LRU's are bumped
when they hit the tail, but COLD items have to bump relatively soon to avoid
accidentally evicting recently active items while under memory pressure.

Items queued for async bumps have an extra refcount.

The slab page mover cannot move a page until all memory within the page is
cleared and freed. If it runs into an item which is stuck, it will try to
delete it, then wait for all of its references to be removed.

However, if an item is in the bump queue when the LRU thread is stopped,
and the slab page mover is moving the page a queued item happens to be in,
the refcount will never drop to zero as the bump will never happen.

The page mover has no mechanism for "giving up" partway through, so it loops
forever. For most use cases, this degrades the instance but doesn't actually
break much:

- The COLD LRU's will eventually drain to zero.
- Allocations will all become "direct reclaims", which force items into COLD
  and then evicts them.
- This will degrade the LRU algorithm and some performance, but if the
  instance usage isn't very high most users may never notice.
- Normally, page moving is rare enough and hash table expansion is rare
  enough that it's an unlikely race.

However, with extstore, page moves tend to happen a lot more frequently.
Also, items are only ever written to extstore from the COLD LRU's, so if they
empty the process will stop. This makes the race more likely to happen and
the effects are more obvious.

The fix is to simply pause the page mover before the LRU thread. The LRU
thread does call the page mover, but it is via trylock() since it normally
expects to progress even if the page mover is busy. The pause of the page
mover won't succeed until it fully completes the current page, so it should
then be clear to pause the LRU thread.

Further improvements could be made by allowing the page mover to bail after N
unsuccessful loops. I am also considering removing the async LRU bump
process, and letting the LRU crawler handle that instead.

Much thanks to Shashi and Sridhar from Netflix for all of their help in
tracking down this bug!

1406. By Stanislaw Pitucha on 2018-05-09

Add Dockerfile definitions

Add alpine (musl-based) and ubuntu (glibc-based) descriptions for test
images. To run the testapp in either of them, run:

docker-compose run alpine
docker-compose run ubuntu

1405. By Stanislaw Pitucha on 2018-04-12

Fix lru-crawler behaviour

Allow worker threads to poll

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar repository format 2a (needs bzr 1.16 or later)
This branch contains Public information 
Everyone can see this information.