lp:~measurement-factory/squid/bag9

Created by Alex Rousskov and last modified
Get this branch:
bzr branch lp:~measurement-factory/squid/bag9
Members of Measurement Factory can upload to this branch. Log in for directions.

Branch merges

Related bugs

Related blueprints

Branch information

Owner:
Measurement Factory
Project:
Squid
Status:
Development

Recent revisions

13329. By Alex Rousskov

Merged from trunk r13477
to get Collapsed Forwarding fixes.

13328. By Alex Rousskov

Merged from trunk r13474.

13327. By Alex Rousskov

Prep for merge from trunk: undo branch r13313, r13312, and r13311 that were
temporary undoing trunk r13266, r13269, and r13270 (std::vector migration).

13326. By Alex Rousskov

Avoid store_client.cc "entry->swap_filen > -1 || entry->swappingOut()" asserts.

A client may hit on an incomplete shared memory cache entry. Such entry is
fully backed by the shared memory cache, but the copy of its data in local RAM
may be trimmed. When that trimMemory() happens, StoreEntry::storeClientType()
assumes DISK_CLIENT due to positive inmem_lo, and the store_client constructor
asserts upon discovering that there is no disk backing.

To improve shared cache effectiveness for "being cached" entries, we need to
prevent local memory trimming while the shared cache entry is being filled
(possibly by another worker, so this is far from trivial!) or, better, stop
using the local memory for entries feeding off the shared memory cache. The
latter would also require revising DISK_CLIENT designation to include entries
backed by a shared memory cache.

13325. By Alex Rousskov

Polished to address squid-dev review comments.

13324. By Alex Rousskov

Fixed typo breaking accounting of [forgotten] entries during index build.

Forgotten entries would still contribute to the total entry count and, hence,
result in some entries being evicted from cache prematurely.

Added in "Stop wasting 96 RAM bytes per slot" revision (branch r13321).

13323. By Alex Rousskov

Document counter-intuitive round-robin cache_dir selection bias; decrease it.

Many squid.confs have at least two groups of cache_dir lines. For example,
rare "large" objects go to larger/slower HDDs while popular "small" objects go
to smaller/fast SSDs:

    # HDDs
    cache_dir rock /hdd1 ... min-size=large
    cache_dir rock /hdd2 ... min-size=large
    cache_dir rock /hdd3 ... min-size=large
    # SSDs
    cache_dir rock /ssd1 ... max-size=large-1
    cache_dir rock /ssd2 ... max-size=large-1
    cache_dir rock /ssd3 ... max-size=large-1
    # rock store does not support least-load yet
    store_dir_select_algorithm round-robin

Since round-robin selects the first suitable disk during a sequential scan,
the probability of /hdd1 (/ssd1) selection is higher than other HDDs (SSDs).
Consider a large object that needs an HDD: /hdd1 is selected whenever scan
starts at /ssd1, /ssd2, /ssd3, and /hdd1 while /hdd2 is selected only when the
scan starts at /hdd2 itself! Documentation now warns against the above
cache_dir configuration approach and suggests to interleave cache_dirs:

    cache_dir rock /hdd1 ... min-size=large
    cache_dir rock /ssd1 ... max-size=large-1
    cache_dir rock /hdd2 ... min-size=large
    cache_dir rock /ssd2 ... max-size=large-1
    cache_dir rock /hdd3 ... min-size=large
    cache_dir rock /ssd3 ... max-size=large-1
    store_dir_select_algorithm round-robin

Squid's historic implementation of round-robin made its natural bias even
worse because it made the starting point of the sequential scan to be the last
selected dir. In the first configuration example above, it boosted the
probability that the scan will start at one of the SSDs because smaller
objects are more popular and, hence, their dirs are selected more often. With
the starting point usually at an SSD, even more _large_ objects were sent to
/hdd1 compared to other HDDs! The code change avoids this artificial boost
(but the cache_dir lines should still be interleaved to avoid the natural
round-robin bias discussed earlier).

13322. By Alex Rousskov

Fix malloc corruption from use-after-free in peer_select.cc

same as trunk r13340

13321. By Alex Rousskov

Stop wasting 96 RAM bytes per slot for high-offset slots in large shared caches
with more than 16777216 slots.

Ipc::StoreMap was using the same structure for all db slots. However, slots at
offsets exceeding SwapFilenMax (16777215) could not contain store entry
anchors and the anchor part of the structure was wasting RAM for those slots.
This change splits a single array of StoreMapSlots into two arrays, one
storing StoreMapAnchors and one storing StoreMapSlices. The anchors array is
shorter for caches with more than 16777216 slots.

For example, a StoreMap for a 1TB shared cache with default 16KB slot sizes
(67108864 slots) occupied about 6.5GB of RAM. After this change, the same
information is stored in about 2.0GB because unused anchors are not stored.

32-bit environments were wasting 72 (instead of 96) bytes per high-offset slot.

Also simplified Ipc::StoreMap API by removing its StoreMapWithExtras part.
The added complexity caused bugs and was not worth saving a few extra lines of
callers code. With the StoreMap storage array split in two, the extras may
belong to each part (although the current code only adds extras to slices),
further complicating the WithExtras part of the StoreMap API. These extras
are now stored in dedicated shared memory segments (*_ex.shm).

Added Ipc::Mem::Segment::Name() function to standardize segment name
formation. TODO: Attempt to convert shm_new/old API to use SBuf instead of
char* to simplify callers, most of which have to form Segment IDs by
concatenating strings.

13320. By Alex Rousskov

Allow HITs on entries backed by a shared memory cache only.

A typo in r12501.1.59 "Do not become a store_client for entries that are not
backed by Store" prevented such entries from being used for HITs and possibly
even purged them from the memory cache.

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar repository format 2a (needs bzr 1.16 or later)
Stacked on:
lp:~squid/squid/trunk
This branch contains Public information 
Everyone can see this information.