lp:~measurement-factory/squid/peer-idle-pool

Created by Alex Rousskov and last modified
Get this branch:
bzr branch lp:~measurement-factory/squid/peer-idle-pool
Members of Measurement Factory can upload to this branch. Log in for directions.

Branch merges

Related bugs

Related blueprints

Branch information

Owner:
Measurement Factory
Project:
Squid
Status:
Development

Recent revisions

12748. By Alex Rousskov

Added secure peer support to the steady connection pool feature
(cache_peer ... ssl steady=N).

Supply a fake HTTP OPTIONS request to getOutgoingAddress() and
GetMarkingsToServer() to make their ACLs happier. The request is also needed
for Ssl::PeerConnector's ErrorState generation code.

Polished/finalized Ssl::PeerConnector callback answer API.

Removed double-negotiateSsl() call from Ssl::PeerConnector. The old bug did
not manifest itself except under valgrind tests for some reason.

12747. By Alex Rousskov

Moved Comm*Params class pre-declarations to comm/forward.h
so that headers of Comm-using classes do not have to drag CommCalls.h in.

12746. By Alex Rousskov

Initial cache_peer standby=N implementation. No SSL peer support yet.

The feature focus is to instantly provide a ready-to-use connection to a
cooperating cache peer, virtually at all times. This is useful when connection
establishment is "too slow" and/or when infrequent peer use prevents Squid from
combating slow connection establishment with the regular idle connection pool.

The feature is similar to Squid2 idle=N feature, but there are key differences:

* Standby connections are available virtually at all times, while Squid2 unused
  "idle" connections are available only for a short time after a peer request.

* All N standby connections are not opened at once, reducing the chance of
  the feature being mistaken for a DoS attack on a peer.

* More consistent support for peers with multiple IP addresses (peer IPs are
  cycled through, just like during regular Squid request forwarding).

Besides, "idle" is a poor choice of adjective for an unused connection pool
name because the same term is used for used persistent connections, which have
somewhat different properties, are stored in a different pool, may need
distinct set of tuning options, etc. It is better to use a dedicated term for
the new feature.

The relationship between the max-conn limit and standby/idle connections is a
complex one. After several rewrites and tests, Squid now obeys max-conn limit
when opening new standby connections and accounts for standby connections when
checking whether to allow peer use. This often works OK, but leads to standby
guarantee violations when non-standby connections approach the limit. The
alternative design where standby code ignores max-conn works better, but is
really difficult to explain and advocate because an admin expects max-conn to
cover all connections and because of the idle connections accounting and
maintenance bugs. We may come back to this when the idle connections code is
fixed.

Fixed max-conn documentation and XXXed a peerHTTPOkay() bug (now in
peerHasConnAvailable()) that results in max-conn limit preventing the use of a
peer with idle persistent connections.

Decided to use standby connections for non-retriable requests. Avoiding
standby connections for POSTs and such would violate the main purpose of the
feature: providing an instant ready-to-use connection. A user does not care
whether it is waiting too long for a GET or POST request. Actually, a user may
care more when their POST requests are delayed (because canceling and
retrying them is often scary from the user point of view). The idea behind
standby connections is that the admin is responsible for avoiding race
conditions by properly configuring the peering Squids. If such proper
configuration is not possible or the consequences of rare races (e.g., due to
peer shutdown) are more severe than the consequences of slow requests, the
admin should not use standby=N. This choice may become configurable in the
future.

A standby pool is using a full-blown PconnPool object for storage instead of
the smaller IdleConnList, like the ICAP code does. The primary reasons for
this design where:

* A peer may have multiple addresses and those addresses may change. PconnPool
has code to deal with multiple addresses while IdleConnList does not. I do not
think this difference is really used in this implementation, but I did not
want to face an unknown limitation. Note that ICAP does not support multiple
ICAP server addresses.

* PconnPool has reporting (and cache manager integration) code that we should
eventually improve and use to report standby-specific stats. When this
happens, PconnPool will probably become abstract and spawn two kids for pconn
and standby pools.

Seemingly unrelated changes triggered by standby=N addition:

* Removed PconnPool from fde.h. We used to create immortal PconnPool objects.
Now, standby pools are destroyed when their peer is destroyed. Sharing raw
pointers to such pools is too dangerous. We could use smart pointers, but
PconnPools do not really belong to such a low-level object like fde IMO.

* Added FwdState::closeServerConnection() to encapsulate server connection
closing code, including the new noteUses() maintenance. Also updated
FwdState::serverClosed() to do the same maintenance.

* Encapsulated commonly reused ToS and NfMark code into GetMarkingsToServer().
May need more work in FwdState where some similar-but-different code remains.

* Close all connections in IdleConnList upon deletion. The old code did
not care because we never deleted PconnPools (although I am not sure
there were no bugs related to ICAP service pools which use IdleConnList
directly and do get destroyed).

* Fixed PconnPool::dumpHash(). It was listing the first entry twice because
the code misused misnamed hash_next().

* Removed unnecessary hard-coded limit on the number of PconnPools. Use
std::set for their storage.

* Fixed very stale PconnPool::pop() documentation and polished its code.

12745. By Alex Rousskov

Added RegisteredRunner::sync() method to use during Squid reconfiguration.

The existing run() method and destructor are great for the initial
configuration and final shutdown, but do not work well for reconfiguration
when you do not want to completely destroy and then recreate the state.
The sync() method (called via SyncRegistered) can be used for that.

Eventually, the reconfiguration API should present the old "saved" config
and the new "current" config to RegisteredRunners so that they can update
their modules/features intelligently. For now, they just see the new config.

12744. By Alex Rousskov

Added SSL_OP_NO_TICKET configuration option to disable TLS session tickets
and, hence, allow the use of Squid's SMP-shared SSL session cache.

TLS session tickets do not always work as intended. For example, we suspect
that a TLS ticket generated by one SMP worker cannot be used by another worker
to resume an SSL session because the two workers may use different r.n.g. to
encrypt/decrypt the ticket. And if TLS tickets are sent but not used, the
session resumption using Squid's shared session cache does not happen.

More work is needed to fully understand why session resumption using TLS
session tickets does not always work (and make it work if possible).

12743. By Alex Rousskov

Set cap_net_admin capability when Squid sets TOS/Diffserv packet values.

In capabilities-capable environments (e.g., Linux with libcap), CAP_NET_ADMIN
capability is required to honor clientside_tos and tcp_outgoing_tos
directives. The code was setting that capability when Netfilter marks or
tproxy was enabled, but missed the clientside_tos and tcp_outgoing_tos cases.

12742. By Alex Rousskov

Replace blocking sleep(3) and close UDS socket on failures
as the first step towards improving kid registration.

Same as trunk r13116.

12741. By Alex Rousskov

Merged from shared-ssl-sessions r12733
to fix shm segment name for the shared SSL shared session cache.

12740. By Alex Rousskov

Merged from collapsed-fwd (r12587) to get initial Collapsed Forwarding
support and Large Rock/Store fixes.

12739. By Alex Rousskov

Merged from trunk r12948.

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar repository format 2a (needs bzr 1.16 or later)
Stacked on:
lp:~squid/squid/trunk
This branch contains Public information 
Everyone can see this information.