The directive configures maximum number of requests allowed on
a connection kept in the cache. Once a connection reaches the number
of requests configured, it is no longer saved to the cache.
The default is 100.
Much like keepalive_requests for client connections, this is mostly
a safeguard to make sure connections are closed periodically and the
memory allocated from the connection pool is freed.
The directive configures maximum time a connection can be kept in the
cache. By configuring a time which is smaller than the corresponding
timeout on the backend side one can avoid the race between closing
a connection by the backend and nginx trying to use the same connection
to send a request at the same time.
If a connection with the read delayed flag set was stored in the keepalive
cache, and after picking it from the cache a read timer was set on that
connection, this timer was considered a delay timer rather than a socket read
event timer as expected. The latter timeout is usually much longer than the
former, which caused a significant delay in request processing.
The issue manifested itself with proxy_limit_rate and upstream keepalive
enabled and exists since 973ee2276300 (1.7.7) when proxy_limit_rate was
introduced.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured. As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.
Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set. The upstream keepalive module
was modified to follow this.
Keeping the ready flag in this case might results in missing notification of
broken connection until nginx tried to use it again.
While there, stale comment about stale event was removed since this function
is also can be called directly.
The c->sent is reset to 0 on each request by server-side http code,
so do the same on client side. This allows to count number of bytes
sent in a particular request.
The original idea was to optimize edge cases in case of interchangeable
backends, i.e. don't establish a new connection if we have any one
cached. This causes more harm than good though, as it screws up
underlying balancer's idea about backends used and may result in
various unexpected problems.