The following config caused segmentation fault due to conf->file not
being properly set if "ssl on" was inherited from the http level:
http {
ssl on;
server {
}
}
If possible we now just extend already present file buffer in p->out chain
instead of keeping ngx_buf_t for each buffer we've flushed to disk. This
saves about 120 bytes of memory per buffer flushed to disk, and resolves
high CPU usage observed in edge cases (due to coalescing these buffers on
send).
1. In ngx_event_pipe_write_chain_to_temp_file() make sure to fully write
all shadow buffers up to last_shadow. With this change recycled buffers
cannot appear in p->out anymore. This also fixes segmentation faults
observed due to ngx_event_pipe_write_chain_to_temp() not freeing any
raw buffers while still returning NGX_OK.
2. In ngx_event_pipe_write_to_downstream() we now properly check for busy
size as a size of buffers, not a size of data in these buffers. This
fixes situations where all available buffers became busy (including
segmentation faults due to this).
3. The ngx_event_pipe_free_shadow_raw_buf() function is dropped. It's
incorrect and not needed.
Previously result of last iteration's writev() was returned. This was
unnoticed as return value was only used if chain contained only one or
two buffers.
Previously nginx used to mark backend again as live as soon as fail_timeout
passes (10s by default) since last failure. On the other hand, detecting
dead backend takes up to 60s (proxy_connect_timeout) in typical situation
"backend is down and doesn't respond to any packets". This resulted in
suboptimal behaviour in the above situation (up to 23% of requests were
directed to dead backend with default settings).
More detailed description of the problem may be found here (in Russian):
http://mailman.nginx.org/pipermail/nginx-ru/2011-August/042172.html
Fix is to only allow one request after fail_timeout passes, and
mark backend as "live" only if this request succeeds.
Note that with new code backend will not be marked "live" unless "check"
request is completed, and this may take a while in some specific workloads
(e.g. streaming). This is believed to be acceptable.