If possible we now just extend already present file buffer in p->out chain
instead of keeping ngx_buf_t for each buffer we've flushed to disk. This
saves about 120 bytes of memory per buffer flushed to disk, and resolves
high CPU usage observed in edge cases (due to coalescing these buffers on
send).
1. In ngx_event_pipe_write_chain_to_temp_file() make sure to fully write
all shadow buffers up to last_shadow. With this change recycled buffers
cannot appear in p->out anymore. This also fixes segmentation faults
observed due to ngx_event_pipe_write_chain_to_temp() not freeing any
raw buffers while still returning NGX_OK.
2. In ngx_event_pipe_write_to_downstream() we now properly check for busy
size as a size of buffers, not a size of data in these buffers. This
fixes situations where all available buffers became busy (including
segmentation faults due to this).
3. The ngx_event_pipe_free_shadow_raw_buf() function is dropped. It's
incorrect and not needed.
Previously result of last iteration's writev() was returned. This was
unnoticed as return value was only used if chain contained only one or
two buffers.
Previously nginx used to mark backend again as live as soon as fail_timeout
passes (10s by default) since last failure. On the other hand, detecting
dead backend takes up to 60s (proxy_connect_timeout) in typical situation
"backend is down and doesn't respond to any packets". This resulted in
suboptimal behaviour in the above situation (up to 23% of requests were
directed to dead backend with default settings).
More detailed description of the problem may be found here (in Russian):
http://mailman.nginx.org/pipermail/nginx-ru/2011-August/042172.html
Fix is to only allow one request after fail_timeout passes, and
mark backend as "live" only if this request succeeds.
Note that with new code backend will not be marked "live" unless "check"
request is completed, and this may take a while in some specific workloads
(e.g. streaming). This is believed to be acceptable.
Second aio post happened when timer set by limit_rate expired while we have
aio request in flight, resulting in "second aio post" alert and socket leak.
The patch adds actual protection from aio calls with r->aio already set to
aio sendfile code in ngx_http_copy_filter(). This should fix other cases
as well, e.g. when sending buffered to disk upstream replies while still
talking to upstream.
The ngx_http_writer() is also fixed to handle the above case (though it's
mostly optimization now).
Reported by Oleksandr V. Typlyns'kyi.
For files with '?' in their names autoindex generated links with '?' not
escaped. This resulted in effectively truncated links as '?' indicates
query string start.
This is an updated version of the patch originally posted at [1]. It
introduces generic NGX_ESCAPE_URI_COMPONENT which escapes everything but
unreserved characters as per RFC 3986. This approach also renders unneeded
special colon processing (as colon is percent-encoded now), it's dropped
accordingly.
[1] http://nginx.org/pipermail/nginx-devel/2010-February/000112.html
Reported by Konstantin Leonov.
If cache was bypassed with proxy_cache_bypass, cache-controlling headers
(Cache-Control, Expires) wasn't considered and response was cached even
if it was actually non-cacheable.
Patch by John Ferlito.
Configuration with duplicate upstream blocks defined after first use, i.e.
like
server {
...
location / {
proxy_pass http://backend;
}
}
upstream backend { ... }
upstream backend { ... }
now correctly results in "duplicate upstream" error.
Additionally, upstream blocks defined after first use now handle various
server directive parameters ("weight", "max_fails", etc.). Previously
configuration like
server {
...
location / {
proxy_pass http://backend;
}
}
upstream backend {
server 127.0.0.1 max_fails=5;
}
incorrectly resulted in "invalid parameter "max_fails=5"" error.
For normal cached responses ngx_http_cache_send() sends last buffer and then
request finalized via ngx_http_finalize_request() call, i.e. everything is
ok.
But for stale responses (i.e. when upstream died, but we have something in
cache) the same ngx_http_cache_send() sends last buffer, but then in
ngx_http_upstream_finalize_request() another last buffer is send. This
causes duplicate final chunk to appear if chunked encoding is used (and
resulting problems with keepalive connections and so on).
Fix this by not sending in ngx_http_upstream_finalize_request()
another last buffer if we know response was from cache.
The special case in question leads to replies without body in
configuration like
location / { error_page 404 /zero; return 404; }
location /zero { return 204; }
while replies with empty body are expected per protocol specs.
Correct one will look like
if (status == NGX_HTTP_NO_CONTENT) {
rc = ngx_http_send_header(r);
if (rc == NGX_ERROR || r->header_only) {
return rc;
}
return ngx_http_send_special(r, NGX_HTTP_LAST);
}
though it looks like it's better to drop this special case at all.
Big POST (not fully preread) to a
location / {
return 202;
}
resulted in incorrect behaviour due to "return" code path not calling
ngx_http_discard_request_body(). The same applies to all "return" used
with 2xx/3xx codes except 201 and 204, and to all "return ... text" uses.
Fix is to add ngx_http_discard_request_body() call to ngx_http_send_response()
function where it looks appropriate. Discard body call from emtpy gif module
removed as it's now redundant.
Reported by Pyry Hakulinen, see
http://mailman.nginx.org/pipermail/nginx/2011-August/028503.html
Test case:
location / {
error_page 405 /nope;
return 405;
}
location /nope {
return 200;
}
This is expected to return 405 with empty body, but in 0.8.42+ will return
builtin 405 error page as well (though not counted in Content-Length, thus
breaking protocol).
Fix is to use status provided by rewrite script execution in case
it's less than NGX_HTTP_BAD_REQUEST even if r->error_status set. This
check is in line with one in ngx_http_script_return_code().
Note that this patch also changes behaviour for "return 302 ..." and
"rewrite ... redirect" used as error handler. E.g.
location / {
error_page 405 /redirect;
return 405;
}
location /redirect {
rewrite ^ http://example.com/;
}
will actually return redirect to "http://example.com/" instead of builtin 405
error page with meaningless Location header. This looks like correct change
and it's in line with what happens on e.g. directory redirects in error
handlers.
Replies with 201 code contain body, and we should clearly indicate it's
empty if it's empty. Before 0.8.32 chunked was explicitly disabled for
201 replies and as a result empty body was indicated by connection close
(not perfect, but worked). Since 0.8.32 chunked is enabled, and this
causes incorrect responses from dav module when HTTP/1.1 is used: with
"Transfer-Encoding: chunked" but no chunks at all.
Fix is to actually return empty body in special response handler instead
of abusing r->header_only flag.
See here for initial report:
http://mailman.nginx.org/pipermail/nginx-ru/2010-October/037535.html
This fixes crashes observed with some 3rd party balancer modules. Standard
balancer modules (round-robin and ip hash) explicitly set pc->connection
(aka u->peer.connection) to NULL and aren't affected.
If client closed connection in ngx_event_pipe_write_to_downstream(), buffers
in the "out" chain were lost. This caused cpu hog if all available buffers
were in the "out" chain. Fix is to call ngx_chain_update_chains() before
checking return code of output filter to avoid loosing buffers in the "out"
chain.
Note that this situation (all available buffers in the "out" chain) isn't
normal, it should be prevented by busy buffers limit. Though right now it
may happen with complex protocols like fastcgi. This should be addressed
separately.
The default value is 32 AIO simultaneous requests per worker. Previously
they were hardcoded to 1024, and it was too large, since Linux allocated
them early on io_setup(), but not on request itself. So with default value
of /proc/sys/fs/aio-max-nr equal to 65536 only 64 worker processes could
be run simultaneously. 32 AIO requests are enough for modern disks even if
server runs only 1 worker.
and "chunked_transfer_encoding" directives, to be in line with all
directives taking a boolean argument. Both flags will ensure that
a directive takes one argument.
syscall(2) uses usual libc convention, it returns -1 on error and
sets errno. Obsolete _syscall(2) returns negative value of error.
Thanks to Hagai Avrahami.
By default follow the old behaviour, i.e. FASTCGI_KEEP_CONN flag isn't set
in request and application is responsible for closing connection once request
is done. To keep connections alive fastcgi_keep_conn must be activated.
As long as ngx_event_pipe() has more data read from upstream than specified
in p->length it's passed to input filter even if buffer isn't yet full. This
allows to process data with known length without relying on connection close
to signal data end.
By default p->length is set to -1 in upstream module, i.e. end of data is
indicated by connection close. To set it from per-protocol handlers upstream
input_filter_init() now called in buffered mode (as well as in
unbuffered mode).
Previous use of size_t may cause wierd effects on 32bit platforms with certain
big responses transferred in unbuffered mode.
Nuke "if (size > u->length)" check as it's not usefull anyway (preread
body data isn't subject to this check) and now requires additional check
for u->length being positive.
We no longer use r->headers_out.content_length_n as a primary source of
backend's response length. Instead we parse response length to
u->headers_in.content_length_n and copy to r->headers_out.content_length_n
when needed.
Just doing another connect isn't safe as peer.get() may expect peer.tries
to be strictly positive (this is the case e.g. with round robin with multiple
upstream servers). Increment peer.tries to at least avoid cpu hog in
round robin balancer (with the patch alert will be seen instead).
This is not enough to fully address the problem though, hence TODO. We
should be able to inform balancer that the error wasn't considered fatal
and it may make sense to retry the same peer.
The ngx_chain_update_chains() needs pool to free chain links used for buffers
with non-matching tags. Providing one helps to reduce memory consumption
for long-lived requests.
There were 2 buffers allocated on each buffer chain sent through chunked
filter (one buffer for chunk size, another one for trailing CRLF, about
120 bytes in total on 32-bit platforms). This resulted in large memory
consumption with long-lived requests sending many buffer chains. Usual
example of problematic scenario is streaming though proxy with
proxy_buffering set to off.
Introduced buffers reuse reduces memory consumption in the above problematic
scenario.
See here for initial report:
http://mailman.nginx.org/pipermail/nginx/2010-April/019814.html
If file inode was not changed, cached file information was not updated
on retest. As a result stale information might be cached forever if file
attributes was changed and/or file was extended.
This fix also makes obsolete r4077 change of is_directio flag handling,
since this flag is updated together with other file information.
On file retest open_file_cache lost is_directio if file wasn't changed.
This caused unaligned operations under Linux to fail with EINVAL.
It wasn't noticeable with AIO though, as errors wasn't properly logged.
Read event should be blocked after reading body, else undefined behaviour
might occur on additional client activity. This fixes segmentation faults
observed with proxy_ignore_client_abort set.
Setting read->eof to 0 seems to be just a typo. It appeared in
nginx-0.0.1-2003-10-28-18:45:41 import (r164), while identical code in
ngx_recv.c introduced in the same import do actually set read->eof to 1.
Failure to set read->eof to 1 results in EOF not being generally detectable
from connection flags. On the other hand, kqueue won't report any read
events on such a connection since we use EV_CLEAR. This resulted in read
timeouts if such connection was cached and used for another request.
If connection has unsent alerts, SSL_shutdown() tries to send them even
if SSL_set_shutdown(SSL_RECEIVED_SHUTDOWN|SSL_SENT_SHUTDOWN) was used.
This can be prevented by SSL_set_quiet_shutdown(). SSL_set_shutdown()
is required nevertheless to preserve session.
"max_ranges 0" disables ranges support at all,
"max_ranges 1" allows the single range, etc.
By default number of ranges is unlimited, to be precise, 2^31-1.
then nginx disables ranges and returns just the source response.
This fix should not affect well-behaving applications but will defeat
DoS attempts exploiting malicious byte ranges.
SSL_set_SSL_CTX() doesn't touch values cached within ssl connection
structure, it only changes certificates (at least as of now, OpenSSL
1.0.0d and earlier).
As a result settings like ssl_verify_client, ssl_verify_depth,
ssl_prefer_server_ciphers are only configurable on per-socket basis while
with SNI it should be possible to specify them different for two servers
listening on the same socket.
Workaround is to explicitly re-apply settings we care about from context
to ssl connection in servername callback.
Note that SSL_clear_options() is only available in OpenSSL 0.9.8m+. I.e.
with older versions it is not possible to clear ssl_prefer_server_ciphers
option if it's set in default server for a socket.
Non-daemon mode is currently used by supervisord, daemontools and so on
or during debugging. The NOACCEPT signal is only used for online upgrade
which is not supported when nginx is run under supervisord, etc.,
so this change should not break existant setups.
now cache loader processes either as many files as specified by loader_files
or works no more than time specified by loader_threshold during each iteration.
loader_threshold was previously used to decrease loader_files or
to increase loader_timeout and this might eventually result in
downgrading loader_files to 1 and increasing loader_timeout to large values
causing loading cache for forever.
NetBSD 5.0+ has SO_ACCEPTFILTER support merged from FreeBSD, and having
accept filter check in FreeBSD-specific ngx_freebsd_config.h prevents it
from being used on NetBSD. Therefore move the check into configure (and
do the same for Linux-specific TCP_DEFER_ACCEPT, just to be in line).
Previously only first log level was required to be correct, while error_log
directive in fact accepts list of levels (e.g. one may specify "error_log ...
debug_core debug_http;"). This resulted in (avoidable) wierd behaviour on
missing semicolon after error_log directive, e.g.
error_log /path/to/log info
index index.php;
silently skipped index directive and it's arguments (trying to interpret
them as log levels without checking to be correct).
The following configuration causes nginx to hog cpu due to infinite loop
in ngx_http_upstream_get_peer():
upstream backend {
server 127.0.0.1:8080 down;
server 127.0.0.1:8080 down;
}
server {
...
location / {
proxy_pass http://backend;
}
}
Make sure we don't loop infinitely in ngx_http_upstream_get_peer() but stop
after resetting peer weights once.
Return 0 if we are stuck. This is guaranteed to work as peer 0 always exists,
and eventually ngx_http_upstream_get_round_robin_peer() will do the right
thing falling back to backup servers or returning NGX_BUSY.
Flush flag wasn't set in constructed buffer and this prevented any data
from being actually sent to upstream due to SSL buffering. Make sure
we always set flush in the last buffer we are going to sent.
See here for report:
http://nginx.org/pipermail/nginx-ru/2011-June/041552.html
Previously all available data was used as body, resulting in garbage after
real body e.g. in case of pipelined requests. Make sure to use only as many
bytes as request's Content-Length specifies.
enabled in any server. The previous r1033 does not help when unused zone
becomes used after reconfiguration, so it is backed out.
The initial thought was to make SSL modules independed from SSL implementation
and to keep OpenSSL code dependance as much as in separate files.
list and evaluating total cache size. Reading just directory is enough for
this purpose. Elimination of reading cache files saves at least one disk I/O
operation per file.
Preparation for elimination of reading cache files by cache loader:
removing dependencies on the reading:
*) cache node valid_sec and valid_msec are used only for caching errors;
*) upstream buffer size can be used instead of cache node body_start.
by ngx_http_upstream_create_round_robin_peer(), since the peer lives
only during request so the saved SSL session will never be used again
and just causes memory leak
patch by Maxim Dounin
and is shared among all hosts instead of pregenerating for every HTTPS host
on configuraiton phase. This decreases start time for configuration with
large number of HTTPS hosts.
*) use a real excess value instead of non-updated limit_req rbtree node field,
*) move inactivity queue handling inside ngx_http_limit_req_lookup()
since the node is not required outside the lookup function;
the bug has been introduced in r3184
merge phase: otherwise the first server without an listen directive
did not become the default server if there was no explicit default server;
the bug has been introduced in r3218
to delete old inactive entries: one of them removes a entry just locked by
other manager from the queue and the rbtree as long inactive entry,
causes the latter manager to segfault leaving cache mutex locked,
the bug has been introduced in r3727
*) now ngx_http_file_cache_cleanup() uses ngx_http_file_cache_free()
*) ngx_http_file_cache_free() interface has been changed to accept r->cache
ngx_http_file_cache_cleanup() must use r->cache, but not r, because
there can be several r->cache's during request processing, r->cache may
be NULL at request finalising, etc.
*) test if updating request does not complete correctly
error_page 502 504 /zero;
location = /zero { return 204; }
The bug has been introduced in r3634.
The fix also allow to use:
error_page 502 504 = /zero;
location = /zero { return 200; }
This case still does not work:
error_page 502 504 /zero;
location = /zero { return 200; }
Besides, now $uid_set is not cacheable and may have two values:
before and after header filter processing.
This allows to log case, when uid cookie is remarked.
Revert a feature introduced in r2028. The feature confuses mostly,
the only gain was not to test regex for a frequent request such as
"/" in "locaiton /".