Previously, when a buffer was processed by the sub filter, its final bytes
could be buffered by the filter even if they don't match any pattern.
This happened because the Boyer-Moore algorithm, employed by the sub filter
since b9447fc457b4 (1.9.4), matches the last characters of patterns prior to
checking other characters. If the last character is out of scope, initial
bytes of a potential match are buffered until the last character is available.
Now, after receiving a flush or recycled buffer, the filter performs
additional checks to reduce the number of buffered bytes. The potential match
is checked against the initial parts of all patterns. Non-matching bytes are
not buffered. This improves processing of a chunked response from upstream
by sending the entire chunks without buffering unless a partial match is found
at the end of a chunk.
This patch moves various OpenSSL-specific function calls into the
OpenSSL module and introduces ngx_ssl_ciphers() to make nginx more
crypto-library-agnostic.
When the stream is terminated the HEADERS frame can still wait in the output
queue. This frame can't be removed and must be sent to the client anyway,
since HTTP/2 uses stateful compression for headers. So in order to postpone
closing and freeing memory of such stream the special close stream handler
is set to the write event. After the HEADERS frame is sent the write event
is called and the stream will be finally closed.
Some events like receiving a RST_STREAM can trigger the read handler of such
stream in closing state and cause unexpected processing that can result in
another attempt to finalize the request. To prevent it the read handler is
now set to ngx_http_empty_handler.
Thanks to Amazon.
There is no reason to add the "Content-Length: 0" header to a proxied request
without body if the header isn't presented in the original request.
Thanks to Amazon.
According to RFC 7540, an endpoint should not send more than one RST_STREAM
frame for any stream.
Also, now all the data frames will be skipped while termination.
The ngx_http_v2_finalize_connection() closes current stream, but that is an
invalid operation while processing unbuffered upload. This results in access
to already freed memory, since the upstream module sets a cleanup handler that
also finalizes the request.
Previously, the stream's window was kept zero in order to prevent a client
from sending the request body before it was requested (see 887cca40ba6a for
details). Until such initial window was acknowledged all requests with
data were rejected (see 0aa07850922f for details).
That approach revealed a number of problems:
1. Some clients (notably MS IE/Edge, Safari, iOS applications) show an error
or even crash if a stream is rejected;
2. This requires at least one RTT for every request with body before the
client receives window update and able to send data.
To overcome these problems the new directive "http2_body_preread_size" is
introduced. It sets the initial window and configures a special per stream
preread buffer that is used to save all incoming data before the body is
requested and processed.
If the directive's value is lower than the default initial window (65535),
as previously, all streams with data will be rejected until the new window
is acknowledged. Otherwise, no special processing is used and all requests
with data are welcome right from the connection start.
The default value is chosen to be 64k, which is bigger than the default
initial window. Setting it to zero is fully complaint to the previous
behavior.
Now, the module extracts optional port which may accompany an
IP address. This custom extension is introduced, among other
things, in order to facilitate logging of original client ports.
Addresses with ports are expected to be in the RFC 3986 format,
that is, with IPv6 addresses in square brackets. E.g.,
"X-Real-IP: [2001:0db8::1]:12345" sets client port ($remote_port)
to 12345.
Previously, when the client address was changed to the one from
the PROXY protocol header, the client port ($remote_port) was
reset to zero. Now the client port is also changed to the one
from the PROXY protocol header.
Since 4fbef397c753 nginx rejects with the 400 error any attempts of
requesting different host over the same connection, if the relevant
virtual server requires verification of a client certificate.
While requesting hosts other than negotiated isn't something legal
in HTTP/1.x, the HTTP/2 specification explicitly permits such requests
for connection reuse and has introduced a special response code 421.
According to RFC 7540 Section 9.1.2 this code can be sent by a server
that is not configured to produce responses for the combination of
scheme and authority that are included in the request URI. And the
client may retry the request over a different connection.
Now this code is used for requests that aren't authorized in current
connection. After receiving the 421 response a client will be able
to open a new connection, provide the required certificate and retry
the request.
Unfortunately, not all clients currently are able to handle it well.
Notably Chrome just shows an error, while at least the latest version
of Firefox retries the request over a new connection.
OpenSSL 1.0.2+ allows configuring a curve list instead of a single curve
previously supported. This allows use of different curves depending on
what client supports (as available via the elliptic_curves extension),
and also allows use of different curves in an ECDHE key exchange and
in the ECDSA certificate.
The special value "auto" was introduced (now the default for ssl_ecdh_curve),
which means "use an internal list of curves as available in the OpenSSL
library used". For versions prior to OpenSSL 1.0.2 it maps to "prime256v1"
as previously used. The default in 1.0.2b+ prefers prime256v1 as well
(and X25519 in OpenSSL 1.1.0+).
As client vs. server preference of curves is controlled by the
same option as used for ciphers (SSL_OP_CIPHER_SERVER_PREFERENCE),
the ssl_prefer_server_ciphers directive now controls both.
Both minor and major versions are now limited to 999 maximum. In case of
r->http_minor, this limit is already implied by the code. Major version,
r->http_major, in theory can be up to 65535 with current code, but such
values are very unlikely to become real (and, additionally, such values
are not allowed by RFC 7230), so the same test was used for r->http_major.
When it's known that the kernel supports EPOLLRDHUP, there is no need in
additional recv() call to get EOF or error when the flag is absent in the
event generated by the kernel. A special runtime test is done at startup
to detect if EPOLLRDHUP is actually supported by the kernel because
epoll_ctl() silently ignores unknown flags.
With this knowledge it's now possible to drop the "ready" flag for partial
read. Previously, the "ready" flag was kept until the recv() returned EOF
or error. In particular, this change allows the lingering close heuristics
(which relies on the "ready" flag state) to actually work on Linux, and not
wait for more data in most cases.
The "available" flag is now used in the read event with the semantics similar
to the corresponding counter in kqueue.
This parameter lets binding the proxy connection to a non-local address.
Upstream will see the connection as coming from that address.
When used with $remote_addr, upstream will accept the connection from real
client address.
Example:
proxy_bind $remote_addr transparent;
The WINDOW_UPDATE frame could be left in the output queue for an indefinite
period of time resulting in the request timeout.
This might happen if reading of the body was triggered by an event unrelated
to client connection, e.g. by the limit_req timer.
Particularly this prevents sending WINDOW_UPDATE with zero delta
which can result in PROTOCOL_ERROR.
Also removed surplus setting of no_flow_control to 0.
Refusing streams is known to be incorrectly handled at least by IE, Edge
and Safari. Make sure to provide appropriate logging to simplify fixing
this in the affected browsers.
After the 92464ebace8e change, it has been discovered that not all
clients follow the RFC and handle RST_STREAM with NO_ERROR properly.
Notably, Chrome currently interprets it as INTERNAL_ERROR and discards
the response.
As a workaround, instead of RST_STREAM the maximum stream window update
will be sent, which will let client to send up to 2 GB of a request body
data before getting stuck on flow control. All the received data will
be silently discarded.
See for details:
http://mailman.nginx.org/pipermail/nginx-devel/2016-April/008143.htmlhttps://bugs.chromium.org/p/chromium/issues/detail?id=603182
A client is allowed to send requests before receiving and acknowledging
the SETTINGS frame. Such a client having a wrong idea about the stream's
could send the request body that nginx isn't ready to process.
The previous behavior was to send RST_STREAM with FLOW_CONTROL_ERROR in
such case, but it didn't allow retrying requests that have been rejected.
This prevents forming empty records out of such buffers. Particularly it fixes
double end-of-stream records with chunked transfer encoding, or when HTTP/2 is
used and the END_STREAM flag has been sent without data. In both cases there
is an empty buffer at the end of the request body chain with the "last_buf"
flag set.
The canonical libfcgi, as well as php implementation, tolerates such records,
while the HHVM parser is more strict and drops the connection (ticket #950).
There are two improvements:
1. Support for request body filters;
2. Receiving of request body is started only after
the ngx_http_read_client_request_body() call.
The last one fixes the problem when the client_max_body_size value might not be
respected from the right location if the location was changed either during the
process of receiving body or after the whole body had been received.
RFC 7540 states that "A server can send a complete response prior to the client
sending an entire request if the response does not depend on any portion of the
request that has not been sent and received. When this is true, a server MAY
request that the client abort transmission of a request without error by sending
a RST_STREAM with an error code of NO_ERROR after sending a complete response
(i.e., a frame with the END_STREAM flag)."
This should prevent a client from blocking on the stream window, since it isn't
maintained for closed streams. Currently, quite big initial stream windows are
used, so such blocking is very unlikly, but that will be changed in the further
patches.
By default, requests with non-idempotent methods (POST, LOCK, PATCH)
are no longer retried in case of errors if a request was already sent
to a backend. Previous behaviour can be restored by using
"proxy_next_upstream ... non_idempotent".
Much like normal connections, cached connections are now tested against
u->conf->next_upstream, and u->state->status is now always set.
This allows to disable additional tries even with upstream keepalive
by using "proxy_next_upstream off".
When a keys_zone is full then each next request to the cache is
penalized. That is, the cache has to evict older files to get a
slot from the keys_zone synchronously. The patch introduces new
behavior in this scenario. Manager will try to maintain available
free slots in the keys_zone by cleaning old files in the background.
The "aio_write" directive is introduced, which enables use of aio
for writing. Currently it is meaningful only with "aio threads".
Note that aio operations can be done by both event pipe and output
chain, so proper mapping between r->aio and p->aio is provided when
calling ngx_event_pipe() and in output filter.
In collaboration with Valentin Bartenev.
This simplifies the interface of the ngx_thread_read() function.
Additionally, most of the thread operations now explicitly set
file->thread_task, file->thread_handler and file->thread_ctx,
to facilitate use of thread operations in other places.
(Potential problems remain with sendfile in threads though - it uses
file->thread_handler as set in ngx_output_chain(), and it should not
be overwritten to an incompatible one.)
In collaboration with Valentin Bartenev.
It can now be set to "off" conditionally, e.g. using the map
directive.
An empty value will disable the emission of the Server: header
and the signature in error messages generated by nginx.
Any other value is treated as "on", meaning that full nginx
version is emitted in the Server: header and error messages
generated by nginx.
If proxy_cache is enabled, and proxy_no_cache tests true, it was previously
possible for the client connection to be closed after a 304. The fix is to
recheck r->header_only after the final cacheability is determined, and end the
request if no longer cacheable.
Example configuration:
proxy_cache foo;
proxy_cache_bypass 1;
proxy_no_cache 1;
If a client sends If-None-Match, and the upstream server returns 200 with a
matching ETag, no body should be returned to the client. At the start of
ngx_http_upstream_send_response proxy_no_cache is not yet tested, thus cacheable
is still 1 and downstream_error is set.
However, by the time the downstream_error check is done in process_request,
proxy_no_cache has been tested and cacheable is set to 0. The client connection
is then closed, regardless of keepalive.
If caching was used, "zero size buf in output" alerts might appear
in logs if a client prematurely closed connection. Alerts appeared
in the following situation:
- writing to client returned an error, so event pipe
drained all busy buffers leaving body output filters
in an invalid state;
- when upstream response was fully received,
ngx_http_upstream_finalize_request() tried to flush
all pending data.
Fix is to avoid flushing body if p->downstream_error is set.
Sendfile handlers (aio preload and thread handler) are called within
ctx->output_filter() in ngx_output_chain(), and hence ctx->aio cannot
be set directly in ngx_output_chain(). Meanwhile, it must be set to
make sure loop within ngx_output_chain() will be properly terminated.
There are no known cases that trigger the problem, though in theory
something like aio + sub filter (something that needs body in memory,
and can also free some memory buffers) + sendfile can result in
"task already active" and "second aio post" alerts.
The fix is to set ctx->aio in ngx_http_copy_aio_sendfile_preload()
and ngx_http_copy_thread_handler().
For consistency, ctx->aio is no longer set explicitly in
ngx_output_chain_copy_buf(), as it's now done in
ngx_http_copy_thread_handler().
Previously, there were only three timeouts used globally for the whole HTTP/2
connection:
1. Idle timeout for inactivity when there are no streams in processing
(the "http2_idle_timeout" directive);
2. Receive timeout for incomplete frames when there are no streams in
processing (the "http2_recv_timeout" directive);
3. Send timeout when there are frames waiting in the output queue
(the "send_timeout" directive on a server level).
Reaching one of these timeouts leads to HTTP/2 connection close.
This left a number of scenarios when a connection can get stuck without any
processing and timeouts:
1. A client has sent the headers block partially so nginx starts processing
a new stream but cannot continue without the rest of HEADERS and/or
CONTINUATION frames;
2. When nginx waits for the request body;
3. All streams are stuck on exhausted connection or stream windows.
The first idea that was rejected was to detect when the whole connection
gets stuck because of these situations and set the global receive timeout.
The disadvantage of such approach would be inconsistent behaviour in some
typical use cases. For example, if a user never replies to the browser's
question about where to save the downloaded file, the stream will be
eventually closed by a timeout. On the other hand, this will not happen
if there's some activity in other concurrent streams.
Now almost all the request timeouts work like in HTTP/1.x connections, so
the "client_header_timeout", "client_body_timeout", and "send_timeout" are
respected. These timeouts close the request.
The global timeouts work as before.
Previously, the c->write->delayed flag was abused to avoid setting timeouts on
stream events. Now, the "active" and "ready" flags are manipulated instead to
control the processing of individual streams.
This is required for implementing per request timeouts.
Previously, the temporary pool was used only during skipping of
headers and the request pool was used otherwise. That required
switching of pools if the request was closed while parsing.
It wasn't a problem since the request could be closed only after
the validation of the fully parsed header. With the per request
timeouts, the request can be closed at any moment, and switching
of pools in the middle of parsing header name or value becomes a
problem.
To overcome this, the temporary pool is now always created and
used. Special checks are added to keep it when either the stream
is being processed or until header block is fully parsed.
Since 667aaf61a778 (1.1.17) the ngx_http_parse_header_line() function can return
NGX_HTTP_PARSE_INVALID_HEADER when a header contains NUL character. In this
case the r->header_end pointer isn't properly initialized, but the log message
in ngx_http_process_request_headers() hasn't been adjusted. It used the pointer
in size calculation, which might result in up to 2k buffer over-read.
Found with afl-fuzz.
When the "pending" value is zero, the "buf" will be right shifted
by the width of its type, which results in undefined behavior.
Found by Coverity (CID 1352150).
Due to greater priority of the unary plus operator over the ternary operator
the expression didn't work as expected. That might result in one byte less
allocation than needed for the HEADERS frame buffer.
With main request buffered, it's possible, that a slice subrequest will send
output before it. For example, while main request is waiting for aio read to
complete, a slice subrequest can start an aio operation as well. The order
in which aio callbacks are called is undetermined.
Skip SSL_CTX_set_tlsext_servername_callback in case of renegotiation.
Do nothing in SNI callback as in this case it will be supplied with
request in c->data which isn't expected and doesn't work this way.
This was broken by b40af2fd1c16 (1.9.6) with OpenSSL master branch and LibreSSL.
Splits a request into subrequests, each providing a specific range of response.
The variable "$slice_range" must be used to set subrequest range and proper
cache key. The directive "slice" sets slice size.
The following example splits requests into 1-megabyte cacheable subrequests.
server {
listen 8000;
location / {
slice 1m;
proxy_cache cache;
proxy_cache_key $uri$is_args$args$slice_range;
proxy_set_header Range $slice_range;
proxy_cache_valid 200 206 1h;
proxy_pass http://127.0.0.1:9000;
}
}
If an upstream with variables evaluated to address without a port,
then instead of a "no port in upstream" error an attempt was made
to connect() which failed with EADDRNOTAVAIL.
The HEADERS frame is always represented by more than one buffer since
b930e598a199, but the handling code hasn't been adjusted.
Only the first buffer of HEADERS frame was checked and if it had been
sent while others had not, the rest of the frame was dropped, resulting
in broken connection.
Before b930e598a199, the problem could only be seen in case of HEADERS
frame with CONTINUATION.
The r->invalid_header flag wasn't reset once an invalid header appeared in a
request, resulting in all subsequent headers in the request were also marked
as invalid.
The directive toggles conversion of HEAD to GET for cacheable proxy requests.
When disabled, $request_method must be added to cache key for consistency.
By default, HEAD is converted to GET as before.
OpenSSL doesn't check if the negotiated protocol has been announced.
As a result, the client might force using HTTP/2 even if it wasn't
enabled in configuration.
It caused inconsistency between setting "in_closed" flag and the moment when
the last DATA frame was actually read. As a result, the body buffer might not
be initialized properly in ngx_http_v2_init_request_body(), which led to a
segmentation fault in ngx_http_v2_state_read_data(). Also it might cause
start processing of incomplete body.
This issue could be triggered when the processing of a request was delayed,
e.g. in the limit_req or auth_request modules.
Now it limits only the maximum length of literal string (either raw or
compressed) in HPACK request header fields. It's easier to understand
and to describe in the documentation.
Previous code has been based on assumption that the header block can only be
splitted at the borders of individual headers. That wasn't the case and might
result in emitting frames bigger than the frame size limit.
The current approach is to split header blocks by the frame size limit.
Previously, nginx worker would crash because of a double free
if client disconnected or timed out before sending all headers.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Previously, streams that were indirectly reprioritized (either because of
a new exclusive dependency on their parent or because of removal of their
parent from the dependency tree), didn't have their pointer to the parent
node updated.
This broke detection of circular dependencies and, as a result, nginx
worker would crash due to stack overflow whenever such dependency was
introduced.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Per RFC7540, a stream cannot depend on itself.
Previously, this requirement was enforced on PRIORITY frames, but not on
HEADERS frames and due to the implementation details nginx worker would
crash (stack overflow) while opening self-dependent stream.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Since an output buffer can only be used for either reading or sending, small
amounts of data left from the previous operation (due to some limits) must be
sent before nginx will be able to read further into the buffer. Using only
one output buffer can result in suboptimal behavior that manifests itself in
forming and sending too small chunks of data. This is particularly painful
with SPDY (or HTTP/2) where each such chunk needs to be prefixed with some
header.
The default flow-control window in HTTP/2 is 64k minus one bytes. With one
32k output buffer this results is one byte left after exhausting the window.
With two 32k buffers the data will be read into the second free buffer before
sending, thus the minimum output is increased to 32k + 1 bytes which is much
better.
A configuration with a named location inside a zero-length prefix
or regex location used to trigger a segmentation fault, as
ngx_http_core_location() failed to properly detect if a nested location
was created. Example configuration to reproduce the problem:
location "" {
location @foo {}
}
Fix is to not rely on a parent location name length, but rather check
command type we are currently parsing.
Identical fix is also applied to ngx_http_rewrite_if(), which used to
incorrectly assume the "if" directive is on server{} level in such
locations.
Reported by Markus Linnala.
Found with afl-fuzz.
This prevents a potential attack that discloses cached data if an attacker
will be able to craft a hash collision between some cache key the attacker
is allowed to access and another cache key with protected data.
See http://mailman.nginx.org/pipermail/nginx-devel/2015-September/007288.html.
Thanks to Gena Makhomed and Sergey Brester.
The value of NGX_ERROR, returned from filter handlers, was treated as a generic
upstream error and changed to NGX_HTTP_INTERNAL_SERVER_ERROR before calling
ngx_http_finalize_request(). This resulted in "header already sent" alert
if header was already sent in filter handlers.
The problem appeared in 54e9b83d00f0 (1.7.5).
This overflow has become possible after the change in 06e850859a26,
since concurrent subrequests are not limited now and each of them is
counted in r->main->count.
Resolved warnings about declarations that hide previous local declarations.
Warnings about WSASocketA() being deprecated resolved by explicit use of
WSASocketW() instead of WSASocket(). When compiling without IPv6 support,
WinSock deprecated warnings are disabled to allow use of gethostbyname().
The following configuration with alias, nested location and try_files
resulted in wrong file being used. Request "/foo/test.gif" tried to
use "/tmp//foo/test.gif" instead of "/tmp/test.gif":
location /foo/ {
alias /tmp/;
location ~ gif {
try_files $uri =405;
}
}
Additionally, rev. c985d90a8d1f introduced a regression if
the "/tmp//foo/test.gif" file was found (ticket #768). Resulting URI
was set to "gif?/foo/test.gif", as the code used clcf->name of current
location ("location ~ gif") instead of parent one ("location /foo/").
Fix is to use r->uri instead of clcf->name in all cases in the
ngx_http_core_try_files_phase() function. It is expected to be
already matched and identical to the clcf->name of the right
location.
If alias was used in a location given by a regular expression,
nginx used to do wrong thing in try_files if a location name (i.e.,
regular expression) was an exact prefix of URI. The following
configuration triggered a segmentation fault on a request to "/mail":
location ~ /mail {
alias /path/to/directory;
try_files $uri =404;
}
Reported by Per Hansson.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured. As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.
Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set. The upstream keepalive module
was modified to follow this.
The function is now called ngx_parse_http_time(), and can be used by
any code to parse HTTP-style date and time. In particular, it will be
used for OCSP stapling.
For compatibility, a macro to map ngx_http_parse_time() to the new name
provided for a while.
When configured, an individual listen socket on a given address is
created for each worker process. This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance. As of now it works on Linux and DragonFly BSD.
Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/. With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.
Based on previous work by Sepherosa Ziehau and Yingqi Lu.
There is no need to set "i" to 0, as it's expected to be 0 assuming
the bindings are properly sorted, and we already rely on this when
explicitly set hport->naddrs to 1. Remaining conditional code is
replaced with identical "hport->naddrs = i + 1".
Identical modifications are done in the mail and stream modules,
in the ngx_mail_optimize_servers() and ngx_stream_optimize_servers()
functions, respectively.
No functional changes.
It's now enough to specify proxy_protocol option in one listen directive to
enable it in all servers listening on the same address/port. Previously,
the setting from the first directive was always used.
If a peer was initially skipped due to max_fails, there's no reason
not to try it again if enough time has passed, and the next_upstream
logic is in action.
This also reduces diffs with NGINX Plus.
This helps to avoid suboptimal behavior when a client waits for a control
frame or more data to increase window size, but the frames have been delayed
in the socket buffer.
The delays can be caused by bad interaction between Nagle's algorithm on
nginx side and delayed ACK on the client side or by TCP_CORK/TCP_NOPUSH
if SPDY was working without SSL and sendfile() was used.
The pushing code is now very similar to ngx_http_set_keepalive().
If any preread body bytes were sent in the first chain, chunk size was
incorrectly added before the whole chain, including header, resulting in
an invalid request sent to upstream. Fixed to properly add chunk size
after the header.
The r->request_body_no_buffering flag was introduced. It instructs
client request body reading code to avoid reading the whole body, and
to call post_handler early instead. The caller should use the
ngx_http_read_unbuffered_request_body() function to read remaining
parts of the body.
Upstream module is now able to use this mode, if configured with
the proxy_request_buffering directive.
If the last header evaluation resulted in an empty header, the e.skip flag
was set and was not reset when we've switched to evaluation of body_values.
This incorrectly resulted in body values being skipped instead of producing
some correct body as set by proxy_set_body. Fix is to properly reset
the e.skip flag.
As the problem only appeared if the last potentially non-empty header
happened to be empty, it only manifested itself if proxy_set_body was used
with proxy_cache.
LibreSSL removed support for export ciphers and a call to
SSL_CTX_set_tmp_rsa_callback() results in an error left in the error
queue. This caused alerts "ignoring stale global SSL error (...called
a function you should not call) while SSL handshaking" on a first connection
in each worker process.
Keeping the ready flag in this case might results in missing notification of
broken connection until nginx tried to use it again.
While there, stale comment about stale event was removed since this function
is also can be called directly.
In case of filter finalization, r->upstream might be changed during
the ngx_event_pipe() call. Added an argument to preserve it while
calling the ngx_http_upstream_process_request() function.
A request may be already finalized when ngx_http_upstream_finalize_request()
is called, due to filter finalization: after filter finalization upstream
can be finalized via ngx_http_upstream_cleanup(), either from
ngx_http_terminate_request(), or because a new request was initiated
to an upstream. Then the upstream code will see an error returned from
the filter chain and will call the ngx_http_upstream_finalize_request()
function again.
To prevent corruption of various upstream data in this situation, make sure
to do nothing but merely call ngx_http_finalize_request().
Prodded by Yichun Zhang, for details see the thread at
http://nginx.org/pipermail/nginx-devel/2015-February/006539.html.
Previously, connection hung after calling ngx_http_ssl_handshake() with
rev->ready set and no bytes in socket to read. It's possible in at least the
following cases:
- when processing a connection with expired TCP_DEFER_ACCEPT on Linux
- after parsing PROXY protocol header if it arrived in a separate TCP packet
Thanks to James Hamlin.
When replacing a stale cache entry, its last_modified and etag could be
inherited from the old entry if the response code is not 200 or 206. Moreover,
etag could be inherited with any response code if it's missing in the new
response. As a result, the cache entry is left with invalid last_modified or
etag which could lead to broken revalidation.
For example, when a file is deleted from backend, its last_modified is copied to
the new 404 cache entry and is used later for revalidation. Once the old file
appears again with its original timestamp, revalidation succeeds and the cached
404 response is sent to client instead of the file.
The problem appeared with etags in 44b9ab7752e3 (1.7.3) and affected
last_modified in 1573fc7875fa (1.7.9).
Repeatedly calling ngx_http_upstream_add_chash_point() to create
the points array in sorted order, is O(n^2) to the total weight.
This can cause nginx startup and reconfigure to be substantially
delayed. For example, when total weight is 1000, startup takes
5s on a modern laptop.
Replace this with a linear insertion followed by QuickSort and
duplicates removal. Startup for total weight of 1000 reduces to 40ms.
Based on a patch by Wai Keen Woon.
This reduces layering violation and simplifies the logic of AIO preread, since
it's now triggered by the send chain function itself without falling back to
the copy filter. The context of AIO operation is now stored per file buffer,
which makes it possible to properly handle cases when multiple buffers come
from different locations, each with its own configuration.
If fastcgi_pass (or any look-alike that doesn't imply a default
port) is specified as an IP literal (as opposed to a hostname),
port absence was not detected at configuration time and could
result in EADDRNOTAVAIL at run time.
Fixed this in such a way that configs like
http {
server {
location / {
fastcgi_pass 127.0.0.1;
}
}
upstream 127.0.0.1 {
server 10.0.0.1:12345;
}
}
still work. That is, port absence check is delayed until after
we make sure there's no explicit upstream with such a name.
If use_temp_path is set to off, a subdirectory "temp" is created in the cache
directory. It's used instead of proxy_temp_path and friends for caching
upstream response.
Some parts of code related to handling variants of a resource moved into
a separate function that is called earlier. This allows to use cache file
name as a prefix for temporary file in the following patch.
The configuration handling code has changed to look similar to the proxy_store
directive and friends. This simplifies adding variable support in the following
patch.
No functional changes.
Currently, storing and caching mechanisms cannot work together, and a
configuration error is thrown when the proxy_store and proxy_cache
directives (as well as their friends) are configured on the same level.
But configurations like in the example below were allowed and could result
in critical errors in the error log:
proxy_store on;
location / {
proxy_cache one;
}
Only proxy_store worked in this case.
For more predictable and errorless behavior these directives now prevent
each other from being inherited from the previous level.
This changes internal API related to handling of the "store"
flag in ngx_http_upstream_conf_t. Previously, a non-null value
of "store_lengths" was enough to enable store functionality with
custom path. Now, the "store" flag is also required to be set.
No functional changes.
The proxy_store, fastcgi_store, scgi_store and uwsgi_store were inherited
incorrectly if a directive with variables was defined, and then redefined
to the "on" value, i.e. in configurations like:
proxy_store /data/www$upstream_http_x_store;
location / {
proxy_store on;
}
In the following configuration request was sent to a backend without
URI changed to '/' due to if:
location /proxy-pass-uri {
proxy_pass http://127.0.0.1:8080/;
set $true 1;
if ($true) {
# nothing
}
}
Fix is to inherit conf->location from the location where proxy_pass was
configured, much like it's done with conf->vars.
The proxy_pass directive and other handlers are not expected to be inherited
into nested locations, but there is a special code to inherit upstream
handlers into limit_except blocks, as well as a configuration into if{}
blocks. This caused incorrect behaviour in configurations with nested
locations and limit_except blocks, like this:
location / {
proxy_pass http://u;
location /inner/ {
# no proxy_pass here
limit_except GET {
# nothing
}
}
}
In such a configuration the limit_except block inside "location /inner/"
unexpectedly used proxy_pass defined in "location /", while it shouldn't.
Fix is to avoid inheritance of conf->upstream.upstream (and
conf->proxy_lengths) into locations which don't have noname flag.
Instead of independant inheritance of conf->upstream.upstream (proxy_pass
without variables) and conf->proxy_lengths (proxy_pass with variables)
we now test them both and inherit only if neither is set. Additionally,
SSL context is also inherited only in this case now.
Based on the patch by Alexey Radkov.
RFC7232 says:
The 304 (Not Modified) status code indicates that a conditional GET
or HEAD request has been received and would have resulted in a 200
(OK) response if it were not for the fact that the condition
evaluated to false.
which means that there is no reason to send requests with "If-None-Match"
and/or "If-Modified-Since" headers for responses cached with other status
codes.
Also, sending conditional requests for responses cached with other status
codes could result in a strange behavior, e.g. upstream server returning
304 Not Modified for cached 404 Not Found responses, etc.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
In case of a cache lock timeout and in the aio handler we now call
r->write_event_handler() instead of a connection write handler,
to make sure to run appropriate subrequest. Previous code failed to run
inactive subrequests and hence resulted in suboptimal behaviour, see
report by Yichun Zhang:
http://mailman.nginx.org/pipermail/nginx-devel/2013-October/004435.html
(Infinite hang claimed in the report seems impossible without 3rd party
modules, as subrequests will be eventually woken up by the postpone filter.)
To ensure proper logging make sure to set current_request in all event
handlers, including resolve, ssl handshake, cache lock wait timer and
aio read handlers. A macro ngx_http_set_log_request() introduced to
simplify this.
The alert was introduced in 03ff14058272 (1.5.4), and was triggered on each
post_action invocation.
There is no real need to call header filters in case of post_action,
so return NGX_OK from ngx_http_send_header() if r->post_action is set.
This helps to avoid delays in sending the last chunk of data because
of bad interaction between Nagle's algorithm on nginx side and
delayed ACK on the client side.
Delays could also be caused by TCP_CORK/TCP_NOPUSH if SPDY was
working without SSL and sendfile() was used.
The upstream modules remove and alter a number of client headers
before sending the request to upstream. This set of headers is
smaller or even empty when cache is disabled.
It's still possible that a request in a cache-enabled location is
uncached, for example, if cache entry counter is below min_uses.
In this case it's better to alter a smaller set of headers and
pass more client headers to backend unchanged. One of the benefits
is enabling server-side byte ranges in such requests.
Once this age is reached, the cache lock is discarded and another
request can acquire the lock. Requests which failed to acquire
the lock are not allowed to cache the response.
For further progress a new buffer must be at least two bytes larger than
the remaining unparsed data. One more byte is needed for null-termination
and another one for further progress. Otherwise inflate() fails with
Z_BUF_ERROR.
Previously, nginx would emit empty values in a header with multiple,
NULL-separated values.
This is forbidden by the SPDY specification, which requires headers to
have either a single (possibly empty) value or multiple, NULL-separated
non-empty values.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
When got multiple upstream IP addresses using DNS resolving, the number of
upstreams tries and the maxinum time spent for these tries were not affected.
This patch fixed it.
Spaces in Accept-Charset, Accept-Encoding, and Accept-Language headers
are now ignored. As per syntax of these headers spaces can only appear
in places where they are optional.
If a variant stored can't be used to respond to a request, the variant
hash is used as a secondary key.
Additionally, if we previously switched to a secondary key, while storing
a response to cache we check if the variant hash still apply. If not, we
switch back to the original key, to handle cases when Vary changes.
To cache responses with Vary, we now calculate hash of headers listed
in Vary, and return the response from cache only if new request headers
match.
As of now, only one variant of the same resource can be stored in cache.
Previous code resulted in transfer stalls when client happened
to read all the data in buffers at once, while all gzip buffers
were exhausted (but ctx->nomem wasn't set). Make sure to call
next body filter at least once per call if there are busy buffers.
Additionally, handling of calls with NULL chain was changed to follow
the same logic, i.e., next body filter is only called with NULL chain
if there are busy buffers. This is expected to fix "output chain is empty"
alerts as reported by some users after c52a761a2029 (1.5.7).
Due to the u->headers_in.last_modified_time not being correctly initialized,
this variable was evaluated to "Thu, 01 Jan 1970 00:00:00 GMT" for responses
cached without the "Last-Modified" header which resulted in subsequent proxy
requests being sent with "If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT"
header.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The c->sent is reset to 0 on each request by server-side http code,
so do the same on client side. This allows to count number of bytes
sent in a particular request.
One intentional side effect of this change is that key is allowed only
in the first position. Previously, it was possible to specify the key
variable at any position, but that was never documented, and is contrary
with nginx configuration practice for positional parameters.
One intentional side effect of this change is that key is allowed only
in the first position. Previously, it was possible to specify the key
variable at any position, but that was never documented, and is contrary
to nginx configuration practice for positional parameters.
Previously, a file buffer start position was reset to the file start.
Now it's reset to the previous file buffer end. This fixes
reinitialization of requests having multiple successive parts of a
single file. Such requests are generated by fastcgi module.
The new directives {proxy,fastcgi,scgi,uwsgi,memcached}_next_upstream_tries
and {proxy,fastcgi,scgi,uwsgi,memcached}_next_upstream_timeout limit
the number of upstreams tried and the maximum time spent for these tries
when searching for a valid upstream.
When memory allocation failed in ngx_http_upstream_cache(), the connection
would be terminated directly in ngx_http_upstream_init_request().
Return a INTERNAL_SERVER_ERROR response instead.
The etag->hash must be set to 0 to avoid an empty ETag header being
returned with the 500 Internal Server Error page after the memory
allocation failure.
Reported by Markus Linnala.
The messages "ngx_slab_alloc() failed: no memory in cache keys zone"
from the file cache slab allocator are suppressed since the allocation
is likely to succeed after the forced expiration of cache nodes.
The second allocation failure is reported.
In theory, this can provide a bit better distribution of latencies.
Also it simplifies the code, since ngx_queue_t is now used instead
of custom implementation.
Previously, a configuration like
location / {
ssi on;
ssi_types *;
set $http_foo "bar";
return 200 '<!--#echo var="http_foo" -->\n';
}
resulted in NULL pointer dereference in ngx_http_get_variable() as
the variable was explicitly added to the variables hash, but its
get_handler wasn't properly set in the hash. Fix is to make sure
that get_handler is properly set by ngx_http_variables_init_vars().
The SPDY module doesn't expect timers can be set on stream events for reasons
other than delaying output. But ngx_http_writer() could add timer on write
event if the delayed flag wasn't set and nginx is waiting for AIO completion.
That could cause delays in sending response over SPDY when file AIO was used.