If an upstream with variables evaluated to address without a port,
then instead of a "no port in upstream" error an attempt was made
to connect() which failed with EADDRNOTAVAIL.
The HEADERS frame is always represented by more than one buffer since
b930e598a199, but the handling code hasn't been adjusted.
Only the first buffer of HEADERS frame was checked and if it had been
sent while others had not, the rest of the frame was dropped, resulting
in broken connection.
Before b930e598a199, the problem could only be seen in case of HEADERS
frame with CONTINUATION.
The r->invalid_header flag wasn't reset once an invalid header appeared in a
request, resulting in all subsequent headers in the request were also marked
as invalid.
The directive toggles conversion of HEAD to GET for cacheable proxy requests.
When disabled, $request_method must be added to cache key for consistency.
By default, HEAD is converted to GET as before.
OpenSSL doesn't check if the negotiated protocol has been announced.
As a result, the client might force using HTTP/2 even if it wasn't
enabled in configuration.
It caused inconsistency between setting "in_closed" flag and the moment when
the last DATA frame was actually read. As a result, the body buffer might not
be initialized properly in ngx_http_v2_init_request_body(), which led to a
segmentation fault in ngx_http_v2_state_read_data(). Also it might cause
start processing of incomplete body.
This issue could be triggered when the processing of a request was delayed,
e.g. in the limit_req or auth_request modules.
Now it limits only the maximum length of literal string (either raw or
compressed) in HPACK request header fields. It's easier to understand
and to describe in the documentation.
Previous code has been based on assumption that the header block can only be
splitted at the borders of individual headers. That wasn't the case and might
result in emitting frames bigger than the frame size limit.
The current approach is to split header blocks by the frame size limit.
Previously, nginx worker would crash because of a double free
if client disconnected or timed out before sending all headers.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Previously, streams that were indirectly reprioritized (either because of
a new exclusive dependency on their parent or because of removal of their
parent from the dependency tree), didn't have their pointer to the parent
node updated.
This broke detection of circular dependencies and, as a result, nginx
worker would crash due to stack overflow whenever such dependency was
introduced.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Per RFC7540, a stream cannot depend on itself.
Previously, this requirement was enforced on PRIORITY frames, but not on
HEADERS frames and due to the implementation details nginx worker would
crash (stack overflow) while opening self-dependent stream.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Since an output buffer can only be used for either reading or sending, small
amounts of data left from the previous operation (due to some limits) must be
sent before nginx will be able to read further into the buffer. Using only
one output buffer can result in suboptimal behavior that manifests itself in
forming and sending too small chunks of data. This is particularly painful
with SPDY (or HTTP/2) where each such chunk needs to be prefixed with some
header.
The default flow-control window in HTTP/2 is 64k minus one bytes. With one
32k output buffer this results is one byte left after exhausting the window.
With two 32k buffers the data will be read into the second free buffer before
sending, thus the minimum output is increased to 32k + 1 bytes which is much
better.
A configuration with a named location inside a zero-length prefix
or regex location used to trigger a segmentation fault, as
ngx_http_core_location() failed to properly detect if a nested location
was created. Example configuration to reproduce the problem:
location "" {
location @foo {}
}
Fix is to not rely on a parent location name length, but rather check
command type we are currently parsing.
Identical fix is also applied to ngx_http_rewrite_if(), which used to
incorrectly assume the "if" directive is on server{} level in such
locations.
Reported by Markus Linnala.
Found with afl-fuzz.
This prevents a potential attack that discloses cached data if an attacker
will be able to craft a hash collision between some cache key the attacker
is allowed to access and another cache key with protected data.
See http://mailman.nginx.org/pipermail/nginx-devel/2015-September/007288.html.
Thanks to Gena Makhomed and Sergey Brester.
The value of NGX_ERROR, returned from filter handlers, was treated as a generic
upstream error and changed to NGX_HTTP_INTERNAL_SERVER_ERROR before calling
ngx_http_finalize_request(). This resulted in "header already sent" alert
if header was already sent in filter handlers.
The problem appeared in 54e9b83d00f0 (1.7.5).
This overflow has become possible after the change in 06e850859a26,
since concurrent subrequests are not limited now and each of them is
counted in r->main->count.
Resolved warnings about declarations that hide previous local declarations.
Warnings about WSASocketA() being deprecated resolved by explicit use of
WSASocketW() instead of WSASocket(). When compiling without IPv6 support,
WinSock deprecated warnings are disabled to allow use of gethostbyname().
The following configuration with alias, nested location and try_files
resulted in wrong file being used. Request "/foo/test.gif" tried to
use "/tmp//foo/test.gif" instead of "/tmp/test.gif":
location /foo/ {
alias /tmp/;
location ~ gif {
try_files $uri =405;
}
}
Additionally, rev. c985d90a8d1f introduced a regression if
the "/tmp//foo/test.gif" file was found (ticket #768). Resulting URI
was set to "gif?/foo/test.gif", as the code used clcf->name of current
location ("location ~ gif") instead of parent one ("location /foo/").
Fix is to use r->uri instead of clcf->name in all cases in the
ngx_http_core_try_files_phase() function. It is expected to be
already matched and identical to the clcf->name of the right
location.
If alias was used in a location given by a regular expression,
nginx used to do wrong thing in try_files if a location name (i.e.,
regular expression) was an exact prefix of URI. The following
configuration triggered a segmentation fault on a request to "/mail":
location ~ /mail {
alias /path/to/directory;
try_files $uri =404;
}
Reported by Per Hansson.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured. As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.
Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set. The upstream keepalive module
was modified to follow this.
The function is now called ngx_parse_http_time(), and can be used by
any code to parse HTTP-style date and time. In particular, it will be
used for OCSP stapling.
For compatibility, a macro to map ngx_http_parse_time() to the new name
provided for a while.
When configured, an individual listen socket on a given address is
created for each worker process. This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance. As of now it works on Linux and DragonFly BSD.
Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/. With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.
Based on previous work by Sepherosa Ziehau and Yingqi Lu.
There is no need to set "i" to 0, as it's expected to be 0 assuming
the bindings are properly sorted, and we already rely on this when
explicitly set hport->naddrs to 1. Remaining conditional code is
replaced with identical "hport->naddrs = i + 1".
Identical modifications are done in the mail and stream modules,
in the ngx_mail_optimize_servers() and ngx_stream_optimize_servers()
functions, respectively.
No functional changes.
It's now enough to specify proxy_protocol option in one listen directive to
enable it in all servers listening on the same address/port. Previously,
the setting from the first directive was always used.
If a peer was initially skipped due to max_fails, there's no reason
not to try it again if enough time has passed, and the next_upstream
logic is in action.
This also reduces diffs with NGINX Plus.
This helps to avoid suboptimal behavior when a client waits for a control
frame or more data to increase window size, but the frames have been delayed
in the socket buffer.
The delays can be caused by bad interaction between Nagle's algorithm on
nginx side and delayed ACK on the client side or by TCP_CORK/TCP_NOPUSH
if SPDY was working without SSL and sendfile() was used.
The pushing code is now very similar to ngx_http_set_keepalive().
If any preread body bytes were sent in the first chain, chunk size was
incorrectly added before the whole chain, including header, resulting in
an invalid request sent to upstream. Fixed to properly add chunk size
after the header.
The r->request_body_no_buffering flag was introduced. It instructs
client request body reading code to avoid reading the whole body, and
to call post_handler early instead. The caller should use the
ngx_http_read_unbuffered_request_body() function to read remaining
parts of the body.
Upstream module is now able to use this mode, if configured with
the proxy_request_buffering directive.
If the last header evaluation resulted in an empty header, the e.skip flag
was set and was not reset when we've switched to evaluation of body_values.
This incorrectly resulted in body values being skipped instead of producing
some correct body as set by proxy_set_body. Fix is to properly reset
the e.skip flag.
As the problem only appeared if the last potentially non-empty header
happened to be empty, it only manifested itself if proxy_set_body was used
with proxy_cache.
LibreSSL removed support for export ciphers and a call to
SSL_CTX_set_tmp_rsa_callback() results in an error left in the error
queue. This caused alerts "ignoring stale global SSL error (...called
a function you should not call) while SSL handshaking" on a first connection
in each worker process.
Keeping the ready flag in this case might results in missing notification of
broken connection until nginx tried to use it again.
While there, stale comment about stale event was removed since this function
is also can be called directly.
In case of filter finalization, r->upstream might be changed during
the ngx_event_pipe() call. Added an argument to preserve it while
calling the ngx_http_upstream_process_request() function.
A request may be already finalized when ngx_http_upstream_finalize_request()
is called, due to filter finalization: after filter finalization upstream
can be finalized via ngx_http_upstream_cleanup(), either from
ngx_http_terminate_request(), or because a new request was initiated
to an upstream. Then the upstream code will see an error returned from
the filter chain and will call the ngx_http_upstream_finalize_request()
function again.
To prevent corruption of various upstream data in this situation, make sure
to do nothing but merely call ngx_http_finalize_request().
Prodded by Yichun Zhang, for details see the thread at
http://nginx.org/pipermail/nginx-devel/2015-February/006539.html.
Previously, connection hung after calling ngx_http_ssl_handshake() with
rev->ready set and no bytes in socket to read. It's possible in at least the
following cases:
- when processing a connection with expired TCP_DEFER_ACCEPT on Linux
- after parsing PROXY protocol header if it arrived in a separate TCP packet
Thanks to James Hamlin.
When replacing a stale cache entry, its last_modified and etag could be
inherited from the old entry if the response code is not 200 or 206. Moreover,
etag could be inherited with any response code if it's missing in the new
response. As a result, the cache entry is left with invalid last_modified or
etag which could lead to broken revalidation.
For example, when a file is deleted from backend, its last_modified is copied to
the new 404 cache entry and is used later for revalidation. Once the old file
appears again with its original timestamp, revalidation succeeds and the cached
404 response is sent to client instead of the file.
The problem appeared with etags in 44b9ab7752e3 (1.7.3) and affected
last_modified in 1573fc7875fa (1.7.9).
Repeatedly calling ngx_http_upstream_add_chash_point() to create
the points array in sorted order, is O(n^2) to the total weight.
This can cause nginx startup and reconfigure to be substantially
delayed. For example, when total weight is 1000, startup takes
5s on a modern laptop.
Replace this with a linear insertion followed by QuickSort and
duplicates removal. Startup for total weight of 1000 reduces to 40ms.
Based on a patch by Wai Keen Woon.
This reduces layering violation and simplifies the logic of AIO preread, since
it's now triggered by the send chain function itself without falling back to
the copy filter. The context of AIO operation is now stored per file buffer,
which makes it possible to properly handle cases when multiple buffers come
from different locations, each with its own configuration.
If fastcgi_pass (or any look-alike that doesn't imply a default
port) is specified as an IP literal (as opposed to a hostname),
port absence was not detected at configuration time and could
result in EADDRNOTAVAIL at run time.
Fixed this in such a way that configs like
http {
server {
location / {
fastcgi_pass 127.0.0.1;
}
}
upstream 127.0.0.1 {
server 10.0.0.1:12345;
}
}
still work. That is, port absence check is delayed until after
we make sure there's no explicit upstream with such a name.
If use_temp_path is set to off, a subdirectory "temp" is created in the cache
directory. It's used instead of proxy_temp_path and friends for caching
upstream response.
Some parts of code related to handling variants of a resource moved into
a separate function that is called earlier. This allows to use cache file
name as a prefix for temporary file in the following patch.
The configuration handling code has changed to look similar to the proxy_store
directive and friends. This simplifies adding variable support in the following
patch.
No functional changes.
Currently, storing and caching mechanisms cannot work together, and a
configuration error is thrown when the proxy_store and proxy_cache
directives (as well as their friends) are configured on the same level.
But configurations like in the example below were allowed and could result
in critical errors in the error log:
proxy_store on;
location / {
proxy_cache one;
}
Only proxy_store worked in this case.
For more predictable and errorless behavior these directives now prevent
each other from being inherited from the previous level.
This changes internal API related to handling of the "store"
flag in ngx_http_upstream_conf_t. Previously, a non-null value
of "store_lengths" was enough to enable store functionality with
custom path. Now, the "store" flag is also required to be set.
No functional changes.
The proxy_store, fastcgi_store, scgi_store and uwsgi_store were inherited
incorrectly if a directive with variables was defined, and then redefined
to the "on" value, i.e. in configurations like:
proxy_store /data/www$upstream_http_x_store;
location / {
proxy_store on;
}
In the following configuration request was sent to a backend without
URI changed to '/' due to if:
location /proxy-pass-uri {
proxy_pass http://127.0.0.1:8080/;
set $true 1;
if ($true) {
# nothing
}
}
Fix is to inherit conf->location from the location where proxy_pass was
configured, much like it's done with conf->vars.
The proxy_pass directive and other handlers are not expected to be inherited
into nested locations, but there is a special code to inherit upstream
handlers into limit_except blocks, as well as a configuration into if{}
blocks. This caused incorrect behaviour in configurations with nested
locations and limit_except blocks, like this:
location / {
proxy_pass http://u;
location /inner/ {
# no proxy_pass here
limit_except GET {
# nothing
}
}
}
In such a configuration the limit_except block inside "location /inner/"
unexpectedly used proxy_pass defined in "location /", while it shouldn't.
Fix is to avoid inheritance of conf->upstream.upstream (and
conf->proxy_lengths) into locations which don't have noname flag.
Instead of independant inheritance of conf->upstream.upstream (proxy_pass
without variables) and conf->proxy_lengths (proxy_pass with variables)
we now test them both and inherit only if neither is set. Additionally,
SSL context is also inherited only in this case now.
Based on the patch by Alexey Radkov.
RFC7232 says:
The 304 (Not Modified) status code indicates that a conditional GET
or HEAD request has been received and would have resulted in a 200
(OK) response if it were not for the fact that the condition
evaluated to false.
which means that there is no reason to send requests with "If-None-Match"
and/or "If-Modified-Since" headers for responses cached with other status
codes.
Also, sending conditional requests for responses cached with other status
codes could result in a strange behavior, e.g. upstream server returning
304 Not Modified for cached 404 Not Found responses, etc.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
In case of a cache lock timeout and in the aio handler we now call
r->write_event_handler() instead of a connection write handler,
to make sure to run appropriate subrequest. Previous code failed to run
inactive subrequests and hence resulted in suboptimal behaviour, see
report by Yichun Zhang:
http://mailman.nginx.org/pipermail/nginx-devel/2013-October/004435.html
(Infinite hang claimed in the report seems impossible without 3rd party
modules, as subrequests will be eventually woken up by the postpone filter.)
To ensure proper logging make sure to set current_request in all event
handlers, including resolve, ssl handshake, cache lock wait timer and
aio read handlers. A macro ngx_http_set_log_request() introduced to
simplify this.
The alert was introduced in 03ff14058272 (1.5.4), and was triggered on each
post_action invocation.
There is no real need to call header filters in case of post_action,
so return NGX_OK from ngx_http_send_header() if r->post_action is set.
This helps to avoid delays in sending the last chunk of data because
of bad interaction between Nagle's algorithm on nginx side and
delayed ACK on the client side.
Delays could also be caused by TCP_CORK/TCP_NOPUSH if SPDY was
working without SSL and sendfile() was used.
The upstream modules remove and alter a number of client headers
before sending the request to upstream. This set of headers is
smaller or even empty when cache is disabled.
It's still possible that a request in a cache-enabled location is
uncached, for example, if cache entry counter is below min_uses.
In this case it's better to alter a smaller set of headers and
pass more client headers to backend unchanged. One of the benefits
is enabling server-side byte ranges in such requests.
Once this age is reached, the cache lock is discarded and another
request can acquire the lock. Requests which failed to acquire
the lock are not allowed to cache the response.