If enabled, workers are bound to available CPUs, each worker to once CPU
in order. If there are more workers than available CPUs, remaining are
bound in a loop, starting again from the first available CPU.
The optional mask parameter defines which CPUs are available for automatic
binding.
In collaboration with Vladimir Homutov.
With main request buffered, it's possible, that a slice subrequest will send
output before it. For example, while main request is waiting for aio read to
complete, a slice subrequest can start an aio operation as well. The order
in which aio callbacks are called is undetermined.
Skip SSL_CTX_set_tlsext_servername_callback in case of renegotiation.
Do nothing in SNI callback as in this case it will be supplied with
request in c->data which isn't expected and doesn't work this way.
This was broken by b40af2fd1c16 (1.9.6) with OpenSSL master branch and LibreSSL.
Splits a request into subrequests, each providing a specific range of response.
The variable "$slice_range" must be used to set subrequest range and proper
cache key. The directive "slice" sets slice size.
The following example splits requests into 1-megabyte cacheable subrequests.
server {
listen 8000;
location / {
slice 1m;
proxy_cache cache;
proxy_cache_key $uri$is_args$args$slice_range;
proxy_set_header Range $slice_range;
proxy_cache_valid 200 206 1h;
proxy_pass http://127.0.0.1:9000;
}
}
If an upstream with variables evaluated to address without a port,
then instead of a "no port in upstream" error an attempt was made
to connect() which failed with EADDRNOTAVAIL.
This fixes suboptimal behavior caused by surplus lseek() for sequential writes
on systems without pwrite(). A consecutive read after write might result in an
error on systems without pread() and pwrite().
Fortunately, at the moment there are no widely used systems without these
syscalls.
The HEADERS frame is always represented by more than one buffer since
b930e598a199, but the handling code hasn't been adjusted.
Only the first buffer of HEADERS frame was checked and if it had been
sent while others had not, the rest of the frame was dropped, resulting
in broken connection.
Before b930e598a199, the problem could only be seen in case of HEADERS
frame with CONTINUATION.
The r->invalid_header flag wasn't reset once an invalid header appeared in a
request, resulting in all subsequent headers in the request were also marked
as invalid.
The directive toggles conversion of HEAD to GET for cacheable proxy requests.
When disabled, $request_method must be added to cache key for consistency.
By default, HEAD is converted to GET as before.
OpenSSL doesn't check if the negotiated protocol has been announced.
As a result, the client might force using HTTP/2 even if it wasn't
enabled in configuration.
It caused inconsistency between setting "in_closed" flag and the moment when
the last DATA frame was actually read. As a result, the body buffer might not
be initialized properly in ngx_http_v2_init_request_body(), which led to a
segmentation fault in ngx_http_v2_state_read_data(). Also it might cause
start processing of incomplete body.
This issue could be triggered when the processing of a request was delayed,
e.g. in the limit_req or auth_request modules.
The code failed to ensure that "s" is within the buffer passed for
parsing when checking for "ms", and this resulted in unexpected errors when
parsing non-null-terminated strings with trailing "m". The bug manifested
itself when the expires directive was used with variables.
Found by Roman Arutyunyan.
Now it limits only the maximum length of literal string (either raw or
compressed) in HPACK request header fields. It's easier to understand
and to describe in the documentation.
Previous code has been based on assumption that the header block can only be
splitted at the borders of individual headers. That wasn't the case and might
result in emitting frames bigger than the frame size limit.
The current approach is to split header blocks by the frame size limit.
Previously, nginx worker would crash because of a double free
if client disconnected or timed out before sending all headers.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Previously, streams that were indirectly reprioritized (either because of
a new exclusive dependency on their parent or because of removal of their
parent from the dependency tree), didn't have their pointer to the parent
node updated.
This broke detection of circular dependencies and, as a result, nginx
worker would crash due to stack overflow whenever such dependency was
introduced.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Per RFC7540, a stream cannot depend on itself.
Previously, this requirement was enforced on PRIORITY frames, but not on
HEADERS frames and due to the implementation details nginx worker would
crash (stack overflow) while opening self-dependent stream.
Found with afl-fuzz.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
As setitimer() isn't available on Windows, time wasn't updated at all
if timer_resolution was used with the select event method. Fix is
to ignore timer_resolution in such cases.
This context is needed for shared sessions cache to work in configurations
with multiple virtual servers sharing the same port. Unfortunately, OpenSSL
does not provide an API to access the session context, thus storing it
separately.
In collaboration with Vladimir Homutov.
If no space left in buffer after adding formatting symbols, error message
could be left without terminating null. The fix is to output message using
actual length.
The code for displaying version info and configuration info seemed to be
cluttering up the main function. I was finding it hard to read main. This
extracts out all of the logic for displaying version and configuration info
into its own function, thus making main easier to read.
RAND_pseudo_bytes() is deprecated in the OpenSSL master branch, so the only
use was changed to RAND_bytes(). Access to internal structures is no longer
possible, so now we don't try to set SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS even
if it's defined.
Since an output buffer can only be used for either reading or sending, small
amounts of data left from the previous operation (due to some limits) must be
sent before nginx will be able to read further into the buffer. Using only
one output buffer can result in suboptimal behavior that manifests itself in
forming and sending too small chunks of data. This is particularly painful
with SPDY (or HTTP/2) where each such chunk needs to be prefixed with some
header.
The default flow-control window in HTTP/2 is 64k minus one bytes. With one
32k output buffer this results is one byte left after exhausting the window.
With two 32k buffers the data will be read into the second free buffer before
sending, thus the minimum output is increased to 32k + 1 bytes which is much
better.
A configuration like
server { server_name .foo^@; }
server { server_name .foo; }
resulted in a segmentation fault during construction of server names hash.
Reported by Markus Linnala.
Found with afl-fuzz.
A configuration with a named location inside a zero-length prefix
or regex location used to trigger a segmentation fault, as
ngx_http_core_location() failed to properly detect if a nested location
was created. Example configuration to reproduce the problem:
location "" {
location @foo {}
}
Fix is to not rely on a parent location name length, but rather check
command type we are currently parsing.
Identical fix is also applied to ngx_http_rewrite_if(), which used to
incorrectly assume the "if" directive is on server{} level in such
locations.
Reported by Markus Linnala.
Found with afl-fuzz.
This prevents a potential attack that discloses cached data if an attacker
will be able to craft a hash collision between some cache key the attacker
is allowed to access and another cache key with protected data.
See http://mailman.nginx.org/pipermail/nginx-devel/2015-September/007288.html.
Thanks to Gena Makhomed and Sergey Brester.
The value of NGX_ERROR, returned from filter handlers, was treated as a generic
upstream error and changed to NGX_HTTP_INTERNAL_SERVER_ERROR before calling
ngx_http_finalize_request(). This resulted in "header already sent" alert
if header was already sent in filter handlers.
The problem appeared in 54e9b83d00f0 (1.7.5).
This overflow has become possible after the change in 06e850859a26,
since concurrent subrequests are not limited now and each of them is
counted in r->main->count.
Resolved warnings about declarations that hide previous local declarations.
Warnings about WSASocketA() being deprecated resolved by explicit use of
WSASocketW() instead of WSASocket(). When compiling without IPv6 support,
WinSock deprecated warnings are disabled to allow use of gethostbyname().
The following configuration with alias, nested location and try_files
resulted in wrong file being used. Request "/foo/test.gif" tried to
use "/tmp//foo/test.gif" instead of "/tmp/test.gif":
location /foo/ {
alias /tmp/;
location ~ gif {
try_files $uri =405;
}
}
Additionally, rev. c985d90a8d1f introduced a regression if
the "/tmp//foo/test.gif" file was found (ticket #768). Resulting URI
was set to "gif?/foo/test.gif", as the code used clcf->name of current
location ("location ~ gif") instead of parent one ("location /foo/").
Fix is to use r->uri instead of clcf->name in all cases in the
ngx_http_core_try_files_phase() function. It is expected to be
already matched and identical to the clcf->name of the right
location.
If alias was used in a location given by a regular expression,
nginx used to do wrong thing in try_files if a location name (i.e.,
regular expression) was an exact prefix of URI. The following
configuration triggered a segmentation fault on a request to "/mail":
location ~ /mail {
alias /path/to/directory;
try_files $uri =404;
}
Reported by Per Hansson.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured. As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.
Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set. The upstream keepalive module
was modified to follow this.
If nginx was used under OpenVZ and a container with nginx was suspended
and resumed, configuration tests started to fail because of EADDRINUSE
returned from listen() instead of bind():
# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
nginx: configuration file /etc/nginx/nginx.conf test failed
With this change EADDRINUSE errors returned by listen() are handled
similarly to errors returned by bind(), and configuration tests work
fine in the same environment:
# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
More details about OpenVZ suspend/resume bug:
https://bugzilla.openvz.org/show_bug.cgi?id=2470
OCSP responses may contain no nextUpdate. As per RFC 6960, this means
that nextUpdate checks should be bypassed. Handle this gracefully by
using NGX_MAX_TIME_T_VALUE as "valid" in such a case.
The problem was introduced by 6893a1007a7c (1.9.2).
Reported by Matthew Baldwin.
Broken by 6893a1007a7c (1.9.2) during introduction of strict OCSP response
validity checks. As stapling file is expected to be returned unconditionally,
fix is to set its validity to the maximum supported time.
Reported by Faidon Liambotis.
Once upstream is connected, the upstream buffer is allocated. Previously, the
proxy module used the buffer allocation status to check if upstream is
connected. Now it's enough to check the flag.
If the -T option is passed, additionally to configuration test, configuration
files are output to stdout.
In the debug mode, configuration files are kept in memory and can be accessed
using a debugger.
The function is now called ngx_parse_http_time(), and can be used by
any code to parse HTTP-style date and time. In particular, it will be
used for OCSP stapling.
For compatibility, a macro to map ngx_http_parse_time() to the new name
provided for a while.
With this change it's no longer needed to pass -D_GNU_SOURCE manually,
and -D_FILE_OFFSET_BITS=64 is set to use 64-bit off_t.
Note that nginx currently fails to work properly with master process
enabled on GNU Hurd, as fcntl(F_SETOWN) returns EOPNOTSUPP for sockets
as of GNU Hurd 0.6. Additionally, our strerror() preloading doesn't
work well with GNU Hurd, as it uses large numbers for most errors.
When configured, an individual listen socket on a given address is
created for each worker process. This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance. As of now it works on Linux and DragonFly BSD.
Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/. With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.
Based on previous work by Sepherosa Ziehau and Yingqi Lu.
There is no need to set "i" to 0, as it's expected to be 0 assuming
the bindings are properly sorted, and we already rely on this when
explicitly set hport->naddrs to 1. Remaining conditional code is
replaced with identical "hport->naddrs = i + 1".
Identical modifications are done in the mail and stream modules,
in the ngx_mail_optimize_servers() and ngx_stream_optimize_servers()
functions, respectively.
No functional changes.
This may happen if eventfd() returns ENOSYS, notably seen on CentOS 5.4.
Such a failure will now just disable the notification mechanism and let
the callers cope with it, instead of failing to start worker processes.
If thread pools are not configured, this can safely be ignored.
Two mechanisms are implemented to make it possible to store pointers
in shared memory on Windows, in particular on Windows Vista and later
versions with ASLR:
- The ngx_shm_remap() function added to allow remapping of a shared memory
zone to the address originally used for it in the master process. While
important, it doesn't solve the problem by itself as in many cases it's
not possible to use the address because of conflicts with other
allocations.
- We now create mappings at the same address in all processes by starting
mappings at predefined addresses normally unused by newborn processes.
These two mechanisms combined allow to use shared memory on Windows
almost without problems, including reloads.
Based on the patch by Sergey Brester:
http://mailman.nginx.org/pipermail/nginx-devel/2015-April/006836.html
It's now enough to specify proxy_protocol option in one listen directive to
enable it in all servers listening on the same address/port. Previously,
the setting from the first directive was always used.
When client or upstream connection is closed, level-triggered read event
remained active until the end of the session leading to cpu hog. Now the flag
NGX_CLOSE_EVENT is used to unschedule the event.
If a peer was initially skipped due to max_fails, there's no reason
not to try it again if enough time has passed, and the next_upstream
logic is in action.
This also reduces diffs with NGINX Plus.
Similar to ngx_http_file_cache_set_slot(), the last component of file->name
with a fixed length of 10 bytes, as generated in ngx_create_temp_path(), is
used as a source for the names of intermediate subdirectories with each one
taking its own part. Ensure that the sum of specified levels with slashes
fits into the length (ticket #731).
Missing call to X509_STORE_CTX_free when X509_STORE_CTX_init fails.
Missing call to OCSP_CERTID_free when OCSP_request_add0_id fails.
Possible leaks in vary particular scenariis of memory shortage.