Previously, while shutting down gracefully, the HTTP/2 connections were
closed in transition to idle state after all active streams have been
processed. That might never happen if the client continued opening new
streams.
Now, nginx sends GOAWAY to all HTTP/2 connections and ignores further
attempts to open new streams. A worker process will quit as soon as
processing of already opened streams is finished.
BoringSSL changed SSL_set_tlsext_host_name() to be a real function
with a (const char *) argument, so it now triggers a warning due to
conversion from (u_char *). Added an explicit cast to silence the
warning.
Prodded by Piotr Sikora, Alessandro Ghedini.
This is needed to allow TLS client certificate auth to work. With
ssl_verify_client configured, the auth daemon can choose to allow the
connection to proceed based on the certificate data.
This has been tested with Thunderbird for IMAP only. I've not yet found a
client that will do client certificate auth for POP3 or SMTP, and the method is
not really documented anywhere that I can find. That said, its simple enough
that the way I've done is probably right.
When headers are set at the "http" level and not redefined in
a server block, we now preserve conf->headers into the "http"
section configuration to inherit it to all servers.
The same applies to conf->headers_cache, though it may not be effective
if no servers use cache at the "server" level as conf->headers_cache
is only initialized if cache is enabled on a given level.
Similar changes made in fastcgi/scgi/uwsgi to preserve conf->params
and conf->params_cache.
When headers to hide are set at the "http" level and not redefined in
a server block, we now preserve compiled headers hash into the "http"
section configuration to inherit this hash to all servers.
Dependency on cache settings existed prior to 2728c4e4a9ae (0.8.44)
as Set-Cookie header was automatically hidden from responses when
using cache. This is no longer the case, and hide_headers_hash can
be safely inherited regardless of cache settings.
With this change it is now possible to load modules compiled without
the "--with-http_ssl_module" configure option into nginx binary compiled
with it, and vice versa (if a module doesn't use ssl-specific functions),
assuming both use the "--with-compat" option.
With this change it is now possible to load modules compiled without
the "--with-file-aio" configure option into nginx binary compiled with it,
and vice versa, assuming both use the "--with-compat" option.
With this change it is now possible to load modules compiled without
the "--with-threads" configure option into nginx binary compiled with it,
and vice versa (if a module does not use thread-specific functions),
assuming both use the "--with-compat" option.
It is used at least by SOAP (M-POST method, defined by RFC 2774) and
by WebDAV versioning (VERSION-CONTROL and BASELINE-CONTROL methods,
defined by RFC 3253).
Previously, user access bits were always set to "rw" unconditionally,
even with "user:r" explicitly specified. With this change we only add
default user access bits (0600) if they weren't set explicitly.
Duplicate processing was possible if the address set by realip was
listed in set_realip_from, and there was an internal redirect so module
context was cleared. This resulted in exactly the same address being set,
so this wasn't a problem before the $realip_remote_addr variable was
introduced, though now results in incorrect $realip_remote_addr being
picked.
Fix is to use ngx_http_realip_get_module_ctx() to look up module context
even if it was cleared. Additionally, the order of checks was switched to
check the configuration first as it looks more effective.
The new parameters "manager_files", "manager_sleep"
and "manager_threshold" were added to proxy_cache_path
and friends.
Note that ngx_path_manager_pt was changed to return ngx_msec_t
instead of time_t (API change).
Explicit checks for OPENSSL_VERSION_NUMBER replaced with checks
for X509_CHECK_FLAG_ALWAYS_CHECK_SUBJECT, thus allowing X509_check_host()
to be used with other libraries. In particular, X509_check_host() was
introduced in LibreSSL 2.5.0.
When the last_buf flag is cleared for add_after_body to append more data from a
subrequest, other filters may still have buffered data, which should be flushed
at this point. For example, the sub_filter may have a partial match buffered,
which will only be flushed after the subrequest is done, ending up with
interleaved data in output.
Setting last_in_chain instead of last_buf flushes the data and fixes the order
of output buffers.
The last_buf flag should only be set in the last buffer of the main request.
Otherwise, several last_buf flags can appear in output. This can, for example,
break the chunked filter, which will include several final chunks in output.
The IPV6_V6ONLY macro is now checked only while parsing appropriate flag
and when using the macro.
The ipv6only field in listen structures is always initialized to 1,
even if not supported on a given platform. This is expected to prevent
a module compiled without IPV6_V6ONLY from accidentally creating dual
sockets if loaded into main binary with proper IPV6_V6ONLY support.
It is to be used as a bitmask with various bits set/reset when appropriate.
Any bit set means that the peer should not be used, that is, exactly what
current checks do, no additional changes required.
Previously flags passed by --with-ld-opt were not used when building perl
module, which meant hardening flags provided by package build systems were not
applied.
All the errors that prevent loading configuration must be printed on the "emerg"
log level. Previously, nginx might silently fail to load configuration in some
cases as the default log level is "error".
The ssl_preread module extracts information from the SSL Client Hello message
without terminating SSL. Currently, only $ssl_preread_server_name variable
is supported, which contains server name from the SNI extension.
In this phase, head of a stream is read and analysed before proceeding to the
content phase. Amount of data read is controlled by the module implementing
the phase, but not more than defined by the "preread_buffer_size" directive.
The time spent on processing preread is controlled by the "preread_timeout"
directive.
The typical preread phase module will parse the beginning of a stream and set
variable that may be used by the content phase, for example to make routing
decision.
Previously, it was not possible to use the stream context
inside ngx_stream_init_connection() handlers. Now, limit_conn,
access handlers, as well as those added later, can create
their own contexts.
Previously, it was possible that some system calls could be
invoked while holding the accept mutex. This is clearly
wrong as it prevents incoming connections from being accepted
as quickly as possible.
Keeps the full address of the upstream server. If several servers were
contacted during proxying, their addresses are separated by commas,
e.g. "192.168.1.1:80, 192.168.1.2:80".
The stream session status is one of the following:
200 - normal completion
403 - access forbidden
500 - internal server error
502 - bad gateway
503 - limit conn
This fixes a problem with aio threads and sendfile with aio_write switched
off, as observed with range requests after fc72784b1f52 (1.9.13). Potential
problems with sendfile in threads were previously described in 9fd738b85fad,
and this seems to be one of them.
The problem occurred as file's thread_handler was set to NULL by event pipe
code after a sendfile thread task was scheduled. As a result, no sendfile
completion code was executed, and the same buffer was additionally sent
using non-threaded sendfile. Fix is to avoid modifying file's thread_handler
if aio_write is switched off.
Note that with "aio_write on" it is still possible that sendfile will use
thread_handler as set by event pipe. This is believed to be safe though,
as handlers used are compatible.
When c->recv_chain() returns an error, it is possible that we already
have some data previously read, e.g., in preread buffer. And in some
cases it may be even a complete response. Changed c->recv_chain() error
handling to process the data, much like it is already done if kevent
reports about an error.
This change, in particular, fixes processing of small responses
when an upstream fails to properly close a connection with lingering and
therefore the connection is reset, but the response is already fully
obtained by nginx (see ticket #1037).
Previously, the realip module could be left with uninitialized context after an
error in the ngx_http_realip_set_addr() function. That context could be later
accessed by $realip_remote_addr and $realip_remote_port variable handlers.
This prevents theoretical resource leak, since those threads are never joined.
Found with ThreadSanitizer.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
If the range includes two or more /16 networks and does
not start at the /16 boundary, the last subrange was not
removed (see 91cff7f97a50 for details).
Return 1 in the SSL_CTX_set_tlsext_ticket_key_cb() callback function
to indicate that a new session ticket is created, as per documentation.
Until 1.1.0, OpenSSL didn't make a distinction between non-negative
return values.
See https://git.openssl.org/?p=openssl.git;a=commitdiff;h=5c753de for details.
The IP_BIND_ADDRESS_NO_PORT option is set on upstream sockets
if proxy_bind does not specify a port. The SO_REUSEADDR option
is set on UDP upstream sockets if proxy_bind specifies a port.
Due to checking of the wrong port, IP_BIND_ADDRESS_NO_PORT was
never set, and SO_REUSEPORT was always set.
Unlike $upstream_response_length that only counts the body size,
the new variable also counts the size of response header and data
received after switching protocols when proxying WebSockets.
The change in b91bcba29351 was not enough to fix random() seeding.
On Windows, the srand() seeds the PRNG only in the current thread,
and worse, is not inherited from the calling thread. Due to this,
worker threads were not properly seeded.
Reported by Marc Bevand.
If PCRE is disabled, captures were treated as normal variables in
ngx_http_script_compile(), while code calculating flushes array length in
ngx_http_compile_complex_value() did not account captures as variables.
This could lead to write outside of the array boundary when setting
last element to -1.
Found with AddressSanitizer.
It fixes potential connection leak if some unsent data was left in the SSL
buffer. Particularly, that could happen when a client canceled the stream
after the HEADERS frame has already been created. In this case no other
frames might be produced and the HEADERS frame alone didn't flush the buffer.
Checking for return value of c->send_chain() isn't sufficient since there
are data can be left in the SSL buffer. Now the wew->ready flag is used
instead.
In particular, this fixed a connection leak in cases when all streams were
closed, but there's still some data to be sent in the SSL buffer and the
client forgot about the connection.
Particularly this fixes alerts on OS X and NetBSD systems when HTTP/2 is
configured over plain TCP sockets.
On these systems calling writev() with no data leads to EINVAL errors
being logged as "writev() failed (22: Invalid argument) while processing
HTTP/2 connection".
Previously, if the worker process exited, GOAWAY was sent to connections in
idle state, but connections with active streams were closed without GOAWAY.
This flag appeared in Linux 4.5 and is useful for avoiding thundering herd
problem.
The current Linux kernel implementation walks the list of exclusive waiters,
and queues an event to each epfd, until it finds the first waiter that has
threads blocked on it via epoll_wait().
Now it is believed that the accept mutex brings more harm than benefits.
Especially in various benchmarks it often results in situation where only
one worker grabs all connections.
On non-aligned platforms, properly cast argument before left-shifting it in
ngx_http_v2_parse_uint32 that is used with u_char. Otherwise it propagates
to int to hold the value and can step over the sign bit. Usually, on known
compilers, this results in negation. Furthermore, a subsequent store into a
wider type, that is ngx_uint_t on 64-bit platforms, results in sign-extension.
In practice, this can be observed in debug log as a very large exclusive bit
value, when client sent PRIORITY frame with exclusive bit set:
: *14 http2 PRIORITY frame sid:5 on 1 excl:8589934591 weight:17
Found with UndefinedBehaviorSanitizer.
Previously, when a buffer was processed by the sub filter, its final bytes
could be buffered by the filter even if they don't match any pattern.
This happened because the Boyer-Moore algorithm, employed by the sub filter
since b9447fc457b4 (1.9.4), matches the last characters of patterns prior to
checking other characters. If the last character is out of scope, initial
bytes of a potential match are buffered until the last character is available.
Now, after receiving a flush or recycled buffer, the filter performs
additional checks to reduce the number of buffered bytes. The potential match
is checked against the initial parts of all patterns. Non-matching bytes are
not buffered. This improves processing of a chunked response from upstream
by sending the entire chunks without buffering unless a partial match is found
at the end of a chunk.
This reduces the number of moving parts in ABI compatibility checks.
Additionally, it also allows to use OpenSSL in FIPS mode while still
using md5 for non-security tasks.
The option is only set if the socket is bound to a specific port to allow
several such sockets coexist at the same time. This is required, for example,
when nginx acts as a transparent proxy and receives two datagrams from the same
client in a short time.
The feature is only implemented for Linux.
The following two types of bind addresses are supported in addition to
$remote_addr and address literals:
- $remote_addr:$remote_port
- [$remote_addr]:$remote_port
In both cases client remote address with port is used in upstream socket bind.
This patch moves various OpenSSL-specific function calls into the
OpenSSL module and introduces ngx_ssl_ciphers() to make nginx more
crypto-library-agnostic.
When the stream is terminated the HEADERS frame can still wait in the output
queue. This frame can't be removed and must be sent to the client anyway,
since HTTP/2 uses stateful compression for headers. So in order to postpone
closing and freeing memory of such stream the special close stream handler
is set to the write event. After the HEADERS frame is sent the write event
is called and the stream will be finally closed.
Some events like receiving a RST_STREAM can trigger the read handler of such
stream in closing state and cause unexpected processing that can result in
another attempt to finalize the request. To prevent it the read handler is
now set to ngx_http_empty_handler.
Thanks to Amazon.
There is no reason to add the "Content-Length: 0" header to a proxied request
without body if the header isn't presented in the original request.
Thanks to Amazon.
According to RFC 7540, an endpoint should not send more than one RST_STREAM
frame for any stream.
Also, now all the data frames will be skipped while termination.
The ngx_http_v2_finalize_connection() closes current stream, but that is an
invalid operation while processing unbuffered upload. This results in access
to already freed memory, since the upstream module sets a cleanup handler that
also finalizes the request.
A special last buffer with cl->buf->pos set to NULL can be present in
a chain when writing request body if chunked encoding was used. This
resulted in a NULL pointer dereference if it happened to be the only
buffer left after a do...while loop iteration in ngx_write_chain_to_file().
The problem originally appeared in nginx 1.3.9 with chunked encoding
support. Additionally, rev. 3832b608dc8d (nginx 1.9.13) changed the
minimum number of buffers to trigger this from IOV_MAX (typically 1024)
to NGX_IOVS_PREALLOCATE (typically 64).
Fix is to skip such buffers in ngx_chain_to_iovec(), much like it is
done in other places.
Previously, the stream's window was kept zero in order to prevent a client
from sending the request body before it was requested (see 887cca40ba6a for
details). Until such initial window was acknowledged all requests with
data were rejected (see 0aa07850922f for details).
That approach revealed a number of problems:
1. Some clients (notably MS IE/Edge, Safari, iOS applications) show an error
or even crash if a stream is rejected;
2. This requires at least one RTT for every request with body before the
client receives window update and able to send data.
To overcome these problems the new directive "http2_body_preread_size" is
introduced. It sets the initial window and configures a special per stream
preread buffer that is used to save all incoming data before the body is
requested and processed.
If the directive's value is lower than the default initial window (65535),
as previously, all streams with data will be rejected until the new window
is acknowledged. Otherwise, no special processing is used and all requests
with data are welcome right from the connection start.
The default value is chosen to be 64k, which is bigger than the default
initial window. Setting it to zero is fully complaint to the previous
behavior.
Now, the module extracts optional port which may accompany an
IP address. This custom extension is introduced, among other
things, in order to facilitate logging of original client ports.
Addresses with ports are expected to be in the RFC 3986 format,
that is, with IPv6 addresses in square brackets. E.g.,
"X-Real-IP: [2001:0db8::1]:12345" sets client port ($remote_port)
to 12345.
Previously, when the client address was changed to the one from
the PROXY protocol header, the client port ($remote_port) was
reset to zero. Now the client port is also changed to the one
from the PROXY protocol header.
The 6f8254ae61b8 change inadvertently fixed the duplicate port
detection similar to how it was fixed for mail in b2920b517490.
It also revealed another issue: the socket type (tcp vs. udp)
wasn't taken into account.
Since 4fbef397c753 nginx rejects with the 400 error any attempts of
requesting different host over the same connection, if the relevant
virtual server requires verification of a client certificate.
While requesting hosts other than negotiated isn't something legal
in HTTP/1.x, the HTTP/2 specification explicitly permits such requests
for connection reuse and has introduced a special response code 421.
According to RFC 7540 Section 9.1.2 this code can be sent by a server
that is not configured to produce responses for the combination of
scheme and authority that are included in the request URI. And the
client may retry the request over a different connection.
Now this code is used for requests that aren't authorized in current
connection. After receiving the 421 response a client will be able
to open a new connection, provide the required certificate and retry
the request.
Unfortunately, not all clients currently are able to handle it well.
Notably Chrome just shows an error, while at least the latest version
of Firefox retries the request over a new connection.
Using the same DH parameters on multiple servers is believed to be subject
to precomputation attacks, see http://weakdh.org/. Additionally, 1024 bits
are not enough in the modern world as well. Let users provide their own
DH parameters with the ssl_dhparam directive if they want to use EDH ciphers.
Note that SSL_CTX_set_dh_auto() as provided by OpenSSL 1.1.0 uses fixed
DH parameters from RFC 5114 and RFC 3526, and therefore subject to the same
precomputation attacks. We avoid using it as well.
This change also fixes compilation with OpenSSL 1.1.0-pre5 (aka Beta 2),
as OpenSSL developers changed their policy after releasing Beta 1 and
broke API once again by making the DH struct opaque (see ticket #860).
OpenSSL 1.0.2+ allows configuring a curve list instead of a single curve
previously supported. This allows use of different curves depending on
what client supports (as available via the elliptic_curves extension),
and also allows use of different curves in an ECDHE key exchange and
in the ECDSA certificate.
The special value "auto" was introduced (now the default for ssl_ecdh_curve),
which means "use an internal list of curves as available in the OpenSSL
library used". For versions prior to OpenSSL 1.0.2 it maps to "prime256v1"
as previously used. The default in 1.0.2b+ prefers prime256v1 as well
(and X25519 in OpenSSL 1.1.0+).
As client vs. server preference of curves is controlled by the
same option as used for ciphers (SSL_OP_CIPHER_SERVER_PREFERENCE),
the ssl_prefer_server_ciphers directive now controls both.
The SSL_CTX_add0_chain_cert() function as introduced in OpenSSL 1.0.2 now
used instead of SSL_CTX_add_extra_chain_cert().
SSL_CTX_add_extra_chain_cert() adds extra certs for all certificates
in the context, while SSL_CTX_add0_chain_cert() only to a particular
certificate. There is no difference unless multiple certificates are used,
though it is important when using multiple certificates.
Additionally, SSL_CTX_select_current_cert() is now called before using
a chain to make sure correct chain will be returned.
A pointer to a previously configured certificate now stored in a certificate.
This makes it possible to iterate though all certificates configured in
the SSL context. This is now used to configure OCSP stapling for all
certificates, and in ngx_ssl_session_id_context().
As SSL_CTX_use_certificate() frees previously loaded certificate of the same
type, and we have no way to find out if it's the case, X509_free() calls
are now posponed till ngx_ssl_cleanup_ctx().
Note that in OpenSSL 1.0.2+ this can be done without storing things in exdata
using the SSL_CTX_set_current_cert() and SSL_CTX_get0_certificate() functions.
These are not yet available in all supported versions though, so it's easier
to continue to use exdata for now.
This makes it possible to properly return OCSP staple with multiple
certificates configured.
Note that it only works properly in OpenSSL 1.0.1d+, 1.0.0k, 0.9.8y+.
In older versions SSL_get_certificate() fails to return correct certificate
when the certificate status callback is called.
Both minor and major versions are now limited to 999 maximum. In case of
r->http_minor, this limit is already implied by the code. Major version,
r->http_major, in theory can be up to 65535 with current code, but such
values are very unlikely to become real (and, additionally, such values
are not allowed by RFC 7230), so the same test was used for r->http_major.
When it's known that the kernel supports EPOLLRDHUP, there is no need in
additional recv() call to get EOF or error when the flag is absent in the
event generated by the kernel. A special runtime test is done at startup
to detect if EPOLLRDHUP is actually supported by the kernel because
epoll_ctl() silently ignores unknown flags.
With this knowledge it's now possible to drop the "ready" flag for partial
read. Previously, the "ready" flag was kept until the recv() returned EOF
or error. In particular, this change allows the lingering close heuristics
(which relies on the "ready" flag state) to actually work on Linux, and not
wait for more data in most cases.
The "available" flag is now used in the read event with the semantics similar
to the corresponding counter in kqueue.
This parameter lets binding the proxy connection to a non-local address.
Upstream will see the connection as coming from that address.
When used with $remote_addr, upstream will accept the connection from real
client address.
Example:
proxy_bind $remote_addr transparent;
The WINDOW_UPDATE frame could be left in the output queue for an indefinite
period of time resulting in the request timeout.
This might happen if reading of the body was triggered by an event unrelated
to client connection, e.g. by the limit_req timer.
Particularly this prevents sending WINDOW_UPDATE with zero delta
which can result in PROTOCOL_ERROR.
Also removed surplus setting of no_flow_control to 0.
The ngx_thread_pool_done object isn't volatile, and at least some
compilers assume that it is permitted to reorder modifications of
volatile and non-volatile objects. Added appropriate ngx_memory_barrier()
calls to make sure all modifications will happen before the lock is released.
Reported by Mindaugas Rasiukevicius,
http://mailman.nginx.org/pipermail/nginx-devel/2016-April/008160.html.