This fixes a problem with aio threads and sendfile with aio_write switched
off, as observed with range requests after fc72784b1f52 (1.9.13). Potential
problems with sendfile in threads were previously described in 9fd738b85fad,
and this seems to be one of them.
The problem occurred as file's thread_handler was set to NULL by event pipe
code after a sendfile thread task was scheduled. As a result, no sendfile
completion code was executed, and the same buffer was additionally sent
using non-threaded sendfile. Fix is to avoid modifying file's thread_handler
if aio_write is switched off.
Note that with "aio_write on" it is still possible that sendfile will use
thread_handler as set by event pipe. This is believed to be safe though,
as handlers used are compatible.
When c->recv_chain() returns an error, it is possible that we already
have some data previously read, e.g., in preread buffer. And in some
cases it may be even a complete response. Changed c->recv_chain() error
handling to process the data, much like it is already done if kevent
reports about an error.
This change, in particular, fixes processing of small responses
when an upstream fails to properly close a connection with lingering and
therefore the connection is reset, but the response is already fully
obtained by nginx (see ticket #1037).
Return 1 in the SSL_CTX_set_tlsext_ticket_key_cb() callback function
to indicate that a new session ticket is created, as per documentation.
Until 1.1.0, OpenSSL didn't make a distinction between non-negative
return values.
See https://git.openssl.org/?p=openssl.git;a=commitdiff;h=5c753de for details.
The IP_BIND_ADDRESS_NO_PORT option is set on upstream sockets
if proxy_bind does not specify a port. The SO_REUSEADDR option
is set on UDP upstream sockets if proxy_bind specifies a port.
Due to checking of the wrong port, IP_BIND_ADDRESS_NO_PORT was
never set, and SO_REUSEPORT was always set.
This flag appeared in Linux 4.5 and is useful for avoiding thundering herd
problem.
The current Linux kernel implementation walks the list of exclusive waiters,
and queues an event to each epfd, until it finds the first waiter that has
threads blocked on it via epoll_wait().
Now it is believed that the accept mutex brings more harm than benefits.
Especially in various benchmarks it often results in situation where only
one worker grabs all connections.
The option is only set if the socket is bound to a specific port to allow
several such sockets coexist at the same time. This is required, for example,
when nginx acts as a transparent proxy and receives two datagrams from the same
client in a short time.
The feature is only implemented for Linux.
This patch moves various OpenSSL-specific function calls into the
OpenSSL module and introduces ngx_ssl_ciphers() to make nginx more
crypto-library-agnostic.
Using the same DH parameters on multiple servers is believed to be subject
to precomputation attacks, see http://weakdh.org/. Additionally, 1024 bits
are not enough in the modern world as well. Let users provide their own
DH parameters with the ssl_dhparam directive if they want to use EDH ciphers.
Note that SSL_CTX_set_dh_auto() as provided by OpenSSL 1.1.0 uses fixed
DH parameters from RFC 5114 and RFC 3526, and therefore subject to the same
precomputation attacks. We avoid using it as well.
This change also fixes compilation with OpenSSL 1.1.0-pre5 (aka Beta 2),
as OpenSSL developers changed their policy after releasing Beta 1 and
broke API once again by making the DH struct opaque (see ticket #860).
OpenSSL 1.0.2+ allows configuring a curve list instead of a single curve
previously supported. This allows use of different curves depending on
what client supports (as available via the elliptic_curves extension),
and also allows use of different curves in an ECDHE key exchange and
in the ECDSA certificate.
The special value "auto" was introduced (now the default for ssl_ecdh_curve),
which means "use an internal list of curves as available in the OpenSSL
library used". For versions prior to OpenSSL 1.0.2 it maps to "prime256v1"
as previously used. The default in 1.0.2b+ prefers prime256v1 as well
(and X25519 in OpenSSL 1.1.0+).
As client vs. server preference of curves is controlled by the
same option as used for ciphers (SSL_OP_CIPHER_SERVER_PREFERENCE),
the ssl_prefer_server_ciphers directive now controls both.
The SSL_CTX_add0_chain_cert() function as introduced in OpenSSL 1.0.2 now
used instead of SSL_CTX_add_extra_chain_cert().
SSL_CTX_add_extra_chain_cert() adds extra certs for all certificates
in the context, while SSL_CTX_add0_chain_cert() only to a particular
certificate. There is no difference unless multiple certificates are used,
though it is important when using multiple certificates.
Additionally, SSL_CTX_select_current_cert() is now called before using
a chain to make sure correct chain will be returned.
A pointer to a previously configured certificate now stored in a certificate.
This makes it possible to iterate though all certificates configured in
the SSL context. This is now used to configure OCSP stapling for all
certificates, and in ngx_ssl_session_id_context().
As SSL_CTX_use_certificate() frees previously loaded certificate of the same
type, and we have no way to find out if it's the case, X509_free() calls
are now posponed till ngx_ssl_cleanup_ctx().
Note that in OpenSSL 1.0.2+ this can be done without storing things in exdata
using the SSL_CTX_set_current_cert() and SSL_CTX_get0_certificate() functions.
These are not yet available in all supported versions though, so it's easier
to continue to use exdata for now.
This makes it possible to properly return OCSP staple with multiple
certificates configured.
Note that it only works properly in OpenSSL 1.0.1d+, 1.0.0k, 0.9.8y+.
In older versions SSL_get_certificate() fails to return correct certificate
when the certificate status callback is called.
When it's known that the kernel supports EPOLLRDHUP, there is no need in
additional recv() call to get EOF or error when the flag is absent in the
event generated by the kernel. A special runtime test is done at startup
to detect if EPOLLRDHUP is actually supported by the kernel because
epoll_ctl() silently ignores unknown flags.
With this knowledge it's now possible to drop the "ready" flag for partial
read. Previously, the "ready" flag was kept until the recv() returned EOF
or error. In particular, this change allows the lingering close heuristics
(which relies on the "ready" flag state) to actually work on Linux, and not
wait for more data in most cases.
The "available" flag is now used in the read event with the semantics similar
to the corresponding counter in kqueue.
This parameter lets binding the proxy connection to a non-local address.
Upstream will see the connection as coming from that address.
When used with $remote_addr, upstream will accept the connection from real
client address.
Example:
proxy_bind $remote_addr transparent;
SSLeay_version() and SSLeay() are no longer available if OPENSSL_API_COMPAT
is set to 0x10100000L. Switched to using OpenSSL_version() instead.
Additionally, we now compare version strings instead of version numbers,
and this correctly works for LibreSSL as well.
OPENSSL_config() deprecated in OpenSSL 1.1.0. Additionally,
SSL_library_init(), SSL_load_error_strings() and OpenSSL_add_all_algorithms()
are no longer available if OPENSSL_API_COMPAT is set to 0x10100000L.
The OPENSSL_init_ssl() function is now used instead with appropriate
arguments to trigger the same behaviour. The configure test changed to
use SSL_CTX_set_options().
Deinitialization now happens automatically in OPENSSL_cleanup() called
via atexit(3), so we no longer call EVP_cleanup() and ENGINE_cleanup()
directly.
LibreSSL defines OPENSSL_VERSION_NUMBER to 0x20000000L, but uses an old
API derived from OpenSSL at the time LibreSSL forked. As a result, every
version check we use to test for new API elements in newer OpenSSL versions
requires an explicit check for LibreSSL.
To reduce clutter, redefine OPENSSL_VERSION_NUMBER to 0x1000107fL if
LibreSSL is used. The same is done by FreeBSD port of LibreSSL.
Fixes various aspects of --test-build-devpoll, --test-build-eventport, and
--test-build-epoll.
In particular, if --test-build-devpoll was used on Linux, then "devpoll"
event method would be preferred over "epoll". Also, wrong definitions of
event macros were chosen.
The "aio_write" directive is introduced, which enables use of aio
for writing. Currently it is meaningful only with "aio threads".
Note that aio operations can be done by both event pipe and output
chain, so proper mapping between r->aio and p->aio is provided when
calling ngx_event_pipe() and in output filter.
In collaboration with Valentin Bartenev.
If a write event happens after sendfile() but before we've got the
sendfile results in the main thread, this write event will be ignored.
And if no more events will happen, the connection will hang.
Removing the events works in the simple cases, but not always, as
in some cases events are added back by an unrelated code. E.g.,
the upstream module adds write event in the ngx_http_upstream_init()
to track client aborts.
Fix is to use wev->complete instead. It is now set to 0 before
a sendfile() task is posted, and it is set to 1 once a write event
happens. If on completion of the sendfile() task wev->complete is 1,
we know that an event happened while we were executing sendfile(), and
the socket is still ready for writing even if sendfile() did not sent
all the data or returned EAGAIN.
This fixes "called a function you should not call" and
"shutdown while in init" errors as observed with OpenSSL 1.0.2f
due to changes in how OpenSSL handles SSL_shutdown() during
SSL handshakes.
As setitimer() isn't available on Windows, time wasn't updated at all
if timer_resolution was used with the select event method. Fix is
to ignore timer_resolution in such cases.
This context is needed for shared sessions cache to work in configurations
with multiple virtual servers sharing the same port. Unfortunately, OpenSSL
does not provide an API to access the session context, thus storing it
separately.
In collaboration with Vladimir Homutov.
If no space left in buffer after adding formatting symbols, error message
could be left without terminating null. The fix is to output message using
actual length.
RAND_pseudo_bytes() is deprecated in the OpenSSL master branch, so the only
use was changed to RAND_bytes(). Access to internal structures is no longer
possible, so now we don't try to set SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS even
if it's defined.
OCSP responses may contain no nextUpdate. As per RFC 6960, this means
that nextUpdate checks should be bypassed. Handle this gracefully by
using NGX_MAX_TIME_T_VALUE as "valid" in such a case.
The problem was introduced by 6893a1007a7c (1.9.2).
Reported by Matthew Baldwin.
Broken by 6893a1007a7c (1.9.2) during introduction of strict OCSP response
validity checks. As stapling file is expected to be returned unconditionally,
fix is to set its validity to the maximum supported time.
Reported by Faidon Liambotis.
When configured, an individual listen socket on a given address is
created for each worker process. This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance. As of now it works on Linux and DragonFly BSD.
Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/. With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.
Based on previous work by Sepherosa Ziehau and Yingqi Lu.
This may happen if eventfd() returns ENOSYS, notably seen on CentOS 5.4.
Such a failure will now just disable the notification mechanism and let
the callers cope with it, instead of failing to start worker processes.
If thread pools are not configured, this can safely be ignored.
Missing call to X509_STORE_CTX_free when X509_STORE_CTX_init fails.
Missing call to OCSP_CERTID_free when OCSP_request_add0_id fails.
Possible leaks in vary particular scenariis of memory shortage.
The SSL_MODE_NO_AUTO_CHAIN mode prevents OpenSSL from automatically
building a certificate chain on the fly if there is no certificate chain
explicitly provided. Before this change, certificates provided via the
ssl_client_certificate and ssl_trusted_certificate directives were
used by OpenSSL to automatically build certificate chains, resulting
in unexpected (and in some cases unneeded) chains being sent to clients.
LibreSSL 2.1.1+ started to set SSL_OP_NO_SSLv3 option by default on
new contexts. This makes sure to clear it to make it possible to use SSLv3
with LibreSSL if enabled in nginx config.
Prodded by Kuramoto Eiji.
This reduces layering violation and simplifies the logic of AIO preread, since
it's now triggered by the send chain function itself without falling back to
the copy filter. The context of AIO operation is now stored per file buffer,
which makes it possible to properly handle cases when multiple buffers come
from different locations, each with its own configuration.