Binary upgrades are not supported without master process, but it is,
however, possible, that nginx running with master process is asked
to upgrade binary, and the configuration file as available on disk
at this time includes "master_process off;".
If this happens, listening sockets inherited from the previous binary
will have ls[i].previous set. But the old cycle on initial process
startup, including startup after binary upgrade, is destroyed by
ngx_init_cycle() once configuration parsing is complete. As a result,
an attempt to dereference ls[i].previous in ngx_event_process_init()
accesses already freed memory.
Fix is to avoid looking into ls[i].previous if the old cycle is already
freed.
With this change it is also no longer needed to clear ls[i].previous in
worker processes, so the relevant code was removed.
Cloning of listening sockets for each worker process does not make sense
when working without master process, and causes some of the connections
not to be accepted if worker_processes is set to more than one and there
are listening sockets configured with the reuseport flag. Fix is to
disable cloning when master process is disabled.
Due to the glibc bug[1], getaddrinfo("localhost") with AI_ADDRCONFIG
on a typical host with glibc and without IPv6 returns two 127.0.0.1
addresses, and therefore "listen localhost:80;" used to result in
"duplicate ... address and port pair" after 4f9b72a229c1.
Fix is to explicitly filter out duplicate addresses returned during
resolution of a name.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=14969
Previously, if an event was posted by a read event handler, called by
ngx_close_idle_connections(), that event was not processed until the next
event loop iteration, which could happen after a timeout.
As the SSI parser always uses the context from the main request for storing
variables and blocks, that context should always exist for subrequests using
SSI, even though the main request does not necessarily have SSI enabled.
However, `ngx_http_get_module_ctx(r->main, ...)` is getting NULL in such cases,
resulting in the worker crashing SIGSEGV when accessing its attributes.
This patch links the first initialized context to the main request, and
upgrades it only when main context is initialized.
The check is not expected to fail unless there is a bug in the calling
code. But given the check is here, it should log an alert if it fails
instead of silently closing the connection.
Maximum size for reading the PROXY protocol header is increased to 4096 to
accommodate a bigger number of TLVs, which are supported since cca4c8a715de.
Maximum size for writing the PROXY protocol header is not changed since only
version 1 is currently supported.
Most atoms should not appear more than once in a container. Previously,
this was not enforced by the module, which could result in worker process
crash, memory corruption and disclosure.
Now it properly detects invalid shared zone configuration with omitted size.
Previously it used to read outside of the buffer boundary.
Found with AddressSanitizer.
OpenSSL with TLSv1.3 updates the session creation time on session
resumption and keeps the session timeout unmodified, making it possible
to maintain the session forever, bypassing client certificate expiration
and revocation. To make sure session timeouts are actually used, we
now update the session creation time and reduce the session timeout
accordingly.
BoringSSL with TLSv1.3 ignores configured session timeouts and uses a
hardcoded timeout instead, 7 days. So we update session timeout to
the configured value as soon as a session is created.
Instead of syncing keys with shared memory on each ticket operation,
the code now does this only when the worker is going to change expiration
of the current key, or going to switch to a new key: that is, usually
at most once per second.
To do so without races, the code maintains 3 keys: current, previous,
and next. If a worker will switch to the next key earlier, other workers
will still be able to decrypt new tickets, since they will be encrypted
with the next key.
As long as ssl_session_cache in shared memory is configured, session ticket
keys are now automatically generated in shared memory, and rotated
periodically. This can be beneficial from forward secrecy point of view,
and also avoids increased CPU usage after configuration reloads.
This also helps BoringSSL to properly resume sessions in configurations
with multiple worker processes and no ssl_session_ticket_key directives,
as BoringSSL tries to automatically rotate session ticket keys and does
this independently in different worker processes, thus breaking session
resumption between worker processes.
Given the present typical SSL session sizes, on 32-bit platforms it is
now beneficial to store all data in a single allocation, since rbtree
node + session id + ASN1 representation of a session takes 256 bytes of
shared memory (36 + 32 + 150 = about 218 bytes plus SNI server name).
Storing all data in a single allocation is beneficial for SNI names up to
about 40 characters long and makes it possible to store about 4000 sessions
in one megabyte (instead of about 3000 sessions now). This also slightly
simplifies the code.
Session ids are not expected to be longer than 32 bytes, but this is
theoretically possible with TLSv1.3, where session ids are essentially
arbitrary and sent as session tickets. Since on 64-bit platforms we
use fixed 32-byte buffer for session ids, added an explicit length check
to make sure the buffer is large enough.
Session cache allocations might fail as long as the new session is different
in size from the one least recently used (and freed when the first allocation
fails). In particular, it might not be possible to allocate space for
sessions with client certificates, since they are noticeably bigger than
normal sessions.
To ensure such allocation failures won't clutter logs, logging level changed
to "warn", and logging is now limited to at most one warning per second.
OpenSSL tries to save TLSv1.3 sessions into session cache even when using
tickets for stateless session resumption, "because some applications just
want to know about the creation of a session". To avoid trashing session
cache with useless data, we do not save such sessions now.
The variables have prefix $proxy_protocol_tlv_ and are accessible by name
and by type. Examples are: $proxy_protocol_tlv_0x01, $proxy_protocol_tlv_alpn.
Previously, all received user input was logged. If a multi-line text was
received from client and logged, it could reduce log readability and also make
it harder to parse nginx log by scripts. The change brings to PROXY protocol
the same behavior that exists for HTTP request line in
ngx_http_log_error_handler().
SSL_sendfile() expects integer file descriptor as an argument, but nginx
uses OS file handles (HANDLE) to work with files on Windows, and passing
HANDLE instead of an integer correctly results in build failure. Since
SSL_sendfile() is not expected to work on Windows anyway, the code is now
disabled on Windows with appropriate compile-time checks.
Multiple C4306 warnings (conversion from 'type1' to 'type2' of greater size)
appear during 64-bit compilation with MSVC 2010 (and older) due to extensively
used constructs like "(void *) -1", so they were disabled.
In newer MSVC versions C4306 warnings were replaced with C4312 ones, and
these are not generated for such trivial type casts.
In 2014ed60f17f, "#if SSL_CTRL_SET_ECDH_AUTO" test was incorrectly used
instead of "#ifdef SSL_CTRL_SET_ECDH_AUTO". There is no practical
difference, since SSL_CTRL_SET_ECDH_AUTO evaluates to a non-zero numeric
value when defined, but anyway it's better to correctly test if the value
is defined.
The SSL_R_BAD_RECORD_TYPE ("bad record type") errors are reported by
OpenSSL 1.1.1 or newer when using TLSv1.3 if the client sends a record
with unknown or unexpected type. These errors are now logged at the
"info" level.
When reading exactly rev->available bytes, rev->available might become 0
after FIONREAD usage introduction in efd71d49bde0. On the next call of
ngx_readv_chain() on systems with EPOLLRDHUP this resulted in return without
any actions, that is, with rev->ready set, and this in turn resulted in no
timers set in event pipe, leading to socket leaks.
Fix is to reset rev->ready in ngx_readv_chain() when returning due to
rev->available being 0 with EPOLLRDHUP, much like it is already done in
ngx_unix_recv(). This ensures that if rev->available will become 0, on
systems with EPOLLRDHUP support appropriate EPOLLRDHUP-specific handling
will happen on the next ngx_readv_chain() call.
While here, also synced ngx_readv_chain() to match ngx_unix_recv() and
reset rev->ready when returning due to rev->available being 0 with kqueue.
This is mostly cosmetic change, as rev->ready is anyway reset when
rev->available is set to 0.
Some servers might emit Content-Range header on 200 responses, and this
does not seem to contradict RFC 9110: as per RFC 9110, the Content-Range
header has no meaning for status codes other than 206 and 416. Previously
this resulted in duplicate Content-Range headers in nginx responses handled
by the range filter. Fix is to clear pre-existing headers.
Starting with OpenSSL 1.1.1, various additional errors can be reported
by OpenSSL in case of client-related issues, most notably during TLSv1.3
handshakes. In particular, SSL_R_BAD_KEY_SHARE ("bad key share"),
SSL_R_BAD_EXTENSION ("bad extension"), SSL_R_BAD_CIPHER ("bad cipher"),
SSL_R_BAD_ECPOINT ("bad ecpoint"). These are now logged at the "info"
level.
To ensure optimal use of memory, SSL contexts for proxying are now
inherited from previous levels as long as relevant proxy_ssl_* directives
are not redefined.
Further, when no proxy_ssl_* directives are redefined in a server block,
we now preserve plcf->upstream.ssl in the "http" section configuration
to inherit it to all servers.
Similar changes made in uwsgi, grpc, and stream proxy.