Commit Graph

8417 Commits

Author SHA1 Message Date
Roman Arutyunyan
6bf13e9d57 QUIC: do not shrink congestion window after losing an MTU probe.
As per RFC 9000, Section 14.4:

    Loss of a QUIC packet that is carried in a PMTU probe is therefore
    not a reliable indication of congestion and SHOULD NOT trigger a
    congestion control reaction.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
cd5e4fa144 QUIC: do not increase underutilized congestion window.
As per RFC 9002, Section 7.8, congestion window should not be increased
when it's underutilized.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
04c65ccd9a QUIC: all-levels commit and revert functions.
Previously, these functions operated on a per-level basis.  This however
resulted in excessive logging of in_flight and will also led to extra
work detecting underutilized congestion window in the followup patches.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
1e883a40db QUIC: ngx_msec_t overflow protection.
On some systems the value of ngx_current_msec is derived from monotonic
clock, for which the following is defined by POSIX:

   For this clock, the value returned by clock_gettime() represents
   the amount of time (in seconds and nanoseconds) since an unspecified
   point in the past.

As as result, overflow protection is needed when comparing two ngx_msec_t.
The change adds such protection to the ngx_quic_detect_lost() function.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
38236bf74f QUIC: prevent spurious congestion control recovery mode.
Since recovery_start field was initialized with ngx_current_msec, all
congestion events that happened within the same millisecond or cycle
iteration, were treated as in recovery mode.

Also, when handling persistent congestion, initializing recovery_start
with ngx_current_msec resulted in treating all sent packets as in recovery
mode, which violates RFC 9002, see example in Appendix B.8.

While here, also fixed recovery_start wrap protection.  Previously it used
2 * max_idle_timeout time frame for all sent frames, which is not a
reliable protection since max_idle_timeout is unrelated to congestion
control.  Now recovery_start <= now condition is enforced.  Note that
recovery_start wrap is highly unlikely and can only occur on a
32-bit system if there are no congestion events for 24 days.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
53e7e9eb54 QUIC: use path MTU in congestion window computations.
As per RFC 9002, Section B.2, max_datagram_size used in congestion window
computations should be based on path MTU.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
3a97111adf HTTP/3: graceful shutdown on keepalive timeout expiration.
Previously, the expiration caused QUIC connection finalization even if
there are application-terminated streams finishing sending data.  Such
finalization terminated these streams.

An easy way to trigger this is to request a large file from HTTP/3 over
a small MTU.  In this case keepalive timeout expiration may abruptly
terminate the request stream.
2025-04-15 19:01:36 +04:00
Roman Arutyunyan
2b8b70068a QUIC: graph-friendly congestion control logging.
Improved logging for simpler data extraction for plotting congestion
window graphs.  In particular, added current milliseconds number from
ngx_current_msec.

While here, simplified logging text and removed irrelevant data.
2025-04-15 19:01:36 +04:00
Sergey Kandaurov
b6e7eb0f57 SSL: external groups support in $ssl_curve and $ssl_curves.
Starting with OpenSSL 3.0, groups may be added externally with pluggable
KEM providers.  Using SSL_get_negotiated_group(), which makes lookup in a
static table with known groups, doesn't allow to list such groups by names
leaving them in hex.  Adding X25519MLKEM768 to the default group list in
OpenSSL 3.5 made this problem more visible.  SSL_get0_group_name() and,
apparently, SSL_group_to_name() allow to resolve such provider-implemented
groups, which is also "generally preferred" over SSL_get_negotiated_group()
as documented in OpenSSL git commit 93d4f6133f.

This change makes external groups listing by name using SSL_group_to_name()
available since OpenSSL 3.0.  To preserve "prime256v1" naming for the group
0x0017, and to avoid breaking BoringSSL and older OpenSSL versions support,
it is used supplementary for a group that appears to be unknown.

See https://github.com/openssl/openssl/issues/27137 for related discussion.
2025-04-10 18:51:10 +04:00
Sergey Kandaurov
6c3a9d5612 Upstream: fixed passwords support for dynamic certificates.
Passwords were not preserved in optimized SSL contexts, the bug had
appeared in d791b4aab (1.23.1), as in the following configuration:

    server {
        proxy_ssl_password_file password;
        proxy_ssl_certificate $ssl_server_name.crt;
        proxy_ssl_certificate_key $ssl_server_name.key;

        location /original/ {
            proxy_pass https://u1/;
        }

        location /optimized/ {
            proxy_pass https://u2/;
        }
    }

The fix is to always preserve passwords, by copying to the configuration
pool, if dynamic certificates are used.  This is done as part of merging
"ssl_passwords" configuration.

To minimize the number of copies, a preserved version is then used for
inheritance.  A notable exception is inheritance of preserved empty
passwords to the context with statically configured certificates:

    server {
        proxy_ssl_certificate $ssl_server_name.crt;
        proxy_ssl_certificate_key $ssl_server_name.key;

        location / {
            proxy_pass ...;

            proxy_ssl_certificate example.com.crt;
            proxy_ssl_certificate_key example.com.key;
        }
    }

In this case, an unmodified version (NULL) of empty passwords is set,
to allow reading them from the password prompt on nginx startup.

As an additional optimization, a preserved instance of inherited
configured passwords is set to the previous level, to inherit it
to other contexts:

    server {
        proxy_ssl_password_file password;

        location /1/ {
            proxy_pass https://u1/;
            proxy_ssl_certificate $ssl_server_name.crt;
            proxy_ssl_certificate_key $ssl_server_name.key;
        }

        location /2/ {
            proxy_pass https://u2/;
            proxy_ssl_certificate $ssl_server_name.crt;
            proxy_ssl_certificate_key $ssl_server_name.key;
        }
    }
2025-04-10 17:27:45 +04:00
Sergey Kandaurov
a813c63921 Charset filter: improved validation of charset_map with utf-8.
It was possible to write outside of the buffer used to keep UTF-8
decoded values when parsing conversion table configuration.

Since this happened before UTF-8 decoding, the fix is to check in
advance if character codes are of more than 3-byte sequence.  Note
that this is already enforced by a later check for ngx_utf8_decode()
decoded values for 0xffff, which corresponds to the maximum value
encoded as a valid 3-byte sequence, so the fix does not affect the
valid values.

Found with AddressSanitizer.
Fixes GitHub issue #529.
2025-04-09 19:37:51 +04:00
Demi Marie Obenour
6a78618113 HTTP: Use common header code for v2 and v3
This makes the behavior of HTTP/2 and HTTP/3 much more similar.  In
particular, the HTTP/3 :authority pseudoheader is used to set the Host
header, instead of the virtual server.  This is arguably less correct,
but it is consistent with the existing HTTP/2 behavior and unbreaks
users of PHP-FPM and other FastCGI applications.  In the future, NGINX
could have a config option that caused :authority and Host to be treated
separately in both HTTP/2 and HTTP/3.
2025-03-25 15:07:33 -04:00
Demi Marie Obenour
c346375a8b HTTP/3: Do not allow invalid pseudo-header fields
RFC9114 requires invalid pseudo-header fields to be rejected, and this
is consistent with HTTP/2.
2025-03-25 15:07:33 -04:00
Demi Marie Obenour
2fbc33e2de HTTP: reject invalid header names
HTTP headers must be an RFC9110 token, so only a subset of characters
are permitted.  RFC9113 and RFC9114 require rejecting invalid header
characters in HTTP/2 and HTTP/3 respectively, so reject them in HTTP/1.0
and HTTP/1.1 for consistency.  This also requires removing the ignore
hack for (presumably ancient) versions of IIS.
2025-03-25 15:07:33 -04:00
Demi Marie Obenour
bccf2b1f3b HTTP: Reject hop-by-hop headers in HTTP/2 and HTTP/3 requests
RFC9113 and RFC9114 both require requests with connection-specific
headers to be treated as malformed, with the exception of "te: trailers".
Reject requests containing them.
2025-03-25 15:07:33 -04:00
Demi Marie Obenour
c68c7adcdb HTTP: Allow rejecting leading and trailing whitespace in HTTP2+ fields
All versions of HTTP forbid field (header and trailer) values from
having leading or trailing horizontal whitespace (0x20 and 0x09).  In
HTTP/1.0 and HTTP/1.1, leading and trailing whitespace must be stripped
from the field value before further processing.  In HTTP/2 and HTTP/3,
leading and trailing whitespace must cause the entire message to be
considered malformed.

Willy Tarreau (lead developer of HAProxy) has indicated that there are
clients that actually do send leading and/or trailing whitespace in
HTTP/2 and/or HTTP/3 cookie headers, which is why HAProxy accepts them.
Therefore, the fix is disabled by default and must be enabled with the
reject_leading_trailing_whitespace directive.  Stripping leading and/or
trailing whitespace would require either allocating a new buffer or
changing the pointers in the existing buffer, and I am not familiar
enough with NGINX to know if subsequent code expects a buffer that was
allocated in a particualar way.  If header values were ever passed to
ngx_pfree(), munging them to skip leading whitespace would mean that a
request with leading whitespace would cause ngx_pfree() to be called
with an invalid pointer, which would be a security vulnerability.
Rejecting the request doesn't introduce any new error paths that clients
cannot already trigger, and it doesn't risk violating any invariants
that existing code might assume.  Also, Varnish Cache rejects HTTP/2
requests with leading and/or trailing whitespace in field values, so
there is precedent for doing so.
2025-03-25 15:07:32 -04:00
Demi Marie Obenour
3a45410074 HTTP: Use common header validation function for HTTP/2 and HTTP/3
The header validation required by HTTP/2 and HTTP/3 is identical, so use
a common function for both.  This will make it easier to add additional
validation in the future.  Move the function to ngx_http_parse.c so that
it can share code with the HTTP/1.x parser.
2025-03-25 15:03:34 -04:00
Demi Marie Obenour
6bd9e8ce72 HTTP: Do not log headers with unsanitized values
These could contain control charactes (including newlines!) and mess up
the logs.
2025-03-25 14:59:40 -04:00
Demi Marie Obenour
1ba504d634 HTTP: Consider tab as whitespace in field value
HTTP considers 0x09 (horizontal tab) to be valid horizontal whitespace
in a field value, and there are badly-behaved clients in the wild that
rely on this behavior and cannot be fixed.  This also ensures that NGINX
is not itself such a badly-behaved client and that, for HTTP/1.x
requests, the values of the $http_* variables agree with what upstream
servers will see.

Fixes: #187
2025-03-25 13:32:24 -04:00
Sergey Kandaurov
d313056537 Slice filter: improved memory allocation error handling.
As uncovered by recent addition in slice.t, a partially initialized
context, coupled with HTTP 206 response from stub backend, might be
accessed in the next slice subrequest.

Found by bad memory allocator simulation.
2025-03-10 19:32:07 +03:00
Sergey Kandaurov
d16251969b SSL: removed stale comments.
It appears to be a relic from prototype locking removed in b0b7b5a35.
2025-02-26 17:40:03 +04:00
Sergey Kandaurov
311c390377 SSL: improved logging of saving sessions from upstream servers.
This makes it easier to understand why sessions may not be saved
in shared memory due to size.
2025-02-26 17:40:03 +04:00
Sergey Kandaurov
9124592202 SSL: raised limit for sessions stored in shared memory.
Upstream SSL sessions may be of a noticeably larger size with tickets
in TLSv1.2 and older versions, or with "stateless" tickets in TLSv1.3,
if a client certificate is saved into the session.  Further, certain
stateless session resumption implemetations may store additional data.

Such one is JDK, known to also include server certificates in session
ticket data, which roughly doubles a decoded session size to slightly
beyond the previous limit.  While it's believed to be an issue on the
JDK side, this change allows to save such sessions.

Another, innocent case is using RSA certificates with 8192 key size.
2025-02-26 17:40:03 +04:00
Sergey Kandaurov
3d7304b527 SSL: using static storage for NGX_SSL_MAX_SESSION_SIZE buffers.
All such transient buffers are converted to the single storage in BSS.

In preparation to raise the limit.
2025-02-26 17:40:03 +04:00
Sergey Kandaurov
b11ae4cfc9 SSL: style. 2025-02-26 17:40:03 +04:00
Sergey Kandaurov
d25139db01 Improved ngx_http_subrequest() error handling.
Some checks failed
buildbot / buildbot (push) Has been cancelled
Previously, request might be left in inconsistent state in case of error,
which manifested in "http request count is zero" alerts when used by SSI
filter.

The fix is to reshuffle initialization order to postpone committing state
changes until after any potentially failing parts.

Found by bad memory allocator simulation.
2025-02-21 00:04:12 +04:00
Orgad Shaneh
f51e2de6fe Add gitignore file.
Some checks are pending
buildbot / buildbot (push) Waiting to run
2025-02-20 14:42:53 +03:00
Thierry Bastian
3327353ec0 Configure: MSVC compatibility with PCRE2 10.45.
Some checks failed
buildbot / buildbot (push) Has been cancelled
2025-02-18 19:07:11 +04:00
Piotr Sikora
9a4090f02a Core: fix build without libcrypt.
libcrypt is no longer part of glibc, so it might not be available.

Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
2025-02-18 16:18:10 +03:00
Sergey Kandaurov
f274b3f72f Version bump. 2025-02-18 15:49:18 +04:00
Sergey Kandaurov
ecb809305e nginx-1.27.4-RELEASE
Some checks failed
buildbot / buildbot (push) Has been cancelled
2025-02-05 20:13:42 +04:00
Sergey Kandaurov
46b9f5d389 SNI: added restriction for TLSv1.3 cross-SNI session resumption.
In OpenSSL, session resumption always happens in the default SSL context,
prior to invoking the SNI callback.  Further, unlike in TLSv1.2 and older
protocols, SSL_get_servername() returns values received in the resumption
handshake, which may be different from the value in the initial handshake.
Notably, this makes the restriction added in b720f650b insufficient for
sessions resumed with different SNI server name.

Considering the example from b720f650b, previously, a client was able to
request example.org by presenting a certificate for example.org, then to
resume and request example.com.

The fix is to reject handshakes resumed with a different server name, if
verification of client certificates is enabled in a corresponding server
configuration.
2025-02-05 20:11:42 +04:00
Roman Arutyunyan
22a2a225ba Added "keepalive_min_timeout" directive.
Some checks are pending
buildbot / buildbot (push) Waiting to run
The directive sets a timeout during which a keepalive connection will
not be closed by nginx for connection reuse or graceful shutdown.

The change allows clients that send multiple requests over the same
connection without delay or with a small delay between them, to avoid
receiving a TCP RST in response to one of them.  This excludes network
issues and non-graceful shutdown.  As a side-effect, it also addresses
the TCP reset problem described in RFC 9112, Section 9.6, when the last
sent HTTP response could be damaged by a followup TCP RST.  It is important
for non-idempotent requests, which cannot be retried by client.

It is not recommended to set keepalive_min_timeout to large values as
this can introduce an additional delay during graceful shutdown and may
restrict nginx from effective connection reuse.
2025-02-05 13:08:01 +03:00
Sergey Kandaurov
04914cfbcb Misc: moved documentation in generated ZIP archive.
Some checks failed
buildbot / buildbot (push) Has been cancelled
The recently added GitHub files now reside in the docs directory.
2025-01-30 18:21:43 +04:00
Sergey Kandaurov
e715202220 Configure: fixed --with-libatomic=DIR with recent libatomic_ops.
The build location of the resulting libatomic_ops.a was changed in v7.4.0
after converting libatomic_ops to use libtool.  The fix is to use library
from the install path, this allows building with both old and new versions.

Initially reported here:
https://mailman.nginx.org/pipermail/nginx/2018-April/056054.html
2025-01-30 17:16:10 +04:00
Aleksei Bavshin
64d0795ac4 QUIC: added missing casts in iov_base assignments.
Some checks failed
buildbot / buildbot (push) Has been cancelled
This is consistent with the rest of the code and fixes build on systems
with non-standard definition of struct iovec (Solaris, Illumos).
2025-01-28 08:00:42 -08:00
Pavel Pautov
5ab4f32e9d Upstream: fixed --with-compat build without SSL, broken by 454ad0e.
Some checks failed
buildbot / buildbot (push) Has been cancelled
2025-01-23 10:50:13 -08:00
Sergey Kandaurov
5d5d9adccf SSL: avoid using mismatched certificate/key cached pairs.
Some checks failed
buildbot / buildbot (push) Has been cancelled
This can happen with certificates and certificate keys specified
with variables due to partial cache update in various scenarios:
- cache expiration with only one element of pair evicted
- on-disk update with non-cacheable encrypted keys
- non-atomic on-disk update

The fix is to retry with fresh data on X509_R_KEY_VALUES_MISMATCH.
2025-01-17 04:37:46 +04:00
Sergey Kandaurov
454ad0ef33 Upstream: caching certificates and certificate keys with variables.
Caching is enabled with proxy_ssl_certificate_cache and friends.

Co-authored-by: Aleksei Bavshin <a.bavshin@nginx.com>
2025-01-17 04:37:46 +04:00
Sergey Kandaurov
4b96ad14f3 SSL: cache revalidation of file based dynamic certificates.
Revalidation is based on file modification time and uniq file index,
and happens after the cache object validity time is expired.
2025-01-17 04:37:46 +04:00
Sergey Kandaurov
0e756d67aa SSL: caching certificates and certificate keys with variables.
A new directive "ssl_certificate_cache max=N [valid=time] [inactive=time]"
enables caching of SSL certificate chain and secret key objects specified
by "ssl_certificate" and "ssl_certificate_key" directives with variables.

Co-authored-by: Aleksei Bavshin <a.bavshin@nginx.com>
2025-01-17 04:37:46 +04:00
Sergey Kandaurov
7677d5646a SSL: encrypted certificate keys are exempt from object cache.
SSL object cache, as previously introduced in 1.27.2, did not take
into account encrypted certificate keys that might be unexpectedly
fetched from the cache regardless of the matching passphrase.  To
avoid this, caching of encrypted certificate keys is now disabled
based on the passphrase callback invocation.

A notable exception is encrypted certificate keys configured without
ssl_password_file.  They are loaded once resulting in the passphrase
prompt on startup and reused in other contexts as applicable.
2025-01-17 04:37:46 +04:00
Sergey Kandaurov
8311e14ae6 SSL: object cache inheritance from the old configuration cycle.
Memory based objects are always inherited, engine based objects are
never inherited to adhere the volatile nature of engines, file based
objects are inherited subject to modification time and file index.

The previous behaviour to bypass cache from the old configuration cycle
is preserved with a new directive "ssl_object_cache_inheritable off;".
2025-01-17 04:37:46 +04:00
Daniel Vasquez Lopez
47f862ffad Slice filter: log the expected range in case of range error.
Some checks are pending
buildbot / buildbot (push) Waiting to run
2025-01-16 21:09:59 +04:00
Sergey Kandaurov
57d54fd922 Gzip: compatibility with recent zlib-ng 2.2.x versions.
Some checks failed
buildbot / buildbot (push) Has been cancelled
It now uses 5/4 times more memory for the pending buffer.

Further, a single allocation is now used, which takes additional 56 bytes
for deflate_allocs in 64-bit mode aligned to 16, to store sub-allocation
pointers, and the total allocation size now padded up to 128 bytes, which
takes theoretically 200 additional bytes in total.  This fits though into
"4 * (64 + sizeof(void*))" additional space for ZALLOC used in zlib-ng
2.1.x versions.  The comment was updated to reflect this.
2025-01-09 17:19:24 +04:00
Roman Arutyunyan
febe6e728f Year 2025. 2025-01-09 17:08:02 +04:00
Roman Arutyunyan
e3a9b6ad08 QUIC: fixed accessing a released stream.
Some checks failed
buildbot / buildbot (push) Has been cancelled
While trying to close a stream in ngx_quic_close_streams() by calling its
read event handler, the next stream saved prior to that could be destroyed
recursively.  This caused a segfault while trying to access the next stream.

The way the next stream could be destroyed in HTTP/3 is the following.
A request stream read event handler ngx_http_request_handler() could
end up calling ngx_http_v3_send_cancel_stream() to report a cancelled
request stream in the decoder stream.  If sending stream cancellation
decoder instruction fails for any reason, and the decoder stream is the
next in order after the request stream, the issue is triggered.

The fix is to postpone calling read event handlers for all streams being
closed to avoid closing a released stream.
2024-12-27 16:14:14 +04:00
Roman Arutyunyan
a52ba8ba0e QUIC: ignore version negotiation packets.
Some checks are pending
buildbot / buildbot (push) Waiting to run
Previously, such packets were treated as long header packets with unknown
version 0, and a version negotiation packet was sent in response.  This
could be used to set up an infinite traffic reflect loop with another nginx
instance.

Now version negotiation packets are ignored.  As per RFC 9000, Section 6.1:

  An endpoint MUST NOT send a Version Negotiation packet in response to
  receiving a Version Negotiation packet.
2024-12-26 18:58:05 +04:00
Jordan Zebor
c73fb273ac Updated security policy to clarify experimental features.
Some checks failed
buildbot / buildbot (push) Has been cancelled
The original security policy language did not capture the scope
as intended for experimental features and availability.
2024-12-23 20:36:15 +04:00
nandsky
930caed3bf QUIC: fixed client request timeout in 0-RTT scenarios.
Some checks failed
buildbot / buildbot (push) Has been cancelled
Since 0-RTT and 1-RTT data exist in the same packet number space,
ngx_quic_discard_ctx incorrectly discards 1-RTT packets when
0-RTT keys are discarded.

The issue was introduced by 58b92177e7.
2024-12-10 17:17:20 +04:00