The patch adds proper transitions between multiple networking addresses that
can be used by a single quic connection. New networking paths are validated
using PATH_CHALLENGE/PATH_RESPONSE frames.
In FreeBSD 13, eventfd(2) was added, and this breaks build
with --test-build-epoll and without --with-file-aio. Fix is
to move eventfd(2) detection to auto/os/linux, as it is used
only on Linux as a notification mechanism for epoll().
The strerrordesc_np() function, introduced in glibc 2.32, provides an
async-signal-safe way to obtain error messages. This makes it possible
to avoid copying error messages.
Previously, systems without sys_nerr (or _sys_nerr) were handled with an
assumption that errors start at 0 and continuous. This is, however, not
something POSIX requires, and not true on some platforms.
Notably, on Linux, where sys_nerr is no longer available for newly linked
binaries starting with glibc 2.32, there are gaps in error list, which
used to stop us from properly detecting maximum errno. Further, on
GNU/Hurd errors start at 0x40000001.
With this change, maximum errno detection is moved to the runtime code,
now able to ignore gaps, and also detects the first error if needed.
This fixes observed "Unknown error" messages as seen on Linux with
glibc 2.32 and on GNU/Hurd.
The quic kernel bpf helper inspects packet payload for DCID, extracts key
and routes the packet into socket matching the key.
Due to reuseport feature, each worker owns a personal socket, which is
identified by the same key, used to create DCID.
BPF objects are locked in RAM and are subject to RLIMIT_MEMLOCK.
The "ulimit -l" command may be used to setup proper limits, if maps
cannot be created with EPERM or updated with ETOOLONG.
The filter is responsible for creating HTTP/3 response header and body.
The change removes differences to the default branch for
ngx_http_chunked_filter_module and ngx_http_header_filter_module.
When installing or running from a non-root user it is sometimes required to
override default, compiled in error log path. There was no way to do this
without rebuilding the binary (ticket #147).
This patch introduced "-e" command line option which allows one to override
compiled in error log path.
Addon modules, both dynamic and static, can now use shared source files.
Shared sources result in only one make rule even if specified several
times in different modules.
All code dealing with serializing/deserializing
is moved int srv/event/ngx_event_quic_transport.c/h file.
All macros for dealing with data are internal to source file.
The header file exposes frame types and error codes.
The exported functions are currently packet header parsers and writers
and frames parser/writer.
The ngx_quic_header_t structure is updated with 'log' member. This avoids
passing extra argument to parsing functions that need to report errors.
New files:
src/event/ngx_event_quic_protection.h
src/event/ngx_event_quic_protection.c
The protection.h header provides interface to the crypto part of the QUIC:
2 functions to initialize corresponding secrets:
ngx_quic_set_initial_secret()
ngx_quic_set_encryption_secret()
and 2 functions to deal with packet processing:
ngx_quic_encrypt()
ngx_quic_decrypt()
Also, structures representing secrets are defined there.
All functions require SSL connection and a pool, only crypto operations
inside, no access to nginx connections or events.
Currently pool->log is used for the logging (instead of original c->log).
This makes it possible to avoid looping for a long time while working
with a fast enough peer when data are added to the socket buffer faster
than we are able to read and process them (ticket #1431). This is
basically what we already do on FreeBSD with kqueue, where information
about the number of bytes in the socket buffer is returned by
the kevent() call.
With other event methods rev->available is now set to -1 when the socket
is ready for reading. Later in ngx_recv() and ngx_recv_chain(), if
full buffer is received, real number of bytes in the socket buffer is
retrieved using ioctl(FIONREAD). Reading more than this number of bytes
ensures that even with edge-triggered event methods the event will be
triggered again, so it is safe to stop processing of the socket and
switch to other connections.
Using ioctl(FIONREAD) only after reading a full buffer is an optimization.
With this approach we only call ioctl(FIONREAD) when there are at least
two recv()/readv() calls.
Postpone filter is an essential part of subrequest functionality. In absence
of it a subrequest response body is sent to the client out of order with
respect to the main request header and body, as well as other subrequests.
For in-memory subrequests the response is also sent to the client instead of
being stored in memory.
Currently the postpone filter is automatically enabled if one of the following
standard modules which are known to create subrequests is enabled: ssi, slice,
addition. However a third-party module that creates subrequests can still be
built without the postpone filter or be dynamically loaded in nginx built
without it.
WSAPoll() is only available with Windows Vista and newer (and only
available during compilation if _WIN32_WINNT >= 0x0600). To make
sure the code works with Windows XP, we do not redefine _WIN32_WINNT,
but instead load WSAPoll() dynamically if it is not available during
compilation.
Also, sockets are not guaranteed to be small integers on Windows.
So an index array is used instead of NGX_USE_FD_EVENT to map
events to connections.
Previously, select was compiled in by default, but the NGX_HAVE_SELECT
macro was not set, resulting in iocp being used by default unless
the "--with-select_module" configure option was explicitly specified.
Since the iocp module is not finished and does not work properly, this
effectively meant that the "--with-select_module" option was mandatory.
With the change NGX_HAVE_SELECT is properly set, making "--with-select_module"
optional. Accordingly, it is removed from misc/GNUmakefile win32 target.
The module implements random load-balancing algorithm with optional second
choice. In the latter case, the best of two servers is chosen, accounting
number of connections and server weight.
Example:
upstream u {
random [two [least_conn]];
server 127.0.0.1:8080;
server 127.0.0.1:8081;
server 127.0.0.1:8082;
server 127.0.0.1:8083;
}
While 325b3042edd6 fixed it on MINIX, it broke it on systems
that output the word "version" on several lines with "cc -v".
The fix is to only consider "clang version" or "LLVM version"
as clang version, but this time only using sed(1).
This was previously used, but was incorrectly removed in 83d54192e97b
while removing old threads remnants. Instead of using it conditionally
when threads are not used, we now set in unconditionally, as even with
thread pools enabled we never call OpenSSL functions in threads.
This fixes resulting binary when using --with-openssl with OpenSSL 1.1.0+
and without -lpthread linked (notably on FreeBSD without PCRE).
OpenSSL now uses pthread_atfork(), and this requires -lpthread on Linux
to compile. Introduced NGX_LIBPTHREAD to add it as appropriate, similar
to existing NGX_LIBDL.
The module allows passing requests to upstream gRPC servers.
The module is built by default as long as HTTP/2 support is compiled in.
Example configuration:
grpc_pass 127.0.0.1:9000;
Alternatively, the "grpc://" scheme can be used:
grpc_pass grpc://127.0.0.1:9000;
Keepalive support is available via the upstream keepalive module. Note
that keepalive connections won't currently work with grpc-go as it fails
to handle SETTINGS_HEADER_TABLE_SIZE.
To use with SSL:
grpc_pass grpcs://127.0.0.1:9000;
SSL connections use ALPN "h2" when available. At least grpc-go works fine
without ALPN, so if ALPN is not available we just establish a connection
without it.
Tested with grpc-c++ and grpc-go.
When clock_gettime(CLOCK_MONOTONIC) (or faster variants, _FAST on FreeBSD,
and _COARSE on Linux) is available, we now use it for ngx_current_msec.
This should improve handling of timers if system time changes (ticket #189).
Previously, capset(2) was called with the 64-bit capabilities version
_LINUX_CAPABILITY_VERSION_3. With this version Linux kernel expected two
copies of struct __user_cap_data_struct, while only one was submitted. As a
result, random stack memory was accessed and random capabilities were requested
by the worker. This sometimes caused capset() errors. Now the 32-bit version
_LINUX_CAPABILITY_VERSION_1 is used instead. This is OK since CAP_NET_RAW is
a 32-bit capability (CAP_NET_RAW = 13).
Previously included file sys/capability.h mentioned in capset(2) man page,
belongs to the libcap-dev package, which may not be installed on some Linux
systems when compiling nginx. This prevented the capabilities feature from
being detected and compiled on that systems.
Now linux/capability.h system header is included instead. Since capset()
declaration is located in sys/capability.h, now capset() syscall is defined
explicitly in code using the SYS_capset constant, similarly to other
Linux-specific features in nginx.
The capability is retained automatically in unprivileged worker processes after
changing UID if transparent proxying is enabled at least once in nginx
configuration.
The feature is only available in Linux.
In 2c7b488a61fb, IP_BIND_ADDRESS_NO_PORT test was accidentally placed
between SO_BINDANY, IP_TRANSPARENT, and IP_BINDANY tests. Moved it after
these tests.
As per POSIX, basic regular expressions have no alternations, and the
interpretation of the "\|" construct is undefined. At least on MINIX
and Solaris grep interprets "\|" as literal "|", and not as an alternation
as GNU grep does. Removed such constructs introduced in f1daa0356a1d.
This fixes clang detection on MINIX.
The phase is added instead of the try_files phase. Unlike the old phase, the
new one supports registering multiple handlers. The try_files implementation is
moved to a separate ngx_http_try_files_module, which now registers a precontent
phase handler.
The http_rewrite module cannot be selected when http is disabled.
Fixed the PCRE check condition to avoid irrelevant check failure.
This is a regression from 4d874b4d82ed.
Signed-off-by: Samuel Martin <s.martin49@gmail.com>
On Cygwin and NetBSD 7.0+ struct in_pktinfo has no ipi_spec_dst field, which
caused nginx compilation error. Now presence of this field is ensured by the
IP_PKTINFO feature test.
The problem was introduced by dbb0c854e308 (1.13.0).
Oracle Developer Studio 12.5 introduced GCC-compatible __sync builtins.
Unfortunately, these builtins are neither GCC-compatible (they generate
warnings when used with volatile), nor working (unexpectedly fail on
unpredictable combinations of code layout and compiler flags). As such,
the gcc builtin atomic operations configure test explicitly disabled when
compiling with Sun C.
Previously, the source IP address of a response UDP datagram could differ from
the original datagram destination address. This could happen if the server UDP
socket is bound to a wildcard address and the network interface chosen to output
the response packet has a different default address than the destination address
of the original packet. For example, if two addresses from the same network are
configured on an interface.
Now source address is set explicitly if a response is sent for a server UDP
socket bound to a wildcard address.
Some combinations of options might cause the builds with the
--with-stream option to break due to invalid value of the
STREAM_INCS make variable, e.g.
auto/configure \
--with-stream \
--with-http_perl_module=dynamic \
--without-http_memcached_module \
--without-http_empty_gif_module \
--without-http_browser_module \
--without-http_upstream_hash_module \
--without-http_upstream_ip_hash_module \
--without-http_upstream_least_conn_module \
--without-http_upstream_keepalive_module \
--without-http_upstream_zone_module \
Explicit initialization of ngx_module_libs and ngx_module_link
matches what we already do when processing mail modules, and
is also required after the next change.
OpenSSL 1.1.0 now uses normal "nmake; nmake install" instead of using
custom "ms\do_ms.bat" script and "ms\nt.mak" makefile. And Configure
now requires --prefix to be absolute, and no longer derives --openssldir
from prefix (so it's specified explicitly). Generated libraries are now
called "libcrypto.lib" and "libssl.lib" instead of "libeay32.lib"
and "ssleay32.lib". Appropriate tests added to support both old and new
variants.
Additionally, openssl/lhash.h now triggers warning C4090 ('function' :
different 'const' qualifiers), so the warning was disabled.
In Perl 5.8.6 the default was switched to use putenv() when used as
embedded library unless "PL_use_safe_putenv = 0" is explicitly used
in the code. Therefore, for modern versions of Perl it is no longer
necessary to restore previous environment when calling perl_destruct().
Dependencies of dynamic modules are added to NGX_ADDON_DEPS (and
it is now used for dynamic modules) to be in line with what happens
in case of static compilation.
To avoid duplication, MAIL_DEPS and STREAM_DEPS are no longer passed
to auto/module when these modules are compiled as dynamic ones. Mail
and stream dependencies are handled explicitly via corresponding
variables.
IPv6 now compiled-in automatically if support is found. If there is a need
to disable it for some reason, --with-cc-opt="-DNGX_HAVE_INET6=0" can be used
for this.
Previously flags passed by --with-ld-opt were not used when building perl
module, which meant hardening flags provided by package build systems were not
applied.
The ssl_preread module extracts information from the SSL Client Hello message
without terminating SSL. Currently, only $ssl_preread_server_name variable
is supported, which contains server name from the SNI extension.
This flag appeared in Linux 4.5 and is useful for avoiding thundering herd
problem.
The current Linux kernel implementation walks the list of exclusive waiters,
and queues an event to each epfd, until it finds the first waiter that has
threads blocked on it via epoll_wait().