Commit Graph

153 Commits

Author SHA1 Message Date
Roman Arutyunyan
eeb8a9f56f QUIC: path MTU discovery.
MTU selection starts by doubling the initial MTU until the first failure.
Then binary search is used to find the path MTU.
2023-08-14 09:21:27 +04:00
Roman Arutyunyan
0a3c796145 Common tree insert function for QUIC and UDP connections.
Previously, ngx_udp_rbtree_insert_value() was used for plain UDP and
ngx_quic_rbtree_insert_value() was used for QUIC.  Because of this it was
impossible to initialize connection tree in ngx_create_listening() since
this function is not aware what kind of listening it creates.

Now ngx_udp_rbtree_insert_value() is used for both QUIC and UDP.  To make
is possible, a generic key field is added to ngx_udp_connection_t.  It keeps
client address for UDP and connection ID for QUIC.
2023-05-14 12:30:11 +04:00
Roman Arutyunyan
1465a34067 QUIC: disabled datagram fragmentation.
As per RFC 9000, Section 14:

  UDP datagrams MUST NOT be fragmented at the IP layer.
2023-05-06 16:23:27 +04:00
Sergey Kandaurov
f5aa66bd30 Merged with the default branch. 2023-01-02 17:10:22 +04:00
Maxim Dounin
9c7a2c7ce4 Updated link to OpenVZ suspend/resume bug. 2022-12-21 14:53:27 +03:00
Roman Arutyunyan
9d81ef744c QUIC: separate UDP framework for QUIC.
Previously, QUIC used the existing UDP framework, which was created for UDP in
Stream.  However the way QUIC connections are created and looked up is different
from the way UDP connections in Stream are created and looked up.  Now these
two implementations are decoupled.
2022-04-20 16:01:17 +04:00
Roman Arutyunyan
465362e066 QUIC: store QUIC connection fd in stream fake connection.
Previously it had -1 as fd.  This fixes proxying, which relies on downstream
connection having a real fd.  Also, this reduces diff to the default branch for
ngx_close_connection().
2021-09-06 16:59:00 +03:00
Sergey Kandaurov
161759443c Merged with the default branch. 2021-07-15 16:28:21 +03:00
Maxim Dounin
52cde89586 Core: disabled SO_REUSEADDR on UDP sockets while testing config.
On Linux, SO_REUSEADDR allows completely duplicate UDP sockets, so using
SO_REUSEADDR when testing configuration results in packets being dropped
if there is an existing traffic on the sockets being tested (ticket #2187).
While dropped packets are expected with UDP, it is better to avoid this
when possible.

With this change, SO_REUSEADDR is no longer set on datagram sockets when
testing configuration.
2021-05-31 16:36:51 +03:00
Sergey Kandaurov
8ca2f73073 Merged with the default branch. 2021-02-17 14:48:35 +03:00
Maxim Dounin
9a3ec20232 Additional connections reuse.
If ngx_drain_connections() fails to immediately reuse any connections
and there are no free connections, it now additionally tries to reuse
a connection again.  This helps to provide at least one free connection
in case of HTTP/2 with lingering close, where merely trying to reuse
a connection once does not free it, but makes it reusable again,
waiting for lingering close.
2021-02-11 21:52:11 +03:00
Roman Arutyunyan
fd6df645eb Merged with the default branch. 2020-08-18 16:22:00 +03:00
Maxim Dounin
348bc94086 Core: reusing connections in advance.
Reworked connections reuse, so closing connections is attempted in
advance, as long as number of free connections is less than 1/16 of
worker connections configured.  This ensures that new connections can
be handled even if closing a reusable connection requires some time,
for example, for a lingering close (ticket #2017).

The 1/16 ratio is selected to be smaller than 1/8 used for disabling
accept when working with accept mutex, so nginx will try to balance
new connections to different workers first, and will start reusing
connections only if this won't help.
2020-08-10 18:53:07 +03:00
Maxim Dounin
e240d88d44 Core: added a warning about reusing connections.
Previously, reusing connections happened silently and was only
visible in monitoring systems.  This was shown to be not very user-friendly,
and administrators often didn't realize there were too few connections
available to withstand the load, and configured timeouts (keepalive_timeout
and http2_idle_timeout) were effectively reduced to keep things running.

To provide at least some information about this, a warning is now logged
(at most once per second, to avoid flooding the logs).
2020-08-10 18:52:59 +03:00
Roman Arutyunyan
b813b9ec35 QUIC: added "quic" listen parameter.
The parameter allows processing HTTP/0.9-2 over QUIC.

Also, introduced ngx_http_quic_module and moved QUIC settings there
2020-07-21 23:09:22 +03:00
Sergey Kandaurov
2346ee29e1 Merged with the default branch. 2020-07-13 15:34:22 +03:00
Sergey Kandaurov
9d46500914 Do not close QUIC sockets in ngx_close_listening_sockets().
This breaks graceful shutdown of QUIC connections in terms of quic-transport.
2020-06-23 11:57:00 +03:00
Ruslan Ermilov
da370de990 Fixed removing of listening UNIX sockets when "changing binary".
When changing binary, sending a SIGTERM to the new binary's master process
should not remove inherited UNIX sockets unless the old binary's master
process has exited.
2020-06-01 20:19:27 +03:00
Roman Arutyunyan
11dfc1c943 Fixed sanitizer errors. 2020-03-13 20:44:32 +03:00
Maxim Dounin
751bdd3bb2 Events: moved sockets cloning to ngx_event_init_conf().
Previously, listenings sockets were not cloned if the worker_processes
directive was specified after "listen ... reuseport".

This also simplifies upcoming configuration check on the number
of worker connections, as it needs to know the number of listening
sockets before cloning.
2018-07-12 19:50:02 +03:00
Ruslan Ermilov
468e37734c Added FreeBSD support for "listen ... reuseport". 2018-07-02 13:54:33 +03:00
Roman Arutyunyan
96b6f215b8 Stream: udp streams.
Previously, only one client packet could be processed in a udp stream session
even though multiple response packets were supported.  Now multiple packets
coming from the same client address and port are delivered to the same stream
session.

If it's required to maintain a single stream of data, nginx should be
configured in a way that all packets from a client are delivered to the same
worker.  On Linux and DragonFly BSD the "reuseport" parameter should be
specified for this.  Other systems do not currently provide appropriate
mechanisms.  For these systems a single stream of udp packets is only
guaranteed in single-worker configurations.

The proxy_response directive now specifies how many packets are expected in
response to a single client packet.
2018-06-04 19:50:00 +03:00
Maxim Dounin
4f9d83d6d7 Core: silenced getsockopt(TCP_FASTOPEN) messages on FreeBSD.
FreeBSD returns EINVAL when getsockopt(TCP_FASTOPEN) is called on a unix
domain socket, resulting in "getsockopt(TCP_FASTOPEN) ... failed" messages
during binary upgrade when unix domain listen sockets are present in
the configuration.  Added EINVAL to the list of ignored error codes.
2018-05-21 23:11:27 +03:00
Maxim Dounin
2e1e65a5c0 Fixed buffer overread with unix sockets after accept().
Some OSes (notably macOS, NetBSD, and Solaris) allow unix socket addresses
larger than struct sockaddr_un.  Moreover, some of them (macOS, Solaris)
return socklen of the socket address before it was truncated to fit the
buffer provided.  As such, on these systems socklen must not be used without
additional check that it is within the buffer provided.

Appropriate checks added to ngx_event_accept() (after accept()),
ngx_event_recvmsg() (after recvmsg()), and ngx_set_inherited_sockets()
(after getsockname()).

We also obtain socket addresses via getsockname() in
ngx_connection_local_sockaddr(), but it does not need any checks as
it is only used for INET and INET6 sockets (as there can be no
wildcard unix sockets).
2017-10-04 21:19:33 +03:00
Maxim Dounin
bedd9c5645 Core: fixed error message on setsockopt(SO_REUSEPORT) failure.
The error is fatal when configuring a new socket, so the ", ignored" part
is not appropriate and was removed.
2017-07-11 20:06:52 +03:00
Maxim Dounin
da165aae88 Core: disabled SO_REUSEPORT when testing config (ticket #1300).
When closing a socket with SO_REUSEPORT, Linux drops all connections waiting
in this socket's listen queue.  Previously, it was believed to only result
in connection resets when reconfiguring nginx to use smaller number of worker
processes.  It also results in connection resets during configuration
testing though.

Workaround is to avoid using SO_REUSEPORT when testing configuration.  It
should prevent listening sockets from being created if a conflicting socket
already exists, while still preserving detection of other possible errors.
It should also cover UDP sockets.

The only downside of this approach seems to be that a configuration testing
won't be able to properly report the case when nginx was compiled with
SO_REUSEPORT, but the kernel is not able to set it.  Such errors will be
reported on a real start instead.
2017-07-11 19:59:56 +03:00
Ruslan Ermilov
b66c18d2d5 Introduced ngx_tcp_nodelay(). 2017-05-26 22:52:48 +03:00
Maxim Dounin
c3ad24da01 Improved connection draining with small number of connections.
Closing up to 32 connections might be too aggressive if worker_connections
is set to a comparable number (and/or there are only a small number of
reusable connections).  If an occasional connection shorage happens in
such a configuration, it leads to closing all reusable connections instead
of gradually reducing keepalive timeout to a smaller value.  To improve
granularity in such configurations we now close no more than 1/8 of all
reusable connections at once.

Suggested by Joel Cunningham.
2017-01-20 14:03:20 +03:00
Maxim Dounin
660e1a5340 Added cycle parameter to ngx_drain_connections().
No functional changes, mostly style.
2017-01-20 14:03:19 +03:00
Ruslan Ermilov
f9430de485 Core: use c->log while closing connection.
c->pool is not destroyed here since c52408583801.
2016-10-05 13:57:43 +03:00
Ruslan Ermilov
fd064d3b88 Introduced the ngx_sockaddr_t type.
It's properly aligned and can hold any supported sockaddr.
2016-05-23 16:37:20 +03:00
Ruslan Ermilov
61fbcd1cad Belatedly changed the ngx_create_listening() prototype.
The function is called only with "struct sockaddr *" since 0.7.58.
2016-05-20 17:02:04 +03:00
Ruslan Ermilov
7ad57da598 Style. 2016-03-30 11:52:16 +03:00
Roman Arutyunyan
030a1f959c Fixed socket inheritance on reload and binary upgrade.
On nginx reload or binary upgrade, an attempt is made to inherit listen sockets
from the previous configuration.  Previously, no check for socket type was made
and the inherited socket could have the wrong type.  On binary upgrade, socket
type was not detected at all.  Wrong socket type could lead to errors on that
socket due to different logic and unsupported syscalls.  For example, a UDP
socket, inherited as TCP, lead to the following error after arrival of a
datagram: "accept() failed (102: Operation not supported on socket)".
2016-03-25 14:10:38 +03:00
Roman Arutyunyan
2ce791f2cd Stream: UDP proxy. 2016-01-20 19:52:12 +03:00
Kouhei Sutou
67614b3aa3 Win32: fixed build with MinGW and MinGW-w64 gcc.
This change fixes the "comparison between signed and unsigned integer
expressions" warning, introduced in 5e6142609e48 (1.9.4).
2015-10-17 21:41:02 +09:00
Valentin Bartenev
50ff8b3c3a Core: idle connections now closed only once on exiting.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured.  As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.

Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set.  The upstream keepalive module
was modified to follow this.
2015-08-11 16:28:55 +03:00
Gena Makhomed
97741382b6 Workaround for "configuration file test failed" under OpenVZ.
If nginx was used under OpenVZ and a container with nginx was suspended
and resumed, configuration tests started to fail because of EADDRINUSE
returned from listen() instead of bind():

# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
nginx: configuration file /etc/nginx/nginx.conf test failed

With this change EADDRINUSE errors returned by listen() are handled
similarly to errors returned by bind(), and configuration tests work
fine in the same environment:

# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

More details about OpenVZ suspend/resume bug:
https://bugzilla.openvz.org/show_bug.cgi?id=2470
2015-07-23 14:00:03 -04:00
Maxim Dounin
f7f1607bf2 The "reuseport" option of the "listen" directive.
When configured, an individual listen socket on a given address is
created for each worker process.  This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance.  As of now it works on Linux and DragonFly BSD.

Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/.  With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.

Based on previous work by Sepherosa Ziehau and Yingqi Lu.
2015-05-20 15:51:56 +03:00
Ruslan Ermilov
33b8e5bc06 Removed the obsolete rtsig module. 2015-04-23 14:17:40 +03:00
Ruslan Ermilov
c1882d9f3f Removed the obsolete aio module. 2015-04-22 18:57:32 +03:00
Ruslan Ermilov
07de3f538b Removed stub implementation of win32 mutexes. 2015-03-23 13:52:47 +03:00
Ruslan Ermilov
f8d10849ad Removed ngx_connection_t.lock. 2015-03-20 06:43:19 +03:00
Ruslan Ermilov
83ba5ed2ec Renamed NGX_THREADS to NGX_OLD_THREADS because of deprecation.
It's mostly dead code and the original idea of worker threads has been rejected.
2015-03-04 18:26:25 +03:00
Roman Arutyunyan
6337c1d185 Core: make ngx_connection_local_sockaddr() always assign address.
Previously, this function checked for connection local address existence
and returned error if it was missing.  Now a new address is assigned in this
case making it possible to call this function not only for accepted connections.
2015-02-17 14:26:44 +03:00
Valentin Bartenev
37d24e7e3b Events: processing of posted events changed from LIFO to FIFO.
In theory, this can provide a bit better distribution of latencies.

Also it simplifies the code, since ngx_queue_t is now used instead
of custom implementation.
2014-09-01 18:20:18 +04:00
Valentin Bartenev
2a81e05566 Events: removed broken thread support from posted events.
It's mostly dead code.  And the idea of thread support for this task has
been deprecated.
2014-09-01 18:20:03 +04:00
Maxim Dounin
1a5cdafa82 Core: plugged socket leak during configuration test.
This isn't really important as configuration testing shortly ends with
a process termination which will free all sockets, though Coverity
complains.

Prodded by Coverity (CID 400872).
2014-06-26 03:34:05 +04:00
Ruslan Ermilov
8aa8365121 Core: allocate enough memory to hold IPv6 text address plus port. 2014-02-22 12:08:31 +04:00
Piotr Sikora
ab3c0f9250 Use ngx_socket_errno where appropriate.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
2014-02-03 14:17:17 -08:00