Commit Graph

134 Commits

Author SHA1 Message Date
Maxim Dounin
751bdd3bb2 Events: moved sockets cloning to ngx_event_init_conf().
Previously, listenings sockets were not cloned if the worker_processes
directive was specified after "listen ... reuseport".

This also simplifies upcoming configuration check on the number
of worker connections, as it needs to know the number of listening
sockets before cloning.
2018-07-12 19:50:02 +03:00
Ruslan Ermilov
468e37734c Added FreeBSD support for "listen ... reuseport". 2018-07-02 13:54:33 +03:00
Roman Arutyunyan
96b6f215b8 Stream: udp streams.
Previously, only one client packet could be processed in a udp stream session
even though multiple response packets were supported.  Now multiple packets
coming from the same client address and port are delivered to the same stream
session.

If it's required to maintain a single stream of data, nginx should be
configured in a way that all packets from a client are delivered to the same
worker.  On Linux and DragonFly BSD the "reuseport" parameter should be
specified for this.  Other systems do not currently provide appropriate
mechanisms.  For these systems a single stream of udp packets is only
guaranteed in single-worker configurations.

The proxy_response directive now specifies how many packets are expected in
response to a single client packet.
2018-06-04 19:50:00 +03:00
Maxim Dounin
4f9d83d6d7 Core: silenced getsockopt(TCP_FASTOPEN) messages on FreeBSD.
FreeBSD returns EINVAL when getsockopt(TCP_FASTOPEN) is called on a unix
domain socket, resulting in "getsockopt(TCP_FASTOPEN) ... failed" messages
during binary upgrade when unix domain listen sockets are present in
the configuration.  Added EINVAL to the list of ignored error codes.
2018-05-21 23:11:27 +03:00
Maxim Dounin
2e1e65a5c0 Fixed buffer overread with unix sockets after accept().
Some OSes (notably macOS, NetBSD, and Solaris) allow unix socket addresses
larger than struct sockaddr_un.  Moreover, some of them (macOS, Solaris)
return socklen of the socket address before it was truncated to fit the
buffer provided.  As such, on these systems socklen must not be used without
additional check that it is within the buffer provided.

Appropriate checks added to ngx_event_accept() (after accept()),
ngx_event_recvmsg() (after recvmsg()), and ngx_set_inherited_sockets()
(after getsockname()).

We also obtain socket addresses via getsockname() in
ngx_connection_local_sockaddr(), but it does not need any checks as
it is only used for INET and INET6 sockets (as there can be no
wildcard unix sockets).
2017-10-04 21:19:33 +03:00
Maxim Dounin
bedd9c5645 Core: fixed error message on setsockopt(SO_REUSEPORT) failure.
The error is fatal when configuring a new socket, so the ", ignored" part
is not appropriate and was removed.
2017-07-11 20:06:52 +03:00
Maxim Dounin
da165aae88 Core: disabled SO_REUSEPORT when testing config (ticket #1300).
When closing a socket with SO_REUSEPORT, Linux drops all connections waiting
in this socket's listen queue.  Previously, it was believed to only result
in connection resets when reconfiguring nginx to use smaller number of worker
processes.  It also results in connection resets during configuration
testing though.

Workaround is to avoid using SO_REUSEPORT when testing configuration.  It
should prevent listening sockets from being created if a conflicting socket
already exists, while still preserving detection of other possible errors.
It should also cover UDP sockets.

The only downside of this approach seems to be that a configuration testing
won't be able to properly report the case when nginx was compiled with
SO_REUSEPORT, but the kernel is not able to set it.  Such errors will be
reported on a real start instead.
2017-07-11 19:59:56 +03:00
Ruslan Ermilov
b66c18d2d5 Introduced ngx_tcp_nodelay(). 2017-05-26 22:52:48 +03:00
Maxim Dounin
c3ad24da01 Improved connection draining with small number of connections.
Closing up to 32 connections might be too aggressive if worker_connections
is set to a comparable number (and/or there are only a small number of
reusable connections).  If an occasional connection shorage happens in
such a configuration, it leads to closing all reusable connections instead
of gradually reducing keepalive timeout to a smaller value.  To improve
granularity in such configurations we now close no more than 1/8 of all
reusable connections at once.

Suggested by Joel Cunningham.
2017-01-20 14:03:20 +03:00
Maxim Dounin
660e1a5340 Added cycle parameter to ngx_drain_connections().
No functional changes, mostly style.
2017-01-20 14:03:19 +03:00
Ruslan Ermilov
f9430de485 Core: use c->log while closing connection.
c->pool is not destroyed here since c52408583801.
2016-10-05 13:57:43 +03:00
Ruslan Ermilov
fd064d3b88 Introduced the ngx_sockaddr_t type.
It's properly aligned and can hold any supported sockaddr.
2016-05-23 16:37:20 +03:00
Ruslan Ermilov
61fbcd1cad Belatedly changed the ngx_create_listening() prototype.
The function is called only with "struct sockaddr *" since 0.7.58.
2016-05-20 17:02:04 +03:00
Ruslan Ermilov
7ad57da598 Style. 2016-03-30 11:52:16 +03:00
Roman Arutyunyan
030a1f959c Fixed socket inheritance on reload and binary upgrade.
On nginx reload or binary upgrade, an attempt is made to inherit listen sockets
from the previous configuration.  Previously, no check for socket type was made
and the inherited socket could have the wrong type.  On binary upgrade, socket
type was not detected at all.  Wrong socket type could lead to errors on that
socket due to different logic and unsupported syscalls.  For example, a UDP
socket, inherited as TCP, lead to the following error after arrival of a
datagram: "accept() failed (102: Operation not supported on socket)".
2016-03-25 14:10:38 +03:00
Roman Arutyunyan
2ce791f2cd Stream: UDP proxy. 2016-01-20 19:52:12 +03:00
Kouhei Sutou
67614b3aa3 Win32: fixed build with MinGW and MinGW-w64 gcc.
This change fixes the "comparison between signed and unsigned integer
expressions" warning, introduced in 5e6142609e48 (1.9.4).
2015-10-17 21:41:02 +09:00
Valentin Bartenev
50ff8b3c3a Core: idle connections now closed only once on exiting.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured.  As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.

Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set.  The upstream keepalive module
was modified to follow this.
2015-08-11 16:28:55 +03:00
Gena Makhomed
97741382b6 Workaround for "configuration file test failed" under OpenVZ.
If nginx was used under OpenVZ and a container with nginx was suspended
and resumed, configuration tests started to fail because of EADDRINUSE
returned from listen() instead of bind():

# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
nginx: configuration file /etc/nginx/nginx.conf test failed

With this change EADDRINUSE errors returned by listen() are handled
similarly to errors returned by bind(), and configuration tests work
fine in the same environment:

# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

More details about OpenVZ suspend/resume bug:
https://bugzilla.openvz.org/show_bug.cgi?id=2470
2015-07-23 14:00:03 -04:00
Maxim Dounin
f7f1607bf2 The "reuseport" option of the "listen" directive.
When configured, an individual listen socket on a given address is
created for each worker process.  This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance.  As of now it works on Linux and DragonFly BSD.

Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/.  With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.

Based on previous work by Sepherosa Ziehau and Yingqi Lu.
2015-05-20 15:51:56 +03:00
Ruslan Ermilov
33b8e5bc06 Removed the obsolete rtsig module. 2015-04-23 14:17:40 +03:00
Ruslan Ermilov
c1882d9f3f Removed the obsolete aio module. 2015-04-22 18:57:32 +03:00
Ruslan Ermilov
07de3f538b Removed stub implementation of win32 mutexes. 2015-03-23 13:52:47 +03:00
Ruslan Ermilov
f8d10849ad Removed ngx_connection_t.lock. 2015-03-20 06:43:19 +03:00
Ruslan Ermilov
83ba5ed2ec Renamed NGX_THREADS to NGX_OLD_THREADS because of deprecation.
It's mostly dead code and the original idea of worker threads has been rejected.
2015-03-04 18:26:25 +03:00
Roman Arutyunyan
6337c1d185 Core: make ngx_connection_local_sockaddr() always assign address.
Previously, this function checked for connection local address existence
and returned error if it was missing.  Now a new address is assigned in this
case making it possible to call this function not only for accepted connections.
2015-02-17 14:26:44 +03:00
Valentin Bartenev
37d24e7e3b Events: processing of posted events changed from LIFO to FIFO.
In theory, this can provide a bit better distribution of latencies.

Also it simplifies the code, since ngx_queue_t is now used instead
of custom implementation.
2014-09-01 18:20:18 +04:00
Valentin Bartenev
2a81e05566 Events: removed broken thread support from posted events.
It's mostly dead code.  And the idea of thread support for this task has
been deprecated.
2014-09-01 18:20:03 +04:00
Maxim Dounin
1a5cdafa82 Core: plugged socket leak during configuration test.
This isn't really important as configuration testing shortly ends with
a process termination which will free all sockets, though Coverity
complains.

Prodded by Coverity (CID 400872).
2014-06-26 03:34:05 +04:00
Ruslan Ermilov
8aa8365121 Core: allocate enough memory to hold IPv6 text address plus port. 2014-02-22 12:08:31 +04:00
Piotr Sikora
ab3c0f9250 Use ngx_socket_errno where appropriate.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
2014-02-03 14:17:17 -08:00
Piotr Sikora
2e57e0609b Core: handle getsockopt(TCP_FASTOPEN) failures.
Linux returns EOPNOTSUPP for non-TCP sockets and ENOPROTOOPT for TCP
sockets, because getsockopt(TCP_FASTOPEN) is not implemented so far.

While there, lower the log level from ALERT to NOTICE to match other
getsockopt() failures.

Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
2014-01-30 14:58:21 -08:00
Maxim Dounin
c94c24b177 Fixed TCP_DEFER_ACCEPT handling (ticket #353).
Backed out 05a56ebb084a, as it turns out that kernel can return connections
without any delay if syncookies are used.  This basically means we can't
assume anything about connections returned with deferred accept set.

To solve original problem the 05a56ebb084a tried to solve, i.e. to don't
wait longer than needed if a connection was accepted after deferred accept
timeout, this patch changes a timeout set with setsockopt(TCP_DEFER_ACCEPT)
to 1 second, unconditionally.  This is believed to be enough for speed
improvements, and doesn't imply major changes to timeouts used.

Note that before 2.6.32 connections were dropped after a timeout.  Though
it is believed that 1s is still appropriate for kernels before 2.6.32,
as previously tcp_synack_retries controlled the actual timeout and 1s results
in more than 1 minute actual timeout by default.
2014-01-28 15:40:46 +04:00
Ruslan Ermilov
bf800822df Fixed the first argument to getsockopt().
While here, always initialize the last argument.
2013-12-19 13:43:18 +04:00
Ruslan Ermilov
fa512fdb76 Fixed handling of UNIX-domain sockets.
When evaluating $local_port, $server_port, and $server_addr,
UNIX-domain sockets were mistakenly interpreted as IPv4 sockets.
2013-12-09 10:16:44 +04:00
Ruslan Ermilov
675e73e3bd Core: keep the length of the local sockaddr. 2013-12-09 10:14:51 +04:00
Mathew Rodley
84f5c2136e Added support for TCP_FASTOPEN supported in Linux >= 3.7.1.
---
 auto/unix                       | 12 ++++++++++++
 src/core/ngx_connection.c       | 32 ++++++++++++++++++++++++++++++++
 src/core/ngx_connection.h       |  4 ++++
 src/http/ngx_http.c             |  4 ++++
 src/http/ngx_http_core_module.c | 21 +++++++++++++++++++++
 src/http/ngx_http_core_module.h |  3 +++
 6 files changed, 76 insertions(+)
2013-12-03 22:07:03 +04:00
Maxim Dounin
0eee3b0bc5 Core: handling of getsockopt(TCP_DEFER_ACCEPT) failures.
Recent Linux versions started to return EOPNOTSUPP to getsockopt() calls
on unix sockets, resulting in log pollution on binary upgrade.  Such errors
are silently ignored now.
2013-10-31 04:00:37 +04:00
Maxim Dounin
48d96ced6f Win32: MinGW GCC compatibility.
Several warnings silenced, notably (ngx_socket_t) -1 is now checked
on socket operations instead of -1, as ngx_socket_t is unsigned on win32
and gcc complains on comparison.

With this patch, it's now possible to compile nginx using mingw gcc,
with options we normally compile on win32.
2013-09-04 20:48:28 +04:00
Ruslan Ermilov
02a077b827 On DragonFlyBSD, TCP_KEEPIDLE and TCP_KEEPINTVL are in msecs.
Based on a patch by Sepherosa Ziehau.
2013-07-25 12:46:03 +04:00
Ruslan Ermilov
690e2b33aa Style: reuse one int variable in ngx_configure_listening_sockets().
No functional changes.
2013-07-25 12:46:02 +04:00
Vladimir Homutov
d79c8abcaa Core: fixed possible use of an uninitialized variable.
The call to ngx_sock_ntop() in ngx_connection_local_sockaddr() might be
performed with the uninitialized "len" variable.  The fix is to initialize
variable to the size of corresponding socket address type.

The issue was introduced in commit 05ba5bce31e0.
2013-07-11 19:50:19 +04:00
Vladimir Homutov
af18946d76 Core: extended ngx_sock_ntop() with socklen parameter.
On Linux, sockaddr length is required to process unix socket addresses properly
due to unnamed sockets (which don't have sun_path set at all) and abstract
namespace sockets.
2013-07-11 16:07:25 +04:00
Valentin Bartenev
604e18fb2c Use NGX_FILE_ERROR for handling file operations errors.
On Win32 platforms 0 is used to indicate errors in file operations, so
comparing against -1 is not portable.

This was not much of an issue in patched code, since only ngx_fd_info() test
is actually reachable on Win32 and in worst case it might result in bogus
error log entry.

Patch by Piotr Sikora.
2013-03-25 15:49:11 +00:00
Valentin Bartenev
bac0cb3bbd Status: introduced the "ngx_stat_waiting" counter.
And corresponding variable $connections_waiting was added.

Previously, waiting connections were counted as the difference between
active connections and the sum of reading and writing connections.
That made it impossible to count more than one request in one connection
as reading or writing (as is the case for SPDY).

Also, we no longer count connections in handshake state as waiting.
2013-03-15 20:00:49 +00:00
Valentin Bartenev
a32d3f8b6b Removed c->single_connection flag.
The c->single_connection was intended to be used as lock mechanism
to serialize modifications of request object from several threads
working with client and upstream connections.  The flag is redundant
since threads in nginx have never been used that way.
2013-03-07 18:07:16 +00:00
Igor Sysoev
da130acfbe Fixed failure to start cache manager and cache loader processes
if there were more than 512 listening sockets in configuration.
2012-11-20 13:37:55 +00:00
Ruslan Ermilov
deaf22d220 Core: ipv6only is now on by default.
There is a general consensus that this change results in better
consistency between different operating systems and differently
tuned operating systems.

Note: this changes the width and meaning of the ipv6only field
of the ngx_listening_t structure.  3rd party modules that create
their own listening sockets might need fixing.
2012-07-30 12:27:06 +00:00
Ruslan Ermilov
47a04aaa27 Fixed spelling in multiline C comments. 2012-04-03 07:37:31 +00:00
Maxim Dounin
ee187436af Whitespace fixes. 2012-03-05 18:09:06 +00:00