With gRPC it is possible that a request sending is blocked due to flow
control. Moreover, further sending might be only allowed once the
backend sees all the data we've already sent. With such a backend
it is required to clear the TCP_NOPUSH socket option to make sure all
the data we've sent are actually delivered to the backend.
As such, we now clear TCP_NOPUSH in ngx_http_upstream_send_request()
also on NGX_AGAIN if c->write->ready is set. This fixes a test (which
waits for all the 64k bytes as per initial window before allowing more
bytes) with sendfile enabled when the body was written to a file
in a different context.
Now tcp_nopush on peer connections is disabled if it is disabled on
the client connection, similar to how we handle c->sendfile. Previously,
tcp_nopush was always used on upstream connections, regardless of
the "tcp_nopush" directive.
We copy input buffers to our buffers, so various flags might be
unexpectedly set in buffers returned by ngx_chain_get_free_buf().
In particular, the b->in_file flag might be set when the body was
written to a file in a different context. With sendfile enabled this
in turn might result in protocol corruption if such a buffer was reused
for a control frame.
Make sure to clear buffers and set only fields we really need to be set.
The module implements random load-balancing algorithm with optional second
choice. In the latter case, the best of two servers is chosen, accounting
number of connections and server weight.
Example:
upstream u {
random [two [least_conn]];
server 127.0.0.1:8080;
server 127.0.0.1:8081;
server 127.0.0.1:8082;
server 127.0.0.1:8083;
}
Before 4a8c9139e579, ngx_resolver_create() didn't use configuration
pool, and allocations were done using malloc().
In 016352c19049, when resolver gained support of several servers,
new allocations were done from the pool.
With u->conf->preserve_output set the request body file might be used
after the response header is sent, so avoid cleaning it. (Normally
this is not a problem as u->conf->preserve_output is only set with
r->request_body_no_buffering, but the request body might be already
written to a file in a different context.)
Previously, only one client packet could be processed in a udp stream session
even though multiple response packets were supported. Now multiple packets
coming from the same client address and port are delivered to the same stream
session.
If it's required to maintain a single stream of data, nginx should be
configured in a way that all packets from a client are delivered to the same
worker. On Linux and DragonFly BSD the "reuseport" parameter should be
specified for this. Other systems do not currently provide appropriate
mechanisms. For these systems a single stream of udp packets is only
guaranteed in single-worker configurations.
The proxy_response directive now specifies how many packets are expected in
response to a single client packet.
Previously, ngx_event_recvmsg() got remote socket addresses after creating
the connection object. In preparation to handling multiple UDP packets in a
single session, this code was moved up.
On Linux recvmsg() syscall may return a zero-length client address when
receiving a datagram from an unbound unix datagram socket. It is usually
assumed that socket address has at least the sa_family member. Zero-length
socket address caused buffer over-read in functions which receive socket
address, for example ngx_sock_ntop(). Typically the over-read resulted in
unexpected socket family followed by session close. Now a fake socket address
is allocated instead of a zero-length client address.
Negative times can appear since workers only update time on an event
loop iteration start. If a worker was blocked for a long time during
an event loop iteration, it is possible that another worker already
updated the time stored in the node. As such, time since last update
of the node (ms) will be negative.
Previous code used ngx_abs(ms) in the calculations. That is, negative
times were effectively treated as positive ones. As a result, it was
not possible to maintain high request rates, where the same node can be
updated multiple times from during an event loop iteration.
In particular, this affected setups with many SSL handshakes, see
http://mailman.nginx.org/pipermail/nginx/2018-May/056291.html.
Fix is to only update the last update time stored in the node if the
new time is larger than previously stored one. If a future time is
stored in the node, we preserve this time as is.
To prevent breaking things on platforms without monotonic time available
if system time is updated backwards, a safety limit of 60 seconds is
used. If the time stored in the node is more than 60 seconds in the future,
we assume that the time was changed backwards and update lr->last
to the current time.
The bug in question was fixed in glibc 2.3.2 and is no longer expected
to manifest itself on real servers. On the other hand, the workaround
causes compilation problems on various systems. Previously, we've
already fixed the code to compile with musl libc (fd6fd02f6a4d), and
now it is broken on Fedora 28 where glibc's crypt library was replaced
by libxcrypt. So the workaround was removed.
FreeBSD returns EINVAL when getsockopt(TCP_FASTOPEN) is called on a unix
domain socket, resulting in "getsockopt(TCP_FASTOPEN) ... failed" messages
during binary upgrade when unix domain listen sockets are present in
the configuration. Added EINVAL to the list of ignored error codes.
While 325b3042edd6 fixed it on MINIX, it broke it on systems
that output the word "version" on several lines with "cc -v".
The fix is to only consider "clang version" or "LLVM version"
as clang version, but this time only using sed(1).
Previously, only unix domain sockets were reopened to tolerate cases when
local syslog server was restarted. It makes sense to treat other cases
(for example, local IP address changes) similarly.
Cast to intermediate "void *" to lose compiler knowledge about the original
type and pass the warning. This is not a real fix but rather a workaround.
Found by gcc8.
In mail and stream modules, no certificate provided is a fatal condition,
much like with the "ssl" and "starttls" directives.
In http, "listen ... ssl" can be used in a non-default server without
certificates as long as there is a certificate in the default one, so
missing certificate is only fatal for default servers.
In 51e1f047d15d, the "ssl" directive name was incorrectly hardcoded
in the error message shown when there are some SSL keys defined, but
not for all certificates. Right approach is to use the "mode" variable,
which can be either "ssl" or "starttls".
Previously, result of ngx_atoi() was assigned to an ngx_uint_t variable,
and errors reported by ngx_atoi() became positive, so the following check
in "status < 100" failed to catch them. This resulted in the configurations
like "proxy_cache_valid 2xx 30s" being accepted as correct, while they
in fact do nothing. Changing type to ngx_int_t fixes this, and such
configurations are now properly rejected.
Previously, ngx_http_upstream_process_header() might be called after
we've finished reading response headers and switched to a different read
event handler, leading to errors with gRPC proxying. Additionally,
the u->conf->read_timeout timer might be re-armed during reading response
headers (while this is expected to be a single timeout on reading
the whole response header).