The ngx_event_pipe() function wasn't called on write events with
wev->delayed set. As a result, threaded writing results weren't
properly collected in ngx_event_pipe_write_to_downstream() when a
write event was triggered for a completed write.
Further, this wasn't detected, as p->aio was reset by a thread completion
handler, and results were later collected in ngx_event_pipe_read_upstream()
instead of scheduling a new write of additional data. If this happened
on the last reading from an upstream, last part of the response was never
written to the cache file.
Similar problems might also happen in case of timeouts when writing to
client, as this also results in ngx_event_pipe() not being called on write
events. In this scenario socket leaks were observed.
Fix is to check if p->writing is set in ngx_event_pipe_read_upstream(), and
therefore collect results of previous write operations in case of read events
as well, similar to how we do so in ngx_event_pipe_write_downstream().
This is enough to fix the wev->delayed case. Additionally, we now call
ngx_event_pipe() from ngx_http_upstream_process_request() if there are
uncollected write operations (p->writing and !p->aio). This also fixes
the wev->timedout case.
The ngx_chain_coalesce_file() function may produce more bytes to send then
requested in the limit passed, as it aligns the last file position
to send to memory page boundary. As a result, (limit - send) may become
negative. This resulted in big positive number when converted to size_t
while calling ngx_output_chain_to_iovec().
Another part of the problem is in ngx_chain_coalesce_file(): it changes cl
to the next chain link even if the current buffer is only partially sent
due to limit.
Therefore, if a file buffer was not expected to be fully sent due to limit,
and was followed by a memory buffer, nginx called sendfile() with a part
of the file buffer, and the memory buffer in trailer. If there were enough
room in the socket buffer, this resulted in a part of the file buffer being
skipped, and corresponding part of the memory buffer sent instead.
The bug was introduced in 8e903522c17a (1.7.8). Configurations affected
are ones using limits, that is, limit_rate and/or sendfile_max_chunk, and
memory buffers after file ones (may happen when using subrequests or
with proxying with disk buffering).
Fix is to explicitly check if (send < limit) before constructing trailer
with ngx_output_chain_to_iovec(). Additionally, ngx_chain_coalesce_file()
was modified to preserve unfinished file buffers in cl.
Closing up to 32 connections might be too aggressive if worker_connections
is set to a comparable number (and/or there are only a small number of
reusable connections). If an occasional connection shorage happens in
such a configuration, it leads to closing all reusable connections instead
of gradually reducing keepalive timeout to a smaller value. To improve
granularity in such configurations we now close no more than 1/8 of all
reusable connections at once.
Suggested by Joel Cunningham.
A missing check could cause ngx_stream_ssl_handler() to be applied
to a non-ssl session, which resulted in a null pointer dereference
if ssl_verify_client is enabled.
The bug had appeared in 1.11.8 (41cb1b64561d).
Previously, an unavailable peer was considered recovered after a successful
proxy session to this peer. Until then, only a single client connection per
fail_timeout was allowed to be proxied to the peer.
Since stream sessions can be long, it may take indefinite time for a peer to
recover, limiting the ability of the peer to receive new connections.
Now, a peer is considered recovered after a successful TCP connection is
established to it. Balancers are notified of this event via the notify()
callback.
OpenSSL 1.1.0 now uses normal "nmake; nmake install" instead of using
custom "ms\do_ms.bat" script and "ms\nt.mak" makefile. And Configure
now requires --prefix to be absolute, and no longer derives --openssldir
from prefix (so it's specified explicitly). Generated libraries are now
called "libcrypto.lib" and "libssl.lib" instead of "libeay32.lib"
and "ssleay32.lib". Appropriate tests added to support both old and new
variants.
Additionally, openssl/lhash.h now triggers warning C4090 ('function' :
different 'const' qualifiers), so the warning was disabled.
There are lots of C4244 warnings (conversion from 'type1' to 'type2',
possible loss of data), so they were disabled.
The same applies to C4267 warnings (conversion from 'size_t' to 'type',
possible loss of data), most notably - conversion from ngx_str_t.len to
ngx_variable_value_t.len (which is unsigned:28). Additionally, there
is at least one case when it is not possible to fix the warning properly
without introducing win32-specific code: recv() on win32 uses "int len",
while POSIX defines "size_t len".
The ssize_t type now properly defined for 64-bit compilation with MSVC.
Caught by warning C4305 (truncation from '__int64' to 'ssize_t'), on
"cutoff = NGX_MAX_SIZE_T_VALUE / 10" in ngx_atosz()).
Several C4334 warnings (result of 32-bit shift implicitly converted to 64 bits)
were fixed by adding explicit conversions.
Several C4214 warnings (nonstandard extension used: bit field types other
than int) in ngx_http_script.h fixed by changing bit field types from
uintptr_t to unsigned.
Most notably, warning W8012 (comparing signed and unsigned values) reported
in multiple places where an unsigned value of small type (e.g., u_short) is
promoted to an int and compared to an unsigned value.
Warning W8072 (suspicious pointer arithmetic) disabled, it is reported
when we increment base pointer in ngx_shm_alloc().
These types are available with MSVC (at least since 2003, in stddef.h),
all variants of GCC (in stdint.h) and Watcom C. We need to define them
only for Borland C.
This allows to set a different one from command line when needed.
For example, to configure nginx with gcc as a compiler one could
use "make -f misc/GNUmakefile win32 CC=gcc".
This implies ticket key size of 80 bytes instead of previously used 48,
as both HMAC and AES keys are 32 bytes now. When an old 48-byte ticket key
is provided, we fall back to using backward-compatible AES128 encryption.
OpenSSL switched to using AES256 in 1.1.0, and we are providing equivalent
security. While here, order of HMAC and AES keys was reverted to make
the implementation compatible with keys used by OpenSSL with
SSL_CTX_set_tlsext_ticket_keys().
Prodded by Christian Klinger.
The current version of HTTP/1.1 standard allows relative references in
redirects (https://tools.ietf.org/html/rfc7231#section-7.1.2).
Allow this form for redirects generated by nginx by introducing the new
directive absolute_redirect.
SSL version 3.0 can be specified by the client at the record level for
compatibility reasons. Previously, ssl_preread module rejected such
connections, presuming they don't have SNI. Now SSL 3.0 is allowed at
the record level.
The resolver handles SRV requests in two stages. In the first
stage it gets all SRV RRs, and in the second stage it resolves
the names from SRV RRs into addresses.
Previously, if a response to an SRV request was cached, the
queries to resolve names were not limited by a timeout. If a
response to any of these queries was not received, the SRV
request could never complete.
If a response to an SRV request was not cached, and some of the
queries to resolve names timed out, NGX_RESOLVE_TIMEDOUT was
returned instead of successfully resolved addresses.
To fix both issues, resolving of names is now always limited by
a timeout.
Changeset e7cb5deb951d breaks build on CentOS 5 with "dereferencing
type-punned pointer will break strict-aliasing rules" warning. It is
backed out.
Instead, to keep builds with BoringSSL happy, type of the "value"
variable changed to "char *", and an explicit cast added before calling
ngx_parse_http_time().
A bug was introduced by 82efcedb310b that could lead to timing out of
responses or segmentation fault, when accept_mutex was enabled.
The output queue in HTTP/2 can contain frames from different streams.
When the queue is sent, all related write handlers need to be called.
In order to do so, the streams were added to the h2c->posted queue
after handling sent frames. Then this queue was processed in
ngx_http_v2_write_handler().
If accept_mutex is enabled, the event's "ready" flag is set but its
handler is not called immediately. Instead, the event is added to
the ngx_posted_events queue. At the same time in this queue can be
events from upstream connections. Such events can result in sending
output queue before ngx_http_v2_write_handler() is triggered. And
at the time ngx_http_v2_write_handler() is called, the output queue
can be already empty with some streams added to h2c->posted.
But after 82efcedb310b, these streams weren't processed if all frames
have already been sent and the output queue was empty. This might lead
to a situation when a number of streams were get stuck in h2c->posted
queue for a long time. Eventually these streams might get closed by
the send timeout.
In the worst case this might also lead to a segmentation fault, if
already freed stream was left in the h2c->posted queue. This could
happen if one of the streams was terminated but wasn't closed, due to
the HEADERS frame or a partially sent DATA frame left in the output
queue. If this happened the ngx_http_v2_filter_cleanup() handler
removed the stream from the h2c->waiting or h2c->posted queue on
termination stage, before the frame has been sent, and the stream
was again added to the h2c->posted queue after the frame was sent.
In order to fix all these problems and simplify the code, write
events of fake stream connections are now added to ngx_posted_events
instead of using a custom h2c->posted queue.
By default, "map" creates cacheable variables [1]. With this
parameter it creates a non-cacheable variable.
An original idea was to deduce the cacheability of the "map"
variable by checking the cacheability of variables specified
in source and resulting values, but it turned to be too hard.
For example, a cacheable variable can be overridden with the
"set" directive or with the SSI "set" command. Also, keeping
"map" variables cacheable by default is good for performance
reasons. This required adding a new parameter.
[1] Before db699978a33f (1.11.0), the cacheability of the
"map" variable could vary depending on the cacheability of
variables specified in resulting values (ticket #1090).
This is believed to be a bug rather than a feature.