On Windows, connect() errors are only reported via exceptfds descriptor set
from select(). Previously exceptfds was set to NULL, and connect() errors
were not detected at all, so connects to closed ports were waiting till
a timeout occurred.
Since ongoing connect() means that there will be a write event active,
except descriptor set is copied from the write one. While it is possible
to construct except descriptor set as a concatenation of both read and write
descriptor sets, this looks unneeded.
With this change, connect() errors are properly detected now when using
select(). Note well that it is not possible to detect connect() errors with
WSAPoll() (see https://daniel.haxx.se/blog/2012/10/10/wsapoll-is-broken/).
WSAPoll() is only available with Windows Vista and newer (and only
available during compilation if _WIN32_WINNT >= 0x0600). To make
sure the code works with Windows XP, we do not redefine _WIN32_WINNT,
but instead load WSAPoll() dynamically if it is not available during
compilation.
Also, sockets are not guaranteed to be small integers on Windows.
So an index array is used instead of NGX_USE_FD_EVENT to map
events to connections.
Previously, select was compiled in by default, but the NGX_HAVE_SELECT
macro was not set, resulting in iocp being used by default unless
the "--with-select_module" configure option was explicitly specified.
Since the iocp module is not finished and does not work properly, this
effectively meant that the "--with-select_module" option was mandatory.
With the change NGX_HAVE_SELECT is properly set, making "--with-select_module"
optional. Accordingly, it is removed from misc/GNUmakefile win32 target.
Previously, the code incorrectly assumed "ngx_event_t *" elements
instead of "struct pollfd".
This is mostly cosmetic change, as this code is never called now.
Previously, when using proxy_upload_rate and proxy_download_rate, the buffer
size for reading from a socket could be reduced as a result of rate limiting.
For connection-oriented protocols this behavior is normal since unread data will
normally be read at the next iteration. But for datagram-oriented protocols
this is not the case, and unread part of the datagram is lost.
Now buffer size is not limited for datagrams. Rate limiting still works in this
case by delaying the next reading event.
A shared connection does not own its file descriptor, which means that
ngx_handle_read_event/ngx_handle_write_event calls should do nothing for it.
Currently the c->shared flag is checked in several places in the stream proxy
module prior to calling these functions. However it was not done everywhere.
Missing checks could lead to calling
ngx_handle_read_event/ngx_handle_write_event on shared connections.
The problem manifested itself when using proxy_upload_rate and resulted in
either duplicate file descriptor error (e.g. with epoll) or incorrect further
udp packet processing (e.g. with kqueue).
The fix is to set and reset the event active flag in a way that prevents
ngx_handle_read_event/ngx_handle_write_event from scheduling socket events.
Previous interface of ngx_open_dir() assumed that passed directory name
has a room for NGX_DIR_MASK at the end (NGX_DIR_MASK_LEN bytes). While all
direct users of ngx_dir_open() followed this interface, this also implied
similar requirements for indirect uses - in particular, via ngx_walk_tree().
Currently none of ngx_walk_tree() uses provides appropriate space, and
fixing this does not look like a right way to go. Instead, ngx_dir_open()
interface was changed to not require any additional space and use
appropriate allocations instead.
If SSL_write_early_data() returned SSL_ERROR_WANT_WRITE, stop further reading
using a newly introduced c->ssl->write_blocked flag, as otherwise this would
result in SSL error "ssl3_write_bytes:bad length". Eventually, normal reading
will be restored by read event posted from successful SSL_write_early_data().
While here, place "SSL_write_early_data: want write" debug on the path.
Previously, if an SRV record was successfully resolved, but all of its A
records failed to resolve, NXDOMAIN was returned to the caller, which is
considered a successful resolve rather than an error. This could result in
losing the result of a previous successful resolve by the caller.
Now NXDOMAIN is only returned if at least one A resolve completed with this
code. Otherwise the error state of the first A resolve is returned.
Previously, unnamed regex captures matched in the parent request, were not
available in a cloned subrequest. Now 3 fields related to unnamed captures
are copied to a cloned subrequest: r->ncaptures, r->captures and
r->captures_data. Since r->captures cannot be changed by either request after
creating a clone, a new flag r->realloc_captures is introduced to force
reallocation of r->captures.
The issue was reported as a proxy_cache_background_update misbehavior in
http://mailman.nginx.org/pipermail/nginx/2018-December/057251.html.
In the past, there were several security issues which resulted in
worker process memory disclosure due to buffers with negative size.
It looks reasonable to check for such buffers in various places,
much like we already check for zero size buffers.
While here, removed "#if 1 / #endif" around zero size buffer checks.
It looks highly unlikely that we'll disable these checks anytime soon.
On 32-bit platforms mp4->buffer_pos might overflow when a large
enough (close to 4 gigabytes) atom is being skipped, resulting in
incorrect memory addesses being read further in the code. In most
cases this results in harmless errors being logged, though may also
result in a segmentation fault if hitting unmapped pages.
To address this, ngx_mp4_atom_next() now only increments mp4->buffer_pos
up to mp4->buffer_end. This ensures that overflow cannot happen.
Variables now do not depend on presence of the HTTP status code in response.
If the corresponding event occurred, variables contain time between request
creation and the event, and "-" otherwise.
Previously, intermediate value of the $upstream_response_time variable held
unix timestamp.
The directive allows to drop binding between a client and existing UDP stream
session after receiving a specified number of packets. First packet from the
same client address and port will start a new session. Old session continues
to exist and will terminate at moment defined by configuration: either after
receiving the expected number of responses, or after timeout, as specified by
the "proxy_responses" and/or "proxy_timeout" directives.
By default, proxy_requests is zero (disabled).
An attack that continuously switches HTTP/2 connection between
idle and active states can result in excessive CPU usage.
This is because when a connection switches to the idle state,
all of its memory pool caches are freed.
This change limits the maximum allowed number of idle state
switches to 10 * http2_max_requests (i.e., 10000 by default).
This limits possible CPU usage in one connection, and also
imposes a limit on the maximum lifetime of a connection.
Initially reported by Gal Goldshtein from F5 Networks.
Fixed uncontrolled memory growth in case peer is flooding us with
some frames (e.g., SETTINGS and PING) and doesn't read data. Fix
is to limit the number of allocated control frames.
Previously there was no validation for the size of a 64-bit atom
in an mp4 file. This could lead to a CPU hog when the size is 0,
or various other problems due to integer underflow when calculating
atom data size, including segmentation fault or worker process
memory disclosure.
Size of a shared memory zones must be at least two pages - one page
for slab allocator internal data, and another page for actual allocations.
Using 8192 instead is wrong, as there are systems with page sizes other
than 4096.
Note well that two pages is usually too low as well. In particular, cache
is likely to use two allocations of different sizes for global structures,
and at least four pages will be needed to properly allocate cache nodes.
Except in a few very special cases, with keys zone of just two pages nginx
won't be able to start. Other uses of shared memory impose a limit
of 8 pages, which provides some room for global allocations. This patch
doesn't try to address this though.
Inspired by ticket #1665.