Since introduction of offset handling in ngx_http_upstream_copy_header_line()
in revision 573:58475592100c, the ngx_http_upstream_copy_content_encoding()
function is no longer needed, as its behaviour is exactly equivalent to
ngx_http_upstream_copy_header_line() with appropriate offset. As such,
the ngx_http_upstream_copy_content_encoding() function was removed.
Further, the u->headers_in.content_encoding field is not used anywhere,
so it was removed as well.
Further, Content-Encoding handling no longer depends on NGX_HTTP_GZIP,
as it can be used even without any gzip handling compiled in (for example,
in the charset filter).
Multi headers are now using linked lists instead of arrays. Notably,
the following fields were changed: r->headers_in.cookies (renamed
to r->headers_in.cookie), r->headers_in.x_forwarded_for,
r->headers_out.cache_control, r->headers_out.link, u->headers_in.cache_control
u->headers_in.cookies (renamed to u->headers_in.set_cookie).
The r->headers_in.cookies and u->headers_in.cookies fields were renamed
to r->headers_in.cookie and u->headers_in.set_cookie to match header names.
The ngx_http_parse_multi_header_lines() and ngx_http_parse_set_cookie_lines()
functions were changed accordingly.
With this change, multi headers are now essentially equivalent to normal
headers, and following changes will further make them equivalent.
Previously, $http_*, $sent_http_*, $sent_trailer_*, $upstream_http_*,
and $upstream_trailer_* variables returned only the first header (with
a few specially handled exceptions: $http_cookie, $http_x_forwarded_for,
$sent_http_cache_control, $sent_http_link).
With this change, all headers are returned, combined together. For
example, $http_foo variable will be "a, b" if there are "Foo: a" and
"Foo: b" headers in the request.
Note that $upstream_http_set_cookie will also return all "Set-Cookie"
headers (ticket #1843), though this might not be what one want, since
the "Set-Cookie" header does not follow the list syntax (see RFC 7230,
section 3.2.2).
With sendfile() in threads ("aio threads; sendfile on;"), client connection
can block on writing, waiting for sendfile() to complete. In HTTP/2 this
might result in the request hang, since an attempt to continue processing
in thread event handler will call request's write event handler, which
is usually stopped by ngx_http_v2_send_chain(): it does nothing if there
are no additional data and stream->queued is set. Further, HTTP/2 resets
stream's c->write->ready to 0 if writing blocks, so just fixing
ngx_http_v2_send_chain() is not enough.
Can be reproduced with test suite on Linux with:
TEST_NGINX_GLOBALS_HTTP="aio threads; sendfile on;" prove h2*.t
The following tests currently fail: h2_keepalive.t, h2_priority.t,
h2_proxy_max_temp_file_size.t, h2.t, h2_trailers.t.
Similarly, sendfile() with AIO preloading on FreeBSD can block as well,
with similar results. This is, however, harder to reproduce, especially
on modern FreeBSD systems, since sendfile() usually does not return EBUSY.
Fix is to modify ngx_http_v2_send_chain() so it actually tries to send
data to the main connection when called, and to make sure that
c->write->ready is set by the relevant event handlers.
With sendfile in threads, "task already active" alerts might appear in logs
if a write event happens on the main HTTP/2 connection, triggering a sendfile
in threads while another thread operation is already running. Observed
with "aio threads; aio_write on; sendfile on;" and with thread event handlers
modified to post a write event to the main HTTP/2 connection (though can
happen without any modifications).
Similarly, sendfile() with AIO preloading on FreeBSD can trigger duplicate
aio operation, resulting in "second aio post" alerts. This is, however,
harder to reproduce, especially on modern FreeBSD systems, since sendfile()
usually does not return EBUSY.
Fix is to avoid starting a sendfile operation if other thread operation
is active by checking r->aio in the thread handler (and, similarly, in
aio preload handler). The added check also makes duplicate calls protection
redundant, so it is removed.
Previously, connections to upstream servers used sendfile() if it was
enabled, but never honored sendfile_max_chunk. This might result
in worker monopolization for a long time if large request bodies
are allowed.
Requires OpenSSL 3.0 compiled with "enable-ktls" option. Further, KTLS
needs to be enabled in kernel, and in OpenSSL, either via OpenSSL
configuration file or with "ssl_conf_command Options KTLS;" in nginx
configuration.
On FreeBSD, kernel TLS is available starting with FreeBSD 13.0, and
can be enabled with "sysctl kern.ipc.tls.enable=1" and "kldload ktls_ocf"
to load a software backend, see man ktls(4) for details.
On Linux, kernel TLS is available starting with kernel 4.13 (at least 5.2
is recommended), and needs kernel compiled with CONFIG_TLS=y (with
CONFIG_TLS=m, which is used at least on Ubuntu 21.04 by default,
the tls module needs to be loaded with "modprobe tls").
With SSL it is possible that an established connection is ready for
reading after the handshake. Further, events might be already disabled
in case of level-triggered event methods. If this happens and
ngx_http_upstream_send_request() blocks waiting for some data from
the upstream, such as flow control in case of gRPC, the connection
will time out due to no read events on the upstream connection.
Fix is to explicitly check the c->read->ready flag if sending request
blocks and post a read event if it is set.
Note that while it is possible to modify ngx_ssl_handshake() to keep
read events active, this won't completely resolve the issue, since
there can be data already received during the SSL handshake
(see 573bd30e46b4).
In limit_req, auth_delay, and upstream code to check for broken
connections, tests for possible connection close by the client
did not work if the connection was already closed when relevant
event handler was set. This happened because there were no additional
events in case of edge-triggered event methods, and read events
were disabled in case of level-triggered ones.
Fix is to explicitly post a read event if the c->read->ready flag
is set.
For new data to be reported with eventport on Solaris,
ngx_handle_read_event() needs to be called after reading response
headers. To do so, ngx_http_upstream_process_non_buffered_upstream()
now called unconditionally if there are no prepread data. This
won't cause any read() syscalls as long as upstream connection
is not ready for reading (c->read->ready is not set), but will result
in proper handling of all events.
After 7675:9afa45068b8f and 7678:bffcc5af1d72 (1.19.1), during non-buffered
simple proxying, responses with extra data might result in zero size buffers
being generated and "zero size buf" alerts in writer. This bug is similar
to the one with FastCGI proxying fixed in 7689:da8d758aabeb.
In non-buffered mode, normally the filter function is not called if
u->length is already 0, since u->length is checked after each call of
the filter function. There is a case when this can happen though: if
the response length is 0, and there are pre-read response body data left
after reading response headers. As such, a check for u->length is needed
at the start of non-buffered filter functions, similar to the one
for p->length present in buffered filter functions.
Appropriate checks added to the existing non-buffered copy filters
in the upstream (used by scgi and uwsgi proxying) and proxy modules.
Previously the stale-if-error extension of the Cache-Control upstream header
triggered the return of a stale response for all error conditions that can be
specified in the proxy_cache_use_stale directive. The list of these errors
includes both network/timeout/format errors, as well as some HTTP codes like
503, 504, 403, 429 etc. The latter prevented a cache entry from being updated
by a response with any of these HTTP codes during the stale-if-error period.
Now stale-if-error only works for network/timeout/format errors and ignores
the upstream HTTP code. The return of a stale response for certain HTTP codes
is still possible using the proxy_cache_use_stale directive.
This change also applies to the stale-while-revalidate extension of the
Cache-Control header, which triggers stale-if-error if it is missing.
Reported at
http://mailman.nginx.org/pipermail/nginx/2020-July/059723.html.
Previous behaviour was to pass everything to the client, but this
seems to be suboptimal and causes issues (ticket #1695). Fix is to
drop extra data instead, as it naturally happens in most clients.
Additionally, we now also issue a warning if the response is too
short, and make sure the fact it is truncated is propagated to the
client. The u->error flag is introduced to make it possible to
propagate the error to the client in case of unbuffered proxying.
For responses to HEAD requests there is an exception: we do allow
both responses without body and responses with body matching the
Content-Length header.
Previous behaviour was to pass everything to the client, but this
seems to be suboptimal and causes issues (ticket #1695). Fix is to
drop extra data instead, as it naturally happens in most clients.
This change covers generic buffered and unbuffered filters as used
in the scgi and uwsgi modules. Appropriate input filter init
handlers are provided by the scgi and uwsgi modules to set corresponding
lengths.
Note that for responses to HEAD requests there is an exception:
we do allow any response length. This is because responses to HEAD
requests might be actual full responses, and it is up to nginx
to remove the response body. If caching is enabled, only full
responses matching the Content-Length header will be cached
(see b779728b180c).
With level-triggered event methods it is important to specify
the NGX_CLOSE_EVENT flag to ngx_handle_read_event(), otherwise
the event won't be removed, resulting in CPU hog.
Reported by Patrick Wollgast.
In case of filter finalization, essential request fields like r->uri,
r->args etc could be changed, which affected the cache update subrequest.
Also, after filter finalization r->cache could be set to NULL, leading to
null pointer dereference in ngx_http_upstream_cache_background_update().
The fix is to create background cache update subrequest before sending the
cached response.
Since initial introduction in 1aeaae6e9446 (1.11.10) background cache update
subrequest was created after sending the cached response because otherwise it
blocked the parent request output. In 9552758a786e (1.13.1) background
subrequests were introduced to eliminate the delay before sending the final
part of the cached response. This also made it possible to create the
background cache update subrequest before sending the response.
Note that creating the subrequest earlier does not change the fact that in case
of filter finalization the background cache update subrequest will likely not
have enough time to successfully update the cache entry. Filter finalization
leads to the main request termination as soon the current iteration of request
processing is complete.
Variables now do not depend on presence of the HTTP status code in response.
If the corresponding event occurred, variables contain time between request
creation and the event, and "-" otherwise.
Previously, intermediate value of the $upstream_response_time variable held
unix timestamp.
The problem does not manifest itself currently, because in case of
non-buffered reading, chain link created by u->create_request method
consists of a single element.
Found by PVS-Studio.
The directive configures maximum number of requests allowed on
a connection kept in the cache. Once a connection reaches the number
of requests configured, it is no longer saved to the cache.
The default is 100.
Much like keepalive_requests for client connections, this is mostly
a safeguard to make sure connections are closed periodically and the
memory allocated from the connection pool is freed.
In TLSv1.3, NewSessionTicket messages arrive after the handshake and
can come at any time. Therefore we use a callback to save the session
when we know about it. This approach works for < TLSv1.3 as well.
The callback function is set once per location on merge phase.
Since SSL_get_session() in BoringSSL returns an unresumable session for
TLSv1.3, peer save_session() methods have been updated as well to use a
session supplied within the callback. To preserve API, the session is
cached in c->ssl->session. It is preferably accessed in save_session()
methods by ngx_ssl_get_session() and ngx_ssl_get0_session() wrappers.
With gRPC it is possible that a request sending is blocked due to flow
control. Moreover, further sending might be only allowed once the
backend sees all the data we've already sent. With such a backend
it is required to clear the TCP_NOPUSH socket option to make sure all
the data we've sent are actually delivered to the backend.
As such, we now clear TCP_NOPUSH in ngx_http_upstream_send_request()
also on NGX_AGAIN if c->write->ready is set. This fixes a test (which
waits for all the 64k bytes as per initial window before allowing more
bytes) with sendfile enabled when the body was written to a file
in a different context.
Now tcp_nopush on peer connections is disabled if it is disabled on
the client connection, similar to how we handle c->sendfile. Previously,
tcp_nopush was always used on upstream connections, regardless of
the "tcp_nopush" directive.
With u->conf->preserve_output set the request body file might be used
after the response header is sent, so avoid cleaning it. (Normally
this is not a problem as u->conf->preserve_output is only set with
r->request_body_no_buffering, but the request body might be already
written to a file in a different context.)
Previously, ngx_http_upstream_process_header() might be called after
we've finished reading response headers and switched to a different read
event handler, leading to errors with gRPC proxying. Additionally,
the u->conf->read_timeout timer might be re-armed during reading response
headers (while this is expected to be a single timeout on reading
the whole response header).
Previously, ngx_http_upstream_test_next() used an outdated condition on
whether it will be possible to switch to a different server or not. It
did not take into account restrictions on non-idempotent requests, requests
with non-buffered request body, and the next upstream timeout.
For such requests, switching to the next upstream server was rejected
later in ngx_http_upstream_next(), resulting in nginx own error page
being returned instead of the original upstream response.
The flag can be used to continue sending request body even after we've
got a response from the backend. In particular, this is needed for gRPC
proxying of bidirectional streaming RPCs, and also to send control frames
in other forms of RPCs.
The flag indicates whether last ngx_output_chain() returned NGX_AGAIN
or not. If the flag is set, we arm the u->conf->send_timeout timer.
The flag complements c->write->ready test, and allows to stop sending
the request body in an output filter due to protocol-specific flow
control.
Basic trailer headers support allows one to access response trailers
via the $upstream_trailer_* variables.
Additionally, the u->conf->pass_trailers flag was introduced. When the
flag is set, trailer headers from the upstream response are passed to
the client. Like normal headers, trailer headers will be hidden
if present in u->conf->hide_headers_hash.
Previously, only the upstream response body could be accessed with the
NGX_HTTP_SUBREQUEST_IN_MEMORY feature. Now any response body from a subrequest
can be saved in a memory buffer. It is available as a single buffer in r->out
and the buffer size is configured by the subrequest_output_buffer_size
directive.
Upstream, proxy and fastcgi code used to handle the old-style feature is
removed.
Following ad3f342f14ba046c (1.9.13), it is possible that a request where
header was already sent will be finalized with NGX_HTTP_BAD_GATEWAY,
triggering an attempt to return additional error response and the
"header already sent" alert as a result.
In particular, it is trivial to reproduce the problem with a HEAD request
and caching enabled. With caching enabled nginx will change HEAD to GET
and will set u->pipe->downstream_error to suppress sending the response
body to the client. When a backend-related error occurs (for example,
proxy_read_timeout expires), ngx_http_finalize_upstream_request() will
be called with NGX_HTTP_BAD_GATEWAY. After ad3f342f14ba046c this will
result in ngx_http_finalize_request(NGX_HTTP_BAD_GATEWAY).
Fix is to move u->pipe->downstream_error handling to a later point,
where all special response codes are changed to NGX_ERROR.
Reported by Jan Prachar,
http://mailman.nginx.org/pipermail/nginx-devel/2018-January/010737.html.
The capability is retained automatically in unprivileged worker processes after
changing UID if transparent proxying is enabled at least once in nginx
configuration.
The feature is only available in Linux.
If the data to write is bigger than what the socket can send, and the
reminder is smaller than NGX_SSL_BUFSIZE, then SSL_write() fails with
SSL_ERROR_WANT_WRITE. The reminder of payload however is successfully
copied to the low-level buffer and all the output chain buffers are
flushed. This means that retry logic doesn't work because
ngx_http_upstream_process_non_buffered_request() checks only if there's
anything in the output chain buffers and ignores the fact that something
may be buffered in low-level parts of the stack.
Signed-off-by: Patryk Lesiewicz <patryk@google.com>
Upgrading an upstream connection is usually followed by reading from the client
which a subrequest is not allowed to do. Moreover, accessing the header_in
request field while processing upgraded connection ends up with a null pointer
dereference since the header_in buffer is only created for the the main request.
If proxy_next_upstream includes http_503/http_504, and upstream
returns 503/504, $upstream_status converted this to 502 for any
values except the last one.
The NGX_DONE value returned from ngx_http_upstream_cache_send() indicates
that upstream was already finalized in ngx_http_upstream_process_headers().
It was treated as a generic error which resulted in duplicate finalization.
Handled NGX_HTTP_UPSTREAM_INVALID_HEADER from ngx_http_upstream_cache_send().
Previously, it could return within ngx_http_upstream_finalize_request(), and
since it's below NGX_HTTP_SPECIAL_RESPONSE, a client connection could stuck.
When parsing of headers in a cache file fails, already parsed headers
need to be cleared, and protocol state needs to be reinitialized. To do
so, u->request_sent is now set to ensure ngx_http_upstream_reinit() will
be called.
This change complements improvements in 46ddff109e72.
When caching intercepted errors, previous behaviour was to use
proxy_cache_valid times specified, regardless of various cache control
headers present in the response. Fix is to check u->cacheable and
use u->cache->valid_sec as set by various cache control response headers,
similar to how we do this in the normal caching code path.
If cache file is truncated, it is possible that u->process_header()
will return NGX_AGAIN. Added appropriate handling of this case by
changing the error to NGX_HTTP_UPSTREAM_INVALID_HEADER.
Also, added appropriate logging of this and NGX_HTTP_UPSTREAM_INVALID_HEADER
cases at the "crit" level. Note that this will result in duplicate logging
in case of NGX_HTTP_UPSTREAM_INVALID_HEADER. While this is something better
to avoid, it is considered to be an overkill to implement cache-specific
error logging in u->process_header().
Additionally, u->buffer.start is now reset to be able to receive a new
response, and u->cache_status set to MISS to provide the value in the
$upstream_cache_status variable, much like it happens on other cache file
errors detected by ngx_http_file_cache_read(), instead of HIT, which is
believed to be misleading.