Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured. As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.
Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set. The upstream keepalive module
was modified to follow this.
If nginx was used under OpenVZ and a container with nginx was suspended
and resumed, configuration tests started to fail because of EADDRINUSE
returned from listen() instead of bind():
# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
nginx: configuration file /etc/nginx/nginx.conf test failed
With this change EADDRINUSE errors returned by listen() are handled
similarly to errors returned by bind(), and configuration tests work
fine in the same environment:
# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
More details about OpenVZ suspend/resume bug:
https://bugzilla.openvz.org/show_bug.cgi?id=2470
OCSP responses may contain no nextUpdate. As per RFC 6960, this means
that nextUpdate checks should be bypassed. Handle this gracefully by
using NGX_MAX_TIME_T_VALUE as "valid" in such a case.
The problem was introduced by 6893a1007a7c (1.9.2).
Reported by Matthew Baldwin.
Broken by 6893a1007a7c (1.9.2) during introduction of strict OCSP response
validity checks. As stapling file is expected to be returned unconditionally,
fix is to set its validity to the maximum supported time.
Reported by Faidon Liambotis.
Once upstream is connected, the upstream buffer is allocated. Previously, the
proxy module used the buffer allocation status to check if upstream is
connected. Now it's enough to check the flag.
If the -T option is passed, additionally to configuration test, configuration
files are output to stdout.
In the debug mode, configuration files are kept in memory and can be accessed
using a debugger.
The function is now called ngx_parse_http_time(), and can be used by
any code to parse HTTP-style date and time. In particular, it will be
used for OCSP stapling.
For compatibility, a macro to map ngx_http_parse_time() to the new name
provided for a while.
With this change it's no longer needed to pass -D_GNU_SOURCE manually,
and -D_FILE_OFFSET_BITS=64 is set to use 64-bit off_t.
Note that nginx currently fails to work properly with master process
enabled on GNU Hurd, as fcntl(F_SETOWN) returns EOPNOTSUPP for sockets
as of GNU Hurd 0.6. Additionally, our strerror() preloading doesn't
work well with GNU Hurd, as it uses large numbers for most errors.
When configured, an individual listen socket on a given address is
created for each worker process. This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance. As of now it works on Linux and DragonFly BSD.
Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/. With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.
Based on previous work by Sepherosa Ziehau and Yingqi Lu.
There is no need to set "i" to 0, as it's expected to be 0 assuming
the bindings are properly sorted, and we already rely on this when
explicitly set hport->naddrs to 1. Remaining conditional code is
replaced with identical "hport->naddrs = i + 1".
Identical modifications are done in the mail and stream modules,
in the ngx_mail_optimize_servers() and ngx_stream_optimize_servers()
functions, respectively.
No functional changes.
This may happen if eventfd() returns ENOSYS, notably seen on CentOS 5.4.
Such a failure will now just disable the notification mechanism and let
the callers cope with it, instead of failing to start worker processes.
If thread pools are not configured, this can safely be ignored.
Two mechanisms are implemented to make it possible to store pointers
in shared memory on Windows, in particular on Windows Vista and later
versions with ASLR:
- The ngx_shm_remap() function added to allow remapping of a shared memory
zone to the address originally used for it in the master process. While
important, it doesn't solve the problem by itself as in many cases it's
not possible to use the address because of conflicts with other
allocations.
- We now create mappings at the same address in all processes by starting
mappings at predefined addresses normally unused by newborn processes.
These two mechanisms combined allow to use shared memory on Windows
almost without problems, including reloads.
Based on the patch by Sergey Brester:
http://mailman.nginx.org/pipermail/nginx-devel/2015-April/006836.html
It's now enough to specify proxy_protocol option in one listen directive to
enable it in all servers listening on the same address/port. Previously,
the setting from the first directive was always used.
When client or upstream connection is closed, level-triggered read event
remained active until the end of the session leading to cpu hog. Now the flag
NGX_CLOSE_EVENT is used to unschedule the event.
If a peer was initially skipped due to max_fails, there's no reason
not to try it again if enough time has passed, and the next_upstream
logic is in action.
This also reduces diffs with NGINX Plus.
Similar to ngx_http_file_cache_set_slot(), the last component of file->name
with a fixed length of 10 bytes, as generated in ngx_create_temp_path(), is
used as a source for the names of intermediate subdirectories with each one
taking its own part. Ensure that the sum of specified levels with slashes
fits into the length (ticket #731).
Missing call to X509_STORE_CTX_free when X509_STORE_CTX_init fails.
Missing call to OCSP_CERTID_free when OCSP_request_add0_id fails.
Possible leaks in vary particular scenariis of memory shortage.
This helps to avoid suboptimal behavior when a client waits for a control
frame or more data to increase window size, but the frames have been delayed
in the socket buffer.
The delays can be caused by bad interaction between Nagle's algorithm on
nginx side and delayed ACK on the client side or by TCP_CORK/TCP_NOPUSH
if SPDY was working without SSL and sendfile() was used.
The pushing code is now very similar to ngx_http_set_keepalive().
If any preread body bytes were sent in the first chain, chunk size was
incorrectly added before the whole chain, including header, resulting in
an invalid request sent to upstream. Fixed to properly add chunk size
after the header.
The r->request_body_no_buffering flag was introduced. It instructs
client request body reading code to avoid reading the whole body, and
to call post_handler early instead. The caller should use the
ngx_http_read_unbuffered_request_body() function to read remaining
parts of the body.
Upstream module is now able to use this mode, if configured with
the proxy_request_buffering directive.
If the last header evaluation resulted in an empty header, the e.skip flag
was set and was not reset when we've switched to evaluation of body_values.
This incorrectly resulted in body values being skipped instead of producing
some correct body as set by proxy_set_body. Fix is to properly reset
the e.skip flag.
As the problem only appeared if the last potentially non-empty header
happened to be empty, it only manifested itself if proxy_set_body was used
with proxy_cache.
The SSL_MODE_NO_AUTO_CHAIN mode prevents OpenSSL from automatically
building a certificate chain on the fly if there is no certificate chain
explicitly provided. Before this change, certificates provided via the
ssl_client_certificate and ssl_trusted_certificate directives were
used by OpenSSL to automatically build certificate chains, resulting
in unexpected (and in some cases unneeded) chains being sent to clients.
LibreSSL removed support for export ciphers and a call to
SSL_CTX_set_tmp_rsa_callback() results in an error left in the error
queue. This caused alerts "ignoring stale global SSL error (...called
a function you should not call) while SSL handshaking" on a first connection
in each worker process.
LibreSSL 2.1.1+ started to set SSL_OP_NO_SSLv3 option by default on
new contexts. This makes sure to clear it to make it possible to use SSLv3
with LibreSSL if enabled in nginx config.
Prodded by Kuramoto Eiji.
Example of usage:
error_log memory:16m debug;
This allows to configure debug logging with minimum impact on performance.
It's especially useful when rare crashes are experienced under high load.
The log can be extracted from a coredump using the following gdb script:
set $log = ngx_cycle->log
while $log->writer != ngx_log_memory_writer
set $log = $log->next
end
set $buf = (ngx_log_memory_buf_t *) $log->wdata
dump binary memory debug_log.txt $buf->start $buf->end
Keeping the ready flag in this case might results in missing notification of
broken connection until nginx tried to use it again.
While there, stale comment about stale event was removed since this function
is also can be called directly.
The code that calls sendfile() was cut into a separate function.
This simplifies EINTR processing, yet is needed for the following
changes that add threads support.
In case of filter finalization, r->upstream might be changed during
the ngx_event_pipe() call. Added an argument to preserve it while
calling the ngx_http_upstream_process_request() function.
A request may be already finalized when ngx_http_upstream_finalize_request()
is called, due to filter finalization: after filter finalization upstream
can be finalized via ngx_http_upstream_cleanup(), either from
ngx_http_terminate_request(), or because a new request was initiated
to an upstream. Then the upstream code will see an error returned from
the filter chain and will call the ngx_http_upstream_finalize_request()
function again.
To prevent corruption of various upstream data in this situation, make sure
to do nothing but merely call ngx_http_finalize_request().
Prodded by Yichun Zhang, for details see the thread at
http://nginx.org/pipermail/nginx-devel/2015-February/006539.html.
Previously, connection hung after calling ngx_http_ssl_handshake() with
rev->ready set and no bytes in socket to read. It's possible in at least the
following cases:
- when processing a connection with expired TCP_DEFER_ACCEPT on Linux
- after parsing PROXY protocol header if it arrived in a separate TCP packet
Thanks to James Hamlin.
When replacing a stale cache entry, its last_modified and etag could be
inherited from the old entry if the response code is not 200 or 206. Moreover,
etag could be inherited with any response code if it's missing in the new
response. As a result, the cache entry is left with invalid last_modified or
etag which could lead to broken revalidation.
For example, when a file is deleted from backend, its last_modified is copied to
the new 404 cache entry and is used later for revalidation. Once the old file
appears again with its original timestamp, revalidation succeeds and the cached
404 response is sent to client instead of the file.
The problem appeared with etags in 44b9ab7752e3 (1.7.3) and affected
last_modified in 1573fc7875fa (1.7.9).
Repeatedly calling ngx_http_upstream_add_chash_point() to create
the points array in sorted order, is O(n^2) to the total weight.
This can cause nginx startup and reconfigure to be substantially
delayed. For example, when total weight is 1000, startup takes
5s on a modern laptop.
Replace this with a linear insertion followed by QuickSort and
duplicates removal. Startup for total weight of 1000 reduces to 40ms.
Based on a patch by Wai Keen Woon.
Previously, the Auth-SSL-Verify header with the "NONE" value was always passed
to the auth_http script if verification of client certificates is disabled.
The "ssl_verify_client", "ssl_verify_depth", "ssl_client_certificate",
"ssl_trusted_certificate", and "ssl_crl" directives introduced to control
SSL client certificate verification in mail proxy module.
If there is a certificate, detail of the certificate are passed to
the auth_http script configured via Auth-SSL-Verify, Auth-SSL-Subject,
Auth-SSL-Issuer, Auth-SSL-Serial, Auth-SSL-Fingerprint headers. If
the auth_http_pass_client_cert directive is set, client certificate
in PEM format will be passed in the Auth-SSL-Cert header (urlencoded).
If there is no required certificate provided during an SSL handshake
or certificate verification fails then a protocol-specific error is
returned after the SSL handshake and the connection is closed.
Based on previous work by Sven Peter, Franck Levionnois and Filipe Da Silva.
Initial size as calculated from the number of elements may be bigger
than max_size. If this happens, make sure to set size to max_size.
Reported by Chris West.
Previously, this function checked for connection local address existence
and returned error if it was missing. Now a new address is assigned in this
case making it possible to call this function not only for accepted connections.
It appeared that the NGX_HAVE_AIO_SENDFILE macro was defined regardless of
the "--with-file-aio" configure option and the NGX_HAVE_FILE_AIO macro.
Now they are related.
Additionally, fixed one macro.
This reduces layering violation and simplifies the logic of AIO preread, since
it's now triggered by the send chain function itself without falling back to
the copy filter. The context of AIO operation is now stored per file buffer,
which makes it possible to properly handle cases when multiple buffers come
from different locations, each with its own configuration.
If fastcgi_pass (or any look-alike that doesn't imply a default
port) is specified as an IP literal (as opposed to a hostname),
port absence was not detected at configuration time and could
result in EADDRNOTAVAIL at run time.
Fixed this in such a way that configs like
http {
server {
location / {
fastcgi_pass 127.0.0.1;
}
}
upstream 127.0.0.1 {
server 10.0.0.1:12345;
}
}
still work. That is, port absence check is delayed until after
we make sure there's no explicit upstream with such a name.
The mtx->wait counter was not decremented if we were able to obtain the lock
right after incrementing it. This resulted in unneeded sem_post() calls,
eventually leading to EOVERFLOW errors being logged, "sem_post() failed
while wake shmtx (75: Value too large for defined data type)".
To close the race, mtx->wait is now decremented if we obtain the lock right
after incrementing it in ngx_shmtx_lock(). The result can become -1 if a
concurrent ngx_shmtx_unlock() decrements mtx->wait before the added code does.
However, that only leads to one extra iteration in the next call of
ngx_shmtx_lock().
The use_temp_path http cache feature is now implemented using a separate temp
hierarchy in cache directory. Prefix-based temp files are no longer needed.
If use_temp_path is set to off, a subdirectory "temp" is created in the cache
directory. It's used instead of proxy_temp_path and friends for caching
upstream response.
The trailer.count variable was not initialized if there was a header,
resulting in "sendfile() failed (22: Invalid argument)" alerts on OS X
if the "sendfile" directive was used. The bug was introduced
in 8e903522c17a (1.7.8).
Some parts of code related to handling variants of a resource moved into
a separate function that is called earlier. This allows to use cache file
name as a prefix for temporary file in the following patch.
The original check for NGX_AGAIN was surplus, since the function returns
only NGX_OK or NGX_ERROR. Now it looks similar to other places.
No functional changes.
The configuration handling code has changed to look similar to the proxy_store
directive and friends. This simplifies adding variable support in the following
patch.
No functional changes.
Currently, storing and caching mechanisms cannot work together, and a
configuration error is thrown when the proxy_store and proxy_cache
directives (as well as their friends) are configured on the same level.
But configurations like in the example below were allowed and could result
in critical errors in the error log:
proxy_store on;
location / {
proxy_cache one;
}
Only proxy_store worked in this case.
For more predictable and errorless behavior these directives now prevent
each other from being inherited from the previous level.
This changes internal API related to handling of the "store"
flag in ngx_http_upstream_conf_t. Previously, a non-null value
of "store_lengths" was enough to enable store functionality with
custom path. Now, the "store" flag is also required to be set.
No functional changes.
The proxy_store, fastcgi_store, scgi_store and uwsgi_store were inherited
incorrectly if a directive with variables was defined, and then redefined
to the "on" value, i.e. in configurations like:
proxy_store /data/www$upstream_http_x_store;
location / {
proxy_store on;
}
In the following configuration request was sent to a backend without
URI changed to '/' due to if:
location /proxy-pass-uri {
proxy_pass http://127.0.0.1:8080/;
set $true 1;
if ($true) {
# nothing
}
}
Fix is to inherit conf->location from the location where proxy_pass was
configured, much like it's done with conf->vars.
The proxy_pass directive and other handlers are not expected to be inherited
into nested locations, but there is a special code to inherit upstream
handlers into limit_except blocks, as well as a configuration into if{}
blocks. This caused incorrect behaviour in configurations with nested
locations and limit_except blocks, like this:
location / {
proxy_pass http://u;
location /inner/ {
# no proxy_pass here
limit_except GET {
# nothing
}
}
}
In such a configuration the limit_except block inside "location /inner/"
unexpectedly used proxy_pass defined in "location /", while it shouldn't.
Fix is to avoid inheritance of conf->upstream.upstream (and
conf->proxy_lengths) into locations which don't have noname flag.
Instead of independant inheritance of conf->upstream.upstream (proxy_pass
without variables) and conf->proxy_lengths (proxy_pass with variables)
we now test them both and inherit only if neither is set. Additionally,
SSL context is also inherited only in this case now.
Based on the patch by Alexey Radkov.
RFC7232 says:
The 304 (Not Modified) status code indicates that a conditional GET
or HEAD request has been received and would have resulted in a 200
(OK) response if it were not for the fact that the condition
evaluated to false.
which means that there is no reason to send requests with "If-None-Match"
and/or "If-Modified-Since" headers for responses cached with other status
codes.
Also, sending conditional requests for responses cached with other status
codes could result in a strange behavior, e.g. upstream server returning
304 Not Modified for cached 404 Not Found responses, etc.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
In case of a cache lock timeout and in the aio handler we now call
r->write_event_handler() instead of a connection write handler,
to make sure to run appropriate subrequest. Previous code failed to run
inactive subrequests and hence resulted in suboptimal behaviour, see
report by Yichun Zhang:
http://mailman.nginx.org/pipermail/nginx-devel/2013-October/004435.html
(Infinite hang claimed in the report seems impossible without 3rd party
modules, as subrequests will be eventually woken up by the postpone filter.)
To ensure proper logging make sure to set current_request in all event
handlers, including resolve, ssl handshake, cache lock wait timer and
aio read handlers. A macro ngx_http_set_log_request() introduced to
simplify this.
The alert was introduced in 03ff14058272 (1.5.4), and was triggered on each
post_action invocation.
There is no real need to call header filters in case of post_action,
so return NGX_OK from ngx_http_send_header() if r->post_action is set.
This helps to avoid delays in sending the last chunk of data because
of bad interaction between Nagle's algorithm on nginx side and
delayed ACK on the client side.
Delays could also be caused by TCP_CORK/TCP_NOPUSH if SPDY was
working without SSL and sendfile() was used.
In 954867a2f0a6, we switched to using resolver node as the timer event data.
This broke debug event logging.
Replaced now unused ngx_resolver_ctx_t.ident with ngx_resolver_node_t.ident
so that ngx_event_ident() extracts something sensible when accessing
ngx_resolver_node_t as ngx_connection_t.
In 954867a2f0a6, we switched to using resolver node as the
timer event data, so make sure we do not free resolver node
memory until the corresponding timer is deleted.
There was no real problem since the amount of bytes can be sent is limited by
NGX_SENDFILE_MAXSIZE to less than 2G. But that can be changed in the future
Though ngx_solaris_sendfilev_chain() shouldn't suffer from the problem mentioned
in d1bde5c3c5d2 since currently IOV_MAX on Solaris is 16, but this follows the
change from 3d5717550371 in order to make the code look similar to other systems
and potentially eliminates the problem in the future.
The upstream modules remove and alter a number of client headers
before sending the request to upstream. This set of headers is
smaller or even empty when cache is disabled.
It's still possible that a request in a cache-enabled location is
uncached, for example, if cache entry counter is below min_uses.
In this case it's better to alter a smaller set of headers and
pass more client headers to backend unchanged. One of the benefits
is enabling server-side byte ranges in such requests.
Once this age is reached, the cache lock is discarded and another
request can acquire the lock. Requests which failed to acquire
the lock are not allowed to cache the response.
For further progress a new buffer must be at least two bytes larger than
the remaining unparsed data. One more byte is needed for null-termination
and another one for further progress. Otherwise inflate() fails with
Z_BUF_ERROR.
Instead of collecting a number of the possible SSL_CTX_use_PrivateKey_file()
error codes that becomes more and more difficult with the rising variety of
OpenSSL versions and its derivatives, just continue with the next password.
Multiple passwords in a single ssl_password_file feature was broken after
recent OpenSSL changes (commit 4aac102f75b517bdb56b1bcfd0a856052d559f6e).
Affected OpenSSL releases: 0.9.8zc, 1.0.0o, 1.0.1j and 1.0.2-beta3.
Reported by Piotr Sikora.
Previously, nginx would emit empty values in a header with multiple,
NULL-separated values.
This is forbidden by the SPDY specification, which requires headers to
have either a single (possibly empty) value or multiple, NULL-separated
non-empty values.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
When got multiple upstream IP addresses using DNS resolving, the number of
upstreams tries and the maxinum time spent for these tries were not affected.
This patch fixed it.
Spaces in Accept-Charset, Accept-Encoding, and Accept-Language headers
are now ignored. As per syntax of these headers spaces can only appear
in places where they are optional.
If a variant stored can't be used to respond to a request, the variant
hash is used as a secondary key.
Additionally, if we previously switched to a secondary key, while storing
a response to cache we check if the variant hash still apply. If not, we
switch back to the original key, to handle cases when Vary changes.
To cache responses with Vary, we now calculate hash of headers listed
in Vary, and return the response from cache only if new request headers
match.
As of now, only one variant of the same resource can be stored in cache.
Previous code resulted in transfer stalls when client happened
to read all the data in buffers at once, while all gzip buffers
were exhausted (but ctx->nomem wasn't set). Make sure to call
next body filter at least once per call if there are busy buffers.
Additionally, handling of calls with NULL chain was changed to follow
the same logic, i.e., next body filter is only called with NULL chain
if there are busy buffers. This is expected to fix "output chain is empty"
alerts as reported by some users after c52a761a2029 (1.5.7).
GetQueuedCompletionStatus() document on MSDN says the
following signature:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa364986.aspx
BOOL WINAPI GetQueuedCompletionStatus(
_In_ HANDLE CompletionPort,
_Out_ LPDWORD lpNumberOfBytes,
_Out_ PULONG_PTR lpCompletionKey,
_Out_ LPOVERLAPPED *lpOverlapped,
_In_ DWORD dwMilliseconds
);
In the latest specification, the type of the third argument
(lpCompletionKey) is PULONG_PTR not LPDWORD.
Due to the u->headers_in.last_modified_time not being correctly initialized,
this variable was evaluated to "Thu, 01 Jan 1970 00:00:00 GMT" for responses
cached without the "Last-Modified" header which resulted in subsequent proxy
requests being sent with "If-Modified-Since: Thu, 01 Jan 1970 00:00:00 GMT"
header.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
Previously, a value of the "send" variable wasn't properly adjusted
in a rare case when syscall was interrupted by a signal. As a result,
these functions could send less data than the limit allows.
The c->sent is reset to 0 on each request by server-side http code,
so do the same on client side. This allows to count number of bytes
sent in a particular request.
One intentional side effect of this change is that key is allowed only
in the first position. Previously, it was possible to specify the key
variable at any position, but that was never documented, and is contrary
with nginx configuration practice for positional parameters.
One intentional side effect of this change is that key is allowed only
in the first position. Previously, it was possible to specify the key
variable at any position, but that was never documented, and is contrary
to nginx configuration practice for positional parameters.
If a syslog daemon is restarted and the unix socket is used, further logging
might stop to work. In case of send error, socket is closed, forcing
a reconnection at the next logging attempt.
The ngx_cycle->log is used when sending the message. This allows to log syslog
send errors in another log.
Logging to syslog after its cleanup handler has been executed was prohibited.
Previously, this was possible from ngx_destroy_pool(), which resulted in error
messages caused by attempts to write into the closed socket.
The "processing" flag is renamed to "busy" to better match its semantics.
Previously, a file buffer start position was reset to the file start.
Now it's reset to the previous file buffer end. This fixes
reinitialization of requests having multiple successive parts of a
single file. Such requests are generated by fastcgi module.
This prevents inappropriate session reuse in unrelated server{}
blocks, while preserving ability to restore sessions on other servers
when using TLS Session Tickets.
Additionally, session context is now set even if there is no session cache
configured. This is needed as it's also used for TLS Session Tickets.
Thanks to Antoine Delignat-Lavaud and Piotr Sikora.
The new directives {proxy,fastcgi,scgi,uwsgi,memcached}_next_upstream_tries
and {proxy,fastcgi,scgi,uwsgi,memcached}_next_upstream_timeout limit
the number of upstreams tried and the maximum time spent for these tries
when searching for a valid upstream.
When memory allocation failed in ngx_http_upstream_cache(), the connection
would be terminated directly in ngx_http_upstream_init_request().
Return a INTERNAL_SERVER_ERROR response instead.
The ngx_init_setproctitle() function, as used on systems without
setproctitle(3), may fail due to memory allocation errors, and
therefore its return code needs to be checked.
Reported by Markus Linnala.
The etag->hash must be set to 0 to avoid an empty ETag header being
returned with the 500 Internal Server Error page after the memory
allocation failure.
Reported by Markus Linnala.
Some of the OpenSSL forks (read: BoringSSL) started removing unused,
no longer necessary and/or not really working bug workarounds along
with the SSL options and defines for them.
Instead of fixing nginx build after each removal, be proactive
and guard use of all SSL options for bug workarounds.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The messages "ngx_slab_alloc() failed: no memory in cache keys zone"
from the file cache slab allocator are suppressed since the allocation
is likely to succeed after the forced expiration of cache nodes.
The second allocation failure is reported.
In theory, this can provide a bit better distribution of latencies.
Also it simplifies the code, since ngx_queue_t is now used instead
of custom implementation.
Previously, a configuration like
location / {
ssi on;
ssi_types *;
set $http_foo "bar";
return 200 '<!--#echo var="http_foo" -->\n';
}
resulted in NULL pointer dereference in ngx_http_get_variable() as
the variable was explicitly added to the variables hash, but its
get_handler wasn't properly set in the hash. Fix is to make sure
that get_handler is properly set by ngx_http_variables_init_vars().
The SPDY module doesn't expect timers can be set on stream events for reasons
other than delaying output. But ngx_http_writer() could add timer on write
event if the delayed flag wasn't set and nginx is waiting for AIO completion.
That could cause delays in sending response over SPDY when file AIO was used.
perl_parse() function expects argv/argc-style argument list,
which according to the C standard must be NULL-terminated,
that is: argv[argc] == NULL.
This change fixes a crash (SIGSEGV) that could happen because
of the buffer overrun during perl module initialization.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
LibreSSL developers decided that LibreSSL is OpenSSL-2.0.0, so tests
for OpenSSL-1.0.2+ are now passing, even though the library doesn't
provide functions that are expected from that version of OpenSSL.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
This change adds support for using BoringSSL as a drop-in replacement
for OpenSSL without adding support for any of the BoringSSL-specific
features.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
This is really just a prerequisite for building against BoringSSL,
which doesn't provide either of those features.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
Timeout may not be set on an upstream connection when we call
ngx_ssl_handshake() in ngx_http_upstream_ssl_init_connection(),
so make sure to arm it if it's not set.
Based on a patch by Yichun Zhang.
The ngx_http_geoip_city_float_variable and
ngx_http_geoip_city_int_variable functions did not always initialize
all variable fields like "not_found", which could lead to empty values
for those corresponding nginx variables randomly.
RFC3986 says that, for consistency, URI producers and normalizers
should use uppercase hexadecimal digits for all percent-encodings.
This is also what modern web browsers and other tools use.
Using lowercase hexadecimal digits makes it harder to interact with
those tools in case when use of the percent-encoded URI is required,
for example when $request_uri is part of the cache key.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
Previously, ngx_http_map_uri_to_path() errors were not checked in
ngx_http_upstream_store(). Moreover, in case of errors temporary
files were not deleted, as u->store was set to 0, preventing cleanup
code in ngx_http_upstream_finalize_request() from removing them. With
this patch, u->store is set to 0 only if there were no errors.
Reported by Feng Gu.
This ensures that debug logging and the $uri variable (if used in
400 Bad Request processing) will not try to access uninitialized
memory.
Found by Sergey Bobrov.
Split SPDY header with multiple, NULL-separated values:
cookie: foo\0bar
into two separate HTTP headers with the same name:
cookie: foo
cookie: bar
Even though the logic for this behavior already existed
in the source code, it doesn't look that it ever worked
and SPDY streams with such headers were simply rejected.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
SSL_SESSION struct is internal part of the OpenSSL library and it's fields
should be accessed via API (when exposed), not directly.
The unfortunate side-effect of this change is that we're losing reference
count that used to be printed at the debug log level, but this seems to be
an acceptable trade-off.
Almost fixes build with -DOPENSSL_NO_SSL_INTERN.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The RSA_generate_key() is marked as deprecated and causes build to
fail. On the other hand, replacement function, RSA_generate_key_ex(),
requires much more code. Since RSA_generate_key() is only needed
for barely usable EXP ciphers, the #ifdef was added instead.
Prodded by Piotr Sikora.
This change is mostly cosmetic, because in practice this callback
is used only for 512-bit RSA keys.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
Previously, <bn.h>, <dh.h>, <rand.h> and <rsa.h> were pulled in
by <engine.h> using OpenSSL's deprecated interface, which meant
that nginx couldn't have been built with -DOPENSSL_NO_DEPRECATED.
Both <x509.h> and <x509v3.h> are pulled in by <ocsp.h>, but we're
calling X509 functions directly, so let's include those as well.
<crypto.h> is pulled in by virtually everything, but we're calling
CRYPTO_add() directly, so let's include it as well.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The ngx_open_dir() function changed to restore name passed to it. This
fixes removing destination directory in dav module, as caught by dav.t.
The ngx_close_dir() function introduced to properly convert errors, as
FindClose() returns 0 on error.
Previously, the NGX_LOG_INFO level was used unconditionally. This is
correct for client SSL connections, but too low for connections to
upstream servers. To resolve this, ngx_connection_error() now used
to log this error, it will select logging level appropriately.
With this change, if an upstream connection is closed during SSL
handshake, it is now properly logged at "error" level.
Previously, nginx closed client connection in cases when a response body
from upstream was needed to be cached or stored but shouldn't be sent to
the client. While this is normal for HTTP, it is unacceptable for SPDY.
Fix is to use instead the p->downstream_error flag to prevent nginx from
sending anything downstream. To make this work, the event pipe code was
modified to properly cache empty responses with the flag set.
The ngx_http_upstream_dummy_handler() must be set regardless of
the read event state. This prevents possible additional call of
ngx_http_upstream_send_request_handler().
The check became meaningless after refactoring in 2a92804f4109.
With the loop currently in place, "current" can't be NULL, hence
the check can be dropped.
Additionally, the local variable "current" was removed to
simplify code, and pool->current now used directly instead.
Found by Coverity (CID 714236).
This isn't really important as configuration testing shortly ends with
a process termination which will free all sockets, though Coverity
complains.
Prodded by Coverity (CID 400872).
Previously, last_modified_time was tested against -1 to check if the
not modified filter should be skipped. Notably, this prevented nginx
from additional If-Modified-Since (et al.) checks on proxied responses.
Such behaviour is suboptimal in some cases though, as checks are always
skipped on responses from a cache with ETag only (without Last-Modified),
resulting in If-None-Match being ignored in such cases. Additionally,
it was not possible to return 412 from the If-Unmodified-Since if last
modification time was not known for some reason.
This change introduces explicit r->disable_not_modified flag instead,
which is set by ngx_http_upstream_process_headers().
Previous code in ngx_http_upstream_send_response() used last modified time
from r->headers_out.last_modified_time after the header filter chain was
already called. At this point, last_modified_time may be already cleared,
e.g., with SSI, resulting in incorrect last modified time stored in a
cache file. Fix is to introduce u->headers_in.last_modified_time instead.
Clearing of the r->headers_out.last_modified_time field if a response
isn't cacheable in ngx_http_upstream_send_response() was introduced
in 3b6afa999c2f, the commit to enable not modified filter for cacheable
responses. It doesn't make sense though, as at this point header was
already sent, and not modified filter was already executed. Therefore,
the line was removed to simplify code.
log->filter ("if" parameter) was uninitialized when the default value
was being used, which would lead to a crash (SIGSEGV) when access_log
directive wasn't specified in the configuration.
Zero-fill the whole structure instead of zeroing fields one-by-one
in order to prevent similar issues in the future.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
Large allocations from a slab pool result in free page blocks being fragmented,
eventually leading to a situation when no further allocation larger than a page
size are possible from the pool. While this isn't a problem for nginx itself,
it is known to be bad for various 3rd party modules. Fix is to merge adjacent
blocks of free pages in the ngx_slab_free_pages() function.
Prodded by Wandenberg Peixoto and Yichun Zhang.
Since the type cast has precedence higher than the bit shift operator,
all values were truncated to 8 bits.
These macros are used to construct header block for SYN_REPLY frame on
platforms with strict alignment requirements. As a result, any response
that contains a header with name or value longer than 255 bytes was
corrupted on such platforms.
Do not taste the last parameter against directory, as otherwise it would
result in the trailing slash being cut from the parameter value.
Notably, this prevents an internal redirect to an empty URI
if the parameter is set to the literal slash:
location / { try_files $uri /; }
Previous workaround to avoid warnings on OS X due to deprecated system
OpenSSL library (introduced in a3870ea96ccd) no longer works, as
the MAC_OS_X_VERSION_MIN_REQUIRED macro is ignored on OS X 10.9
if a compiler used supports __attribute__(availability).
In particular, properly output partial match at the end of a subrequest
response (much like we do at the end of a response), and reset/set the
last_in_chain flag as appropriate.
Reported by KAWAHARA Masashi.
This fixes --with-file-aio support on systems that lack eventfd()
syscall, notably aarch64 Linux.
The syscall(SYS_eventfd) may still be necessary on systems that
have eventfd() syscall in the kernel but lack it in glibc, e.g.
as seen in the current CentOS 5 release.
Missed during introduction of the SMTP pipelining support (04e43d03e153,
1.5.6). Previously, the check wasn't needed as s->buffer was used directly
and the number of arguments didn't matter.
Reported by Svyatoslav Nikolsky.
If response is gzipped we can't recode response, but in case it's not
needed we still can add charset to Content-Type.
The r->ignore_content_encoding is dropped accordingly, charset with gzip_static
now properly works without any special flags.
The ngx_http_map_uri_to_path() function used clcf->regex to detect if
it's working within a location given by a regular expression and have
to replace full URI with alias (instead of a part matching the location
prefix). This is incorrect due to clcf->regex being false in implicit
locations created by if and limit_except.
Fix is to preserve relevant information in clcf->alias instead, by setting
it to NGX_MAX_SIZE_T_VALUE if an alias was specified in a regex location.
Handling of PROXY protocol for SPDY connection is currently implemented as
a SPDY state. And while nginx waiting for PROXY protocol data it continues
to process SPDY connection: initializes zlib context, sends control frames.
- Specification-friendly handling of invalid header block or special headers.
Such errors are not fatal for session and shouldn't lead to connection close;
- Avoid mix of NGX_HTTP_PARSE_INVALID_REQUEST/NGX_HTTP_PARSE_INVALID_HEADER.
The function just calls ngx_http_spdy_state_headers_skip() most of the time.
There was also an attempt of optimization to stop parsing if the client already
closed connection, but it looks strange and unfinished anyway.
Now ngx_http_spdy_state_protocol_error() is able to close stream,
so there is no need in a separate call for this.
Also fixed zero status code in logs for some cases.
The 7022564a9e0e changeset made ineffective workaround from 2464ccebdb52
to avoid NULL pointer dereference with "if". It is now restored by
moving the u->ssl_name initialization after the check.
Found by Coverity (CID 1210408).
Previous code failed to properly restore cf->conf_file in case of
ngx_close_file() errors, potentially resulting in double free of
cf->conf_file->buffer->start.
Found by Coverity (CID 1087507).
While managing big caches it is possible that expiring old cache items
in ngx_http_file_cache_expire() will take a while. Added a check for
ngx_quit / ngx_terminate to make sure cache manager can be terminated
while in ngx_http_file_cache_expire().
The ngx_http_proxy_rewrite_cookie() function expects the value of the
"Set-Cookie" header to be null-terminated, and for headers obtained
from proxied server it is usually true.
Now the ngx_http_proxy_rewrite() function preserves the null character
while rewriting headers.
This fixes accessing memory outside of rewritten value if both the
"proxy_cookie_path" and "proxy_cookie_domain" directives are used in
the same location.
There's a race condition between closing a stream by one endpoint
and sending a WINDOW_UPDATE frame by another. So it would be better
to just skip such frames for unknown streams, like is already done
for the DATA frames.
These directives allow to switch on Server Name Indication (SNI) while
connecting to upstream servers.
By default, proxy_ssl_server_name is currently off (that is, no SNI) and
proxy_ssl_name is set to a host used in the proxy_pass directive.
The SSL_CTX_set_cipher_list() may fail if there are no valid ciphers
specified in proxy_ssl_ciphers / uwsgi_ssl_ciphers, resulting in
SSL context leak.
In theory, ngx_pool_cleanup_add() may fail too, but this case is
intentionally left out for now as it's almost impossible and proper fix
will require changes to http ssl and mail ssl code as well.
This should prevent attempts of using pointer before it was checked, since
all modern compilers are able to spot access to uninitialized variable.
No functional changes.
Previously, an empty frame object was created for an output chain that contains
only sync or flush empty buffers. But since 39d7eef2e332 every DATA frame has
the flush flag set on its last buffer, so there's no need any more in additional
flush buffers in the output queue and they can be skipped.
Note that such flush frames caused an incorrect $body_bytes_sent value.
The SPDY draft 2 specification requires that if an endpoint receives a
control frame for a type it does not recognize, it must ignore the frame.
But the 3 and 3.1 drafts don't seem to declare any behavior for such case.
Then sticking with the previous draft in this matter looks to be right.
But previously, only 8 least significant bits of the type field were
parsed while the rest of 16 bits of the field were checked against zero.
Though there are no known frame types bigger than 255, this resulted in
inconsistency in handling of such frames: they were not recognized as
valid frames at all, and the connection was closed.
If start time is within the track but end time is out of it, error
"end time is out mp4 stts samples" is generated. However it's
better to ignore the error and output the track until its end.
The ngx_cacheline_size may be too low on some platforms, resulting
in unexpected hash build problems (as no collisions are tolerated due
to low bucket_size, and max_size isn't big enough to build a hash without
collisions). These problems aren't fatal anymore but nevertheless
need to be addressed.
The flag allows to suppress "ngx_slab_alloc() failed: no memory" messages
from a slab allocator, e.g., if an LRU expiration is used by a consumer
and allocation failures aren't fatal.
The flag is now used in the SSL session cache code, and in the limit_req
module.
The "ngx_quit" may be reset by the worker thread before it's seen
by a ngx_cache_manager_thread(), resulting in an infinite loop. Make
sure to test ngx_exiting as well.
This brings Cygwin compilation in line with other case-insensitive
systems (notably win32 and OS X) where one can force case sensitivity
using -DNGX_HAVE_CASELESS_FILESYSTEM=0.
Despite introducing start and end crop operations existing log
messages still mostly refer only to start. Logging is improved
to match both cases.
New debug logging is added to track entry count in atoms after
cropping.
Two format type mismatches are fixed as well.
When "start" value is equal to a track duration the request
fails with "time is out mp4 stts" like it did before track
duration check was added. Now such tracks are considered
short and skipped.
The SPDY/3.1 specification requires that the server must respond with
a 400 "Bad request" error if the sum of the data frame payload lengths
does not equal the size of the Content-Length header.
This also fixes "zero size buf in output" alert, that might be triggered
if client sends a greater than zero Content-Length header and closes
stream using the FIN flag with an empty request body.
There are a few cases in ngx_http_spdy_state_read_data() related to error
handling when ngx_http_spdy_state_skip() might be called with an inconsistent
state between *pos and sc->length, that leads to violation of frame layout
parsing and resuted in corruption of spdy connection.
Based on a patch by Xiaochen Wang.
Previously, only one case was checked: if there's more data to parse
in a r->header_in buffer, but the buffer can be filled to the end by
the last parsed entry, so we also need to check that there's no more
data to inflate.
If set, it means that response body is going to be in more than one buffer,
hence only range requests with a single range should be honored.
The flag is now used by mp4 and cacheable upstream responses, thus allowing
range requests of mp4 files with start/end, as well as range processing
on a first request to a not-yet-cached files with proxy_cache.
Notably this makes it possible to play mp4 files (with proxy_cache, or with
mp4 module) on iOS devices, as byte-range support is required by Apple.
Client address specified in the PROXY protocol header is now
saved in the $proxy_protocol_addr variable and can be used in
the realip module.
This is currently not implemented for mail.
Additionally, make sure to check for errors from ngx_http_parse_header_line()
call after joining saved parts. There shouldn't be any errors, though
check may help to catch bugs like missing f->split_parts reset.
Reported by Lucas Molas.
Previously r->header_size was used to store length for a part of
value that represents an individual already parsed HTTP header,
while r->header_end pointed to the end of the whole value.
Instead of storing length of a following name or value as pointer
to a potential end address (r->header_name_end and r->header_end)
that might be overflowed, now r->lowercase_index counter is used
to store remaining length of a following unparsed field.
It also fixes incorrect $body_bytes_sent value if a request is
closed while parsing of the request header. Since r->header_size
is intended for counting header size, thus abusing it for header
parsing purpose was certainly a bad idea.
If something like "error_page 400 @name" is used in a configuration,
a request could be passed to a named location without URI set, and this
in turn might result in segmentation faults or other bad effects
as most of the code assumes URI is set.
With this change nginx will catch such configuration problems in
ngx_http_named_location() and will stop request processing if URI
is not set, returning 500.
If a request is finalized in the first call to the
ngx_http_upstream_process_upgraded() function, e.g., because upstream
server closed the connection for some reason, in the second call
the u->peer.connection pointer will be null, resulting in segmentation
fault.
Fix is to avoid second direct call, and post event instead. This ensures
that ngx_http_upstream_process_upgraded() won't be called again if
a request is finalized.
The fix removes useless stsc entry in result mp4.
If start_sample == n then current stsc entry should be skipped
and the result stsc should start with the next entry.
The reason for that is start_sample starts from 0, not 1.
Previously, upstream's status code was overwritten with
cached response's status code when STALE or REVALIDATED
response was sent to the client.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The size of the priority field is increased by one bit in spdy/3,
and now it's a 3-bit field followed by 5 bits of unused space.
But a shift of these bits hasn't been adjusted in 39d7eef2e332
accordingly.
Linux returns EOPNOTSUPP for non-TCP sockets and ENOPROTOOPT for TCP
sockets, because getsockopt(TCP_FASTOPEN) is not implemented so far.
While there, lower the log level from ALERT to NOTICE to match other
getsockopt() failures.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
If "proxy_pass" is specified with variables, the resulting
hostname is looked up in the list of upstreams defined in
configuration. The search was case-sensitive, as opposed
to the case of "proxy_pass" specified without variables.
If seek position is within the last track chunk
and that chunk is standalone (stsc entry describes only
this chunk) such seek generates stsc seek error. The
problem is that chunk numbers start with 1, not with 0.
Mp4 module does not check movie and track durations when reading
file. Instead it generates errors when track metadata is shorter
than seek position. Now such tracks are skipped and movie duration
check is performed at file read stage.
Mp4 module does not allow seeks after the last key frame. Since
stss atom only contains key frames it's usually shorter than
other track atoms. That leads to stss seek error when seek
position is close to the end of file. The fix outputs empty
stss frame instead of generating error.
Backed out 05a56ebb084a, as it turns out that kernel can return connections
without any delay if syncookies are used. This basically means we can't
assume anything about connections returned with deferred accept set.
To solve original problem the 05a56ebb084a tried to solve, i.e. to don't
wait longer than needed if a connection was accepted after deferred accept
timeout, this patch changes a timeout set with setsockopt(TCP_DEFER_ACCEPT)
to 1 second, unconditionally. This is believed to be enough for speed
improvements, and doesn't imply major changes to timeouts used.
Note that before 2.6.32 connections were dropped after a timeout. Though
it is believed that 1s is still appropriate for kernels before 2.6.32,
as previously tcp_synack_retries controlled the actual timeout and 1s results
in more than 1 minute actual timeout by default.
If there is no SSI context in a given request at a given time,
the $date_local and $date_gmt variables used "%s" format, instead
of "%A, %d-%b-%Y %H:%M:%S %Z" documented as the default and used
if there is SSI module context and timefmt wasn't modified using
the "config" SSI command.
While use of these variables outside of the SSI evaluation isn't strictly
valid, previous behaviour is certainly inconsistent, hence the fix.
Even during execution of a request it is possible that there will be
no session available, notably in case of renegotiation. As a result
logging of $ssl_session_id in some cases caused NULL pointer dereference
after revision 97e3769637a7 (1.5.9). The check added returns an empty
string if there is no session available.
Read event on a client connection might have been disabled during
previous processing, and we at least need to handle events. Calling
ngx_http_upstream_process_upgraded() is a simpliest way to do it.
Notably this change is needed for select, poll and /dev/poll event
methods.
Previous version of this patch was posted here:
http://mailman.nginx.org/pipermail/nginx/2014-January/041839.html
Previously, it used to contain full session serialized instead of just
a session id, making it almost impossible to use the variable in a safe
way.
Thanks to Ivan Ristić.
It simplifies the code and allows easy reuse the same queue pointer to store
streams in various queues with different requirements. Future implementation
of SPDY/3.1 will take advantage of this quality.
The "length" value better corresponds with the specification and reduces
confusion about whether frame's header is included in "size" or not.
Also this change simplifies some parts of code, since in more cases the
length of frame is more useful than its actual size, especially considering
that the size of frame header is constant.
There is no need in separate "free" pointer and like it is for ngx_chain_t
the "next" pointer can be used. But after this change successfully handled
frame should not be accessed, so the frame handling cycle was improved to
store pointer to the next frame before processing.
Also worth noting that initializing "free" pointer to NULL in the original
code was surplus.
That code was based on misunderstanding of spdy specification about
configuration applicability in the SETTINGS frames. The original
interpretation was that configuration is assigned for the whole
SPDY connection, while it is only for the endpoint.
Moreover, the strange thing is that specification forbids multiple
entries in the SETTINGS frame with the same ID even if flags are
different. As a result, Chrome sends two SETTINGS frames: one with
its own configuration, and another one with configuration stored
for a server (when the FLAG_SETTINGS_PERSIST_VALUE flags were used
by the server).
To simplify implementation we refuse to use the persistent settings
feature and thereby avoid all the complexity related with its proper
support.
The "headers" is not a good term, since it is used not only to count
name/value pairs in the HEADERS block but to count SETTINGS entries too.
Moreover, one name/value pair in HEADERS can contain multiple http headers
with the same name.
No functional changes.
While processing a DATA frame, the link to related stream is stored in spdy
connection object as part of connection state. But this stream can be closed
between receiving parts of the frame.
Previously pool->current wasn't moved back to pool, resulting in blocks
not used for further allocations if pool->current was already moved at the
time of ngx_reset_pool(). Additionally, to preserve logic of moving
pool->current, the p->d.failed counters are now properly cleared. While
here, pool->chain is also cleared.
This change is essentially a nop with current code, but generally improves
things.
During the processing of input some control frames can be added to the queue.
And if there were no writing streams at the moment, these control frames might
be left unsent for a long time (or even forever).
This long delay is especially critical for PING replies since a client can
consider connection as broken and then resend exactly the same request over
a new connection, which is not safe in case of non-idempotent HTTP methods.
There is no reason to allocate it from connection pool that more like just
a bug especially since ngx_http_spdy_settings_frame_handler() already uses
sc->pool to free a chain.
The frame->stream pointer should always be initialized for control frames since
the check against it can be performed in ngx_http_spdy_filter_cleanup().
Parameters of ngx_http_spdy_filter_get_shadow() are changed from size_t to off_t
since the last call of the function may get size and offset from the rest of a
file buffer. This fixes possible data loss rightfully complained by MSVC on 32
bits systems where off_t is 8 bytes long while size_t is only 4 bytes.
The other two type casts are needed just to suppress warnings about possible
data loss also complained by MSVC but false positive in these cases.
False positive warning about the "cl" variable may be uninitialized in
the ngx_http_spdy_filter_get_data_frame() call was suppressed.
It is always initialized either in the "while" cycle or in the following
"if" condition since frame_size cannot be zero.
The "delayed" flag always should be set if there are unsent frames,
but this might not be the case if ngx_http_spdy_body_filter() was
called with NULL chain.
As a result, the "send_timeout" timer could be set on a stream in
ngx_http_writer(). And if the timeout occurred before all the stream
data has been sent, then the request was finalized with the "client
timed out" error.
It was used to prevent destroying of request object when there are unsent
frames in queue for the stream. Since it was incremented for each frame
and is only 8 bits long, so it was not very hard to overflow the counter.
Now the stream->queued counter is checked instead.