The previous code only parsed the first answer, without checking its
type, and required a compressed RR name.
The new code checks the RR type, supports responses with multiple
answers, and doesn't require the RR name to be compressed.
This has a side effect in limited support of CNAME. If a response
includes both CNAME and PTR RRs, like when recursion is enabled on
the server, PTR RR is handled.
Full CNAME support in PTR response is not implemented in this change.
Previously, a global server balancer was used to assign the next DNS server to
send a query to. That could lead to a non-uniform distribution of servers per
request. A request could be assigned to the same dead server several times in a
row and wait longer for a valid server or even time out without being processed.
Now each query is sent to all servers sequentially in a circle until a
response is received or timeout expires. Initial server for each request is
still globally balanced.
When several requests were waiting for a response, then after getting
a CNAME response only the last request's context had the name updated.
Contexts of other requests had the wrong name. This name was used by
ngx_resolve_name_done() to find the node to remove the request context
from. When the name was wrong, the request could not be properly
cancelled, its context was freed but stayed linked to the node's waiting
list. This happened e.g. when the first request was aborted or timed
out before the resolving completed. When it completed, this triggered
a use-after-free memory access by calling ctx->handler of already freed
request context. The bug manifests itself by
"could not cancel <name> resolving" alerts in error_log.
When a request was responded with a CNAME, the request context kept
the pointer to the original node's rn->u.cname. If the original node
expired before the resolving timed out or completed with an error,
this would trigger a use-after-free memory access via ctx->name in
ctx->handler().
The fix is to keep ctx->name unmodified. The name from context
is no longer used by ngx_resolve_name_done(). Instead, we now keep
the pointer to resolver node to which this request is linked.
Keeping the original name intact also improves logging.
When several requests were waiting for a response, then after getting
a CNAME response only the last request was properly processed, while
others were left waiting.
If one or more requests were waiting for a response, then after
getting a CNAME response, the timeout event on the first request
remained active, pointing to the wrong node with an empty
rn->waiting list, and that could cause either null pointer
dereference or use-after-free memory access if this timeout
expired.
If several requests were waiting for a response, and the first
request terminated (e.g., due to client closing a connection),
other requests were left without a timeout and could potentially
wait indefinitely.
This is fixed by introducing per-request independent timeouts.
This change also reverts 954867a2f0a6 and 5004210e8c78.
If enabled, workers are bound to available CPUs, each worker to once CPU
in order. If there are more workers than available CPUs, remaining are
bound in a loop, starting again from the first available CPU.
The optional mask parameter defines which CPUs are available for automatic
binding.
In collaboration with Vladimir Homutov.
The code failed to ensure that "s" is within the buffer passed for
parsing when checking for "ms", and this resulted in unexpected errors when
parsing non-null-terminated strings with trailing "m". The bug manifested
itself when the expires directive was used with variables.
Found by Roman Arutyunyan.
The code for displaying version info and configuration info seemed to be
cluttering up the main function. I was finding it hard to read main. This
extracts out all of the logic for displaying version and configuration info
into its own function, thus making main easier to read.
A configuration like
server { server_name .foo^@; }
server { server_name .foo; }
resulted in a segmentation fault during construction of server names hash.
Reported by Markus Linnala.
Found with afl-fuzz.
Iterating through all connections takes a lot of CPU time, especially
with large number of worker connections configured. As a result
nginx processes used to consume CPU time during graceful shutdown.
To mitigate this we now only do a full scan for idle connections when
shutdown signal is received.
Transitions of connections to idle ones are now expected to be
avoided if the ngx_exiting flag is set. The upstream keepalive module
was modified to follow this.
If nginx was used under OpenVZ and a container with nginx was suspended
and resumed, configuration tests started to fail because of EADDRINUSE
returned from listen() instead of bind():
# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] listen() to 0.0.0.0:80, backlog 511 failed (98: Address already in use)
nginx: configuration file /etc/nginx/nginx.conf test failed
With this change EADDRINUSE errors returned by listen() are handled
similarly to errors returned by bind(), and configuration tests work
fine in the same environment:
# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
More details about OpenVZ suspend/resume bug:
https://bugzilla.openvz.org/show_bug.cgi?id=2470
If the -T option is passed, additionally to configuration test, configuration
files are output to stdout.
In the debug mode, configuration files are kept in memory and can be accessed
using a debugger.
The function is now called ngx_parse_http_time(), and can be used by
any code to parse HTTP-style date and time. In particular, it will be
used for OCSP stapling.
For compatibility, a macro to map ngx_http_parse_time() to the new name
provided for a while.
When configured, an individual listen socket on a given address is
created for each worker process. This allows to reduce in-kernel lock
contention on configurations with high accept rates, resulting in better
performance. As of now it works on Linux and DragonFly BSD.
Note that on Linux incoming connection requests are currently tied up
to a specific listen socket, and if some sockets are closed, connection
requests will be reset, see https://lwn.net/Articles/542629/. With
nginx, this may happen if the number of worker processes is reduced.
There is no such problem on DragonFly BSD.
Based on previous work by Sepherosa Ziehau and Yingqi Lu.
Two mechanisms are implemented to make it possible to store pointers
in shared memory on Windows, in particular on Windows Vista and later
versions with ASLR:
- The ngx_shm_remap() function added to allow remapping of a shared memory
zone to the address originally used for it in the master process. While
important, it doesn't solve the problem by itself as in many cases it's
not possible to use the address because of conflicts with other
allocations.
- We now create mappings at the same address in all processes by starting
mappings at predefined addresses normally unused by newborn processes.
These two mechanisms combined allow to use shared memory on Windows
almost without problems, including reloads.
Based on the patch by Sergey Brester:
http://mailman.nginx.org/pipermail/nginx-devel/2015-April/006836.html
Similar to ngx_http_file_cache_set_slot(), the last component of file->name
with a fixed length of 10 bytes, as generated in ngx_create_temp_path(), is
used as a source for the names of intermediate subdirectories with each one
taking its own part. Ensure that the sum of specified levels with slashes
fits into the length (ticket #731).
Example of usage:
error_log memory:16m debug;
This allows to configure debug logging with minimum impact on performance.
It's especially useful when rare crashes are experienced under high load.
The log can be extracted from a coredump using the following gdb script:
set $log = ngx_cycle->log
while $log->writer != ngx_log_memory_writer
set $log = $log->next
end
set $buf = (ngx_log_memory_buf_t *) $log->wdata
dump binary memory debug_log.txt $buf->start $buf->end
Initial size as calculated from the number of elements may be bigger
than max_size. If this happens, make sure to set size to max_size.
Reported by Chris West.
Previously, this function checked for connection local address existence
and returned error if it was missing. Now a new address is assigned in this
case making it possible to call this function not only for accepted connections.