Recent Linux versions started to return EOPNOTSUPP to getsockopt() calls
on unix sockets, resulting in log pollution on binary upgrade. Such errors
are silently ignored now.
The accept_filter and deferred options were not applied to sockets
that were added to configuration during binary upgrade cycle.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The 403 (Forbidden) should not overwrite 401 (Unauthorized) as the
latter should be returned with the WWW-Authenticate header to request
authentication by a client.
The problem could be triggered with 3rd party modules and the "deny"
directive, or with auth_basic and auth_request which returns 403
(in 1.5.4+).
Patch by Jan Marc Hoffmann.
Much like with other headers, "add_header Cache-Control $value;" no longer
results in anything added to response headers if $value evaluates to an
empty string.
In order to support key rollover, ssl_session_ticket_key can be defined
multiple times. The first key will be used to issue and resume Session
Tickets, while the rest will be used only to resume them.
ssl_session_ticket_key session_tickets/current.key;
ssl_session_ticket_key session_tickets/prev-1h.key;
ssl_session_ticket_key session_tickets/prev-2h.key;
Please note that nginx supports Session Tickets even without explicit
configuration of the keys and this feature should be only used in setups
where SSL traffic is distributed across multiple nginx servers.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The timeout set is used by OpenSSL as a hint for clients in TLS Session
Tickets. Previous code resulted in a default timeout (5m) used for TLS
Sessions Tickets if there was no session cache configured.
Prodded by Piotr Sikora.
With this change all such frames will be added in front of the output queue, and
will be sent first. It prevents HOL blocking when response with higher priority
is blocked by response with lower priority in the middle of the queue because
the order of their SYN_REPLY frames cannot be changed.
Proposed by Yury Kirpichev.
If an error occurs in a SPDY connection, the c->error flag is set on every fake
request connection, and its read or write event handler is called, in order to
finalize it. But while waiting for request headers, it was a no-op since the
read event handler had been set to ngx_http_empty_handler().
If an error occurs in a SPDY connection, the c->error flag is set on every fake
request connection, and its read or write event handler is called, in order to
finalize it. But while waiting for a request body, it was a no-op since the
read event handler ngx_http_request_handler() calls r->read_event_handler that
had been set to ngx_http_block_reading().
Basically, this does the following two changes (and corresponding
modifications of related code):
1. Does not reset session buffer unless it's reached it's end, and always
wait for LF to terminate command (even if we detected invalid command).
2. Record command name to make it available for handlers (since now we
can't assume that command starts from s->buffer->start).
A server MUST send greeting before other replies, while before this
change in case of smtp_greeting_delay violation the 220 greeting was
sent after several 503 replies to commands received before greeting,
resulting in protocol synchronization loss. Moreover, further commands
were accepted after the greeting.
While closing a connection isn't strictly RFC compliant (RFC 5321
requires servers to wait for a QUIT before closing a connection), it's
probably good enough for practial uses.
With previous code only part of u->buffer might be emptied in case
of special responses, resulting in partial responses seen by SSI set
in case of simple protocols, or spurious errors like "upstream sent
invalid chunked response" in case of complex ones.
This patch fixes incorrect handling of auto redirect in configurations
like:
location /0 { }
location /a- { }
location /a/ { proxy_pass ... }
With previously used sorting, this resulted in the following locations
tree (as "-" is less than "/"):
"/a-"
"/0" "/a/"
and a request to "/a" didn't match "/a/" with auto_redirect, as it
didn't traverse relevant tree node during lookup (it tested "/a-",
then "/0", and then falled back to null location).
To preserve locale use for non-ASCII characters on case-insensetive
systems, libc's tolower() used.
Location tree was always constructed using case-sensitive comparison, even
on case-insensitive systems. This resulted in incorrect operation if
uppercase letters were used in location directives. Notably, the
following config:
location /a { ... }
location /B { ... }
failed to properly map requests to "/B" into "location /B".
Found by using auth_basic.t from mdounin nginx-tests under valgrind.
==10470== Invalid write of size 1
==10470== at 0x43603D: ngx_crypt_to64 (ngx_crypt.c:168)
==10470== by 0x43648E: ngx_crypt (ngx_crypt.c:153)
==10470== by 0x489D8B: ngx_http_auth_basic_crypt_handler (ngx_http_auth_basic_module.c:297)
==10470== by 0x48A24A: ngx_http_auth_basic_handler (ngx_http_auth_basic_module.c:240)
==10470== by 0x44EAB9: ngx_http_core_access_phase (ngx_http_core_module.c:1121)
==10470== by 0x44A822: ngx_http_core_run_phases (ngx_http_core_module.c:895)
==10470== by 0x44A932: ngx_http_handler (ngx_http_core_module.c:878)
==10470== by 0x455EEF: ngx_http_process_request (ngx_http_request.c:1852)
==10470== by 0x456527: ngx_http_process_request_headers (ngx_http_request.c:1283)
==10470== by 0x456A91: ngx_http_process_request_line (ngx_http_request.c:964)
==10470== by 0x457097: ngx_http_wait_request_handler (ngx_http_request.c:486)
==10470== by 0x4411EE: ngx_epoll_process_events (ngx_epoll_module.c:691)
==10470== Address 0x5866fab is 0 bytes after a block of size 27 alloc'd
==10470== at 0x4A074CD: malloc (vg_replace_malloc.c:236)
==10470== by 0x43B251: ngx_alloc (ngx_alloc.c:22)
==10470== by 0x421B0D: ngx_malloc (ngx_palloc.c:119)
==10470== by 0x421B65: ngx_pnalloc (ngx_palloc.c:147)
==10470== by 0x436368: ngx_crypt (ngx_crypt.c:140)
==10470== by 0x489D8B: ngx_http_auth_basic_crypt_handler (ngx_http_auth_basic_module.c:297)
==10470== by 0x48A24A: ngx_http_auth_basic_handler (ngx_http_auth_basic_module.c:240)
==10470== by 0x44EAB9: ngx_http_core_access_phase (ngx_http_core_module.c:1121)
==10470== by 0x44A822: ngx_http_core_run_phases (ngx_http_core_module.c:895)
==10470== by 0x44A932: ngx_http_handler (ngx_http_core_module.c:878)
==10470== by 0x455EEF: ngx_http_process_request (ngx_http_request.c:1852)
==10470== by 0x456527: ngx_http_process_request_headers (ngx_http_request.c:1283)
==10470==
The same path names with different "data" context should not be allowed.
In particular it rejects configurations like this:
proxy_cache_path /var/cache/ keys_zone=one:10m max_size=1g inactive=5m;
proxy_cache_path /var/cache/ keys_zone=two:20m max_size=4m inactive=30s;
The SSL_CTX_load_verify_locations() may leave errors in the error queue
while returning success (e.g. if there are duplicate certificates in the file
specified), resulting in "ignoring stale global SSL error" alerts later
at runtime.
Casts between pointers and integers produce warnings on size mismatch. To
silence them, cast to (u)intptr_t should be used. Prevoiusly, casts to
ngx_(u)int_t were used in some cases, and several ngx_int_t expressions had
no casts.
As of now it's mostly style as ngx_int_t is defined as intptr_t.
On win32, time_t is 64 bits wide by default, and passing an ngx_msec_int_t
argument for %T format specifier doesn't work. This doesn't manifest itself
on other platforms as time_t and ngx_msec_int_t are usually of the same size.
Several warnings silenced, notably (ngx_socket_t) -1 is now checked
on socket operations instead of -1, as ngx_socket_t is unsigned on win32
and gcc complains on comparison.
With this patch, it's now possible to compile nginx using mingw gcc,
with options we normally compile on win32.
Several false positive warnings silenced, notably W8012 "Comparing
signed and unsigned" (due to u_short values promoted to int), and
W8072 "Suspicious pointer arithmetic" (due to large type values added
to pointers).
With this patch, it's now again possible to compile nginx using bcc32,
with options we normally compile on win32 minus ipv6 and ssl.
Precompiled headers are disabled as they lead to internal compiler errors
with long configure lines. Couple of false positive warnings silenced.
Various win32 typedefs are adjusted to work with Open Watcom C 1.9 headers.
With this patch, it's now again possible to compile nginx using owc386,
with options we normally compile on win32 minus ipv6 and ssl.
It was introduced in Linux 2.6.39, glibc 2.14 and allows to obtain
file descriptors without actually opening files. Thus made it possible
to traverse path with openat() syscalls without the need to have read
permissions for path components. It is effectively emulates O_SEARCH
which is missing on Linux.
O_PATH is used in combination with O_RDONLY. The last one is ignored
if O_PATH is used, but it allows nginx to not fail when it was built on
modern system (i.e. glibc 2.14+) and run with a kernel older than 2.6.39.
Then O_PATH is unknown to the kernel and ignored, while O_RDONLY is used.
Sadly, fstat() is not working with O_PATH descriptors till Linux 3.6.
As a workaround we fallback to fstatat() with the AT_EMPTY_PATH flag
that was introduced at the same time as O_PATH.
It is believed to be better than fallback to HTTP/0.9, because most of
the clients at present time support HTTP/1.0. It allows nginx to return
error response code for them in cases when it fail to parse request line,
and therefore fail to detect client protocol version.
Even if the client does not support HTTP/1.0, this assumption should not
cause any harm, since from the HTTP/0.9 point of view it still a valid
response.
Without u->header_sent set a special response might be generated following
an upgraded connection. The problem appeared in 1ccdda1f37f3 (1.5.3).
Catched by "header already sent" alerts in 1.5.4 after upstream timeouts.
This allows to approach "server_name" values specified below the
"valid_referers" directive when used within the "server_names" parameter, e.g.:
server_name example.org;
valid_referers server_names;
server_name example.com;
As a bonus, this fixes bogus error with "server_names" specified several times.
The server_name regexes are normally compiled for case-sensitive matching.
This violates case-insensitive obligations in the referer module. To fix
this, the host string is converted to lower case before matching.
Previously server_name regex was executed against the whole referer string
after dropping the scheme part. This could led to an improper matching, e.g.:
server_name ~^localhost$;
valid_referers server_names;
Referer: http://localhost/index.html
It was changed to look only at the hostname part.
The server_name regexes are separated into another array to not clash with
regular regexes.
If Content-Length header is not set, and the image size is larger than the
buffer size, client will hang until a timeout occurs.
Now NGX_HTTP_UNSUPPORTED_MEDIA_TYPE is returned immediately.
diff -r d1403de41631 -r 4fae04f332b4
src/http/modules/ngx_http_image_filter_module.c
Missing call to ngx_http_run_posted_request() resulted in a main request hang
if subrequest's ssl handshake with an upstream server failed for some reason.
Reported by Aviram Cohen.
While ngx_get_full_name() might have a bit more descriptive arguments,
the ngx_conf_full_name() is generally easier to use when parsing
configuration and limits exposure of cycle->prefix / cycle->conf_prefix
details.
They refer to the same socket descriptor as our real connection, and
deleting them will stop processing of the connection.
Events of fake connections must not be activated, and if it happened there
is nothing we can do. The whole processing should be terminated as soon as
possible, but it is not obvious how to do this safely.
A quote from SPDY draft 2 specification: "The length of each name and
value must be greater than zero. A receiver of a zero-length name or
value must send a RST_STREAM with code PROTOCOL error."
But it appears that Chrome browser allows sending requests over SPDY/2
connection using JavaScript that contain headers with empty values.
For better compatibility across SPDY clients and to be compliant with
HTTP, such headers are no longer rejected.
Also, it is worth noting that in SPDY draft 3 the statement has been
changed so that it permits empty values for headers.
It was broken in 8e446a2daf48 when the NGX_SENDFILE_LIMIT constant was added
to ngx_linux_sendfile_chain.c having the same name as already defined one in
ngx_linux_config.h.
The newer is needed to overcome a bug in old Linux kernels by limiting the
number of bytes to send per sendfile() syscall. The older is used with
sendfile() on ancient kernels that works with 32-bit offsets only.
One of these renamed to NGX_SENDFILE_MAXSIZE.
When matching a compiled regex against value in the "Referer" header field,
the length was calculated incorrectly for strings that start from "https://".
This might cause matching to fail for regexes with end-of-line anchors.
Patch by Liangbin Li.
In ngx_*_sendfile_chain() when calculating pointer to a first
non-zero sized buf, use "in" as iterator. This fixes processing
of zero sized buf(s) after EINTR. Otherwise function can return
zero sized buf to caller, and later ngx_http_write_filter()
logs warning.
If a relative path is set by variables, then the ngx_conf_full_name()
function was called while processing requests, which causes allocations
from the cycle pool.
A new function that takes pool as an argument was introduced.
This is done by passing AI_ADDRCONFIG to getaddrinfo().
On Linux, setting net.ipv6.conf.all.disable_ipv6 to 1 will now be
respected.
On FreeBSD, AI_ADDRCONFIG filtering is currently implemented by
attempting to create a datagram socket for the corresponding family,
which succeeds even if the system doesn't in fact have any addresses
of that family configured. That is, if the system with IPv6 support
in the kernel doesn't have IPv6 addresses configured, AI_ADDRCONFIG
will filter out IPv6 only inside a jail without IPv6 addresses or
with IPv6 disabled.
Though there are several MIME types commonly used for JavaScript nowadays,
the most common being "text/javascript", "application/javascript", and
currently used by nginx "application/x-javascript", RFC 4329 prefers
"application/javascript".
The "charset_types" directive's default value was adjusted accordingly.
As per perlxs, C preprocessor directives should be at the first
non-whitespace of a line to avoid interpreting them as comments.
#if and #endif are moved so that there are no blank lines before them
to retain them as part of the function body.
Previously, after sending a header we always sent a last buffer and
finalized a request with code 0, even in case of errors. In some cases
this resulted in a loss of ability to detect the response wasn't complete
(e.g. if Content-Length was removed from a response by gzip filter).
This change tries to propogate to a client information that a response
isn't complete in such cases. In particular, with this change we no longer
pretend a returned response is complete if we wasn't able to create
a temporary file.
If an error code suggests the error wasn't fatal, we flush buffered data
and disable keepalive, then finalize request normally. This allows to to
propogate information about a problem to a client, while still sending all
the data we've got from an upstream.
No semantic changes expected, though some checks are done differently.
In particular, the r->cached flag is no longer explicitly checked. Instead,
we relay on u->header_sent not being set if a response is sent from
a cache.
The NGX_HTTP_CLIENT_CLOSED_REQUEST code is allowed to happen after we
started sending a response (much like NGX_HTTP_REQUEST_TIME_OUT), so there
is no need to reset response code to 0 in this case.
Checks were added to both buffered and unbuffered code paths to detect
and complain if a response is incomplete. Appropriate error codes are
now passed to ngx_http_upstream_finalize_request().
With this change in unbuffered mode we now use u->length set to -1 as an
indicator that EOF is allowed per protocol and used to indicate response
end (much like its with p->length in buffered mode). Proxy module was
changed to set u->length to 1 (instead of previously used -1) in case of
chunked transfer encoding used to comply with the above.
That is, by default we assume that response end is signalled by
a connection close. This seems to be better default, and in line
with u->pipe->length behaviour.
Memcached module was modified accordingly.
In case of upstream eof, only responses with u->pipe->length == -1
are now cached/stored. This ensures that unfinished chunked responses
are not cached.
Note well - previously used checks for u->headers_in.content_length_n are
preserved. This provides an additional level of protection if protol data
disagree with Content-Length header provided (e.g., a FastCGI response
is sent with wrong Content-Length, or an incomple SCGI or uwsgi response),
as well as protects from storing of responses to HEAD requests. This should
be reconsidered if we'll consider caching of responses to HEAD requests.
There is no real difference from previously used 0 as NGX_HTTP_* will
become 0 in ngx_http_upstream_finalize_request(), but the change
preserves information about a timeout a bit longer. Previous use of
ETIMEDOUT in one place was just wrong.
Note well that with cacheable responses there will be a difference
(code in ngx_http_upstream_finalize_request() will store the error
in cache), though this change doesn't touch cacheable case.
Previously, ngx_http_upstream_finalize_request(0) was used in most
cases after errors. While with current code there is no difference,
use of NGX_ERROR allows to pass a bit more information into
ngx_http_upstream_finalize_request().
In all cases ngx_http_upstream_finalize_request() with NGX_ERROR now used.
Previously used NGX_HTTP_INTERNAL_SERVER_ERROR in the subrequest in memory
case don't cause any harm, but inconsistent with other uses.
With previous code the p->temp_file->offset wasn't adjusted if a temp
file was written by the code in ngx_event_pipe_write_to_downstream()
after an EOF, resulting in cache not being used with empty scgi and uwsgi
responses with Content-Length set to 0.
Fix it to call ngx_event_pipe_write_chain_to_temp_file() there instead
of calling ngx_write_chain_to_temp_file() directly.
Previous code called ngx_http_finalize_request() with rc = 0. This is
ok if a response status was already set, but resulted in "000" being
logged if it wasn't. In particular this happened with limit_req
if a connection was prematurely closed during limit_req delay.
After a failed partial match we now check if there is another partial
match in previously matched substring to fix cases like "aab" in "aaab".
The ctx->saved string is now always sent if it's present on return
from the ngx_http_sub_parse() function (and reset accordingly). This
allows to release parts of previously matched data.
If a pattern was partially matched at a response end, partially matched
string wasn't send. E.g., a response "fo" was truncated to an empty response
if partially mathed by a pattern "foo".
It is possible to send FLAG_FIN in additional empty data frame, even if it is
known from the content-length header that request body is empty. And Firefox
actually behaves like this (see ticket #357).
To simplify code we sacrificed our microoptimization that did not work right
due to missing check in the ngx_http_spdy_state_data() function for rb->buf
set to NULL.
The exsltRegisterAll() needs to be called before XSLT stylesheets
are compiled, else stylesheet compilation hooks will not work. This
change fixes EXSLT Functions extension.
Currently this flag is needed for epoll and rtsig, and though these methods
usually present on different platforms than kqueue, nginx can be compiled to
support all of them.
The call to ngx_sock_ntop() in ngx_connection_local_sockaddr() might be
performed with the uninitialized "len" variable. The fix is to initialize
variable to the size of corresponding socket address type.
The issue was introduced in commit 05ba5bce31e0.
On Linux, sockaddr length is required to process unix socket addresses properly
due to unnamed sockets (which don't have sun_path set at all) and abstract
namespace sockets.
Minimal data length we expect for further calls was calculated incorrectly
if parsing stopped right after parsing chunk size. This might in theory
affect clients and/or backends using LF instead of CRLF.
Patch by Dmitry Popov.
On Linux x32 inclusion of sys/sysctl.h produces an error. As sysctl() is
only used by rtsig event method code, which is legacy and not compiled
in by default on modern linuxes, the sys/sysctl.h file now only included
if rtsig support is enabled.
Based on patch by Serguei I. Ivantsov.
When several "error_log" directives are specified in the same configuration
block, logs are written to all files with a matching log level.
All logs are stored in the singly-linked list that is sorted by log level in
the descending order.
Specific debug levels (NGX_LOG_DEBUG_HTTP,EVENT, etc.) are not supported
if several "error_log" directives are specified. In this case all logs
will use debug level that has largest absolute value.
The cycle->new_log->log_level should only be initialized by ngx_init_cycle()
if no error logs were found in the configuration. This move allows to get rid
of extra initialization in ngx_error_log().
Valgrind complains if we pass uninitialized memory to a syscall:
==36492== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==36492== at 0x6B5E6A: sendmsg (in /usr/lib/system/libsystem_kernel.dylib)
==36492== by 0x10004288E: ngx_signal_worker_processes (ngx_process_cycle.c:527)
==36492== by 0x1000417A7: ngx_master_process_cycle (ngx_process_cycle.c:203)
==36492== by 0x100001F10: main (nginx.c:410)
==36492== Address 0x7fff5fbff71c is on thread 1's stack
Even initialization of all members of the structure passed isn't enough, as
there is padding which still remains uninitialized and results in Valgrind
complaint. Note there is no real problem here as data from uninitialized
memory isn't used.
Valgrind intercepts SIGUSR2 in some cases, and nginx might not be able to
start due to sigaction() failure. If compiled with NGX_VALGRIND defined,
we now ignore the failure of sigaction().
If "stderr" was specified in one of the "error_log" directives,
stderr is not redirected to the first error_log on startup,
configuration reload, and reopening log files.
On win32 stderr was not redirected into a file specified by "error_log"
while reopening files. Fix is to use platform-independent functions to
work with stderr, as already used by ngx_init_cycle() and main() since
rev. d8316f307b6a.
Use of accept mutex on win32 may result in a deadlock if there are multiple
worker_processes configured and the mutex is grabbed by a process which
can't accept connections.
The parameter is mostly identical to http_404, and is expected to
be used in similar situations. The 403 code might be returned by
a backend instead of 404 on initial sync of new directories with rsync.
See here for feature request and additional details:
http://mailman.nginx.org/pipermail/nginx-ru/2013-April/050920.html
It is now a syntax error if tokens passed to a custom configuration
handler are terminated by "{".
The following incorrect configuration is now properly rejected:
map $v $v2 {
a b {
c d {
e f {
}
An invalid memcached reply that started with '\n' could cause
segmentation fault.
An invalid memcached reply "VALUE / 0 2\r?ok\r\nEND\r\n" was
considered as a valid response.
In addition, if memcached reports that the key was not found,
set u->headers_in.content_length_n to 0. This ensures that
ngx_http_memcached_filter() will not be called while previous
code relied on always intercepting 404. Initialization of
ctx->rest was moved to where it belongs.
Due to peer->checked always set since rev. c90801720a0c (1.3.0)
by round-robin and least_conn balancers (ip_hash not affected),
the code in ngx_http_upstream_free_round_robin_peer() function
incorrectly reset peer->fails too often.
Reported by Dmitry Popov,
http://mailman.nginx.org/pipermail/nginx-devel/2013-May/003720.html
Due to a bad argument list, nginx worker would crash (SIGSEGV) while
trying to log the fact that it received OCSP response with "revoked"
or "unknown" certificate status.
While there, fix similar (but non-crashing) error a few lines above.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
There are two significant changes in this patch:
1) The <= 0 comparison is done with a signed type. This fixes the case
of ngx_time() being larger than r->lingering_time.
2) Calculation of r->lingering_time - ngx_time() is now always done
in the ngx_msec_t type. This ensures the calculation is correct
even if time_t is unsigned and differs in size from ngx_msec_t.
Thanks to Lanshun Zhou.
Previously, input pattern was kept only for regular expressions
with named captures, which resulted in error log entries without
input pattern for PCRE errors that occured while processing
regular expressions without them.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
The $proxy_internal_body_length value might change during request lifetime,
notably if proxy_set_body used, and use of a cached value might result in
incorrect upstream requests.
Patch by Lanshun Zhou.
As of now, it allows to better control bandwidth limiting from additional
modules. It is also expected to be used to add variables support to
the limit_rate_after directive.
If nginx was compiled without --with-http_ssl_module, but with some
other module which uses OpenSSL (e.g. --with-mail_ssl_module), insufficient
preprocessor check resulted in build failure. The problem was introduced
by e0a3714a36f8 (1.3.14).
Reported by Roman Arutyunyan.
This is to avoid setting the TCP_NODELAY flag on SPDY socket in
ngx_http_upstream_send_response(). The latter works per request,
but in SPDY case it might affect other streams in connection.
As of 1.3.9, chunked request body may be available with
r->headers_in.content_length_n <= 0. Additionally, request body
may be in multiple buffers even if r->request_body_in_single_buf
was requested.
Dependancy tracking introduced in r5169 were not handled absolute path
names properly. Absolute names might appear in CORE_DEPS if --with-openssl
or --with-pcre configure arguments are used to build OpenSSL/PCRE
libraries.
Additionally, revert part of r5169 to set NGX_INCS from Makefile
variables. Makefile variables have $ngx_include_opt in them, which
might result in wrong include paths being used. As a side effect,
this also restores build with --with-http_perl_module and --without-http
at the same time.
Before 1.3.9 an attempt to read body in a subrequest only caused problems
if body wasn't already read or discarded in a main request. Starting with
1.3.9 it might also cause problems if body was discarded by a main request
before subrequest start.
Fix is to just ignore attempts to read request body in a subrequest, which
looks like right thing to do anyway.
We generate both read and write events if an error event was returned by
port_getn() without POLLIN/POLLOUT, but we should not try to handle inactive
events, they may even have no handler.
Stale write event may happen if read and write events was reported both,
and processing of the read event closed descriptor.
In practice this might result in "sendfilev() failed (134: ..." or
"writev() failed (134: ..." errors when switching to next upstream server.
See report here:
http://mailman.nginx.org/pipermail/nginx/2013-April/038421.html
To avoid further breaks it's now done properly, all the dependencies
are now passed to Makefile.PL. While here, fixed include list passed to
Makefile.PL to use Makefile variables rather than a list expanded during
configure.
Problems with setsockopt(TCP_NODELAY) and setsockopt(TCP_NOPUSH), as well
as sendfile() syscall on Solaris, are specific to UNIX-domain sockets.
Other address families, i.e. AF_INET and AF_INET6, are fine.
On Win32 platforms 0 is used to indicate errors in file operations, so
comparing against -1 is not portable.
This was not much of an issue in patched code, since only ngx_fd_info() test
is actually reachable on Win32 and in worst case it might result in bogus
error log entry.
Patch by Piotr Sikora.
Sorting of upstream servers by their weights is not required by
current balancing algorithms.
This will likely change mapping to backends served by ip_hash
weighted upstreams.
And corresponding variable $connections_waiting was added.
Previously, waiting connections were counted as the difference between
active connections and the sum of reading and writing connections.
That made it impossible to count more than one request in one connection
as reading or writing (as is the case for SPDY).
Also, we no longer count connections in handshake state as waiting.
This should improve behavior under deficiency of connections.
Since SSL handshake usually takes significant amount of time,
we exclude connections from reusable queue during this period
to avoid premature flush of them.
If proxy_pass to a host with dynamic resolution was used to handle
a subrequest, and host resolution failed, the main request wasn't run
till something else happened on the connection. E.g. request to "/zzz"
with the following configuration hanged:
addition_types *;
resolver 8.8.8.8;
location /test {
set $ihost xxx;
proxy_pass http://$ihost;
}
location /zzz {
add_after_body /test;
return 200 "test";
}
Report and original version of the patch by Lanshun Zhou,
http://mailman.nginx.org/pipermail/nginx-devel/2013-March/003476.html.
Code to reuse of r->request_body->buf in upstream module assumes it's
dedicated buffer, hence after 1.3.9 (r4931) it might reuse r->header_in
if client_body_in_file_only was set, resulting in original request
corruption. It is considered to be safer to always create a dedicated
buffer for rb->bufs to avoid such problems.
After introduction of chunked request body handling in 1.3.9 (r4931),
r->request_body->bufs buffers have b->start pointing to original buffer
start (and b->pos pointing to real data of this particular buffer).
While this is ok as per se, it caused bad things (usually original request
headers included before the request body) after reinit of the request
chain in ngx_http_upstream_reinit() while sending the request to a next
upstream server (which used to do b->pos = b->start for each buffer
in the request chain).
Patch by Piotr Sikora.
If c->recv() returns 0 there is no sense in using ngx_socket_errno for
logging, its value meaningless. (The code in question was copied from
ngx_http_keepalive_handler(), but ngx_socket_errno makes sense there as it's
used as a part of ECONNRESET handling, and the c->recv() call is preceeded
by the ngx_set_socket_errno(0) call.)
In r2411 setting of NGX_HTTP_GZIP_BUFFERED in c->buffered was moved from
ngx_http_gzip_filter_deflate_start() to ngx_http_gzip_filter_buffer() since
it was always called first. But in r2543 the "postpone_gzipping" directive
was introduced, and if postponed gzipping is disabled (the default setting),
ngx_http_gzip_filter_buffer() is not called at all.
We must always set NGX_HTTP_GZIP_BUFFERED after the start of compression
since there is always a trailer that is buffered.
There are no known cases when it leads to any problem with current code.
But we already had troubles in upcoming SPDY implementation.
Not only this is useful for the upcoming SPDY support, but it can
also help to improve HTTPS performance by enabling TLS False Start
in Chrome/Chromium browsers [1]. So, we always enable NPN for HTTPS
if it is supported by OpenSSL.
[1] http://www.imperialviolet.org/2012/04/11/falsestart.html
The c->single_connection was intended to be used as lock mechanism
to serialize modifications of request object from several threads
working with client and upstream connections. The flag is redundant
since threads in nginx have never been used that way.
In Linux 2.6.32, TCP_DEFER_ACCEPT was changed to accept connections
after the deferring period is finished without any data available.
(Reading from the socket returns EAGAIN in this case.)
Since in nginx TCP_DEFER_ACCEPT is set to "post_accept_timeout", we
do not need to wait longer if deferred accept returns with no data.
Previously, only the first request in a connection used timeout
value from the "client_header_timeout" directive while reading
header. All subsequent requests used "keepalive_timeout" for
that.
It happened because timeout of the read event was set to the
value of "keepalive_timeout" in ngx_http_set_keepalive(), but
was not removed when the next request arrived.
Previously, we always created an object and logged 400 (Bad Request)
in access log if a client closed connection without sending any data.
Such a connection was counted as "reading".
Since it's common for modern browsers to behave like this, it's no
longer considered an error if a client closes connection without
sending any data, and such a connection will be counted as "waiting".
Now, we do not log 400 (Bad Request) and keep memory footprint as
small as possible.
Previously, it was allocated from a connection pool and
was selectively freed for an idle keepalive connection.
The goal is to put coupled things in one chunk of memory,
and to simplify handling of request objects.
According to RFC 6066, client is not supposed to request a different server
name at the application layer. Server implementations that rely upon these
names being equal must validate that a client did not send a different name
in HTTP request. Current versions of Apache HTTP server always return 400
"Bad Request" in such cases.
There exist implementations however (e.g., SPDY) that rely on being able to
request different host names in one connection. Given this, we only reject
requests with differing host names if verification of client certificates
is enabled in a corresponding server configuration.
An example of configuration that might not work as expected:
server {
listen 433 ssl default;
return 404;
}
server {
listen 433 ssl;
server_name example.org;
ssl_client_certificate org.cert;
ssl_verify_client on;
}
server {
listen 433 ssl;
server_name example.com;
ssl_client_certificate com.cert;
ssl_verify_client on;
}
Previously, a client was able to request example.com by presenting
a certificate for example.org, and vice versa.
Not only this is consistent with a case without SNI, but this also
prevents abusing configurations that assume that the $host variable
is limited to one of the configured names for a server.
An example of potentially unsafe configuration:
server {
listen 443 ssl default_server;
...
}
server {
listen 443;
server_name example.com;
location / {
proxy_pass http://$host;
}
}
Note: it is possible to negotiate "example.com" by SNI, and to request
arbitrary host name that does not exist in the configuration above.
Previously, this was done only after the whole request header
was parsed, and if an error occurred earlier then the request
was processed in the default server (or server chosen by SNI),
while r->headers_in.server might be set to the value from the
Host: header or host from request line.
r->headers_in.server is in turn used for $host variable and
in HTTP redirects if "server_name_in_redirect" is disabled.
Without the change, configurations that rely on this during
error handling are potentially unsafe if SNI is used.
This change also allows to use server specific settings of
"underscores_in_headers", "ignore_invalid_headers", and
"large_client_header_buffers" directives for HTTP requests
and HTTPS requests without SNI.
The request object will not be created until SSL handshake is complete.
This simplifies adding another connection handler that does not need
request object right after handshake (e.g., SPDY).
There are also a few more intentional effects:
- the "client_header_buffer_size" directive will be taken from the
server configuration that was negotiated by SNI;
- SSL handshake errors and timeouts are not logged into access log
as bad requests;
- ngx_ssl_create_connection() is not called until the first byte of
ClientHello message was received. This also decreases memory
consumption if plain HTTP request is sent to SSL socket.
Previously, only the first request in a connection was assigned the
configuration selected by SNI. All subsequent requests initially
used the default server's configuration, ignoring SNI, which was
wrong.
Now all subsequent requests in a connection will initially use the
configuration selected by SNI. This is done by storing a pointer
to configuration in http connection object. It points to default
server's configuration initially, but changed upon receipt of SNI.
(The request's configuration can be further refined when parsing
the request line and Host: header.)
This change was not made specific to SNI as it also allows slightly
faster access to configuration without the request object.
This change helps to decouple ngx_http_ssl_servername() from the request
object.
Note: now we close connection in case of error during server name lookup
for request. Previously, we did so only for HTTP/0.9 requests.
In case multiple "Cache-Control" headers are sent to a client,
multiple values in $sent_http_cache_control were incorrectly
split by a semicolon. Now they are split by a comma.
In case of error in the read event handling we close a connection
by calling ngx_http_close_connection(), that also destroys connection
pool. Thereafter, an attempt to free a buffer (added in r4892) that
was allocated from the pool could cause SIGSEGV and is meaningless
as well (the buffer already freed with the pool).
In case of fully populated SSL session cache with no memory left for
new allocations, ngx_ssl_new_session() will try to expire the oldest
non-expired session and retry, but only in case when slab allocation
fails for "cached_sess", not when slab allocation fails for either
"sess_id" or "id", which can happen for number of reasons and results
in new session not being cached.
Patch fixes this by adding retry logic to "sess_id" & "id" allocations.
Patch by Piotr Sikora.
It was added in r2717 and no longer needed since r2721,
where the termination was added to ngx_shm_alloc() and
ngx_init_zone_pool(). So then it only corrupts error
messages about ivalid zones.
This allows to proxy WebSockets by using configuration like this:
location /chat/ {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
Connection upgrade is allowed as long as it was requested by a client
via the Upgrade request header.
Note: use of {SHA} passwords is discouraged as {SHA} password scheme is
vulnerable to attacks using rainbow tables. Use of {SSHA}, $apr1$ or
crypt() algorithms as supported by OS is recommended instead.
The {SHA} password scheme support is added to avoid the need of changing
the scheme recorded in password files from {SHA} to {SSHA} because such
a change hides security problem with {SHA} passwords.
Patch by Louis Opter, with minor changes.
If fastcgi end request record was split between several network packets,
with fastcgi_keep_conn it was possible that connection was saved in incorrect
state (e.g. with padding bytes not yet read).
Checks for f->padding before state transitions make code hard to follow,
remove them and make sure we always do another loop iteration after
f->state is set to ngx_http_fastcgi_st_padding.
With fastcgi_keep_conn it was possible that connection was closed after
FCGI_STDERR record with zero padding and without any further data read yet.
This happended as f->state was set to ngx_http_fastcgi_st_padding and then
"break" happened, resulting in p->length being set to f->padding, i.e. 0
(which in turn resulted in connection close).
Fix is to make sure we continue the loop after f->state is set.
After introduction of chunked request body reading support in 1.3.9 (r4931),
the rb->bufs wasn't set if request body was fully preread while calling the
ngx_http_read_client_request_body() function.
Reported by Yichun Zhang (agentzh).
Missing calls to ngx_handle_write_event() and ngx_handle_read_event()
resulted in a CPU hog during SSL handshake if an level-triggered event
method (e.g. select) was used.
According to documentation, calling SSL_write() with num=0 bytes to be sent
results in undefined behavior.
We don't currently call ngx_ssl_send_chain() with empty chain and buffer.
This check handles the case of a chain with total data size that is
a multiple of NGX_SSL_BUFSIZE, and with the special buffer at the end.
In practice such cases resulted in premature connection close and critical
error "SSL_write() failed (SSL:)" in the error log.
The "secure_link_secret" directive was always inherited from the outer
configuration level even when "secure_link" and "secure_link_md5" were
specified on the inner level.
A POLLERR signalled by poll() without POLLIN/POLLOUT, as seen on
Linux, would generate both read and write events, but there's no
write event handler for resolver events. A fix is to only call
event handler of an active event.
Before the patch if proxy_method was specified at http{} level the code
to add trailing space wasn't executed, resulting in incorrect requests
to upstream.
The "proxy_bind", "fastcgi_bind", "uwsgi_bind", "scgi_bind" and
"memcached_bind" directives are now inherited; inherited value
can be reset by the "off" parameter. Duplicate directives are
now detected. Parameter value can now contain variables.
Upstreams created by "proxy_pass" with IP address and no port were
broken in 1.3.10, by not initializing port in u->sockaddr.
API change: ngx_parse_url() was modified to always initialize port
(in u->sockaddr and in u->port), even for the u->no_resolve case;
ngx_http_upstream() and ngx_http_upstream_add() were adopted.
The patch saves one EC_KEY_generate_key() call per server{} block by
informing OpenSSL about SSL_OP_SINGLE_ECDH_USE we are going to use before
the SSL_CTX_set_tmp_ecdh() call.
For a configuration file with 10k simple server{} blocks with SSL enabled
this change reduces startup time from 18s to 5s on a slow test box here.
Uninitialized pointer may result in arbitrary segfaults if access_log is used
without buffer and without variables in file path.
Patch by Tatsuhiko Kubo (ticket #268).
Previously, "default" was equivalent to specifying 0.0.0.0/0, now
it's equivalent to specifying both 0.0.0.0/0 and ::/0 (if support
for IPv6 is enabled) with the same value.
The code refactored in a way to call custom handler that can do appropriate
cleanup work (if any), like flushing buffers, finishing compress streams,
finalizing connections to log daemon, etc..
Previously a new buffer was allocated for every "access_log" directive with the
same file path and "buffer=" parameters, while only one buffer per file is used.
The crypt_r() function returns NULL on errors, check it explicitly instead
of assuming errno will remain 0 if there are no errors (per POSIX, the
setting of errno after a successful call to a function is unspecified
unless the description of that function specifies that errno shall not
be modified).
Additionally, dropped unneeded ngx_set_errno(0) and fixed error handling
of memory allocation after normal crypt(), which was inapropriate and
resulted in null pointer dereference on allocation failures.
Configurations like
location /i/ {
image_filter resize 200 200;
image_filter rotate 180;
location /i/foo/ {
image_filter resize 200 200;
}
}
resulted in rotation incorrectly applied in the location /i/foo, without
any way to clear it. Fix is to handle conf->angle/conf->acv consistently
with other filter variables and do not try to inherit them if there are
transformations defined for current location.
The image_filter_jpeg_quality, image_filter_sharpen and "image_filter rotate"
were inherited incorrectly if a directive with variables was defined, and
then redefined to a literal value, i.e. in configurations like
image_filter_jpeg_quality $arg_q;
location / {
image_filter_jpeg_quality 50;
}
Patch by Ian Babrou, with minor changes.
This includes "debug_connection", upstreams, "proxy_pass", etc.
(ticket #92)
To preserve compatibility, "listen" specified with a domain name
selects the first IPv4 address, if available. If not available,
the first IPv6 address will be used (ticket #186).
The URL parsing code is not expected to initialize port from default port
when in "no_resolve" mode. This got broken in r4671 for the case of IPv6
literals.
The ngx_write_fd() and ngx_read_fd() functions return -1 in case of error,
so the incorrect comparison with NGX_FILE_ERROR (which is 0 on windows
platforms) might result in inaccurate error message in the error log.
Also the ngx_errno global variable is being set only if the returned value
is -1.
An incorrect memLevel (lower than 1) might be passed to deflateInit2() if the
"gzip_hash" directive is set to a value less than the value of "gzip_window"
directive. This resulted in "deflateInit2() failed: -2" alert and an empty
reply.
Configuration like
location / {
set $true 1;
if ($true) {
proxy_pass http://backend;
}
if ($true) {
# nothing
}
}
resulted in segmentation fault due to NULL pointer dereference as the
upstream configuration wasn't initialized in an implicit location created
by the last if(), but the r->content_handler was set due to first if().
Instead of committing a suicide by dereferencing a NULL pointer, return
500 (Internal Server Error) in such cases, i.e. if uscf is NULL. Better
fix would be to avoid such cases by fixing the "if" directive handling,
but it's out of scope of this patch.
Prodded by Piotr Sikora.
This will result in alphabetical sorting of included files if
the "include" directive with wildcards is used.
Note that the behaviour is now different from that on Windows, where
alphabetical sorting is not guaranteed for FindFirsFile()/FindNextFile()
(used to be alphabetical on NTFS, but not on FAT).
Approved by Igor Sysoev, prodded by many.
Catched by dav_chunked.t on Solaris. In released versions this might
potentially result in corruption of complex protocol responses if they
were written to disk and there were more distinct buffers than IOV_MAX
in a single write.
If write events are not blocked, an extra write event might happen for
various reasons (e.g. as a result of a http pipelining), resulting in
incorrect body being passed to a post handler.
The problem manifested itself with the dav module only, as this is
the only module which reads the body from a content phase handler (in
contrast to exclusive content handlers like proxy). Additionally, dav
module used to dump core in such situations due to ticket #238.
See reports here:
http://mailman.nginx.org/pipermail/nginx-devel/2012-November/002981.htmlhttp://serverfault.com/questions/449195/nginx-webdav-server-with-auth-request
While discarding chunked request body in some cases after detecting
request body corruption no error was returned, while it was possible
to correctly return 400 Bad Request. If error is detected too late,
make sure to properly close connection.
Additionally, in ngx_http_special_response_handler() don't return body
of 500 Internal Server Error to a client if ngx_http_discard_request_body()
fails, but disable keepalive and continue.
Even if there is no preread data, make sure to always call
ngx_http_discard_request_body_filter() in case of chunked request
body to initialize r->headers_in.content_length_n for later use.
nginx doesn't allow the same shared memory zone to be used for different
purposes, but failed to check this on reconfiguration. If a shared memory
zone was used for another purpose in the new configuration, nginx attempted
to reuse it and crashed.
An attempt to call ngx_handle_read_event() before actually reading
data from a socket might result in read event being disabled, which is
wrong. Catched by body.t test on Solaris.
The r->main->count reference counter was always incremented in
ngx_http_read_client_request_body(), while it is only needs to be
incremented on positive returns.
The $request_body variable was assuming there can't be more than two
buffers. While this is currently true due to request body reading
implementation details, this is not a good thing to depend on and may
change in the future.
It is not about "Method" but a generic message, and is expected to be used
e.g. if specified Transfer-Encoding is not supported. Fixed message to
match RFC 2616.
Additionally, disable keepalive on such errors as we won't be able to read
request body correctly if we don't understand Transfer-Encoding used.
If request body reading happens with different options it's possible
that there will be no r->request_body->temp_file available (or even
no r->request_body available if body was discarded). Return internal
server error in this case instead of committing suicide by dereferencing
a null pointer.
Pending EOF might be reported on both read and write events, whichever
comes first, so check both of them.
Patch by Yichun Zhang (agentzh), slightly modified.
If an upstream block was defined with the only server marked as
"down", e.g.
upstream u {
server 127.0.0.1:8080 down;
}
an attempt was made to contact the server despite the "down" flag.
It is believed that immediate 502 response is better in such a
case, and it's also consistent with what is currently done in case
of multiple servers all marked as "down".
Input filter might free a buffer if there is no data in it, and in case
of first buffer (used for cache header and request header, aka p->buf_to_file)
this resulted in cache corruption. Buffer memory was reused to read upstream
response before headers were written to disk.
Fix is to avoid moving pointers in ngx_event_pipe_add_free_buf() to a buffer
start if we were asked to free a buffer used by p->buf_to_file.
This fixes occasional cache file corruption, usually resulted
in "cache file ... has md5 collision" alerts.
Reported by Anatoli Marinov.
idle connections.
This behaviour is consistent with the ngx_http_set_keepalive() function and it
should decrease memory usage in some cases (especially if epoll/rtsig is used).
This parameter allows to don't require certificate to be signed by
a trusted CA, e.g. if CA certificate isn't known in advance, like in
WebID protocol.
Note that it doesn't add any security unless the certificate is actually
checked to be trusted by some external means (e.g. by a backend).
Patch by Mike Kazantsev, Eric O'Connor.
With the "ssl_stapling_verify" commit build with old OpenSSL libraries
was broken due to incorrect prototype of the ngx_ssl_stapling() function.
One incorrect use of ngx_log_debug() instead of ngx_log_debug2() slipped in
and broke win32 build.
OCSP response verification is now switched off by default to simplify
configuration, and the ssl_stapling_verify allows to switch it on.
Note that for stapling OCSP response verification isn't something required
as it will be done by a client anyway. But doing verification on a server
allows to mitigate some attack vectors, most notably stop an attacker from
presenting some specially crafted data to all site clients.
This is expected to simplify configuration in a common case when OCSP
response is signed by a certificate already present in ssl_certificate
chain. This case won't need any extra trusted certificates.
This will result in better error message in case of incorrect response
from OCSP responder:
... OCSP responder sent invalid "Content-Type" header: "text/plain"
while requesting certificate status, responder: ...
vs.
... d2i_OCSP_RESPONSE() failed (SSL:
error:0D07209B:asn1 encoding routines:ASN1_get_object:too long
error:0D068066:asn1 encoding routines:ASN1_CHECK_TLEN:bad object header
error:0D07803A:asn1 encoding routines:ASN1_ITEM_EX_D2I:nested asn1 error)
while requesting certificate status, responder: ...
This includes the ssl_stapling_responder directive (defaults to OCSP
responder set in certificate's AIA extension).
OCSP response for a given certificate is requested once we get at least
one connection with certificate_status extension in ClientHello, and
certificate status won't be sent in the connection in question. This due
to limitations in the OpenSSL API (certificate status callback is blocking).
Note: SSL_CTX_use_certificate_chain_file() was reimplemented as it doesn't
allow to access the certificate loaded via SSL_CTX.
Very basic version without any OCSP responder query code, assuming valid
DER-encoded OCSP response is present in a ssl_stapling_file configured.
Such file might be produced with openssl like this:
openssl ocsp -issuer root.crt -cert domain.crt -respout domain.staple \
-url http://ocsp.example.com
The directive allows to specify additional trusted Certificate Authority
certificates to be used during certificate verification. In contrast to
ssl_client_certificate DNs of these cerificates aren't sent to a client
during handshake.
Trusted certificates are loaded regardless of the fact whether client
certificates verification is enabled as the same certificates will be
used for OCSP stapling, during construction of an OCSP request and for
verification of an OCSP response.
The same applies to a CRL (which is now always loaded).
The SSL_COMP_get_compression_methods() is only available as an API
function in OpenSSL 0.9.8+, require it explicitly to unbreak build
with OpenSSL 0.9.7.
Previous code used sk_SSL_COMP_delete(ssl_comp_methods, i) while iterating
stack from 0 to n, resulting in removal of only even compression methods.
In real life this change is a nop, as there is only one compression method
which is enabled by default in OpenSSL.
This fixes unwanted/incorrect cpu_affinity use on dead worker processes
respawn. While this is not ideal, it's expected to be better when previous
situation where multiple processes were spawn with identical CPU affinity
set.
Reported by Charles Chen.
With "always" gzip static returns gzipped content in all cases, without
checking if client supports it. It is useful if there are no uncompressed
files on disk anyway.
This directive allows to test desired flag as returned by memcached and
sets Content-Encoding to gzip if one found.
This is reimplementation of patch by Tomash Brechko as available on
http://openhack.ru/. It should be a bit more correct though (at least
I think so). In particular, it doesn't try to detect if we are able to
gunzip data, but instead just sets correct Content-Encoding.
The rbtree used in ngx_http_limit_req_module has two level of keys, the top is
hash, and the next is the value string itself. However, when inserting a new
node, only hash has been set, while the value string has been left empty.
The bug was introduced in r4419 (1.1.14).
Found by Charles Chen.
The "include" directive should be able to include multiple files if
given a filename mask. Fixed this to work for "include" directives
inside the "map" or "types" blocks. The "include" directive inside
the "geo" block is still not fixed.
The preallocation size was calculated incorrectly and was always 8 due to
sizeof(ngx_radix_tree_t) accidentally used instead of sizeof(ngx_radix_node_t).
Previous code incorrectly used ctx->var_values as an array of pointers to
ngx_http_variable_value_t, but the array contains structures, not pointers.
Additionally, ctx->var_values inspection failed to properly set var on
match.
We don't have strong reason to inform about any errors
reported by close() call here, and there are no other things
to do with its return value.
Prodded by Coverity.
The only thing we could potentially do here in case of error
returned is to complain to error log, but we don't have log
structure available here due to interface limitations.
Prodded by Coverity.
If ngx_time_sigsafe_update() updated only ngx_cached_err_log_time, and
then clock was adjusted backwards, the cached_time[slot].sec might
accidentally match current seconds on next ngx_time_update() call,
resulting in various cached times not being updated.
Fix is to clear the cached_time[slot].sec to explicitly mark cached times
are stale and need updating.
There is a general consensus that this change results in better
consistency between different operating systems and differently
tuned operating systems.
Note: this changes the width and meaning of the ipv6only field
of the ngx_listening_t structure. 3rd party modules that create
their own listening sockets might need fixing.
Hide headers and pass headers arrays might not be inherited correctly
into a nested location, e.g. in configuration like
server {
proxy_hide_header X-Foo;
location / {
location /nested/ {
proxy_pass_header X-Pad;
}
}
}
the X-Foo header wasn't hidden in the location /nested/.
Reported by Konstantin Svist,
http://mailman.nginx.org/pipermail/nginx-ru/2012-July/047555.html
If ngx_spawn_process() failed while starting a process, the process
handle was closed but left non-NULL in the ngx_processes[] array.
The handle later was used in WaitForMultipleObjects() (if there
were multiple worker processes configured and at least one worker
process was started successfully), resulting in infinite loop.
Reported by Ricardo V G:
http://mailman.nginx.org/pipermail/nginx-devel/2012-July/002494.html
the end (closes#187). Failure to do so could result in several listen
sockets to be created instead of only one listening on wildcard address.
Reported by Roman Odaisky.
It allows to disable generation of nginx's own entity tags, while
still handling ETags in cache properly. This may be useful e.g.
if one want to serve static files from servers with different ETag
generation algorithms.
This includes handling of ETag headers (if present in a response) with
basic support for If-Match, If-None-Match conditionals in not modified
filter.
Note that the "r->headers_out.last_modified_time == -1" check in the not
modified filter is left as is intentionally. It's to prevent handling
of If-* headers in case of proxy without cache (much like currently
done with If-Modified-Since).
This makes code more extendable. The only functional change is when
If-Modified-Since and If-Unmodified-Since are specified together, the
case which is explicitly left undefined by RFC 2616. The new behaviour
is to respect them both, which seems better.
If modification time isn't known, skip range processing and return full
entity body instead of just ignoring If-Range. Ignoring If-Range isn't
safe as client will assume entity wasn't changed since time specified.
The original idea was to optimize edge cases in case of interchangeable
backends, i.e. don't establish a new connection if we have any one
cached. This causes more harm than good though, as it screws up
underlying balancer's idea about backends used and may result in
various unexpected problems.
HP-UX needs _HPUX_ALT_XOPEN_SOCKET_API to be defined to be able to
use various POSIX versions of networking functions. Notably sendmsg()
resulted in "sendmsg() failed (9: Bad file number)" alerts without it.
See xopen_networking(7) for more details.
Poll event method needs ngx_cycle->files to work, and use of ngx_exit_cycle
without files set caused null pointer dereference in resolver's cleanup
on udp socket close.
With previous code wildcard names were added to hash even if conflict
was detected. This resulted in identical names in hash and segfault
later in ngx_hash_wildcard_init().
Number of entries in stsc atom was wrong if we've added an entry to
split a chunk.
Additionally, there is no need to add an entry if we are going to split
last chunk in an entry, it's enough to update the entry we already have.
Previously new entry was added and old one was left as is, resulting in
incorrect entry with zero chunks which might confuse some software.
Contains response status code as a 3-digit integer
(with leading zeroes if necessary), or one of the following values:
000 - response status code has not yet been assigned
009 - HTTP/0.9 request is being processed
If sending a DNS request fails with an error (e.g., when mistakenly trying
to send it to a local IP broadcast), such a request is not deleted if there
are clients waiting on it. However, it was still erroneously removed from
the queue. Later ngx_resolver_cleanup_tree() attempted to remove it from
the queue again that resulted in a NULL pointer dereference.
There are too many problems with special NTFS streams, notably "::$data",
"::$index_allocation" and ":$i30:$index_allocation".
For now we don't reject all URIs with ":" like Apache does as there are no
good reasons seen yet, and there are multiple programs using it in URLs
(e.g. MediaWiki).
Windows treats "/directory./" identical to "/directory/". Do the same
when working on Windows. Note that the behaviour is different from one
with last path component (where multiple spaces and dots are ignored by
Windows).
This includes trailings dots and spaces, NTFS streams (and short names, as
previously checked). The checks are now also done in ngx_file_info(), thus
allowing to use the "try_files" directive to protect external scripts.
Removed duplicate call of ngx_http_upstream_init_round_robin_peer()
overlooked during code changes. Rewritten "return lcp->free_rr_peer(...)"
as MSVC doesn't like it.
If the "proxy_cookie_domain" or "proxy_cookie_path" directive is used and there
are no matches in Set-Cookie header then ngx_http_proxy_rewrite_cookie() returns
NGX_DECLINED to indicate that the header was not rewritten. Returning this value
further from the upstream headers copy handler resulted in 500 error response.
See here for report:
http://mailman.nginx.org/pipermail/nginx/2012-May/033858.html
If variable was indexed in previous configuration but not in current
one, the NGX_HTTP_VAR_INDEXED flag was left set and confused
ngx_http_get_variable().
Patch by Yichun Zhang (agentzh), slightly modified.
Example configuration to reproduce:
location /image/ {
error_page 415 = /zero;
image_filter crop 100 100;
proxy_pass http://127.0.0.1:8080;
proxy_store on;
}
location /zero {
return 204;
}
The problem appeared if upstream returned (big enough) non-image file,
causing 415 to be generated by image filter.
The module now supports recursive search of client address through the
chain of trusted proxies (closes#100), in the same scope as the geo
module. Proxies are listed by the "geoip_proxy" directive, recursive
search is enabled by the "geoip_proxy_recursive" directive. IPv6 is
partially supported: proxies may be specified with IPv6 addresses.
Example:
geoip_country .../GeoIP.dat;
geoip_proxy 127.0.0.1;
geoip_proxy ::1;
geoip_proxy 10.0.0.0/8;
geoip_proxy_recursive on;
The module now supports recursive search of client address through
the chain of trusted proxies, controlled by the "proxy_recursive"
directive in the "geo" block. It also gets partial IPv6 support:
now proxies may be specified with IPv6 addresses.
Example:
geo $test {
...
proxy 127.0.0.1;
proxy ::1;
proxy_recursive;
}
There's also a slight change in behavior. When original client
address (as specified by the "geo" directive) is one of the
trusted proxies, and the value of the X-Forwarded-For request
header cannot not be parsed as a valid address, an original client
address will be used for lookup. Previously, 255.255.255.255 was
used in this case.
The module now supports recursive search of client address through
the chain of trusted proxies, controlled by the "real_ip_recursive"
directive (closes#2). It also gets full IPv6 support (closes#44)
and canonical value of the $client_addr variable on address change.
Example:
real_ip_header X-Forwarded-For;
set_real_ip_from 127.0.0.0/8;
set_real_ip_from ::1;
set_real_ip_from unix:;
real_ip_recursive on;
On input it takes an original address, string in the X-Forwarded-For format
and its length, list of trusted proxies, and a flag indicating to perform
the recursive search. On output it returns NGX_OK and the "deepest" valid
address in a chain, or NGX_DECLINED. It supports AF_INET and AF_INET6.
Additionally, original address and/or proxy may be specified as AF_UNIX.
Due to weight being set to 0 for down peers, order of peers after sorting
wasn't the same as without the "down" flag (with down peers at the end),
resulting in client rebalancing for clients on other servers. The only
rebalancing which should happen after adding "down" to a server is one
for clients on the server.
The problem was introduced in r1377 (which fixed endless loop by setting
weight to 0 for down servers). The loop is no longer possible with new
smooth algorithm, so preserving original weight is safe.
For edge case weights like { 5, 1, 1 } we now produce { a, a, b, a, c, a, a }
sequence instead of { c, b, a, a, a, a, a } produced previously.
Algorithm is as follows: on each peer selection we increase current_weight
of each eligible peer by its weight, select peer with greatest current_weight
and reduce its current_weight by total number of weight points distributed
among peers.
In case of { 5, 1, 1 } weights this gives the following sequence of
current_weight's:
a b c
0 0 0 (initial state)
5 1 1 (a selected)
-2 1 1
3 2 2 (a selected)
-4 2 2
1 3 3 (b selected)
1 -4 3
6 -3 4 (a selected)
-1 -3 4
4 -2 5 (c selected)
4 -2 -2
9 -1 -1 (a selected)
2 -1 -1
7 0 0 (a selected)
0 0 0
To preserve weight reduction in case of failures the effective_weight
variable was introduced, which usually matches peer's weight, but is
reduced temporarily on peer failures.
This change also fixes loop with backup servers and proxy_next_upstream
http_404 (ticket #47), and skipping alive upstreams in some cases if there
are multiple dead ones (ticket #64).
With r->filter_finalize set the ngx_http_finalize_connection() wasn't
called from ngx_http_finalize_request() called with NGX_OK, resulting in
r->main->count not being decremented, thus causing request hang in some
rare situations.
See here for more details:
http://mailman.nginx.org/pipermail/nginx-devel/2012-May/002190.html
Patch by Yichun Zhang (agentzh).
If we already had CNAME in resolver node (i.e. rn->cnlen and rn->u.cname
set), and got additional response with A record, it resulted in rn->cnlen
set and rn->u.cname overwritten by rn->u.addr (or rn->u.addrs), causing
segmentation fault later in ngx_resolver_free_node() on an attempt to free
overwritten rn->u.cname. The opposite (i.e. CNAME got after A) might cause
similar problems as well.
In case of EMFILE/ENFILE returned from accept() we disable accept events,
and (in case of no accept mutex used) arm timer to re-enable them later.
With accept mutex we just drop it, and rely on normal accept mutex handling
to re-enable accept events once it's acquired again.
As we now handle errors in question, logging level was changed to "crit"
(instead of "alert" used for unknown errors).
Note: the code might call ngx_enable_accept_events() multiple times if
there are many listen sockets. The ngx_enable_accept_events() function was
modified to check if connection is already active (via c->read->active) and
skip it then, thus making multiple calls safe.
The following code resulted in incorrect escaping of uri and possible
segfault:
location / {
rewrite ^(.*) $1?c=$1;
return 200 "$uri";
}
If there were arguments in a rewrite's replacement string, and length was
actually calculated (due to duplicate captures as in the example above,
or variables present), the is_args flag was set and incorrectly copied
after length calculation. This resulted in escaping applied to the uri part
of the replacement, resulting in incorrect escaping. Additionally, buffer
was allocated without escaping expected, thus this also resulted in buffer
overrun and possible segfault.
Padding was incorrectly ignored on end request, empty stdout and stderr
fastcgi records. This resulted in protocol desynchronization if fastcgi
application used these records with padding for some reason.
Reported by Ilia Vinokurov.
Failing to do so results in problems if 400 or 414 requests are
redirected to fastcgi/scgi/uwsgi upstream, as well as after invalid
headers got from upstream. This was already fixed for proxy in r3478,
but fastcgi (the only affected protocol at that time) was missed.
Reported by Matthieu Tourne.
On internal redirects this happens via ngx_http_handler() call, which is
not called on named location redirect. As a result incorrect write handler
remained (if previously set) and this might cause incorrect behaviour (likely
request hang).
Patch by Yichun Zhang (agentzh).
If name passed for resolution was { 0, NULL } (e.g. as a result
of name server returning CNAME pointing to ".") pointer wrapped
to (void *) -1 resulting in segmentation fault on an attempt to
dereference it.
Reported by Lanshun Zhou.
The proxy module context may be NULL in case of filter finalization
(e.g. by image_filter) followed by an internal redirect. This needs
some better handling, but for now just check if ctx is still here.
The problem occured if first uri in try_files was shorter than request uri,
resulting in reserve being 0 and hence allocation skipped. The bug was
introduced in r4584 (1.1.19).
Instead of checking if there is events{} section present in configuration
in init_module handler we now do the same in init_conf handler. This
allows master process to detect incorrect configuration early and
reject it.
We now stop on IOV_MAX iovec entries only if we are going to add new one,
i.e. next buffer can't be coalesced into last iovec.
This also fixes incorrect checks for trailer creation on FreeBSD and
Mac OS X, header.nelts was checked instead of trailer.nelts.
The "complete" flag wasn't cleared on loop iteration start, resulting in
broken behaviour if there were more than IOV_MAX buffers and first
iteration was fully completed (and hence the "complete" flag was set
to 1).
Previous (incorrect) behaviour was to inherit ipv6 rules separately from
ipv4 ones. Now all rules are either inherited (if there are no rules
defined at current level) or not (if there are any rules defined).
Integer overflow is undefined behaviour in C and this indeed caused
problems on Solaris/SPARC (at least in some cases). Fix is to
subtract unsigned integers instead, and then cast result to a signed
one, which is implementation-defined behaviour and used to work.
Strictly speaking, we should compare (unsigned) result with the maximum
value of the corresponding signed integer type instead, this will be
defined behaviour. This will require much more changes though, and
considered to be overkill for now.
Such upstreams cause CPU hog later in the code as number of peers isn't
expected to be 0. Currently this may happen either if there are only backup
servers defined in an upstream block, or if server with ipv6 address used
in an upstream block.
Note that "ctxt->loadsubset = 1" previously used isn't really correct as
ctxt->loadsubset is a bitfield now. The use of xmlCtxtUseOptions() with
XML_PARSE_DTDLOAD is believed to be a better way to do the same thing.
Patch by Laurence Rowe.
POSIX doesn't require it to be defined, and Debian GNU/Hurd doesn't define
it. Note that if there is no MAX_PATH defined we have to use realpath()
with NULL argument and free() the result.
Most of the systems have it included due to namespace pollution, but
relying on this is a bad idea. Explicit include is required for at least
Debian GNU/Hurd.
The problem was introduced in 0.7.44 (r2589) during conversion to complex
values. Previously string.len included space for terminating NUL, but
with complex values it doesn't.
The bug in question is likely already fixed (though unfortunately we have
no information available as Apple's bugtracker isn't open), and the
workaround seems to be too pessimistic for modern versions of Safari
as well as other webkit-based browsers pretending to be Safari.
- Removed "hash" element from ngx_http_header_val_t which was always 1.
- Replaced NGX_HTTP_EXPIRES_* with ngx_http_expires_t enum type.
- Added prototype for ngx_http_add_header()
- Simplified ngx_http_set_last_modified().
This resulted in a disclosure of previously freed memory if upstream
server returned specially crafted response, potentially exposing
sensitive information.
Reported by Matthew Daley.
Embedded perl module assumes there is a space for terminating NUL character,
make sure to provide it in all situations by allocating one extra byte for
value buffer. Default ssi_value_length is reduced accordingly to
preserve 256 byte allocations.
While here, fixed another one byte value buffer overrun possible in
ssi_quoted_symbol_state.
Reported by Matthew Daley.
It wasn't enforced for a long time, and there are reports that people
use up to 100 simultaneous subrequests now. As this is a safety limit
to prevent loops, it's raised accordingly.
ZFS reports incorrect st_blocks until file settles on disk, and this
may take a while (i.e. just after creation of a file the st_blocks value
is incorrect). As a workaround we now use st_blocks only if
st_blocks * 512 > st_size, this should fix ZFS problems while still
preserving accuracy for other filesystems.
The problem had appeared in r3900 (1.0.1).
Previous code incorrectly assumed that nodes with identical keys are linked
together. This might not be true after tree rebalance.
Patch by Lanshun Zhou.
The cycle->new_log.file may not be set before config parsing finished if
there are no error_log directive defined at global level. Fix is to
copy it after config parsing.
Patch by Roman Arutyunyan.
With previous code raw buffer might be lost if p->input_filter() was called
on a buffer without any data and used ngx_event_pipe_add_free_buf() to
return it to the free list. This eventually might cause "all buffers busy"
problem, resulting in segmentation fault due to null pointer dereference in
ngx_event_pipe_write_chain_to_temp_file().
In ngx_event_pipe_add_free_buf() the buffer was added to the list start
due to pos == last, and then "p->free_raw_bufs = cl->next" in
ngx_event_pipe_read_upstream() dropped both chain links to the buffer
from the p->free_raw_bufs list.
Fix is to move "p->free_raw_bufs = cl->next" before calling the
p->input_filter().
the last path component if "if_not_owner" parameter is used.
To prevent race condition we have to open a file before checking its owner and
there's no way to change access flags for already opened file descriptor, so
we disable symlinks for the last path component at all if flags allow creating
or truncating the file.
Solaris has AT_FDCWD defined to unsigned value, and comparison of a file
descriptor with it causes warnings in modern versions of gcc. Explicitly
cast AT_FDCWD to ngx_fd_t to resolve these warnings.
To completely disable symlinks (disable_symlinks on)
we use openat(O_NOFOLLOW) for each path component
to avoid races.
To allow symlinks with the same owner (disable_symlinks if_not_owner),
use openat() (followed by fstat()) and fstatat(AT_SYMLINK_NOFOLLOW),
and then compare uids between fstat() and fstatat().
As there is a race between openat() and fstatat() we don't
know if openat() in fact opened symlink or not. Therefore,
we have to compare uids even if fstatat() reports the opened
component isn't a symlink (as we don't know whether it was
symlink during openat() or not).
Default value is off, i.e. symlinks are allowed.
Nuke NGX_PARSE_LARGE_TIME, it's not used since 0.6.30. The only error
ngx_parse_time() can currently return is NGX_ERROR, check it explicitly
and make sure to cast it to appropriate type (either time_t or ngx_msec_t)
to avoid signedness warnings on platforms with unsigned time_t (notably QNX).
Now redirects to named locations are counted against normal uri changes
limit, and post_action respects this limit as well. As a result at least
the following (bad) configurations no longer trigger infinite cycles:
1. Post action which recursively triggers post action:
location / {
post_action /index.html;
}
2. Post action pointing to nonexistent named location:
location / {
post_action @nonexistent;
}
3. Recursive error page for 500 (Internal Server Error) pointing to
a nonexistent named location:
location / {
recursive_error_pages on;
error_page 500 @nonexistent;
return 500;
}
Without the protection, subrequest loop results in r->count overflow and
SIGSEGV. Protection was broken in 0.7.25.
Note that this also limits number of parallel subrequests. This
wasn't exactly the case before 0.7.25 as local subrequests were
completed directly.
See here for details:
http://nginx.org/pipermail/nginx-ru/2010-February/032184.html
Variables with the "not_found" flag set follow the same rules as ones with
the "valid" flag set. Make sure ngx_http_get_flushed_variable() will flush
non-cacheable variables with the "not_found" flag set.
This fixes at least one known problem with $args not available in a subrequest
(with args) when there were no args in the main request and $args variable was
queried in the main request (reported by Laurence Rowe aka elro on irc).
Also this eliminates unneeded call to ngx_http_get_indexed_variable() in
cacheable case (as it will return cached value anyway).
Temporary files might not be removed if the "proxy_store" or "fastcgi_store"
directives were used for subrequests (e.g. ssi includes) and client closed
connection prematurely.
Non-active subrequests are finalized out of the control of the upstream
module when client closes a connection. As a result, the code to remove
unfinished temporary files in ngx_http_upstream_process_request() wasn't
executed.
Fix is to move relevant code into ngx_http_upstream_finalize_request() which
is called in all cases, either directly or via the cleanup handler.
Empty flush buffers are legitimate and may happen e.g. due to $r->flush()
calls in embedded perl. If there are no data buffered in zlib, deflate()
will return Z_BUF_ERROR (i.e. no progress possible) without adding anything
to output. Don't treat Z_BUF_ERROR as fatal and correctly send empty flush
buffer if we have no data in output at all.
See this thread for details:
http://mailman.nginx.org/pipermail/nginx/2010-November/023693.html
If header filter postponed processing of a header by returning NGX_AGAIN
and not moved u->buffer->pos, previous check incorrectly assumed there
is additional space and did another recv() with zero-size buffer. This
resulted in "upstream prematurely closed connection" error instead
of correct "upstream sent too big header" one.
Patch by Feibo Li.
the way.
It was unintentionally changed in r4272, so that it could only limit the first
location where the processing of the request has reached PREACCESS phase.
The PCRE JIT compiler uses mmap to allocate memory for its executable codes, so
we have to explicitly call the pcre_free_study() function to free this memory.
Example configuration to reproduce:
server {
proxy_redirect off;
location / {
proxy_pass http://localhost:8000;
proxy_redirect http://localhost:8000/ /;
location ~ \.php$ {
proxy_pass http://localhost:8000;
# proxy_redirect must be inherited from the level above,
# but instead it was switched off here
}
}
}
Previously if ngx_add_event() failed a connection was freed two times (once
in the ngx_event_connect_peer(), and again by a caller) as pc->connection was
left set. Fix is to always use ngx_close_connection() to close connection
properly and set pc->connection to NULL on errors.
Patch by Piotr Sikora.
Doing a cleanup before every lookup seems to be too aggressive. It can lead to
premature removal of the nodes still usable, which increases the amount of work
under a mutex lock and therefore decreases performance.
In order to improve cleanup behavior, cleanup function call has been moved right
before the allocation of a new node.
"limit_req_zone" directive; minimum size of zone is increased.
Previously an unsigned variable was used to keep the return value of
ngx_parse_size() function, which led to an incorrect zone size if NGX_ERROR
was returned.
The new code has been taken from the "limit_conn_zone" directive.
The aio_return() must be called regardless of the error returned by
aio_error(). Not calling it resulted in various problems up to segmentation
faults (as AIO events are level-triggered and were reported again and again).
Additionally, in "aio sendfile" case r->blocked was incremented in case of
error returned from ngx_file_aio_read(), thus causing request hangs.
It's already called by OPENSSL_config(). Calling it again causes some
openssl engines (notably GOST) to corrupt memory, as they don't expect
to be created more than once.
The ngx_hash_init() function did not expect call with zero elements count,
which caused FPE error on configs with an empty "types" block in http context
and "types_hash_max_size" > 10000.
Example configuration to reproduce:
events { }
http {
types_hash_max_size 10001;
types {}
server {}
}
Second argument (cpusetsize) is size in bytes, not in bits. Previously
used constant 32 resulted in reading of uninitialized memory and caused
EINVAL to be returned on some Linux kernels.
Support for TLSv1.1 and TLSv1.2 protocols was introduced in OpenSSL 1.0.1
(-beta1 was recently released). This change makes it possible to disable
these protocols and/or enable them without other protocols.
The problem was localized in ngx_http_proxy_rewrite_redirect_regex() handler
function which did not take into account prefix when overwriting header value.
New directives: proxy_cache_lock on/off, proxy_cache_lock_timeout. With
proxy_cache_lock set to on, only one request will be allowed to go to
upstream for a particular cache item. Others will wait for a response
to appear in cache (or cache lock released) up to proxy_cache_lock_timeout.
Waiting requests will recheck if they have cached response ready (or are
allowed to run) every 500ms.
Note: we intentionally don't intercept NGX_DECLINED possibly returned by
ngx_http_file_cache_read(). This needs more work (possibly safe, but needs
further investigation). Anyway, it's exceptional situation.
Note: probably there should be a way to disable caching of responses
if there is already one request fetching resource to cache (without waiting
at all). Two possible ways include another cache lock option ("no_cache")
or using proxy_no_cache with some supplied variable.
Note: probably there should be a way to lock updating requests as well. For
now "proxy_cache_use_stale updating" is available.
It's possible that configured limit_rate will permit more bytes per
single operation than sendfile_max_chunk. To protect disk from takeover
by a single client it is necessary to apply sendfile_max_chunk as a limit
regardless of configured limit_rate.
See here for report (in Russian):
http://mailman.nginx.org/pipermail/nginx-ru/2010-March/032806.html
If proxy_pass was used with variables and there was no URI component,
nginx always used unparsed URI. This isn't consistent with "no variables"
case, where e.g. rewrites are applied even if there is no URI component.
Fix is to use the same logic in both cases, i.e. only use unparsed URI if
it's valid and request is the main one.
This resolves issue with try_files (see ticket #70), configuration like
location / { try_files $uri /index.php; }
location /index.php { proxy_pass http://backend; }
caused nginx to use original request uri in a request to a backend.
Historically, not clearing of the r->valid_unparsed_uri on internal redirect
was a feature: it allowed to pass the same request to (another) upstream
server via error_page redirection. Since then named locations appeared
though, and it's time to start resetting r->valid_unparsed_uri on internal
redirects. Configurations still using this feature should be converted
to use named locations instead.
Patch by Lanshun Zhou.
The SCGI specification doesn't specify format of the response, and assuming
CGI specs should be used there is no reason to complain. RFC 3875
explicitly states that "A Status header field is optional, and status
200 'OK' is assumed if it is omitted".
The r->http_version is a version of client's request, and modules must
not set it unless they are really willing to downgrade protocol version
used for a response (i.e. to HTTP/0.9 if no response headers are available).
In neither case r->http_version may be upgraded.
The former code downgraded response from HTTP/1.1 to HTTP/1.0 for no reason,
causing various problems (see ticket #66). It was also possible that
HTTP/0.9 requests were upgraded to HTTP/1.0.
There have been multiple reports of cases where a real locked entry was
removed, resulting in a segmentation fault later in a worker which locked
the entry. It looks like default inactive timeout isn't enough in real
life.
For now just ignore such locked entries, and move them to the top of the
inactive queue to allow processing of other entries.
There are two possible situations which can lead to this: response was
cached with bigger proxy_buffer_size value (and nginx was restared since
then, i.e. shared memory zone content was lost), or due to the race in
the cache update code (see [1]) we've end up with fcn->body_start from
a different response stored in shared memory zone.
[1] http://mailman.nginx.org/pipermail/nginx-devel/2011-September/001287.html
The ngx_http_cache() and ngx_http_no_cache_set_slot() functions were replaced
by ngx_http_test_predicates() and ngx_http_set_predicate_slot() in 0.8.46 and
no longer used since then.
FreeBSD kernel checks headers/trailers pointer against NULL, not
corresponding count. Passing NULL if there are no headers/trailers
helps to avoid unneeded work in kernel, as well as unexpected 0 bytes
GIO in traces.
haven't been sent to a client.
The ngx_http_variable_headers() and ngx_http_variable_unknown_header() functions
did not ignore response header entries with zero "hash" field.
Thanks to Yichun Zhang (agentzh).
It was working for nginx's own 206 replies as they are seen as 200 in the
headers filter module (range filter goes later in the headers filter chain),
but not for proxied replies.
Additional parsing logic added to correctly handle RFC 3986 compliant IPv6 and
IPvFuture characters enclosed in square brackets.
The host validation was completely rewritten. The behavior for non IP literals
was changed in a more proper and safer way:
- Host part is now delimited either by the first colon or by the end of string
if there's no colon. Previously the last colon was used as delimiter which
allowed substitution of a port number in the $host variable.
(e.g. Host: 127.0.0.1:9000:80)
- Fixed stripping of the ending dot in the Host header when the host was also
followed by a port number.
(e.g. Host: nginx.com.:80)
- Fixed upper case characters detection. Previously it was broken which led to
wasting memory and CPU.
If process exited abnormally while holding lock on some shared memory zone -
unlock it. It may be not safe thing to do (as crash with lock held may
result in corrupted shared memory structure, and other processes will
subsequently crash while trying to access shared data), therefore complain
loudly if unlock succeeds.
It is currently used from master process on abnormal worker termination to
unlock accept mutex (unlocking of accept mutex was broken in 1.0.2). It is
expected to be used in the future to unlock other mutexes as well.
Shared mutex code was rewritten to make this possible in a safe way, i.e.
with a check if lock was actually held by the exited process. We again use
pid to lock mutex, and use separate atomic variable for a count of processes
waiting in sem_wait().
Stale write event may happen if epoll_wait() reported both read and write
events, and processing of the read event closed descriptor.
Patch by Yichun Zhang (agentzh).
Check if received data length match Content-Length header (if present),
don't cache response if no match found. This prevents caching of corrupted
response in case of premature connection close by upstream.
Used "\x5" in 5th byte to claim presence of both audio and video. Used
previous tag size 0 in the beginning of the flv body (bytes 10 .. 13) as
required by specification (see http://www.adobe.com/devnet/f4v.html).
Patch by Piotr Sikora.
Previously it used a hardcoded value of 300 seconds. Also added the
"valid=" parameter to the "resolver" directive that can be used to
override the cache validity time.
Patch by Kirill A. Korinskiy with minor changes.
The following problems were fixed:
1. Directive fastcgi_cache affected headers sent to backends in unrelated
servers / locations (see ticket #45).
2. If-Unmodified-Since, If-Match and If-Range headers were sent to backends
if fastcgi_cache was used.
3. Cache-related headers were sent to backends if there were no fastcgi_param
directives and fastcgi_cache was used at server level.
Headers cleared with cache enabled (If-Modified-Since etc.) might be cleared
in unrelated servers/locations without proxy_cache enabled if proxy_cache was
used in some server/location.
Example config which triggered the problem:
proxy_set_header X-Test "test";
server { location /1 { proxy_cache name; proxy_pass ... } }
server { location /2 { proxy_pass ... } }
Another one:
server {
proxy_cache name;
location /1 { proxy_pass ... }
location /2 { proxy_cache off; proxy_pass ... }
}
In both cases If-Modified-Since header wasn't sent to backend in location /2.
Fix is to not modify conf->headers_source, but instead merge user-supplied
headers from conf->headers_source and default headers (either cache or not)
into separate headers_merged array.
The following config caused segmentation fault due to conf->file not
being properly set if "ssl on" was inherited from the http level:
http {
ssl on;
server {
}
}
If possible we now just extend already present file buffer in p->out chain
instead of keeping ngx_buf_t for each buffer we've flushed to disk. This
saves about 120 bytes of memory per buffer flushed to disk, and resolves
high CPU usage observed in edge cases (due to coalescing these buffers on
send).
1. In ngx_event_pipe_write_chain_to_temp_file() make sure to fully write
all shadow buffers up to last_shadow. With this change recycled buffers
cannot appear in p->out anymore. This also fixes segmentation faults
observed due to ngx_event_pipe_write_chain_to_temp() not freeing any
raw buffers while still returning NGX_OK.
2. In ngx_event_pipe_write_to_downstream() we now properly check for busy
size as a size of buffers, not a size of data in these buffers. This
fixes situations where all available buffers became busy (including
segmentation faults due to this).
3. The ngx_event_pipe_free_shadow_raw_buf() function is dropped. It's
incorrect and not needed.
Previously result of last iteration's writev() was returned. This was
unnoticed as return value was only used if chain contained only one or
two buffers.
Previously nginx used to mark backend again as live as soon as fail_timeout
passes (10s by default) since last failure. On the other hand, detecting
dead backend takes up to 60s (proxy_connect_timeout) in typical situation
"backend is down and doesn't respond to any packets". This resulted in
suboptimal behaviour in the above situation (up to 23% of requests were
directed to dead backend with default settings).
More detailed description of the problem may be found here (in Russian):
http://mailman.nginx.org/pipermail/nginx-ru/2011-August/042172.html
Fix is to only allow one request after fail_timeout passes, and
mark backend as "live" only if this request succeeds.
Note that with new code backend will not be marked "live" unless "check"
request is completed, and this may take a while in some specific workloads
(e.g. streaming). This is believed to be acceptable.
Second aio post happened when timer set by limit_rate expired while we have
aio request in flight, resulting in "second aio post" alert and socket leak.
The patch adds actual protection from aio calls with r->aio already set to
aio sendfile code in ngx_http_copy_filter(). This should fix other cases
as well, e.g. when sending buffered to disk upstream replies while still
talking to upstream.
The ngx_http_writer() is also fixed to handle the above case (though it's
mostly optimization now).
Reported by Oleksandr V. Typlyns'kyi.
For files with '?' in their names autoindex generated links with '?' not
escaped. This resulted in effectively truncated links as '?' indicates
query string start.
This is an updated version of the patch originally posted at [1]. It
introduces generic NGX_ESCAPE_URI_COMPONENT which escapes everything but
unreserved characters as per RFC 3986. This approach also renders unneeded
special colon processing (as colon is percent-encoded now), it's dropped
accordingly.
[1] http://nginx.org/pipermail/nginx-devel/2010-February/000112.html
Reported by Konstantin Leonov.
If cache was bypassed with proxy_cache_bypass, cache-controlling headers
(Cache-Control, Expires) wasn't considered and response was cached even
if it was actually non-cacheable.
Patch by John Ferlito.
Configuration with duplicate upstream blocks defined after first use, i.e.
like
server {
...
location / {
proxy_pass http://backend;
}
}
upstream backend { ... }
upstream backend { ... }
now correctly results in "duplicate upstream" error.
Additionally, upstream blocks defined after first use now handle various
server directive parameters ("weight", "max_fails", etc.). Previously
configuration like
server {
...
location / {
proxy_pass http://backend;
}
}
upstream backend {
server 127.0.0.1 max_fails=5;
}
incorrectly resulted in "invalid parameter "max_fails=5"" error.
For normal cached responses ngx_http_cache_send() sends last buffer and then
request finalized via ngx_http_finalize_request() call, i.e. everything is
ok.
But for stale responses (i.e. when upstream died, but we have something in
cache) the same ngx_http_cache_send() sends last buffer, but then in
ngx_http_upstream_finalize_request() another last buffer is send. This
causes duplicate final chunk to appear if chunked encoding is used (and
resulting problems with keepalive connections and so on).
Fix this by not sending in ngx_http_upstream_finalize_request()
another last buffer if we know response was from cache.
The special case in question leads to replies without body in
configuration like
location / { error_page 404 /zero; return 404; }
location /zero { return 204; }
while replies with empty body are expected per protocol specs.
Correct one will look like
if (status == NGX_HTTP_NO_CONTENT) {
rc = ngx_http_send_header(r);
if (rc == NGX_ERROR || r->header_only) {
return rc;
}
return ngx_http_send_special(r, NGX_HTTP_LAST);
}
though it looks like it's better to drop this special case at all.
Big POST (not fully preread) to a
location / {
return 202;
}
resulted in incorrect behaviour due to "return" code path not calling
ngx_http_discard_request_body(). The same applies to all "return" used
with 2xx/3xx codes except 201 and 204, and to all "return ... text" uses.
Fix is to add ngx_http_discard_request_body() call to ngx_http_send_response()
function where it looks appropriate. Discard body call from emtpy gif module
removed as it's now redundant.
Reported by Pyry Hakulinen, see
http://mailman.nginx.org/pipermail/nginx/2011-August/028503.html
Test case:
location / {
error_page 405 /nope;
return 405;
}
location /nope {
return 200;
}
This is expected to return 405 with empty body, but in 0.8.42+ will return
builtin 405 error page as well (though not counted in Content-Length, thus
breaking protocol).
Fix is to use status provided by rewrite script execution in case
it's less than NGX_HTTP_BAD_REQUEST even if r->error_status set. This
check is in line with one in ngx_http_script_return_code().
Note that this patch also changes behaviour for "return 302 ..." and
"rewrite ... redirect" used as error handler. E.g.
location / {
error_page 405 /redirect;
return 405;
}
location /redirect {
rewrite ^ http://example.com/;
}
will actually return redirect to "http://example.com/" instead of builtin 405
error page with meaningless Location header. This looks like correct change
and it's in line with what happens on e.g. directory redirects in error
handlers.
Replies with 201 code contain body, and we should clearly indicate it's
empty if it's empty. Before 0.8.32 chunked was explicitly disabled for
201 replies and as a result empty body was indicated by connection close
(not perfect, but worked). Since 0.8.32 chunked is enabled, and this
causes incorrect responses from dav module when HTTP/1.1 is used: with
"Transfer-Encoding: chunked" but no chunks at all.
Fix is to actually return empty body in special response handler instead
of abusing r->header_only flag.
See here for initial report:
http://mailman.nginx.org/pipermail/nginx-ru/2010-October/037535.html
This fixes crashes observed with some 3rd party balancer modules. Standard
balancer modules (round-robin and ip hash) explicitly set pc->connection
(aka u->peer.connection) to NULL and aren't affected.
If client closed connection in ngx_event_pipe_write_to_downstream(), buffers
in the "out" chain were lost. This caused cpu hog if all available buffers
were in the "out" chain. Fix is to call ngx_chain_update_chains() before
checking return code of output filter to avoid loosing buffers in the "out"
chain.
Note that this situation (all available buffers in the "out" chain) isn't
normal, it should be prevented by busy buffers limit. Though right now it
may happen with complex protocols like fastcgi. This should be addressed
separately.
The default value is 32 AIO simultaneous requests per worker. Previously
they were hardcoded to 1024, and it was too large, since Linux allocated
them early on io_setup(), but not on request itself. So with default value
of /proc/sys/fs/aio-max-nr equal to 65536 only 64 worker processes could
be run simultaneously. 32 AIO requests are enough for modern disks even if
server runs only 1 worker.
and "chunked_transfer_encoding" directives, to be in line with all
directives taking a boolean argument. Both flags will ensure that
a directive takes one argument.
syscall(2) uses usual libc convention, it returns -1 on error and
sets errno. Obsolete _syscall(2) returns negative value of error.
Thanks to Hagai Avrahami.
By default follow the old behaviour, i.e. FASTCGI_KEEP_CONN flag isn't set
in request and application is responsible for closing connection once request
is done. To keep connections alive fastcgi_keep_conn must be activated.
As long as ngx_event_pipe() has more data read from upstream than specified
in p->length it's passed to input filter even if buffer isn't yet full. This
allows to process data with known length without relying on connection close
to signal data end.
By default p->length is set to -1 in upstream module, i.e. end of data is
indicated by connection close. To set it from per-protocol handlers upstream
input_filter_init() now called in buffered mode (as well as in
unbuffered mode).
Previous use of size_t may cause wierd effects on 32bit platforms with certain
big responses transferred in unbuffered mode.
Nuke "if (size > u->length)" check as it's not usefull anyway (preread
body data isn't subject to this check) and now requires additional check
for u->length being positive.
We no longer use r->headers_out.content_length_n as a primary source of
backend's response length. Instead we parse response length to
u->headers_in.content_length_n and copy to r->headers_out.content_length_n
when needed.
Just doing another connect isn't safe as peer.get() may expect peer.tries
to be strictly positive (this is the case e.g. with round robin with multiple
upstream servers). Increment peer.tries to at least avoid cpu hog in
round robin balancer (with the patch alert will be seen instead).
This is not enough to fully address the problem though, hence TODO. We
should be able to inform balancer that the error wasn't considered fatal
and it may make sense to retry the same peer.
The ngx_chain_update_chains() needs pool to free chain links used for buffers
with non-matching tags. Providing one helps to reduce memory consumption
for long-lived requests.
There were 2 buffers allocated on each buffer chain sent through chunked
filter (one buffer for chunk size, another one for trailing CRLF, about
120 bytes in total on 32-bit platforms). This resulted in large memory
consumption with long-lived requests sending many buffer chains. Usual
example of problematic scenario is streaming though proxy with
proxy_buffering set to off.
Introduced buffers reuse reduces memory consumption in the above problematic
scenario.
See here for initial report:
http://mailman.nginx.org/pipermail/nginx/2010-April/019814.html
If file inode was not changed, cached file information was not updated
on retest. As a result stale information might be cached forever if file
attributes was changed and/or file was extended.
This fix also makes obsolete r4077 change of is_directio flag handling,
since this flag is updated together with other file information.
On file retest open_file_cache lost is_directio if file wasn't changed.
This caused unaligned operations under Linux to fail with EINVAL.
It wasn't noticeable with AIO though, as errors wasn't properly logged.
Read event should be blocked after reading body, else undefined behaviour
might occur on additional client activity. This fixes segmentation faults
observed with proxy_ignore_client_abort set.
Setting read->eof to 0 seems to be just a typo. It appeared in
nginx-0.0.1-2003-10-28-18:45:41 import (r164), while identical code in
ngx_recv.c introduced in the same import do actually set read->eof to 1.
Failure to set read->eof to 1 results in EOF not being generally detectable
from connection flags. On the other hand, kqueue won't report any read
events on such a connection since we use EV_CLEAR. This resulted in read
timeouts if such connection was cached and used for another request.
If connection has unsent alerts, SSL_shutdown() tries to send them even
if SSL_set_shutdown(SSL_RECEIVED_SHUTDOWN|SSL_SENT_SHUTDOWN) was used.
This can be prevented by SSL_set_quiet_shutdown(). SSL_set_shutdown()
is required nevertheless to preserve session.
"max_ranges 0" disables ranges support at all,
"max_ranges 1" allows the single range, etc.
By default number of ranges is unlimited, to be precise, 2^31-1.
then nginx disables ranges and returns just the source response.
This fix should not affect well-behaving applications but will defeat
DoS attempts exploiting malicious byte ranges.
SSL_set_SSL_CTX() doesn't touch values cached within ssl connection
structure, it only changes certificates (at least as of now, OpenSSL
1.0.0d and earlier).
As a result settings like ssl_verify_client, ssl_verify_depth,
ssl_prefer_server_ciphers are only configurable on per-socket basis while
with SNI it should be possible to specify them different for two servers
listening on the same socket.
Workaround is to explicitly re-apply settings we care about from context
to ssl connection in servername callback.
Note that SSL_clear_options() is only available in OpenSSL 0.9.8m+. I.e.
with older versions it is not possible to clear ssl_prefer_server_ciphers
option if it's set in default server for a socket.
Non-daemon mode is currently used by supervisord, daemontools and so on
or during debugging. The NOACCEPT signal is only used for online upgrade
which is not supported when nginx is run under supervisord, etc.,
so this change should not break existant setups.
now cache loader processes either as many files as specified by loader_files
or works no more than time specified by loader_threshold during each iteration.
loader_threshold was previously used to decrease loader_files or
to increase loader_timeout and this might eventually result in
downgrading loader_files to 1 and increasing loader_timeout to large values
causing loading cache for forever.
NetBSD 5.0+ has SO_ACCEPTFILTER support merged from FreeBSD, and having
accept filter check in FreeBSD-specific ngx_freebsd_config.h prevents it
from being used on NetBSD. Therefore move the check into configure (and
do the same for Linux-specific TCP_DEFER_ACCEPT, just to be in line).
Previously only first log level was required to be correct, while error_log
directive in fact accepts list of levels (e.g. one may specify "error_log ...
debug_core debug_http;"). This resulted in (avoidable) wierd behaviour on
missing semicolon after error_log directive, e.g.
error_log /path/to/log info
index index.php;
silently skipped index directive and it's arguments (trying to interpret
them as log levels without checking to be correct).
The following configuration causes nginx to hog cpu due to infinite loop
in ngx_http_upstream_get_peer():
upstream backend {
server 127.0.0.1:8080 down;
server 127.0.0.1:8080 down;
}
server {
...
location / {
proxy_pass http://backend;
}
}
Make sure we don't loop infinitely in ngx_http_upstream_get_peer() but stop
after resetting peer weights once.
Return 0 if we are stuck. This is guaranteed to work as peer 0 always exists,
and eventually ngx_http_upstream_get_round_robin_peer() will do the right
thing falling back to backup servers or returning NGX_BUSY.
Flush flag wasn't set in constructed buffer and this prevented any data
from being actually sent to upstream due to SSL buffering. Make sure
we always set flush in the last buffer we are going to sent.
See here for report:
http://nginx.org/pipermail/nginx-ru/2011-June/041552.html
Previously all available data was used as body, resulting in garbage after
real body e.g. in case of pipelined requests. Make sure to use only as many
bytes as request's Content-Length specifies.
enabled in any server. The previous r1033 does not help when unused zone
becomes used after reconfiguration, so it is backed out.
The initial thought was to make SSL modules independed from SSL implementation
and to keep OpenSSL code dependance as much as in separate files.
list and evaluating total cache size. Reading just directory is enough for
this purpose. Elimination of reading cache files saves at least one disk I/O
operation per file.
Preparation for elimination of reading cache files by cache loader:
removing dependencies on the reading:
*) cache node valid_sec and valid_msec are used only for caching errors;
*) upstream buffer size can be used instead of cache node body_start.
by ngx_http_upstream_create_round_robin_peer(), since the peer lives
only during request so the saved SSL session will never be used again
and just causes memory leak
patch by Maxim Dounin
and is shared among all hosts instead of pregenerating for every HTTPS host
on configuraiton phase. This decreases start time for configuration with
large number of HTTPS hosts.