Linux returns EOPNOTSUPP for non-TCP sockets and ENOPROTOOPT for TCP
sockets, because getsockopt(TCP_FASTOPEN) is not implemented so far.
While there, lower the log level from ALERT to NOTICE to match other
getsockopt() failures.
Signed-off-by: Piotr Sikora <piotr@cloudflare.com>
Recent Linux versions started to return EOPNOTSUPP to getsockopt() calls
on unix sockets, resulting in log pollution on binary upgrade. Such errors
are silently ignored now.
This patch fixes incorrect handling of auto redirect in configurations
like:
location /0 { }
location /a- { }
location /a/ { proxy_pass ... }
With previously used sorting, this resulted in the following locations
tree (as "-" is less than "/"):
"/a-"
"/0" "/a/"
and a request to "/a" didn't match "/a/" with auto_redirect, as it
didn't traverse relevant tree node during lookup (it tested "/a-",
then "/0", and then falled back to null location).
To preserve locale use for non-ASCII characters on case-insensetive
systems, libc's tolower() used.
Several warnings silenced, notably (ngx_socket_t) -1 is now checked
on socket operations instead of -1, as ngx_socket_t is unsigned on win32
and gcc complains on comparison.
With this patch, it's now possible to compile nginx using mingw gcc,
with options we normally compile on win32.
Several false positive warnings silenced, notably W8012 "Comparing
signed and unsigned" (due to u_short values promoted to int), and
W8072 "Suspicious pointer arithmetic" (due to large type values added
to pointers).
With this patch, it's now again possible to compile nginx using bcc32,
with options we normally compile on win32 minus ipv6 and ssl.
Precompiled headers are disabled as they lead to internal compiler errors
with long configure lines. Couple of false positive warnings silenced.
Various win32 typedefs are adjusted to work with Open Watcom C 1.9 headers.
With this patch, it's now again possible to compile nginx using owc386,
with options we normally compile on win32 minus ipv6 and ssl.
It was introduced in Linux 2.6.39, glibc 2.14 and allows to obtain
file descriptors without actually opening files. Thus made it possible
to traverse path with openat() syscalls without the need to have read
permissions for path components. It is effectively emulates O_SEARCH
which is missing on Linux.
O_PATH is used in combination with O_RDONLY. The last one is ignored
if O_PATH is used, but it allows nginx to not fail when it was built on
modern system (i.e. glibc 2.14+) and run with a kernel older than 2.6.39.
Then O_PATH is unknown to the kernel and ignored, while O_RDONLY is used.
Sadly, fstat() is not working with O_PATH descriptors till Linux 3.6.
As a workaround we fallback to fstatat() with the AT_EMPTY_PATH flag
that was introduced at the same time as O_PATH.
It was broken in 8e446a2daf48 when the NGX_SENDFILE_LIMIT constant was added
to ngx_linux_sendfile_chain.c having the same name as already defined one in
ngx_linux_config.h.
The newer is needed to overcome a bug in old Linux kernels by limiting the
number of bytes to send per sendfile() syscall. The older is used with
sendfile() on ancient kernels that works with 32-bit offsets only.
One of these renamed to NGX_SENDFILE_MAXSIZE.
In ngx_*_sendfile_chain() when calculating pointer to a first
non-zero sized buf, use "in" as iterator. This fixes processing
of zero sized buf(s) after EINTR. Otherwise function can return
zero sized buf to caller, and later ngx_http_write_filter()
logs warning.
Currently this flag is needed for epoll and rtsig, and though these methods
usually present on different platforms than kqueue, nginx can be compiled to
support all of them.
On Linux x32 inclusion of sys/sysctl.h produces an error. As sysctl() is
only used by rtsig event method code, which is legacy and not compiled
in by default on modern linuxes, the sys/sysctl.h file now only included
if rtsig support is enabled.
Based on patch by Serguei I. Ivantsov.
When several "error_log" directives are specified in the same configuration
block, logs are written to all files with a matching log level.
All logs are stored in the singly-linked list that is sorted by log level in
the descending order.
Specific debug levels (NGX_LOG_DEBUG_HTTP,EVENT, etc.) are not supported
if several "error_log" directives are specified. In this case all logs
will use debug level that has largest absolute value.
Valgrind complains if we pass uninitialized memory to a syscall:
==36492== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==36492== at 0x6B5E6A: sendmsg (in /usr/lib/system/libsystem_kernel.dylib)
==36492== by 0x10004288E: ngx_signal_worker_processes (ngx_process_cycle.c:527)
==36492== by 0x1000417A7: ngx_master_process_cycle (ngx_process_cycle.c:203)
==36492== by 0x100001F10: main (nginx.c:410)
==36492== Address 0x7fff5fbff71c is on thread 1's stack
Even initialization of all members of the structure passed isn't enough, as
there is padding which still remains uninitialized and results in Valgrind
complaint. Note there is no real problem here as data from uninitialized
memory isn't used.
Valgrind intercepts SIGUSR2 in some cases, and nginx might not be able to
start due to sigaction() failure. If compiled with NGX_VALGRIND defined,
we now ignore the failure of sigaction().
On Win32 platforms 0 is used to indicate errors in file operations, so
comparing against -1 is not portable.
This was not much of an issue in patched code, since only ngx_fd_info() test
is actually reachable on Win32 and in worst case it might result in bogus
error log entry.
Patch by Piotr Sikora.
The crypt_r() function returns NULL on errors, check it explicitly instead
of assuming errno will remain 0 if there are no errors (per POSIX, the
setting of errno after a successful call to a function is unspecified
unless the description of that function specifies that errno shall not
be modified).
Additionally, dropped unneeded ngx_set_errno(0) and fixed error handling
of memory allocation after normal crypt(), which was inapropriate and
resulted in null pointer dereference on allocation failures.
This includes "debug_connection", upstreams, "proxy_pass", etc.
(ticket #92)
To preserve compatibility, "listen" specified with a domain name
selects the first IPv4 address, if available. If not available,
the first IPv6 address will be used (ticket #186).
This will result in alphabetical sorting of included files if
the "include" directive with wildcards is used.
Note that the behaviour is now different from that on Windows, where
alphabetical sorting is not guaranteed for FindFirsFile()/FindNextFile()
(used to be alphabetical on NTFS, but not on FAT).
Approved by Igor Sysoev, prodded by many.
Catched by dav_chunked.t on Solaris. In released versions this might
potentially result in corruption of complex protocol responses if they
were written to disk and there were more distinct buffers than IOV_MAX
in a single write.
This fixes unwanted/incorrect cpu_affinity use on dead worker processes
respawn. While this is not ideal, it's expected to be better when previous
situation where multiple processes were spawn with identical CPU affinity
set.
Reported by Charles Chen.
The only thing we could potentially do here in case of error
returned is to complain to error log, but we don't have log
structure available here due to interface limitations.
Prodded by Coverity.
If ngx_spawn_process() failed while starting a process, the process
handle was closed but left non-NULL in the ngx_processes[] array.
The handle later was used in WaitForMultipleObjects() (if there
were multiple worker processes configured and at least one worker
process was started successfully), resulting in infinite loop.
Reported by Ricardo V G:
http://mailman.nginx.org/pipermail/nginx-devel/2012-July/002494.html
HP-UX needs _HPUX_ALT_XOPEN_SOCKET_API to be defined to be able to
use various POSIX versions of networking functions. Notably sendmsg()
resulted in "sendmsg() failed (9: Bad file number)" alerts without it.
See xopen_networking(7) for more details.
Poll event method needs ngx_cycle->files to work, and use of ngx_exit_cycle
without files set caused null pointer dereference in resolver's cleanup
on udp socket close.
This includes trailings dots and spaces, NTFS streams (and short names, as
previously checked). The checks are now also done in ngx_file_info(), thus
allowing to use the "try_files" directive to protect external scripts.
In case of EMFILE/ENFILE returned from accept() we disable accept events,
and (in case of no accept mutex used) arm timer to re-enable them later.
With accept mutex we just drop it, and rely on normal accept mutex handling
to re-enable accept events once it's acquired again.
As we now handle errors in question, logging level was changed to "crit"
(instead of "alert" used for unknown errors).
Note: the code might call ngx_enable_accept_events() multiple times if
there are many listen sockets. The ngx_enable_accept_events() function was
modified to check if connection is already active (via c->read->active) and
skip it then, thus making multiple calls safe.
We now stop on IOV_MAX iovec entries only if we are going to add new one,
i.e. next buffer can't be coalesced into last iovec.
This also fixes incorrect checks for trailer creation on FreeBSD and
Mac OS X, header.nelts was checked instead of trailer.nelts.
The "complete" flag wasn't cleared on loop iteration start, resulting in
broken behaviour if there were more than IOV_MAX buffers and first
iteration was fully completed (and hence the "complete" flag was set
to 1).
POSIX doesn't require it to be defined, and Debian GNU/Hurd doesn't define
it. Note that if there is no MAX_PATH defined we have to use realpath()
with NULL argument and free() the result.
Most of the systems have it included due to namespace pollution, but
relying on this is a bad idea. Explicit include is required for at least
Debian GNU/Hurd.
ZFS reports incorrect st_blocks until file settles on disk, and this
may take a while (i.e. just after creation of a file the st_blocks value
is incorrect). As a workaround we now use st_blocks only if
st_blocks * 512 > st_size, this should fix ZFS problems while still
preserving accuracy for other filesystems.
The problem had appeared in r3900 (1.0.1).
Solaris has AT_FDCWD defined to unsigned value, and comparison of a file
descriptor with it causes warnings in modern versions of gcc. Explicitly
cast AT_FDCWD to ngx_fd_t to resolve these warnings.
The aio_return() must be called regardless of the error returned by
aio_error(). Not calling it resulted in various problems up to segmentation
faults (as AIO events are level-triggered and were reported again and again).
Additionally, in "aio sendfile" case r->blocked was incremented in case of
error returned from ngx_file_aio_read(), thus causing request hangs.
Second argument (cpusetsize) is size in bytes, not in bits. Previously
used constant 32 resulted in reading of uninitialized memory and caused
EINVAL to be returned on some Linux kernels.
FreeBSD kernel checks headers/trailers pointer against NULL, not
corresponding count. Passing NULL if there are no headers/trailers
helps to avoid unneeded work in kernel, as well as unexpected 0 bytes
GIO in traces.
If process exited abnormally while holding lock on some shared memory zone -
unlock it. It may be not safe thing to do (as crash with lock held may
result in corrupted shared memory structure, and other processes will
subsequently crash while trying to access shared data), therefore complain
loudly if unlock succeeds.
It is currently used from master process on abnormal worker termination to
unlock accept mutex (unlocking of accept mutex was broken in 1.0.2). It is
expected to be used in the future to unlock other mutexes as well.
Shared mutex code was rewritten to make this possible in a safe way, i.e.
with a check if lock was actually held by the exited process. We again use
pid to lock mutex, and use separate atomic variable for a count of processes
waiting in sem_wait().
Previously result of last iteration's writev() was returned. This was
unnoticed as return value was only used if chain contained only one or
two buffers.
syscall(2) uses usual libc convention, it returns -1 on error and
sets errno. Obsolete _syscall(2) returns negative value of error.
Thanks to Hagai Avrahami.
On file retest open_file_cache lost is_directio if file wasn't changed.
This caused unaligned operations under Linux to fail with EINVAL.
It wasn't noticeable with AIO though, as errors wasn't properly logged.
Non-daemon mode is currently used by supervisord, daemontools and so on
or during debugging. The NOACCEPT signal is only used for online upgrade
which is not supported when nginx is run under supervisord, etc.,
so this change should not break existant setups.
NetBSD 5.0+ has SO_ACCEPTFILTER support merged from FreeBSD, and having
accept filter check in FreeBSD-specific ngx_freebsd_config.h prevents it
from being used on NetBSD. Therefore move the check into configure (and
do the same for Linux-specific TCP_DEFER_ACCEPT, just to be in line).