TL;DR: NGINX was not launching correctly. Since no logs were being written by the process, had to use strace to debug what was going on.
There was a weird thing going on with one of our NGINX servers. The sequence of events was like this:
1. Our server rebooted (after many, many months of uptime).
2. After reboot, NGINX was running but not responding to requests
Even if I curled localhost like this, nothing happened:
root@amy:/tmp# curl -v http://localhost
* Rebuilt URL to: http://localhost/
* Hostname was NOT found in DNS cache
* Trying ::1...
* Connected to localhost (::1) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.38.0
> Host: localhost
> Accept: */*
3. Checked error and access logs. Nothing was being written in logs after reboot
That was weird...
4. Did a ps to check if process was running at all. And it was. But realized that no NGINX workers were spawned after launch. How come?
* Connection #0 to host localhost left intactroot@amy:/tmp# ps aux | grep nginxroot 880 0.0 0.1 43424 5968 ? Ss 08:40 0:00 nginx: master process /usr/local/nginx/sbin/nginx -g daemon on; master_process on;root@amy:/tmp#
Dec 31 08:40:11 amy systemd: Stopping A high performance web server and a reverse proxy server...Dec 31 08:40:11 amy systemd: Stopped A high performance web server and a reverse proxy server.Dec 31 08:40:15 amy systemd: Starting A high performance web server and a reverse proxy server...Dec 31 08:40:15 amy systemd: Started A high performance web server and a reverse proxy server.
Launched strace attaching it to NGINX's using its PID.
# strace -p 513 -s 10000 -v -f
On a different terminal reloaded NGINX
# systemctl reload nginx
Then strace's output gave me the reason no workers were being spawned.
[pid 844] prctl(PR_SET_DUMPABLE, 1) = 0
[pid 844] chdir("/tmp/cores") = -1 ENOENT (No such file or directory)
[pid 844] write(16, "2021/12/31 08:38:08 [alert] 844#0: chdir(\"/tmp/cores\") failed (2: No such file or directory)\n", 93) = 93
[pid 846] fstat(20, <unfinished ...>
[pid 844] exit_group(2) = ?
[pid 844] +++ exited with 2 +++
7. Turned out I directory I configured a long time ago to evaluate a SIGSEGV I was having, was deleted on reboot so workers were failing to spawn. After that, created the directory again and NGINX was responding to my requests once again.
End of (sad) story. Half an hour I will never get back.
Happy New Year!