Bug: installing with certbot impedes further nginx conf changes without reboot

Here goes a play-by-play of the test session:

$ sudo reboot
[...]
$ ls -l /run/nginx.pid
-rw-r--r-- 1 root root 4 Nov  1 09:14 /run/nginx.pid
$ sudo service nginx status
   returns satisfactorily
$ sudo ps -eF | grep -E "nginx|PID"
UID          PID    PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
root         757       1  0 27379  5576   0 09:14 ?        00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data     760     757  0 27519 13352   0 09:14 ?        00:00:00 nginx: worker process
jerdvo      1120    1035  0  2039  2428   0 09:17 pts/0    00:00:00 grep --color=auto -E nginx|PID
$ sudo service nginx restart
~$
$ sudo vim /etc/nginx/sites-enabled/default
# added new server name
$ sudo nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
$ sudo service nginx restart
$ ls -l /run/nginx.pid
-rw-r--r-- 1 root root 5 Nov  1 09:20 /run/nginx.pid
$ sudo ps -eF | grep -E "nginx|PID"
UID          PID    PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
root        1339       1  0 27379  5572   0 09:20 ?        00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data    1342    1339  0 27519 11220   0 09:20 ?        00:00:00 nginx: worker process
jerdvo      1348    1035  0  2039  2448   0 09:21 pts/0    00:00:00 grep --color=auto -E nginx|PID
$ sudo service nginx status
● nginx.service - A high performance web server and a reverse proxy server
     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2021-11-01 09:20:17 UTC; 4min 2s ago
       Docs: man:nginx(8)
    Process: 1313 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
    Process: 1325 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
   Main PID: 1339 (nginx)
      Tasks: 16 (limit: 1136)
     Memory: 14.1M
    CGroup: /system.slice/nginx.service
             ├─1326 Passenger watchdog
             ├─1329 Passenger core
             ├─1339 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
             └─1342 nginx: worker process

Nov 01 09:20:17 [...] systemd[1]: nginx.service: Succeeded.
Nov 01 09:20:17 [...] systemd[1]: Stopped A high performance web server and a reverse proxy server.
Nov 01 09:20:17 [...] systemd[1]: Starting A high performance web server and a reverse proxy server...
Nov 01 09:20:17 [...] systemd[1]: Started A high performance web server and a reverse proxy server.

now invoking certbot

$ sudo certbot --nginx -d testthree.fidely.club
#  [...]  Successfully received certificate. [...]  Deploying certificate
$ ls -l /run/nginx.pid
-rw-r--r-- 1 root root 5 Nov  1 09:25 /run/nginx.pid
$ sudo service nginx status
● nginx.service - A high performance web server and a reverse proxy server
     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
     Active: failed (Result: core-dump) since Mon 2021-11-01 09:25:34 UTC; 1min 3s ago
       Docs: man:nginx(8)
    Process: 1313 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
    Process: 1325 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
   Main PID: 1339 (code=dumped, signal=SEGV)
      Tasks: 0 (limit: 1136)
     Memory: 1.6M
     CGroup: /system.slice/nginx.service

Nov 01 09:20:17 [...] systemd[1]: nginx.service: Succeeded.
Nov 01 09:20:17 [...] systemd[1]: Stopped A high performance web server and a reverse proxy server.
Nov 01 09:20:17 [...] systemd[1]: Starting A high performance web server and a reverse proxy server...
Nov 01 09:20:17 [...] systemd[1]: Started A high performance web server and a reverse proxy server.
Nov 01 09:25:34 [...] systemd[1]: nginx.service: Main process exited, code=dumped, status=11/SEGV
Nov 01 09:25:34 [...] systemd[1]: nginx.service: Killing process 1443 (nginx) with signal SIGKILL.
Nov 01 09:25:34 [...] systemd[1]: nginx.service: Killing process 1443 (nginx) with signal SIGKILL.
Nov 01 09:25:34 [...] systemd[1]: nginx.service: Failed with result 'core-dump'.
$ sudo ps -eF | grep -E "nginx|PID"
UID          PID    PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
root        1478       1  0 27646 18180   0 09:25 ?        00:00:00 nginx: master process nginx -c /etc/nginx/nginx.conf
www-data    1501    1478  0 27750 14996   0 09:25 ?        00:00:00 nginx: worker process
jerdvo      1519    1035  0  2039  2584   0 09:27 pts/0    00:00:00 grep --color=auto -E nginx|PID
$ ls -l /run/nginx.pid
-rw-r--r-- 1 root root 5 Nov  1 09:25 /run/nginx.pid

the pid is now pointing to a different directory /etc/nginx/nginx.conf compared to the original state and when retarting without certbot's intervention: /usr/sbin/nginx
at which point we are now in the failing state

$ sudo service nginx restart
Job for nginx.service failed because the control process exited with error code.
See "systemctl status nginx.service" and "journalctl -xe" for details.
$ systemctl status nginx.service
● nginx.service - A high performance web server and a reverse proxy server
     Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2021-11-01 09:33:44 UTC; 34s ago
       Docs: man:nginx(8)
    Process: 1874 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
    Process: 1875 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=1/FAILURE)

$ journalctl -xe
Nov 01 09:15:56 [...] sshd[1032]: Received disconnect from 5.171.89.128 port 17016:11: disconnected by user
Nov 01 09:15:56 [...] sshd[1032]: Disconnected from user jerdvo 5.171.89.128 port 17016
Nov 01 09:29:18 [...] sshd[1600]: Received disconnect from 5.171.89.128 port 17032:11: disconnected by user
Nov 01 09:29:18 [...] sshd[1600]: Disconnected from user jerdvo 5.171.89.128 port 17032
Nov 01 09:29:28 [...] sshd[1681]: Received disconnect from 5.171.89.128 port 16499:11: disconnected by user
Nov 01 09:29:28 [...] sshd[1681]: Disconnected from user jerdvo 5.171.89.128 port 16499
Nov 01 09:31:29 [...] sshd[1778]: Received disconnect from 5.171.89.128 port 17101:11: disconnected by user
Nov 01 09:31:29 [...] sshd[1778]: Disconnected from user jerdvo 5.171.89.128 port 17101
Nov 01 09:31:58 [...] sshd[1858]: Received disconnect from 5.171.89.128 port 16861:11: disconnected by user
Nov 01 09:31:58 [...] sshd[1858]: Disconnected from user jerdvo 5.171.89.128 port 16861

@_az suggestion nginx -s reload was tried next. Alas...

$ ls -l /run/nginx.pid
ls: cannot access '/run/nginx.pid': No such file or directory
$ sudo nginx -s reload
nginx: [error] open() "/run/nginx.pid" failed (2: No such file or directory)

I enclose the letsecnrypt log letsencrypt_log.txt (54.5 KB)
as well as process.txt (2.7 KB)
a file that documents what processes are taken in the creation of the VPS before invoking certbot.
I believe this allows a fully replicable instance.

2 Likes