Bug: installing with certbot impedes further nginx conf changes without reboot

/home/jerdvo/.rbenv/plugins/ruby-build/bin:/home/jerdvo/.rbenv/shims:/home/jerdvo/.rbenv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

$ which -a nginx
/usr/sbin/nginx
/sbin/nginx

Thank you, the situation is clarified.
Until we know what certbot touches improperly and that gets fixed, certbot --webroot is the only useful possibility in production mode.
rebooting could be acceptable in development - which also provides the syntax for .conf files.

We know that certbot might start nginx directly which causes problems in systemd distros like yours (and mine :)). And, _az promised to look at that.

But, this "feels" to me like an inherent issue related to certbot modifying your nginx.conf files while nginx is running. Does passenger or any other monitoring system auto-restart nginx when it detects a change to nginx.conf? Something like that would explain all the facts - especially the troubling unknown pid in the systemd segv error message.

If such an auto-restart is happening certbot --webroot (or a different acme client) is your only option as certbot --nginx will always update the nginx.conf files - for new issuance and renew.

2 Likes
1 Like

@rg305

grep: /etc/shadow: Permission denied
grep: /etc/ufw/before6.rules: Permission denied
grep: /etc/ufw/after.init: Permission denied
grep: /etc/ufw/user.rules: Permission denied
grep: /etc/ufw/after.rules: Permission denied
grep: /etc/ufw/after6.rules: Permission denied
grep: /etc/ufw/before.rules: Permission denied
grep: /etc/ufw/before.init: Permission denied
grep: /etc/ufw/user6.rules: Permission denied
grep: /etc/gshadow-: Permission denied
grep: /etc/ssh/ssh_host_ed25519_key: Permission denied
grep: /etc/ssh/ssh_host_rsa_key: Permission denied
grep: /etc/ssh/ssh_host_ecdsa_key: Permission denied
grep: /etc/ssh/ssh_host_dsa_key: Permission denied
grep: /etc/gshadow: Permission denied
grep: /etc/polkit-1/localauthority: Permission denied
/etc/rc2.d/S01nginx:# Try to extract nginx pidfile
/etc/rc2.d/S01nginx:    PID=/run/nginx.pid
/etc/rc4.d/S01nginx:# Try to extract nginx pidfile
/etc/rc4.d/S01nginx:    PID=/run/nginx.pid
grep: /etc/iscsi/iscsid.conf: Permission denied
grep: /etc/iscsi/initiatorname.iscsi: Permission denied
grep: /etc/letsencrypt/keys: Permission denied
grep: /etc/letsencrypt/archive: Permission denied
grep: /etc/letsencrypt/live: Permission denied
grep: /etc/letsencrypt/accounts: Permission denied
grep: /etc/redis/redis.conf: Permission denied
grep: /etc/sudoers: Permission denied
grep: /etc/ssl/private: Permission denied
/etc/init.d/nginx:# Try to extract nginx pidfile
/etc/init.d/nginx:      PID=/run/nginx.pid
grep: /etc/at.deny: Permission denied
/etc/rc1.d/K01nginx:# Try to extract nginx pidfile
/etc/rc1.d/K01nginx:    PID=/run/nginx.pid
grep: /etc/.pwd.lock: Permission denied
/etc/nginx/nginx.conf:pid /run/nginx.pid;
grep: /etc/security/opasswd: Permission denied
/etc/rc6.d/K01nginx:# Try to extract nginx pidfile
/etc/rc6.d/K01nginx:    PID=/run/nginx.pid
grep: /etc/sudoers.d: Permission denied
/etc/rc3.d/S01nginx:# Try to extract nginx pidfile
/etc/rc3.d/S01nginx:    PID=/run/nginx.pid
/etc/rc5.d/S01nginx:# Try to extract nginx pidfile
/etc/rc5.d/S01nginx:    PID=/run/nginx.pid
/etc/systemd/system/multi-user.target.wants/nginx.service:PIDFile=/run/nginx.pid
/etc/systemd/system/multi-user.target.wants/nginx.service:ExecStop=-/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid
grep: /etc/shadow-: Permission denied
/etc/rc0.d/K01nginx:# Try to extract nginx pidfile
/etc/rc0.d/K01nginx:    PID=/run/nginx.pid

@MikeMcQ

I am not aware of such passenger behaviour; its role is related to the associated application (thus as sub-component of the nginx.conf file). However I am light years away from being an expert.

With the attached file though, the case should be replicable for capable hands.

LOL
Maybe better output with sudo

The three most important ones agree on the same location:

/etc/nginx/nginx.conf:pid /run/nginx.pid;
/etc/systemd/system/multi-user.target.wants/nginx.service:PIDFile=/run/nginx.pid
/etc/systemd/system/multi-user.target.wants/nginx.service:ExecStop=-/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid
1 Like

Yes, and under sudo, all the others agree

/etc/rc2.d/S01nginx:# Try to extract nginx pidfile
/etc/rc2.d/S01nginx:    PID=/run/nginx.pid
/etc/rc4.d/S01nginx:# Try to extract nginx pidfile
/etc/rc4.d/S01nginx:    PID=/run/nginx.pid
/etc/init.d/nginx:# Try to extract nginx pidfile
/etc/init.d/nginx:      PID=/run/nginx.pid
/etc/rc1.d/K01nginx:# Try to extract nginx pidfile
/etc/rc1.d/K01nginx:    PID=/run/nginx.pid
/etc/nginx/nginx.conf:pid /run/nginx.pid;
/etc/rc6.d/K01nginx:# Try to extract nginx pidfile
/etc/rc6.d/K01nginx:    PID=/run/nginx.pid
/etc/rc3.d/S01nginx:# Try to extract nginx pidfile
/etc/rc3.d/S01nginx:    PID=/run/nginx.pid
/etc/rc5.d/S01nginx:# Try to extract nginx pidfile
/etc/rc5.d/S01nginx:    PID=/run/nginx.pid
/etc/systemd/system/multi-user.target.wants/nginx.service:PIDFile=/run/nginx.pid
/etc/systemd/system/multi-user.target.wants/nginx.service:ExecStop=-/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid
/etc/rc0.d/K01nginx:# Try to extract nginx pidfile
/etc/rc0.d/K01nginx:    PID=/run/nginx.pid

The nginx segv fault is not easily reproduced. We would be seeing that a vast number of times per day if it was common.

Maybe look in /var/log/dmesg for clues? If you don't see something upload that and maybe we will see something helpful. (look for nginx and/or segfault)

1 Like

Those strings do not appear in that log. I enclose the contents nonetheless dmesg.txt (47.6 KB)

Note: with the process file provided I consistently generate that error. At least 6 VPS instances now.

Just to be clear to anyone trying to keep up with this topic...
Which versions of certbot and nginx are you using?

1 Like

$ nginx -v
nginx version: nginx/1.18.0 (Ubuntu)
$ certbot --version
certbot 1.20.0

OK now I'm curious - LOL

To round that off (so even I can put this in a lab):
Which version of Ubuntu?
Were both nginx and certbot installed from apt?
OR was certbot installed via snap?
[OR other... like either or both were compiled from source]

1 Like

Ubuntu 20.04.
nginx installed via apt
certbot installed via snap (freshly; I even ensured that sudo apt-get remove certbot ran beforehand - it drew a blank)

1 Like

Also see the process.txt from the earlier post #35 for other package details

1 Like

Do you the have Perl module enabled in nginx? i.e. Is /etc/nginx/modules-enabled/50-mod-http-perl.conf present?

nginx's master process segfaulting would explain some things. We've had numerous other reports of that module causing segfaults on reload, on Ubuntu servers.

Try disable it, if it's there.

If that doesn't help, it would be handy if you could gdb attach to the nginx master process before it crashes, and provide a backtrace of the segfault.

3 Likes

Best to have OP answer but this was in the letsencrypt log. Is that sufficient info?

2021-11-01 09:25:31,246:DEBUG:certbot.reverter:Creating backup of /etc/nginx/modules-enabled/50-mod-http-perl.conf

2 Likes

Yes, nicely spotted. Try removing that file @dvo, restarting nginx, then try the entire process again.

3 Likes

Yes, left the file in, but disabling its only line:
# load_module modules/ngx_http_perl_module.so;

# added new server name
$ sudo service nginx restart
~$
$ ls -l /run/nginx.pid
-rw-r--r-- 1 root root 5 Nov  2 06:11 /run/nginx.pid
sudo ps -eF | grep -E "nginx|PID"
UID          PID    PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
root        1137       1  0 26306  4668   0 06:11 ?        00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data    1138    1137  0 26448  8828   0 06:11 ?        00:00:00 nginx: worker process
jerdvo      1159    1041  0  2039  2520   0 06:13 pts/0    00:00:00 grep --color=auto -E nginx|PID

$ sudo certbot --nginx -d testthree.fidely.club
#  [...]  Successfully received certificate. [...]  Deploying certificate
$ ls -l /run/nginx.pid
-rw-r--r-- 1 root root 5 Nov  2 06:17 /run/nginx.pid
$ sudo ps -eF | grep -E "nginx|PID"
UID          PID    PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
root        1419       1  0 26506 14428   0 06:17 ?        00:00:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
www-data    1746    1419  0 26566  9316   0 06:24 ?        00:00:00 nginx: worker process
jerdvo      1825    1041  0  2039  2516   0 06:25 pts/0    00:00:00 grep --color=auto -E nginx|PID

$ sudo service nginx restart
~$

huzzah

So the letsencrypt.log does indicate a backup, but of the disabled load_module command

2021-11-02 06:24:37,789:DEBUG:certbot.reverter:Creating backup of /etc/nginx/modules-enabled/50-mod-http-cache-purge.conf
2021-11-02 06:24:37,789:DEBUG:certbot.reverter:Creating backup of /etc/nginx/modules-enabled/50-mod-http-perl.conf
2021-11-02 06:24:37,789:DEBUG:certbot.reverter:Creating backup of /etc/nginx/modules-enabled/50-mod-http-xslt-filter.conf

Further changes to conf files pass nginx tests and the service restarts.

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.