Certbot fails to start nginx

I am using kernel 3.10 on CentOS 7. I have successfully installed CertBot 1.10.1 using alternative installation instructions as certbot-auto . I have manually added certbot-auto to autorun via systemd as:

/etc/systemd/system/certbot-renewal.service :

[Unit]
Description=Certbot Renewal

[Service]
ExecStart=/usr/local/bin/certbot-auto renew --pre-hook "service nginx stop" --post-hook "service nginx start" --quiet --agree-tos

/etc/systemd/system/certbot-renewal.timer :

[Unit]
Description=Timer for Certbot Renewal

[Timer]
OnBootSec=1h
OnUnitActiveSec=1d

[Install]
WantedBy=multi-user.target

Now, certbot-auto successfully refreshes SSL certificates when it is needed. However, the problem is that certbot-auto fails to start nginx .

For example, if certbot-auto updates certificates - my web-site is down. If I connect via SSH, I see this:

[root@somedomain ~]# sudo systemctl status nginx
● nginx.service - SYSV: Nginx is an HTTP(S) server, HTTP(S) reverse proxy and IMAP/POP3 proxy server
   Loaded: loaded (/etc/rc.d/init.d/nginx; bad; vendor preset: disabled)
   Active: inactive (dead) since Wed 2021-04-14 16:40:56 UTC; 3min 14s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 5745 ExecStop=/etc/rc.d/init.d/nginx stop (code=exited, status=0/SUCCESS)
  Process: 5737 ExecStart=/etc/rc.d/init.d/nginx start (code=exited, status=0/SUCCESS)
 Main PID: 5708 (code=exited, status=0/SUCCESS)

Apr 14 16:40:56 somedomain.com systemd[1]: Starting SYSV: Nginx is an HTTP(S)....
Apr 14 16:40:56 somedomain.com systemd[1]: Started SYSV: Nginx is an HTTP(S) ....
Hint: Some lines were ellipsized, use -l to show in full.
[root@somedomain ~]# sudo systemctl start nginx
[root@somedomain ~]# sudo systemctl status nginx
● nginx.service - SYSV: Nginx is an HTTP(S) server, HTTP(S) reverse proxy and IMAP/POP3 proxy server
   Loaded: loaded (/etc/rc.d/init.d/nginx; bad; vendor preset: disabled)
   Active: active (running) since Wed 2021-04-14 16:44:45 UTC; 3s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 5745 ExecStop=/etc/rc.d/init.d/nginx stop (code=exited, status=0/SUCCESS)
  Process: 5809 ExecStart=/etc/rc.d/init.d/nginx start (code=exited, status=0/SUCCESS)
 Main PID: 5822 (nginx)
   CGroup: /system.slice/nginx.service
           ├─5822 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.c...
           ├─5823 nginx: worker process
           ├─5824 nginx: worker process
           ├─5826 nginx: cache manager process
           └─5827 nginx: cache loader process

Apr 14 16:44:45 somedomain.com systemd[1]: Starting SYSV: Nginx is an HTTP(S)....
Apr 14 16:44:45 somedomain.com nginx[5809]: Starting nginx: [  OK  ]
Apr 14 16:44:45 somedomain.com systemd[1]: Started SYSV: Nginx is an HTTP(S) ....
Hint: Some lines were ellipsized, use -l to show in full.

Looking at certbot logs does not show anything suspicious:

...

2021-04-14 16:40:46,329:INFO:certbot.compat.misc:Running pre-hook command: service nginx stop
2021-04-14 16:40:46,488:INFO:certbot.compat.misc:Output from pre-hook command service:
Stopping nginx (via systemctl):  [  OK  ] 
2021-04-14 16:40:46,492:DEBUG:certbot.display.util:Notifying user: Renewing an existing certificate for somedomain.com and 4 more domains

...

2021-04-14 16:40:48,149:DEBUG:certbot_nginx._internal.parser:Writing nginx conf tree to /etc/nginx/conf.d/somefile.conf:
...
2021-04-14 16:40:48,221:DEBUG:certbot_nginx._internal.configurator:nginx reload failed:
nginx: [error] open() "/run/nginx.pid" failed (2: No such file or directory) 

...

2021-04-14 16:40:52,071:DEBUG:acme.client:Storing nonce: ...
2021-04-14 16:40:52,072:DEBUG:certbot._internal.error_handler:Calling registered functions
2021-04-14 16:40:52,072:INFO:certbot._internal.auth_handler:Cleaning up challenges 

2021-04-14 16:40:55,267:DEBUG:certbot._internal.storage:Writing new config /etc/letsencrypt/renewal/somedomain.com.conf.new.
2021-04-14 16:40:56,316:DEBUG:certbot.display.util:Notifying user: new certificate deployed with reload of nginx server; fullchain is
/etc/letsencrypt/live/somedomain.com/fullchain.pem
2021-04-14 16:40:56,322:DEBUG:certbot._internal.plugins.selection:Requested authenticator nginx and installer nginx
2021-04-14 16:40:56,324:DEBUG:certbot._internal.plugins.selection:Selecting plugin: * nginx
Description: Nginx Web Server plugin
Interfaces: IAuthenticator, IInstaller, IPlugin
Entry point: nginx = certbot_nginx._internal.configurator:NginxConfigurator
Initialized: <certbot_nginx._internal.configurator.NginxConfigurator object at 0x7f75ee7ab250>
Prep: True
2021-04-14 16:40:56,325:DEBUG:certbot.display.util:Notifying user: 
Congratulations, all renewals succeeded. The following certs have been renewed:
  /etc/letsencrypt/live/somedomain.com/fullchain.pem (success)
2021-04-14 16:40:56,326:DEBUG:certbot._internal.renewal:no renewal failures
2021-04-14 16:40:56,326:INFO:certbot.compat.misc:Running post-hook command: service nginx start
2021-04-14 16:40:56,455:INFO:certbot.compat.misc:Output from post-hook command service:
Starting nginx (via systemctl):  [  OK  ] 

As you can see - logs indicate that certbot was able to run nginx.

Looking at nginx logs:

... unrelated old entries
2021/04/14 16:40:46 [alert] 5188#0: *1395650 open socket #18 left in connection 10
2021/04/14 16:40:46 [alert] 5188#0: *1395649 open socket #13 left in connection 17
2021/04/14 16:40:46 [alert] 5188#0: aborting
2021/04/14 16:40:48 [notice] 5706#0: signal process started
2021/04/14 16:40:48 [error] 5706#0: open() "/run/nginx.pid" failed (2: No such file or directory)
2021/04/14 16:40:52 [notice] 5715#0: signal process started
2021/04/14 16:40:55 [notice] 5720#0: signal process started 

Nothing suspicious to me as well. nginx seems to be started.

Any idea what could be wrong? Or what could I check?

1 Like

Welcome to the Let's Encrypt Community, Alex :slightly_smiling_face:

What are the contents of these?

/etc/letsencrypt/renewal/somedomain.com.conf
/etc/letsencrypt/renewal/somedomain.com.conf.new

What is the output of this?

sudo certbot-auto certificates

2 Likes

Just in case: certbot renews certificates without any issues. E.g.:

  1. certbot renews certificates;
  2. The web-site is down (nginx is not started);
  3. I connect via SSH and do sudo systemctl start nginx;
  4. The web-site is running, the new certificate is used.

Anyway:

[root@somedomain ~]# sudo /usr/local/bin/certbot-auto certificates
Your system is not supported by certbot-auto anymore.
certbot-auto and its Certbot installation will no longer receive updates.
You will not receive any bug fixes including those fixing server compatibility
or security problems.
Please visit https://certbot.eff.org/ to check for other alternatives.
Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Found the following certs:
  Certificate Name: somedomain.com
    Serial Number: 3...0
    Key Type: RSA
    Domains: somedomain.com sub1.somedomain.com sub2.somedomain.com sub3.somedomain.com www.somedomain.com
    Expiry Date: 2021-07-13 15:41:35+00:00 (VALID: 89 days)
    Certificate Path: /etc/letsencrypt/live/somedomain.com/fullchain.pem
    Private Key Path: /etc/letsencrypt/live/somedomain.com/privkey.pem
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[root@somedomain ~]#

The /etc/letsencrypt/renewal/somedomain.com.conf.new file does not exist.

The /etc/letsencrypt/renewal/somedomain.com.conf file is:

# renew_before_expiry = 30 days
version = 1.10.1
archive_dir = /etc/letsencrypt/archive/somedomain.com
cert = /etc/letsencrypt/live/somedomain.com/cert.pem
privkey = /etc/letsencrypt/live/somedomain.com/privkey.pem
chain = /etc/letsencrypt/live/somedomain.com/chain.pem
fullchain = /etc/letsencrypt/live/somedomain.com/fullchain.pem

# Options used in the renewal process
[renewalparams]
authenticator = nginx
installer = nginx
account = f...e
server = https://acme-v02.api.letsencrypt.org/directory
post_hook = service nginx start
manual_public_ip_logging_ok = None
pre_hook = service nginx stop
1 Like

Well... I'm not sure how you can successfully acquire a certificate by:

  1. stopping nginx
  2. using nginx to acquire a certificate
  3. starting nginx

Your renewal configuration file contains exactly the pre and post hooks that would cause problems. Not to worry though, we can straighten that out.


Firstly, make sure nginx is actually running.

sudo nginx -s start


Secondly, remove the existing certificate and its broken configuration file.

sudo certbot-auto delete --cert-name somedomain.com

Do NOT reload/restart nginx yet!


Thirdly, test acquisition of a new certificate using nginx.

sudo certbot-auto certonly --nginx -d "somedomain.com,www.somedomain.com,sub1.somedomain.com,sub2.somedomain.com,sub3.somedomain.com" --dry-run

If certbot-auto complains about your nginx configuration here, we will need to make some adjustments to proceed.


Fourthly, acquire a new certificate and write the correct renewal configuration file.

sudo certbot-auto certonly --nginx -d "somedomain.com,www.somedomain.com,sub1.somedomain.com,sub2.somedomain.com,sub3.somedomain.com" --deploy-hook "sudo nginx -s reload"


Fifthly, test your renewal.

sudo certbot-auto renew --dry-run


Sixthly, run an actual renewal ONLY ONCE:

sudo certbot-auto renew --force-renewal

The --force-renewal flag is very dangerous because it acquires a new certificate even if the current certificate is not due for renewal. Using it repeatedly will get you rate-limited for a week. Unfortunately, it's the only decent way to test a deployment hook.

Your actual renewal command in your cron job should look something like this:

sudo certbot-auto renew -q

1 Like

@griffin, you missed only one step:
Remove

from the "service"

:eyes: :eyes: :eyes:

1 Like

Oh... :hushed: Good catch. I removed it from the renewal configuration file (by deleting the file itself), but I missed that it was elsewhere. No wonder things are working strangely.

@GunSmoker

In addition to what @rg305 just mentioned, you need to also remove --agree-tos from /etc/systemd/system/certbot-renewal.service.

1 Like

[quote="griffin, post:4, topic:149712"]Well... I'm not sure how you can successfully acquire a certificate by:

  1. stopping nginx
  2. using nginx to acquire a certificate
  3. starting nginx

Your renewal configuration file contains exactly the pre and post hooks that would cause problems.[/quote]

I am prettry sure that was in some tutorial that I tried to follow. For example, googing indicate that it is mentioned here: User Guide — Certbot 1.11.0.dev0 documentation

I can also confirm that --dry-run executes without issues.

Perhaps, it has something to do with that I am using certbot-auto?

In logs:

2021-04-14 16:34:02,572:DEBUG:certbot._internal.storage:Should renew, less than 30 days before certificate expiry 2021-05-14 10:37:40 UTC.
2021-04-14 16:34:02,572:INFO:certbot._internal.renewal:Cert is due for renewal, auto-renewing...
2021-04-14 16:34:02,572:INFO:certbot._internal.renewal:Non-interactive renewal: random delay of 402.506239019 seconds
2021-04-14 16:40:45,179:DEBUG:certbot._internal.plugins.selection:Requested authenticator nginx and installer nginx
2021-04-14 16:40:45,667:WARNING:certbot_nginx._internal.configurator:NGINX configured with OpenSSL alternatives is not officially supported by Certbot.
2021-04-14 16:40:45,672:DEBUG:certbot._internal.plugins.selection:Single candidate plugin: * nginx
Description: Nginx Web Server plugin
Interfaces: IAuthenticator, IInstaller, IPlugin
Entry point: nginx = certbot_nginx._internal.configurator:NginxConfigurator
Initialized: <certbot_nginx._internal.configurator.NginxConfigurator object at 0x7f75ee7ab250>
Prep: True
2021-04-14 16:40:45,674:DEBUG:certbot._internal.plugins.selection:Single candidate plugin: * nginx
Description: Nginx Web Server plugin
Interfaces: IAuthenticator, IInstaller, IPlugin
Entry point: nginx = certbot_nginx._internal.configurator:NginxConfigurator
Initialized: <certbot_nginx._internal.configurator.NginxConfigurator object at 0x7f75ee7ab250>
Prep: True
2021-04-14 16:40:45,674:DEBUG:certbot._internal.plugins.selection:Selected authenticator <certbot_nginx._internal.configurator.NginxConfigurator object at 0x7f75ee7ab250> and installer <certbot_nginx._internal.configurator.NginxConfigurator object at 0x7f75ee7ab250>
2021-04-14 16:40:45,675:INFO:certbot._internal.plugins.selection:Plugins selected: Authenticator nginx, Installer nginx
2021-04-14 16:40:45,684:DEBUG:certbot._internal.main:Picked account: <Account(RegistrationResource(body=Registration(status=None, terms_of_service_agreed=None, agreement=u'https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf', only_return_existing=None, contact=(u'mailto:masked@somedomain.com',), key=JWKRSA(key=<ComparableRSAKey(<cryptography.hazmat.backends.openssl.rsa._RSAPublicKey object at 0x7f75f004ee10>)>), external_account_binding=None), uri=u'https://acme-v01.api.letsencrypt.org/acme/reg/25059924', new_authzr_uri=u'https://acme-v01.api.letsencrypt.org/acme/new-authz', terms_of_service=u'https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf'), fd59c4236656cee1f91930077bfe254e, Meta(creation_host=u'somedomain.com', register_to_eff=None, creation_dt=datetime.datetime(2017, 11, 29, 15, 4, 4, tzinfo=<UTC>)))> 
...
2021-04-14 16:40:46,329:INFO:certbot.compat.misc:Running pre-hook command: service nginx stop
2021-04-14 16:40:46,488:INFO:certbot.compat.misc:Output from pre-hook command service:
Stopping nginx (via systemctl):  [  OK  ]
2021-04-14 16:40:46,492:DEBUG:certbot.display.util:Notifying user: Renewing an existing certificate for somedomain.com and 4 more domains
2021-04-14 16:40:46,746:DEBUG:certbot.crypto_util:Generating RSA key (2048 bits): /etc/letsencrypt/keys/0016_key-certbot.pem
2021-04-14 16:40:46,751:DEBUG:certbot.crypto_util:Creating CSR: /etc/letsencrypt/csr/0016_csr-certbot.pem
2021-04-14 16:40:46,753:DEBUG:acme.client:Requesting fresh nonce 
2021-04-14 16:40:46,753:DEBUG:acme.client:Sending HEAD request to https://acme-v02.api.letsencrypt.org/acme/new-nonce.
2021-04-14 16:40:46,906:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "HEAD /acme/new-nonce HTTP/1.1" 200 0
2021-04-14 16:40:46,908:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Date: Wed, 14 Apr 2021 16:41:28 GMT
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: 0003HZJAjYFpCFo_sxtkEfUIaZRClhRFl55lSuMQmT81mZY
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800 
2021-04-14 16:40:46,908:DEBUG:acme.client:Storing nonce: 0003HZJAjYFpCFo_sxtkEfUIaZRClhRFl55lSuMQmT81mZY 

As you can see - stopping the server does not seem like a problem.

Now that I am looking closely at this: seems like docs mention --pre-hook "service nginx stop" only for when using standalone plugin. However, my logs seems to suggest that standalone plugin is not used; nginx plugin is used instead. So I assume that means that server stop/start/restart is not needed in my case. And perhaps nginx plugin could keep some references to nginx, while stop/start commands were executed? And that could be the problem?

I will try to follow your advices, thanks...

1 Like

I think I got this working in two steps:

  1. Removing pre_hook/post_hook and adding deploy_hook = sudo nginx -s reload inside /etc/letsencrypt/renewal/somedomain.com.conf.
  2. Replacing command-line in /etc/systemd/system/certbot-renewal.service with ExecStart=/usr/local/bin/certbot-auto renew --quiet

Well, I least sudo /usr/local/bin/certbot-auto renew --force-renewal was executed OK and server did start to use new certificate (without dying in the process).

Thanks again!

3 Likes

Glad things are working as expected now!

:partying_face:

The steps you identified are indeed the shortcut. :wink:

2 Likes

The piece of documentation you're referring to is only an example of one of many usages for the pre and post hook and by far not a recommended method. In 999 of 1000 situations there is a better and/or more easy solution, so I believe you've mis-interpreted that part of the documentation.

But luckily you and @griffin (thanks!) managed to get everything working :smiley:

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.