Certbot killing apache


#1

Hi, I have several installations of certbot working ok, but one server is killing apache each night.
That server (Ubuntu 16.04.5 x64) has Apache running on port 80 and Nginx on 443, when trying the renewal apache gets killed because:

[Tue Aug 28 23:58:44.650612 2018] [mpm_prefork:notice] [pid 23067] AH00171: Graceful restart requested, doing restart
AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
(98)Address already in use: AH00072: make_sock: could not bind to address [::]:443
(98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:443
[Tue Aug 28 23:58:44.980410 2018] [mpm_prefork:alert] [pid 23067] no listening sockets available, shutting down

This is correct since 443 is with Nginx. At this point I have 2 problems:

  • how to have certbot not to modify my ports.conf adding the port 443 each time
  • how to know from where certbot started.

The second point is more important than the first one (and may be a solution to it also): I have no clue who the hell is starting certbot. I DO NOT have scheduled anything in my crontabs (for any user), I saw the script in the cron.daily directory and removed it. I restarted the server a dozen of times.
Still each night I see the logs of certbot trying to renew the cert and killing apache.
I can run certbot manually when it is needed, no problems about that.
I would simply remove the automatic renew process that starts at night. This must be something related to one of the latest updates, since the setup was working since some months.

I’m running certbot 0.26.1 installed via packages as per instructions.
Thanks


#2

Hi @LoZio

looks like you use tls-sni-01 - validation. This is deprecated. Perhaps switch to http-01 - validation, then you can use a running webserver and the --webroot - option. Then no new program / port 443 is required.

There are a lot of places to do things: Cron, at, systemd


#3

Thanks Juergen, but I don’t “use” anything, there is something scheduled to do it.
I’ve been running *nix boxes in the last 20 years so I know about scheduled task a bit and there is nothing I can think of that starts that task. I think it is started inside some daemon since there are no logins at that time, and scheduled tasks always have a login entry for the user running that.
I supposed some update to certbot added some daemon or hook, it was working since the update I made a couple of weeks ago.
Editing to list the service i found:


Tried disabling it, let’s see.
I understand some update installed that service


#4

I believe this is the new systemd timer mechanism, which 20-year Unix veterans (including me!) often find counterintuitive since it didn’t historically exist in Unix.

I think packaging guidelines for systemd-based operating systems are suggesting migrating cron jobs to systemd timers, and Certbot packaging is also gradually following this recommendation.


#5

Yes, it’s a matter of documenting. The documentation on the net is all about configuring a cron job and the way to have a random time to it. The Ubuntu package itself installs a cron file with the random delay so I think there’s the need to:

  • remove overlapping
  • output a message while upgrading packages telling to undo what was done by earlier packages or sysadmins following the doc
    Thank you

#6

To ease people who googled the issue, on Ubuntu:

systemctl disable certbot.timer
systemctl disable certbot.service
rm /etc/cron.daily/certbot

disables automatic executions of certbot


#7

I’m sure that isn’t the best end result, right? You do want some form of automated renewal?


#8

I do NOT want any renewal on this server, I will do it the way it should be done. This requires ACL modification and so on.
Assuming that all the servers are configured the same way and you can safely crash a running Apache is not the best option in my opinion.
IF I want to auto renew the cert, I configure a cront job as it was before.
By the way after I wrote the message above the server crashed again, I just don’t have a clue why but the auto-renewal started again:

Aug 30 09:14:35 my-server certbot[1031]: Encountered vhost ambiguity when trying to find a vhost for leverify.my.domain but was unable to ask for user guidance in non-interactive mode. Certbot may need vhosts to be explicitly labelled with ServerName or ServerAlias directives.
Aug 30 09:14:35 my-server certbot[1031]: Falling back to default vhost *:443...
Aug 30 09:14:35 my-server certbot[1031]: Error while running apache2ctl graceful.
Aug 30 09:14:35 my-server certbot[1031]: httpd not running, trying to start
Aug 30 09:14:35 my-server certbot[1031]: Action 'graceful' failed.
Aug 30 09:14:35 my-server certbot[1031]: The Apache error log may have more information.
Aug 30 09:14:35 my-server certbot[1031]: AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
Aug 30 09:14:35 my-server certbot[1031]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:443
Aug 30 09:14:35 my-server certbot[1031]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:443
Aug 30 09:14:35 my-server certbot[1031]: no listening sockets available, shutting down
Aug 30 09:14:35 my-server certbot[1031]: AH00015: Unable to open logs
Aug 30 09:14:35 my-server certbot[1031]: Attempting to renew cert (leverify.my.domain) from /etc/letsencrypt/renewal/leverify.my.domain.conf produced an unexpected error: Error while running apache2ctl graceful.
Aug 30 09:14:35 my-server certbot[1031]: httpd not running, trying to start
Aug 30 09:14:35 my-server certbot[1031]: Action 'graceful' failed.
Aug 30 09:14:35 my-server certbot[1031]: The Apache error log may have more information.
Aug 30 09:14:35 my-server certbot[1031]: AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
Aug 30 09:14:35 my-server certbot[1031]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:443
Aug 30 09:14:35 my-server certbot[1031]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:443
Aug 30 09:14:35 my-server certbot[1031]: no listening sockets available, shutting down
Aug 30 09:14:35 my-server certbot[1031]: AH00015: Unable to open logs
Aug 30 09:14:35 my-server certbot[1031]: . Skipping.
Aug 30 09:14:35 my-server certbot[1031]: All renewal attempts failed. The following certs could not be renewed:
Aug 30 09:14:35 my-server certbot[1031]:   /etc/letsencrypt/live/xxx/fullchain.pem (failure)
Aug 30 09:14:35 my-server certbot[1031]: 1 renew failure(s), 0 parse failure(s)
Aug 30 09:14:35 my-server systemd[1]: certbot.service: Main process exited, code=exited, status=1/FAILURE
Aug 30 09:14:35 my-server systemd[1]: certbot.service: Unit entered failed state.
Aug 30 09:14:35 my-server systemd[1]: certbot.service: Failed with result 'exit-code'.
Aug 30 09:14:38 my-server systemd[1]: Stopped Run certbot twice daily.

That was from syslog. And again the ports.conf were modified and apache crashed. Please, is there someone who knows how to stop this from running, or to configure it not to try to use port 443?


#9

Looks like you have certificates using standalone and tls-sni-01 - validation.

Then a new server is startet on port 443.

So you should check your certificates, that all use http-01 validation.


#10

Ok, thanks for the reply. The certificates were obtained using the “standard” interactive procedure from command line, I have no idea what kind of certificate was requested. I doubt it ever required port 443 being open since from the internet port 80 and 443 go to different servers.
BTW who the hell is starting the renewal now? It is something with some randomness since it happens at any time in the day and never the same twice.


#11

Just made a new test (after setting up a script to modify the ACLs and NAT to direct the traffic to the correct server). If I just request a new certificate in interactive mode (certbot --apache) it does require a binding to 443, crashing the server.
If I specify (thanks for the suggestion) --preferred-challenges=http it runs on http only and the request goes through.
Now the problem is to know which command is used from the auto-I-do-not-know-what to check for renewal. Is there a place where my http preferred option is stored so that this will be picked up someway? That would be good, but still I’ll have to chain that procedure with my ACL/NAT modification, so I need to know who starts the process…


#12

The options that you most recently specified on the command line (including --preferred-challenges) should be saved in /etc/letsencrypt/renewal, which is used by certbot renew.

If you turn out to need to run a script during renewal, there are --pre-hook and --post-hook options to run these scripts (and they’re also saved in the renewal configuration).


#13

Certbot creates a job with a random start. So that not all certificates are renewed at 12:00, 13:00 …


#14

certbot itself does no such thing.It’s the OS package which could introduce such a systemd timer or cronjob. Not every distribution does it. For example, Gentoo doesn’t.


#15

Here lies the entire problem since my first post: on Ubuntu there is “something” that starts the renewal process. No one is able to tell me what it is and systemd and crond are disabled (see above). So I have no idea where is the script that starts it and so i cannot add any pre/post scripts. To be honest it would be a lot better to write my own scripts with checks and so on, as I always did.


#16

As you can see above, I know it, I wrote it and I found and disabled THAT script. There is something other that runs the renewal.


#17

Yes, the config is saved:

[renewalparams]
installer = apache
account = xxxxxxxxxxxxxxxxxxxx
authenticator = apache
server = https://acme-v02.api.letsencrypt.org/directory
pref_challs = http-01,

My manual script runs with no problem, but definitely there should be a script with hard coded options somewhere…


#18

@joohoi, do you know what the Ubuntu package is currently using for automated renewal? Is it not the systemd timer at present?


#19

Apparently ‘disable’ only takes effect when you reboot, you have to use systemctl stop certbot.timer if you want to stop it immediately.

Edit: or you can use

systemctl disable --now certbot.timer

which will both stop it now and prevent it from starting again when you reboot.


#20

I restarted the server several times after the disable command, also as you can see from the pic above the service itself was already failed.
Currently the list in the picture does not show the line about certbot anymore, still it runs at random times twice a day.