Yes, the default and often easiest approach is to leave port 80 open, redirecting normal users to https. Usually it's the same software running on 80 as on 443, and so it's not really opening any attack surface.
If you can't do that for whatever reason, then you need to use a different challenge type. Only needing 443 open is TLS-ALPN-01, which is supported by some clients but not by certbot, and isn't nearly as popular. Or, if you can automate your DNS record updates, then you can use DNS-01.
Thanks both for the responses. The only thing that has changed is that we closed http access; this is not a generally available web server, and the intent is to reduce overall unwanted traffic. TLS-ALPN-01 would work, but since certbot doesn't support it and I'd like to stick with certbot for now, that's not an option.
Since we have to leave http open for certbot to work, is there a known range of IPs we could use? Understanding that might change, but is that likely?
If I understand correctly, you (and the referred to note) are saying closing port 80 does not meaningfully improve security because typical redirects go to the same software. Agreed. However, closing port 80 may meaningfully reduce unwanted traffic, and server load, which is why I want it closed.
Unfortunately, this is on a google compute engine vm and it appears the firewall rules don't support filtering based on url content. Thanks for the suggestion.
I should expound on "Doubtful".
If you are dropping the requests, then you have already heard them.
And the only thing being saved is the small "redirection" reply.
This is like trying to do QoS only the nearside of the restrictive/limited link - too late, the bandwidth has already been consumed.
Not replying at all won't stop the majority of unwanted request (vulnerability scans/exploit attempts).
They are pre-programmed to either hit all IPs (in a specific range [sometimes the entire IPv4 space]) OR they are triggered by IPs gathered that are responsive to any port.
So, if you respond on any port (like: 443), your IP will be on a list (somewhere) and that list will feed the scanners that will continue to probe every (other) possible/exploitable port.
If the idea is that the server shouldn't be publicly available in general, then you probably want to switch to the DNS-01 challenge, which doesn't care whether your web server is available at all (though it does require your authoritative DNS server to be publicly available). It looks like you're also using Google for the DNS; there's probably a plugin for certbot to update the DNS for you and prove ownership over the name that way. Be aware though, that you might just be trading one security threat for another, as some system administrators aren't thrilled with their web servers having store credentials that let them automatically make any changes they want to DNS.
Thanks, I am reluctant to switch to DNS-01 for the reasons you mention. I've replaced the cron certbot line with a script which enables http, does the certbot renew, then disables it. Not ideal but it will do for now.
Glad to hear you found something that worked for you. It's not the first time I'd heard of someone scripting opening their firewall to allow the challenges through and then closing it again afterward; I've even suggested it in the past to people so I should have thought to suggest it here.
One thing you might want to do (though I understand being hesitant to change something once it's working) is that instead of running your firewall commands in the cron job, where they'd run every time certbot renew gets run, you could put them in a --pre-hook and --post-hook argument to certbot when doing a renewal. That way, the firewall only gets opened when a certificate is actually trying to get renewed, instead of every day (or whenever your cron job runs) as certbot checks to see if the certificate is close enough to expiration.