"Error getting validation data"

There is no need to actually install a web server on a computer not having one, there is already a web server on every computer having certbot installed: python. That’s the way standalone runs, but it’s perfectly possible to make it permanent.

/etc/systemd/system$ cat acme-challenge.service
[Unit]
Description=acme-challenge
After=network.target
ConditionPathExists=/var/www/letsencrypt

[Service]
WorkingDirectory=/var/www/letsencrypt
ExecStart=/usr/bin/python3 -m http.server
Type=simple
SyslogIdentifier=acme-challenge.http.server
User=acme
Group=acme

[Install]
WantedBy=multi-user.target
Alias=acme-challenge.service

in the case of something having to run with haproxy, that could be python -m http.server 1234 to listen on a non standard port.

I’ve always used (rarely) ncat from my Windows laptop - I’ve never done it from a Linux Terminal before - I’ve learnt somthing - thanklyou

ncat result:

Ncat: Version 7.50 ( https://nmap.org/ncat )
NCAT DEBUG: Initialized fdlist with 103 maxfds
NCAT DEBUG: do_listen("::"): Address family not supported by protocol
Ncat: Listening on 0.0.0.0:54321
NCAT DEBUG: Added fd 3 to list, nfds 1, maxfd 3
NCAT DEBUG: Added fd 0 to list, nfds 2, maxfd 3
NCAT DEBUG: Initialized fdlist with 100 maxfds
NCAT DEBUG: selecting, fdmax 3

curl result:

HTTP/1.0 503 Service Unavailable
Cache-Control: no-cache
Connection: close
Content-Type: text/html

Okay. From that, we learn that something is busted with your haproxy config, or even networking in general. (You can curl 127.0.0.1:54321 while nc is running, right?)

I think from here I would probably take the approach of simplifying your haproxy config until it begins to work. Maybe take a backup and then it with something very basic:

global
    daemon
    maxconn 1024

defaults
    mode http
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms

frontend ft_http
    bind :80 
    bind :::80
    use_backend be_certbot if { path_beg /.well-known/acme-challenge/ }
    default_backend be_nginx

backend be_nginx
    server s_nginx 10.123.123.11:80

backend be_certbot
    server s_certbot 127.0.0.1:54321

Actuall, no, I can’t curl 127… I get either a conection reset by peer, (from the 2nd terminal), or a connection refused (from the 1st terminal).

I don’t belive it is a IP issue - mainly because the website comes up just fine if its http: - try it, you’ll see.

As the path/stack is: Gateway (Router) => HAProxy (Server) => Nginx (Server) the IP has to be correct. Its only this damned certificate (ie certbot) that is not working :frowning:

Then start a website on your nginx, so you can use --webroot. So you don’t need --standalone.

PS: So a check

http://www.peregrineit.net/.well-known/acme-challenge/87FbY9gvIKdVKpq7yVjsYzyZlDphO_2CgGLPV9-uRmA

should send a http status 404 - Not Found.

To clarify:

  • Two terminals one the same server
  • First terminal has nc -vvv -l -p 54321 running
  • Then (after nc has started), second terminal runs curl 127.0.0.1:54321

and that gives you a “Connection reset by peer” error on the second terminal?

Correct - you sense a problem, master?

Not possible - hence the use of the HAProxy server in the first place :frowning:

???

If a static solution doesn’t work, why should a temporary solution work?

No, the 2nd terminal gives:

curl: (56) Recv failure: Connection reset by peer

There are Political Restraints in place that forbid us to exposre a web server to the Internet WITHOUT going via a HAProxy server. We’re not just trying to get a certificacte this one time, we’re trying to set things up to get a number of certificates for a number of websites over the next several month and automatically renew them.

Hence, it needs to be set up on the HAProxy server, and we can’t use any “temporary work arounds” because our CISO/secuirty dept. wont allow it.

I’m still not sure whether we are on the same page. An error from curl isn’t a big deal, as long as you can see the request arrive in nc.

It might be easier to do this instead of nc:

python -m SimpleHTTPServer 54321

and try the same curl.

Edit: and blah, I made errors earlier on. The curl needed to include :54321 in the URL. Too tired to be posting, apparently.

I don’t say “You should skip your haproxy”. You should skip the --standalone, because it’s hard to debug.

PS: Your new 503 error says: Your haproxy works. But if you use standalone, there is (if certbot isn’t running) no answering webserver. So debugging the way from your haproxy to your port 5* isn’t possible.

OK - but the only problem with all that is that this was all working a few days ago until we had a crash (that took the backup of the haproxy server as well - luckily it was only the haproxy server and haproxy backup that went.

So I’ve been rebuilding the haproxy server over the last day and half (among other things) from installation notes from 2 years ago - so we had it working --standonly on the haproxy server then with that config, so we should be able to get it running now.

Obviously I’ve got a typo/misconfig somewhere, which we need to find.

That one gave the raw html code - so that one worked :smile:

This simple haproxy.conf file gives the same error as I get with my more “complex one” :frowning:

If by web server the policy means ‘a computer receiving HTTP connections’, I’m afraid that you are breaking it by running certbot with --standalone, since it is doing just that. In this case you can NOT implement http-01 challenge on the haproxy computer, and the optimal solution conforming to your corp policy is to use dns challenge

If by web server the policy means ‘a process receiving HTTP connections’ then --standalone is acceptable as long as it is behind haproxy… but a permanent web server such as python -m httpserver exposing only the files necessary for the let’s encrypt exchange is acceptable too.

1 Like

No need to throw the baby out with the bathwater. Proxying through to --standalone on an alternate port is a valid strategy that Certbot supports.

As OP said, it was working for them before.

That’s just really bizarre. If you can replicate the issue on a second test environment, I’d be happy to login and take a look if you want.

Its OK, I found the problem, there was a typo :smile:

Dry Run now working - thanks for your help, everyone

Cheers

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.