During secondary validation: No valid IP addresses found for sjvapzr2.app-staging.havoc.sh

My domain is: sjvapzr2.app-staging.havoc.sh

I ran this command: I'm using the nginx-proxy docker container from here https://github.com/nginx-proxy/nginx-proxy and the letsencrypt companion container from here https://github.com/nginx-proxy/docker-letsencrypt-nginx-proxy-companion. I used the three step process as described on the letsencrypt companion container github page.

It produced this output: During secondary validation: No valid IP addresses found for sjvapzr2.app-staging.havoc.sh as shown here: https://acme-v02.api.letsencrypt.org/acme/authz-v3/7430402154

My web server is (include version): The latest version of the nginx-proxy container.

The operating system my web server runs on is (include version): Whatever the nginx-proxy container runs.

My hosting provider, if applicable, is: AWS

I can login to a root shell on my machine (yes or no, or I don't know): Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): The latest version of the letsencrypt companion container.

The nginx-proxy logs show that the http validation succeeds so I'm at a loss. How is it connecting if no valid IP address was found?

**nginx.1 |** sjvapzr2.app-staging.havoc.sh 64.78.149.164 - - [24/Sep/2020:07:15:30 +0000] "GET /.well-known/acme-challenge/PqtXEAsYbvxuVZ6KCvErAjswaK7UkcxRGDKCcKaU2d8 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
**nginx.1 |** sjvapzr2.app-staging.havoc.sh 3.128.26.105 - - [24/Sep/2020:07:15:37 +0000] "GET /.well-known/acme-challenge/PqtXEAsYbvxuVZ6KCvErAjswaK7UkcxRGDKCcKaU2d8 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
**nginx.1 |** sjvapzr2.app-staging.havoc.sh 34.209.232.166 - - [24/Sep/2020:07:15:43 +0000] "GET /.well-known/acme-challenge/PqtXEAsYbvxuVZ6KCvErAjswaK7UkcxRGDKCcKaU2d8 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
2 Likes

Welcome to the Let's Encrypt Community, Tom :slightly_smiling_face:

I'm confused by this as well, since sjvapzr2.app-staging.havoc.sh is currently serving the certificate you got 20 minutes ago.

1 Like

It's because of ACME v1/v2: Validating challenges from multiple network vantage points.

To get a successful validation, the primary and 2 out of the 3 secondary validation servers must be able to perform the challenge.

Per your error message, at least two of the secondary servers claim to have failed to resolve your domain.

It's definitely odd that you have 3x HTTP 200s there, though. It would suggest that only one secondary server failed.

edit: removed ping

3 Likes

@_az

Why did he still get a certificate then? Is this a failure when acquiring a duplicate?

1 Like

OP probably managed to get the certificate in a subsequent attempt (or got the certificate in some unrelated way?). That order definitely failed.

2 Likes

@_az

It failed a DNS lookup of an IP address right? Without that there wouldn't be any way to attempt getting the challenge file. All three of the accesses in @tom.havoc's log are within 20 seconds of the certificate's start time. This is just my curiosity here. Don't read into it too much.

Access Log:

Certificate:

Not Before: Sep 24 07:15:49 2020 GMT

1 Like

Certificates are backdated by an hour. That certificate was issued at 08:15 UTC, which makes for a confusing coincidence.

As for the HTTP requests, I don't have an explanation, but I'm sure we'll get one eventually :slight_smile: .

2 Likes

:astonished:

So it was issued 20 minutes before my first reply then. Good to know.

1 Like

Thanks for the comments!

@_az is correct, I attempted a second time and it worked. So here's a bit more elaboration of what I'm doing and I think it will explain the results I'm getting.

The sever is provisioned with a build script and the FQDN is generated on the fly. After the provisioning script creates the instance, it then grabs the instance IP and adds an A record to the hosted zone in Route 53.

The nginx containers get pulled as part of the cloud init script and I put a delay in there to try to make sure that the DNS record is in place before that happens. So it must be that the newly added A record hasn't fully propagated yet when the certificate request is taking place. I'll try extending the delay and see if that fixes it.

Again, the responses are really appreciated.

Best,
Tom

2 Likes

Seems like you've got all your ducks in a row.

:duck: :duck: :duck:

Glad you have your certificate process working!

:partying_face:

2 Likes