Dns01 with OVH doesn't seems to work

Hi,

I’m banging my head against the wall.

I’m trying to issue certificate with the dns-01 procedure. I’m on ubuntu 16.04. And this was intented to be a test. So i’m not on a production server but on my laptop.

So first thing I tried this : github.com/antoiner77/letsencrypt.sh-ovh the error was :

Failed authorization procedure. www.example.net (dns-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: D
NS problem: NXDOMAIN looking up TXT for _acme-challenge.www.example.net

And the command was :

certbot certonly -d www.example.net --manual --manual-auth-hook ./manual-auth-hook.py --
manual-cleanup-hook ./manual-cleanup-hook.py --config-dir=./config --work-dir=./workdir --
logs-dir=./logs --agree-tos --preferred-challenges dns -m remi.desgrange@example.net --
manual-public-ip-logging-ok

I replace my domain by example.net of course. When I do a dig

dig _acme-challenge.www.example.net TXT

I get the correct token. So I tried without the script to add the challenge directly in the dns myself (juste --manual) and I got the same error.

So if you have any idea whatsoever thanks.

Hi @RemiDesgrange,

For this kind of DNS issue it's very difficult to help troubleshoot without knowing the domain name to be able to run test queries and examine the logs. Can you share the affected domain name?

Yes, fibrea.net

I tried www.fibrea.net., so _acme-challenge.www.fibrea.net. (I also tried gitlab).

Hi Remi,

I don't see any TXT records for "www.fibrea.net" presently. Can you try to issue using the staging environment and modify the hooks so that the record created by the client isn't removed? That will likely help troubleshoot.

You now have a TXT rec for _acme-challenge.www.fibrea.net issued by staging env.

hi @RemiDesgrange

Your DNS looks good from the tests I have done

Andrei

It could be a matter of timing. If it takes, say, 10 seconds, or 10 minutes, for DNS changes to be reflected by all of OVH’s nameservers, and the client is having Let’s Encrypt check a moment after creating them…

1 Like

the thing is, In the manual hook script I do something like this after creating the record :

res = Resolver()
while True:
    try:
        res.query(os.environ['CERTBOT_DOMAIN'])
        break
    except:
        pass

This is dirty bug it should work. I saw here https://certbot.eff.org/docs/using.html#pre-and-post-validation-hooks that the script is waiting 25 seconds before doing anything, maybe I should try this.

EDIT:

So I have tried with 60sec sleep, and now it is working. The code that check the validity of the TXT (with a dns request) record does not work :frowning:

Thank you all for your help

1 Like

Even if you do query the DNS servers, it may not be possible for you to check all of them. If they use load balancers, or a globally distributed anycast setup, you may find that some servers are up-to-date, while the Let’s Encrypt resolver hits other ones that aren’t.

The only way to be sure is with detailed infrastructure information or technical promises that many DNS providers don’t provide.

1 Like

Let’s Encrypt always asks the authoritative nameserver for your domain - propagation is not a concern here.

@jared.m

Are you absolutely sure about that

Andrei

Pretty sure

Let's Encrypt always asks one of the authoritative nameservers for the zone, chosen at random. "Propagation" in the sense of "are all of the authoritative servers up-to-date" matters very much.

1 Like

Yes, but “propagation” in the sense of “worldwide distribution that can take up to 24 hours” is not generally a concern with Let’s Encrypt.

hi @jared.m

You are correct propagation to DNS servers around the world might take longer

In this case the delays was with the OVH internal sync (which the user proved by delays for 60 seconds) so propagation of internal DNS infrastructure on OVH was an issue.

There have also been issues where customers have had different servers out of sync and pointing to different resources and challenges didn’t pass or they had name servers which were not online but part of the SOA record

The key take away is all authorative nameservers need to be up to date and when a challenge is added all servers need to be answer that challenge.

Andrei

1 Like

I believe people most often use the term "DNS propagation" to refer to caching resolvers going back to the authoritative servers after a record's TTL expires, rather than whether authoritative servers themselves are aware of a change. DNS propagation in the former sense isn't a consideration for Let's Encrypt because Let's Encrypt never relies on a caching resolver for this purpose.

There are two causes of the blanket statement "worldwide distribution can take up to 24 hours:"

  1. Most recursive DNS resolvers have caches; the relevant resource records (RRs) need to expire from all those caches based on their TTL. This is generally not a concern for Let's Encrypt DNS challenges because (a) the relevant RR, _acme-challenge.example.com is not commonly requested and so won't be in any cache, (b) the TTL on that RR is usually low, and most importantly (c) Let's Encrypt's resolver doesn't really cache results.
  2. There are many authoritative nameservers for a given domain name, and they are busy, so data updates get batched up and pushed out in chunks, and sometimes not to every authoritative nameserver at once. So it may take some time for all the authoritative servers to have the new data. Most modern authoritative resolvers actually update very quickly, on the order of minutes. However, there's one big exception: The TLD nameservers (e.g. for .com, .net, etc) are very busy indeed, and there are a lot of them all around the world. When you register a new domain name, each of those TLD nameservers need to receive updates saying that your domain name exists, and pointing to your authoritative resolvers. These can often take longer.

The terminology tends to be fuzzy, but my personal understanding is that (1) is usually called "cache expiration" or similar, and (2) is usually called "propagation," but the two are frequently confused. Note that only (2) is relevant for most people, and certainly it is the only one that is relevant for Let's Encrypt users. But it definitely is relevant for Let's Encrypt users.

As a final note: It's really quite hard to figure out how many authoritative resolvers there are for a given domain name, due to the load balancing and anycast situations @mnordhoff mentioned. In a load balancing setup, depending on where in the world you look up the addresses for a set of authoritative resolvers, you might get different IP addresses. This is done in order to spread the load among those IP addresses. In an anycast setup, you always get the same IP addresses for the authoritative nameservers no matter where you query from, but sending traffic to those IP addresses from different parts of the world will route to different physical machines. So it's not possible for an end-user to check whether propagation (2) has completed. Some DNS providers, like Route53, provide an API to check whether propagation has completed, but that is rare.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.