Letsencrypt + godaddy = fail

Why does cerbot sucks so much?!?!?
My domain is on godaddy.
And every renewal the same bsht.
Then after many retries sometimes it happens to work.
Isn’t a better way to handle this?!?!?

LOG:

Are you OK with your IP being logged?


(Y)es/(N)o: Y


Please deploy a DNS TXT record under the name
_acme-challenge.zibri.org with the following value:

B2ZEqGdvn-FNuKuoIXbNUVyIZCWuK-cNIbqHtnD5LI0

Before continuing, verify the record is deployed.


Press Enter to Continue


Please deploy a DNS TXT record under the name
_acme-challenge.zibri.org with the following value:

22Ds1uopjKiIy2Y_v4BArnobBmaxkYanmaEfcCkOoUc

Before continuing, verify the record is deployed.


Press Enter to Continue
Waiting for verification…
Cleaning up challenges
Failed authorization procedure. zibri.org (dns-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: During secondary validation: Incorrect TXT record “B2ZEqGdvn-FNuKuoIXbNUVyIZCWuK-cNIbqHtnD5LI0” found at _acme-challenge.zibri.org, zibri.org (dns-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: Incorrect TXT record “22Ds1uopjKiIy2Y_v4BArnobBmaxkYanmaEfcCkOoUc” found at _acme-challenge.zibri.org

IMPORTANT NOTES:

  • The following errors were reported by the server:

    Domain: zibri.org
    Type: unauthorized
    Detail: During secondary validation: Incorrect TXT record
    “B2ZEqGdvn-FNuKuoIXbNUVyIZCWuK-cNIbqHtnD5LI0” found at
    _acme-challenge.zibri.org

    Domain: zibri.org
    Type: unauthorized
    Detail: Incorrect TXT record
    “22Ds1uopjKiIy2Y_v4BArnobBmaxkYanmaEfcCkOoUc” found at
    _acme-challenge.zibri.org

    To fix these errors, please make sure that your domain name was
    entered correctly and the DNS A/AAAA record(s) for that domain
    contain(s) the right IP address.

1 Like

The answer can be found at https://letsencrypt.org/docs/godaddy/

Instead, GoDaddy offers automated renewal with their own certificates, which are an added-cost feature.

If you are manually issuing certificates and are receiving the errors about invalid TXT record, the likely reason is that you did not wait long enough for the GoDaddy nameservers to update.

You can also find other clients which offer a slightly more automated approach: https://github.com/acmesh-official/acme.sh/wiki/dnsapi#4-use-godaddycom-domain-api-to-automatically-issue-cert

3 Likes

It still sucks. Instead of using the same TXT record, letsencrypt should use 2 different records… so I can update them, wait 10 minutes and then renew them. Using the same one adds problems and nothing more.
It’s a horrible implementation.

I’m not exactly sure what you mean.

You should be adding both TXT records to _acme-challenge.zibri.org at the same time.

So, when Certbot prompts you, you should end up with this user interface in GoDaddy:

and then you may wait 10 minutes (or however long) after you have added the final one.

As I mentioned, acme.sh can make this slightly less painful as it will automated the adding and removal of both records for you.

But yes, it obviously sucks either way because Let’s Encrypt renewal is meant to be done by machines, not humans. Choosing a web host that provides AutoSSL can wash all that pain away.

3 Likes

Disclaimer: (I don’t know why i need this but just in case) I’m definitely not affiliated with Let’s Encrypt.

You can contact CA/B forum and ask them to update their regulations. Source Document

You also have other options to use a “Premium DNS Provider” which will probably provide much better update times. Please don’t place the blame on a single entity when the whole process is having issues.
(You might also experience the same issue at other certificate providers, and they will charge you for a certificate)

2 Likes

Nevermind. I wrote my own script. Thanks.

1 Like

Anyway, just to be clear: certbot should work automatically and from an automatic script.
Not needing all this mess.
About “premium” provider functions, that was not the subject of this post.
For premium fees I can have the world.
Also, about godaddy, every domain uses different godaddy DNS which are updated IMMEDIATELY if interrogated directly.
certbot and your servers should do a dns query and the directly ask the domain designed dns for the records and that would work flawlessly every time.
No matter the TTL.

That’s exactly what Let’s Encrypt do, they use unbound for DNS.

You can contribute your script to certbot if you wanted, however i don’t think certbot developers have the time to write scripts for every DNS providers on the world (not to mention some of them don’t even have API endpoints).

1 Like

One thing to add, He can skip the entire GoDaddy DNS problem by switching providers. Use Cloudflare (free, and works nicely with Certbot or Caddy in my configuration) or another service like Route 53 (I have no personal experience with this one)

EDIT: To clarify, by switching providers, I mean DNS providers. You can continue to use their Registrar, hosting, or any other services they offer while changing your domain’s authoritative nameservers.

2 Likes

Is your domain supposed to have a non-existent home page? I get a 404 when I go there. I use GoDaddy too and usually don’t have any problems certifying 15 different domains manually. I use my own website client instead of certbot. I set the TTL of the TXT records to 600 seconds (the smallest value they’ll allow, though it used to be 300 seconds). I find using https://toolbox.googleapps.com/apps/dig/#TXT/ to check my TXT records before continuing with validation can be helpful.

1 Like

I’m not sure why you’re blaiming certbot for this. Certbot can do a lot automatically. Just not everything. If you’re missing a specific feature, you could make a feature request on its github page or – even better – write a pull request! Certbot is open source :wink:

Please note that almost everybody on this community is not affiliated with Let’s Encrypt, but are enthousiastic volunteers.

The Let’s Encrypt validation servers actually don’t care about TTLs and ask the authoriative DNS servers directly. However, with things like global validation from multiple points, anycast and all kinds of distributive problems on the DNS servers part, the world wide propogation of DNS records might take some time.

4 Likes

It does. But does that script exist for godaddy? Maybe.

This might interest you:

3 Likes

@Gowebsmarty For automatic installation into cPanel, you’ve got to pay for the premium version. I see you’re the developer for the plugin, so I’m pretty sure this is spam.      

3 Likes

In my experience, the bigger problem are within the DNS Vendor's networks, and caused by internal caching systems and/or database replication. I can't speak to GoDaddy, but most large-scale providers I've interacted with use read-through/write-through application caches and DNS caches.

When providers leverage caches in their network (which is standard with large scale providers), updating a record via an API/ControlPanel will typically only update the Primary/Persistent database - it will not necessarily update/expire an application level cache (think Redis, Memcached, etc) which records may be queried from, or DNS level cache (some will have multiple tiers of DNS servers within their network), and database replicas may take time to sync up. Some providers offer a "DNS Flush" service that will purge records from multiple portions of their internal stack - however that is usually limited to 1 invocation every 24 hours.

The only effective way around this is to delegate DNS authorization to another service, such as acme-dns.

Here's a typical example of a caching nightmare:

  1. User updates a DNS record with Provider, changing a value from 192.168.0.1 to 192.168.0.2. The TTL is 60 seconds. This directly edits the Provider's persistent storage database.

  2. A first DNS query is made by LetsEncrypt. The nameservers serve the old record, because the TTL was 60s and only 45 seconds have passed since last loading the record.

  3. A second DNS query is made by LetsEncrypt. The record is outside of the TTL, so a query is made upstream within the network. The system employs a "dogpile" generation lock, and serves the old record while it repopulates with the new record.

  4. A third DNS query is made by LetsEncrypt. The server now returns the result generated by the second DNS query. That record is still the original record, because it was valid within the secondary DNS server.

  5. A fourth DNS query is made by LetsEncrypt. There is no valid value in the upstream DNS servers. The Provider's system then queries an internal API for the value; that value is pulled off a read-through cache backed by Redis. Although the TTL for the record is 60s, the application's Read-Through cache has a 5minute timeout for storage - so the cached value is served. (You don't think this happens?!? It happens a lot)

  6. A fifth DNS query is made by LetsEncrypt. There are multiple layers of misses and expiries, a dogpile style generation is handled by the application API and the stale value is returned while the new value is loaded.

  7. A sixth DNS query is made by LetsEncrypt. There are multiple layers of misses and expiries, the value loaded into the Application Cache off the backing datastore during the fifth request is now returned. This too serves the old value, because the Application queries a read-only database replica which has not yet synced to the primary database. While the ControlPanel visible to the user and authenticated API return off a primary write/read database server, the DNS servers have enough traffic they must read from replicas.

  8. A seventh DNS query is made by LetsEncrypt. The new record is returned - as the internal cache failovers are now hitting a fully synced read-only database server.

To deal with these scenarios, some providers offer DNS-Flush services that will try to purge the records for domain(s) from multiple internal cache levels and nodes.

This type of feature usually: (1) can take several minutes; (2) is limited to one invocation per day; and (3) often misses a caching datastore or node.

This is why I migrated to acme-dns.

(reposted with right reply)

3 Likes

I’m not sure this is true. Unbound as used by Let’s Encrypt “walks” the DNS tree up to the authorative DNS server. Why would the authorative nameserver serve the old record due to a non-expired TTL? In my experience when I use dig, only caching (i.e.: non-authorative) DNS servers show a declining TTL while the authorative DNS server always shows the non-declining TTL as set in the DNS zone together with the non-cached result.

2 Likes

Perspective.

This all happens on the provider's internal network. LetsEncrypt/dig talk to the PUBLIC authoritative name server, and that server is configured to internally cache records for the TTL before generating any new values. When generating a new value, the PUBLIC authoritative DNS server may then rely on one or more backend name servers.

Consider this failover visually:

[Public DNS Server | Authorative] 
-> [Internal DNS Servers]
-> [Internal API Servers]
-> [Internal Application Cache]
-> [Internal Database Replica]
-> [Internal Database Master]

With larger vendors, we can see this:

  1. The DNS Servers have multiple FQDNs (e.g. ns1.example.com, ns2.example.com)
  2. The FQDNs for the DNS servers resolve to multiple IPs each (roundrobin DNS selection)
  3. The IPs of the nameservers are just gateways/loadbalancers to pools of DNS servers

So... two nameserver FQDNs can end up resolving to 6 gateway IPs, which front 12 nameservers responding to DNS queries.

As far as the PUBLIC can see, the 2 FQDNs (which are really 12 responders) are the authorative nameservers for a domain. INTERNALLY, the provider's network is designed to have them failover/lookup DNS queries to internal nameservers that are loadbalanced or sharded in a specific way. Some providers do this because it's fairly simple to leverage existing DNS software for load balancing and caching.

The actual record is never managed in the "authoritive nameserver" - which is essentially only authoritative to the general public. As far as the public can tell - and the query-responding servers behave - the data is not cached and is served. In reality, there is a complex mutli-tiered caching system in place on the internal network.

Every provider is slightly different on how they implement caching like this. If you can manage to elevate support requests to a dev/ops team or speak to some technical employees at a specific provider, they will sometimes explain their internal systems to you.

A few years ago I had to move 50 domains off a certain provider, because their systems had so much caching it took over 15 minutes to publicly update a domain's records after editing. Working with their dev/ops team, we managed to drop the wait to 6minutes by dropping the TTL to 60seconds. IIRC, they had an internal caching layer that used BIND to shard customer records. Behind the public authoritive server, there were 2 or 3 levels of DNS servers that cached results (and then the application cache, and it was just a nightmare).

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.