Splitting DNS across Google Domains and Google Cloud DNS

My domain is: 66c.dev

I ran this command:

domains:
  - '*.66c.dev'
certfile: fullchain.pem
keyfile: privkey.pem
challenge: dns
dns:
  provider: dns-google
  google_creds: google.json

(via https://github.com/home-assistant/hassio-addons/tree/master/letsencrypt wrapper)

It produced this output:

Plugins selected: Authenticator dns-google, Installer None
Obtaining a new certificate
Performing the following challenges:
dns-01 challenge for 66c.dev
Attempting refresh to obtain initial access_token
Refreshing access_token
Cleaning up challenges
Attempting refresh to obtain initial access_token
Refreshing access_token
Error finding zone. Skipping cleanup.
Unable to determine managed zone for 66c.dev using zone names: ['66c.dev', 'dev'].

My web server is (include version):

Home Assistant 0.114.4

The operating system my web server runs on is (include version):

Ubuntu 20.04.1 LTS

My hosting provider, if applicable, is:

DynDNS off a local server, updating Google Domains via ddclient

I'm trying to route _acme-challenge only to Google Cloud DNS via a CNAME pointing at
ns-cloud-a1.googledomains.com. In Cloud DNS, I've created a public record for _acme-challenge.66c.dev with data:

ns-cloud-a1.googledomains.com. cloud-dns-hostmaster.google.com. 1 21600 3600 259200 300

google.json has an appropriately scoped role account set up.

I can login to a root shell on my machine (yes or no, or I don't know):

Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):

No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):

hassio-addons/letsencrypt 4.10.0

I'm especially unsure about what the heck I'm up to with the Google Domains -> Google DNS routing (I can't do the whole record, because then ddclient can't auto update the synthetic A record anymore; I want to route the challenge subdomain only).

Thanks in advance!

Sneaky edit: I have a working Let's Encrypt SSL cert deployed at the moment, but it's only on the top domain (not the *.). For some reason I decided to try the new HA plugin tooling and a new challenge method at the same time.

1 Like

The Internet shows four different DNS names that are authoritive for your zone:

66c.dev nameserver = ns-cloud-e1.googledomains.com
66c.dev nameserver = ns-cloud-e2.googledomains.com
66c.dev nameserver = ns-cloud-e3.googledomains.com
66c.dev nameserver = ns-cloud-e4.googledomains.com

Not:

1 Like
nslookup 66c.dev ns-cloud-a1.googledomains.com
Server:  UnKnown
Address:  216.239.32.106
*** UnKnown can't find 66c.dev: Query refused
1 Like

Thanks! Is it possible I have or can implement a different nameserver for the _acme-challenge.66c.dev subdomain? I was hoping to retain Google Domains for the main domain and just use Cloud DNS for the acme challenge, since Google Domains doesn’t have an API that would let me keep the TXT records in sync for renewals.

You could CNAME the whole _acme-challenge entry to any other FQDN.

1 Like

To be more specific, you can’t have both Google Domains and Google Cloud DNS host the root 66c.dev domain. But you can “delegate” a subdomain like acme.66c.dev to Google Cloud DNS.

Then you add a CNAME in Google Domains for _acme-challenge.66c.dev that points to _acme-challenge.acme.66c.dev and use a client that supports both CNAME challenge aliases and has a Google Cloud DNS plugin. The TXT record should get created at _acme-challenge.acme.66c.dev and you’re good to go.

3 Likes

Thanks both, I think I’m getting close. Using these handy keywords I stumbled upon this SO which introduced me to the concept of a “zone cut”. I was using a CNAME record, but it sounds like I should use an NS and a CNAME (going from Ryan’s comment). So my new setup is:

And here’s my zone def in Cloud DNS: https://imgur.com/1bxtO0i (can only embed one image per post since I’m a newbie).

Does this look right so far?

2 Likes

Not quite there yet.
66c.dev is controlled by the e1-e5 DNS servers.
You are trying to delegate the acme subfolder (acme.66c.dev) to the a1-a5 DNS servers.
So far, so good.
But your image of the entries in the a1-a5 system, doesn’t show the acme subfolder.
image

1 Like

Gotcha, thanks Rudy! I created a new zone in Cloud DNS: https://imgur.com/a/E79nldN

Which has given me the following: https://imgur.com/a/6oCjjgE

I noticed I’m now in the d1-d4 system, so I’ve updated my Google Domains entry to point to d1: https://imgur.com/4Q9la0V

Does that look better? I should write up a little guide on this afterwards, as I’ve googled I’ve found at least three people who’ve got close and given up.

1 Like

Yes.

Sorry to hear about their misfortunes.
DNS seems so simple and straightforward (to me).

I will agree that some very simple concepts are generally misunderstood.
Take the simple CNAME.
If I'm looking for A but it is CNAMEd to B.
What just happened?
Some say - now you must go to B and ask for B there.
Some say - just go to B and ask for A there.
Obviously they can't both be right.

Simple answer - in plain English:
If you are looking for Adam but you find a sign on his door that says go to Bob house.
When you get to Bob's house, do you ask for Bob or Adam?
[hint: Who are you looking for again?]

1 Like

Ha! Great analogy. Might be standing at Bob’s house but I’m still after Adam. You should write a DNS primer for people like me.

I’ll give this an hour to propagate then have another go. Fingers crossed!

2 Likes

No luck I’m afraid - any guesses? Here’s my output:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] file-structure.sh: executing... 
[cont-init.d] file-structure.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[12:06:45] INFO: Selected DNS Provider: dns-google
[12:06:45] INFO: Use propagation seconds: 60
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator dns-google, Installer None
Obtaining a new certificate
Performing the following challenges:
dns-01 challenge for 66c.dev
Attempting refresh to obtain initial access_token
Refreshing access_token
Cleaning up challenges
Attempting refresh to obtain initial access_token
Refreshing access_token
Error finding zone. Skipping cleanup.
Unable to determine managed zone for 66c.dev using zone names: ['66c.dev', 'dev'].
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.

And I found the full log in a distant Docker cache folder:

https://pastebin.pl/view/4cfced0c

1 Like

You have to be able to tell it to update the zone "acme.66c.dev" (not "66c.dev").
In there It should place the _acme-challenge TXT record.
[even thou you want a wildcard cert for "66cdev"]

Please show the non-private parts of this file:

1 Like

Perhaps I’ve stuffed the role account creation up? I’d have thought that would manifest in an auth error (in fact, I think it did at an earlier point and I had to give the account permissions for the Cloud DNS API specifically).

# cat /usr/share/hassio/share/google.json
{
  "type": "service_account",
  "project_id": "home-assistant-288014",
  "private_key_id": "__redacted__",
  "private_key": "-----BEGIN PRIVATE KEY-----\n__redcated__\n-----END PRIVATE KEY-----\n",
  "client_email": "ha-861@home-assistant-XXXXX.iam.gserviceaccount.com",
  "client_id": "_redacted_",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/ha-861%40home-assistant-XXXXX.iam.gserviceaccount.com"
}
1 Like

Will that info change the E1 or the A1 DNS servers?
[I was hoping there would be something more obvious in there]

1 Like

As I understand it, that role account is capable of writing to any of the zones in my Google Cloud DNS account. I’ve deleted the old zones, so I imagine it will be interacting with B.

Looking at _find_managed_zone_id() I think the issue is that it’s looking for a zone named either “66c.dev” or “.dev”. My zone (in Cloud DNS) is named “acme.66c.dev.”. Other than modifying the Python, which seems like a bad idea, is there a way I can pass this subdomain as an argument to the script? From my reading, perhaps not?

1 Like

Apologies for not recognizing this sooner and there are other who would know better. But you may be stuck because that home assistant addon appears to be using certbot under the hood and certbot doesn’t have great support for DNS challenges using CNAME aliases. There’s an open issue about it, but it’s rather old and seems to be low priority.

Are there any other addons that use a different client like acme.sh?

1 Like

Ah, what a pain! That’s a very helpful clue, cheers.

I’ve monkeyed something together to simulate the functionality I need (I guess this is okay until I update the extension, so I’ll work on a PR to do it properly). But it hasn’t quite worked: I can see the TXT record was created:

But I’m getting an error looking it up:

Some challenges have failed.
IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: 66c.dev
   Type:   dns
   Detail: During secondary validation: DNS problem: NXDOMAIN looking
   up TXT for _acme-challenge.66c.dev - check that a DNS record exists
   for this domain

As from above, here are my Google Domains entries (the ones that should theoretically be bouncing _acme-challenge.66c.dev to _acme-challenge.acme.66c.dev):

Any guesses?

1 Like

Trying to follow the trail myself, I’m not quite sure where certbot is getting lost:

 DNS server handling your query: localhost
 DNS server's address:	127.0.0.1#53
 
 Non-authoritative answer:
 _acme-challenge.66c.dev	rdata_46 = CNAME 8 3 3600 20200923080850 20200901080850 51963 66c.dev. ixR+ActeKlcZJ5mt4Rbbt3DCBg6rZbtJkdtfEmWZI32zlS8kAIJIqmpt lOpsNN23yxdZK4EK4uGR1IyJC0Q3h32p6BJc4f7XnUvCq4/ZbKC74ZIe KKeV/2yQugH7Jj6Vv0D7WcBfnUotPqwwERVmcZYRoKEBMR0sSwf15ahK grA=
 _acme-challenge.66c.dev	canonical name = _acme-challenge.acme.66c.dev.

 Authoritative answers can be found from:

If I follow the canonical name, things look good:

 DNS server handling your query: localhost
 DNS server's address:	127.0.0.1#53
 
 Non-authoritative answer:
 _acme-challenge.acme.66c.dev	text = "76ptv0D1D1CU7bYS2odTW8PnOGXiokBKfs4iqAAzoew"
 
 Authoritative answers can be found from:

I’m pretty sure that rdata_ tag on the first result is something to do with DNSSEC, so I’ve turned it off just in case that helps - but apparently it can take up to 2 days to propagate. In the meantime, the only difference I can see is that my ancestor CNAME ends with a “.” which I need to remove for kloth.net to look up nameservers succesfully - but I think that’s just the two using different styles, if I try and remove the trailing period in Google Domains it gets grumpy.

Wondering now if I misinterpreted Ryan’s comment, and Certbot’s issues are a little deeper than I understood (e.g, it won’t follow CNAMEs during validation - I thought it was just ignoring them during TXT record population). But this goes against my understanding of who would be doing that step (e.g, I don’t think it’s up to Certbot if / how to parse and validate a domain - that’s up to LE’s ACME servers).

1 Like

I think you're on the right track. The error indicates the failure was during "secondary validation". What that phrase refers to is how the ACME server is doing Multi-Perspective Validation. In short, it not only tries to validate from the primary LE datacenter, but also from a number of other datacenters across the globe to protect against a local traffic/DNS spoofing attack. But since a provider like Google has DNS servers all over the globe, it takes a while for the changes you make to propagate everywhere. Here's a related thread:

In short, you may just need to wait a bit longer after publishing the record and before initiating the validation. The certbot Google DNS plugin has a --dns-google-propagation-seconds parameter that defaults to 60 seconds that you should be able to tweak.

2 Likes