Peter, first of all, THANK YOU for your detailed response. Very, very helpful.
We had run into this a long time ago, and I couldn't recall what was going on. Sure enough, this client had their AAAA records pointing to another unrelated server, which apparently was causing the validation to fail.
Hey, it's somebody else using a custom AWS Lambda! I'm curious how big this club is. Are you using an existing ACME library to handle making the requests, or is it just completely custom?
Good question! I am actually not 100% sure on this as we had a third party put the lambda together; I think it might be a completely custom integration (written in Node). Our system looks up domains in our own db and, when they are ready to be renewed, automatically calls the Lambda. Well, two Lambdas actually as our first lambda gets the "verification string," (which we store), and if that works as expected, our second lambda then attempts to generate the cert. It tries a several times automatically, and if it keeps failing it stops trying.
I'm guessing that your authorization challenges failed, and your code isn't logging or diagnosing that properly. My best guess as to why it failed is that your challenge URL you posted only works on IPv4-only networks. That is, the IPv4 address works but the IPv6 address is returning a file-not-found.
I don't know that there is anything I can do in a case like this where there is a faulty AAAA record as it falls outside of our system. Does something come to mind on your end how I could programmatically respond better in the case of an unclear response like 403 forbidden (which I imagine could mean many things)?
That said, if I could ask a few questions to get a better understanding of what's going on, perhaps I can make some adjustments:
the IPv4 address works but the IPv6 address is returning a file-not-found.
This is confusing to me. The vast majority of our clients do NOT have IPv6 DNS entries, and they are processed without difficulty. I guess the issue here is that the ones which DO have IPv6 entries are going to be tested during the LetsEncrypt verification process, and if they exist, well then they need to be correct. Is that right?
I'm not a networking guy, so I am confused... how is it that LetsEncrypt is issuing requests that use both the IPv4 and the IPv6 DNS entries? I guess their code is designed to look for those DNS entries and if it finds an AAAA record, it will issue both the v4 and v6 requests?
That is baffling to me as I thought all they were doing was hitting the verification url, and if it serves up the correct string, it is valid.
I saw in your curl
statement that you used a -6
param. I tried that on my end (with a Macbook), and it did not come back with the 404 that you received, mine came back with the expected verification string response. Any idea why that happened?
I guess, taking a step back, my question is this: what causes a client (any client really, not just LetsEncrypt) to make use of the IPv4 address (the A record) vs. the IPv6 address (the AAAA record) to complete a request? Is it just based on how the browser (or other client) is written? Sorry, again I am not very well versed in networking.
Any insight you can provide is again much appreciated Peter! I'm a knucklehead with this stuff and Googling/researching these questions is not providing me with much clarity.