We have a problem where lets encrypt server occasionally doesn’t verity dns-01 tasks and keeps these in “pending” state for ever (or until we send “resource”: “challenge” again).
Example, we get new authz for psql02.trm-trans.beep.pl:
2019-03-12 15:09:58,236 - DEBUG - JWS payload:
b'{\n "identifier": {\n "type": "dns",\n "value": "psql02.trm-trans.beep.pl"\n },\n "resource": "new-authz"\n}'
2019-03-12 15:09:58,277 - DEBUG - Sending POST request to https://acme-v01.api.letsencrypt.org/acme/new-authz:
{
[…]
"payload": "ewogICJpZGVudGlmaWVyIjogewogICAgInR5cGUiOiAiZG5zIiwKICAgICJ2YWx1ZSI6ICJwc3FsMDIudHJtLXRyYW5zLmJlZXAucGwiCiAgfSwKICAicmVzb3VyY2UiOiAibmV3LWF1dGh6Igp9"
}
letsencrypt server answers fine:
2019-03-12 15:09:58,684 - DEBUG - Received response:
HTTP 201
Server: nginx
Content-Type: application/json
Content-Length: 1280
Boulder-Requester: 1732128
Link: <https://acme-v01.api.letsencrypt.org/acme/new-cert>;rel="next"
Location: https://acme-v01.api.letsencrypt.org/acme/authz/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes
Replay-Nonce: UpJHtxZ_2Dpi986_vGKSobDTvh8vklCmclALEsp5oSM
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800
Expires: Tue, 12 Mar 2019 14:09:58 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 12 Mar 2019 14:09:58 GMT
Connection: keep-alive
b'{\n "identifier": {\n "type": "dns",\n "value": "psql02.trm-trans.beep.pl"\n },\n "status": "pending",\n "expires": "2019-03-19T14:09:58Z",\n "challenges": [\n {\n "type": "http-01",\n "status": "pending",\n "uri": "[https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958532",\n](https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958532%22,%5Cn) "token": "urZi6CE5T2T_yy9yTyKCC6xLvhPYUdVX512XRas5Jvs"\n },\n {\n "type": "dns-01",\n "status": "pending",\n "uri": "[https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958534",\n](https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958534%22,%5Cn) "token": "5BsMA6xfGs1h5tVPL-C7fbv_nIK2lpeai_hfs_QrsPA"\n },\n {\n "type": "tls-sni-01",\n "status": "pending",\n "uri": "[https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958535",\n](https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958535%22,%5Cn) "token": "hqTsmgpFur34lSeN8fydIn56yyK6hhqOrEaaq9XYMcM"\n },\n {\n "type": "tls-alpn-01",\n "status": "pending",\n "uri": "[https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958536",\n](https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958536%22,%5Cn) "token": "ckjl5F13rLlbq3VOvOr7P4pAlWg3KIqjehXGsbq6koA"\n }\n ],\n "combinations": [\n [\n 1\n ],\n [\n 0\n ],\n [\n 2\n ],\n [\n 3\n ]\n ]\n}'
We choose dns-01, put proper records in our DNS zones and told letsencrypt server about that:
2019-03-12 15:15:05,484 - DEBUG - JWS payload: b'{\n "resource": "challenge",\n "keyAuthorization": "5BsMA6xfGs1h5tVPL-C7fbv_nIK2lpeai_hfs_QrsPA.ndBNik9Qn4ddsVca8VLjaHENcFnRCFa1Rg30N_p3M8w",\n "type": "dns-01"\n}'
2019-03-12 15:15:05,516 - DEBUG - Sending POST request to https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958534:
{
[…]
"payload": "ewogICJyZXNvdXJjZSI6ICJjaGFsbGVuZ2UiLAogICJrZXlBdXRob3JpemF0aW9uIjogIjVCc01BNnhmR3MxaDV0VlBMLUM3ZmJ2X25JSzJscGVhaV9oZnNfUXJzUEEubmRCTmlrOVFuNGRkc1ZjYThWTGphSEVOY0ZuUkNGYTFSZzMwTl9wM004dyIsCiAgInR5cGUiOiAiZG5zLTAxIgp9"
}
Where letsencrypt accepted that:
2019-03-12 15:15:05,816 - DEBUG - Received response:
HTTP 202
Server: nginx
Content-Type: application/json
Content-Length: 336
Boulder-Requester: 1732128
Link: <https://acme-v01.api.letsencrypt.org/acme/authz/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes>;rel="up"
Location: https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958534
Replay-Nonce: tdoYpnwBNxMYB-6g9hZxgiCdGQEka4apknkD7OjhN2s
Expires: Tue, 12 Mar 2019 14:15:05 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 12 Mar 2019 14:15:05 GMT
Connection: keep-alive
b'{\n "type": "dns-01",\n "status": "pending",\n "uri": "[https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958534",\n](https://acme-v01.api.letsencrypt.org/acme/challenge/vwKBbydMj09GXl6Cw-L3awe-gEoczQ0M71NWd4JSWes/13564958534%22,%5Cn) "token": "5BsMA6xfGs1h5tVPL-C7fbv_nIK2lpeai_hfs_QrsPA",\n "keyAuthorization": "5BsMA6xfGs1h5tVPL-C7fbv_nIK2lpeai_hfs_QrsPA.ndBNik9Qn4ddsVca8VLjaHENcFnRCFa1Rg30N_p3M8w"\n}'
But until now it didn’t verify dns zones and status stays in pending.
Now we are stuck.
We can get unstuck if we send “resource”: “challenge” again. Then letsencrypt server will do dns validation. Just like letsencrypt didn’t save information about resource challenge.
Why is that happening?
Recently we see this problem few times per day. Earlier it was like once per week.