Receiving 400 error code


#1

I’ve been receiving 400 responses to requests coming from Europe. This happened highly inconsistently and at various stages of the certificate issuance process. Eventually my certificate did get issued after about 16 retries, by luck I didn’t get any 400 errors on the last run. So I don’t need help issuing a certificate. But I think there is one or more unhealthy components serving requests in Europe.

I issued six other certificates in the U.S. at the same time, and none of them had an issue.
Hard coding the address in /etc/hosts for acme-v02.api.letsencrypt.org to an IP that was used in another region did not help, I still received 400 errors.

I’m using dns-01 challenges. The client IP making the API requests is not associated with the DNS name for the certificate being issued.

My domain is: cmon.eu-ams-1.triton.zone

I ran this command:

dehydrated -c

It produced this output:

Details:
HTTP/1.1 400 Bad Request
Server: nginx
Content-Type: application/problem+json
Content-Length: 169
Boulder-Requester: 43620828
Replay-Nonce: REDACTED
Expires: Wed, 10 Oct 2018 16:54:43 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Wed, 10 Oct 2018 16:54:43 GMT
Connection: close

{
“type”: “urn:ietf:params:acme:error:badNonce”,
“detail”: “JWS has an invalid anti-replay nonce: “REDACTED””,
“status”: 400
}

My web server is (include version):

N/A

The operating system my web server runs on is (include version):

SmartOS (Illumos)

My hosting provider, if applicable, is:

Joyent

I can login to a root shell on my machine (yes or no, or I don’t know):

Yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel):

No


#2

Thanks for posting! Are you by any chance coming from behind a multi-IP NAT? We’ve seen some issues with that before, since we have separate nonce pools per-datacenter, and use IP stickiness to ensure the same client usually hits the same nonce pool. In theory clients should be implementing automatic retry on badNonce errors - not sure if dehydrated does.

I’ll also check to see if we’re seeing elevated nonce-related errors.

Update: Looks like dehydrated doesn’t handle retries, which exacerbates any nonce-related issues: https://github.com/lukas2511/dehydrated/issues/547


#3

Nope, this server has a dedicated IP.

For dehydrated not handling retries, this is something we know about, but it still fits our use case best. We retry manually in the event of failures on initial issuance, and our cron jobs will retry when the cert is nearing expiration.


#4

I checked our logs, and it does look like your traffic is coming from two different IP addresses, which makes nonce issues likely. Maybe this server has two interfaces and is balancing traffic across them?

Thanks for raising this and providing the extra details. It looks like not an outage, but we always appreciate our community flagging potential availability issues.


#5

Can you tell me which IPs they were (privately somehow? or perhaps just the last octet number?) so I can see where those IPs are?

The cert in question should have 3 IPs associated with it.


#6

I think I figured it out. This server has two default gateways assigned, which is a misconfiguration. And yes, on my operating system it would automatically load balance between them. Thanks for helping me track that down!


#7

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.