Hi @bruncsak, thanks for the follow-up information!
The nonce pool is per-environment (staging vs prod) and per-datacentre. If the egress IP changes between requests our load balancing may send a request that retrieved a nonce from one DC to another DC that doesn't know it.
Since the user is only using one IP we can rule this out as a problem.
Great, thanks! I provided my e-mail to the user on the Github ticket so they can send the domains to me that way.
This isn't a workaround per say - even if we resolve this particular problem it's a best practice that will help in the future if (for example) a nonce expires out of the pool between when you get it and use it.
I think you are misunderstanding the purpose of the nonce. I recommend you review Section 6.4 and the ACME threat model. The nonce is strictly to prevent replay attacks from a middle party that terminates the TLS connection (e.g. a CDN). The goal is to prevent the CDN from replaying a request it processed and forwarded to the ACME server previously. A man-in-the-middle attacker is unable to modify the ACME request without breaking the JWS that authenticates it with the user's account key. Retrying requests on nonce-failure will not give a MITM any information they couldn't already observe.