Starting a few hours ago we are seeing challenge failures on http challenges on the staging environment for all domains.
We have seen this before, it was an issue resolved by lets encrypt (Staging Certificate Generation Failure)
We use Acme4j.
We get this output:
Challenge fails:
{
"type": "urn:ietf:params:acme:error:malformed",
"detail": "Method not allowed",
"status": 405
}
I can't reproduce it here because on staging, my domain challenge is stuck in pending state for more than a minute. There seems to be a general problem with staging at the moment.
I do my best to come up with unique ways to break staging each time. In this case, I made all the new multi-VA nodes come up without public IP addresses.
JC takes too much blame for himself! This was caused by an upgrade of a third-party software component (aws terraform provider) which slightly changed behavior.
While we are all inconvenienced by this failure in staging, I hope everyone appreciates that this staging failure enabled us to avoid the failure happening in production -- that would have been a much bigger deal!
We do have private dev environments, but some things (like autoscaling behavior that caused this) don't really show up until you're in something pretty close to the real thing.