Pending, pending, … suddenly valid?

Hello,

I’ve noticed a peculiar problem with some cert orders that seems like it might be a Boulder/LE issue rather than a problem with our client.

I created an order and started doing its authzs’ HTTP challenges. Two of the challenges (https://acme-v02.api.letsencrypt.org/acme/chall-v3/3416094598/mJS7tg and https://acme-v02.api.letsencrypt.org/acme/chall-v3/3416094600/-lCigg) remained “pending” after 30 seconds, so my client gives up on HTTP and switches to DNS (i.e., creates a new order with the same set of certificates).

Our DNS challenge logic, though, failed because the authz was valid at that point, and so there was no DNS challenge in the authz object.

I can (and will) update our DNS challenge logic. There’s definitely a race condition built into our workflow; it just seems a bit funny that it would happen for two domains at the same time.

Are you guys able to look at logs and see what may have happened here? Was there a pending status being given to some authz polls that should have given back valid?

I’m wondering if those initial order’s authzs were ever valid, and somehow only the new order’s authzs reflected a successful challenge.

if a challenge does not fail or succeed, it will stay pending for a week, and you have a limit of 300 pending challenges.

you can’t just give up on a challenge, your client is badly programmed.

Hi @FGasper

that’s the wrong way.

There was another topic, same problem.

Looks like the new multi perspective validation sometimes need some more time.

So add additional sleeps (or something else).

Pending -> not finished. If you cancel that, that’s wrong -> too much pending orders.

30 seconds are nothing.

PS: There is the topic

Takes really long to switch from “ status: pending ” to “ status: valid ” …

The client - “timeout”, but there isn’t a timeout, it’s simple processing.

I think it’s 300 authzs per week, not pending challenges.

We haven’t been hitting the rate limit you reference, which I suspect is because the authz doesn’t stay pending for a week; it just takes longer than we’re willing to wait for it to resolve. Our logic doubtless has room for improvement, but thus far it’s served us and our customers well.

https://letsencrypt.org/docs/rate-limits/
You can have a maximum of 300 Pending Authorizations on your account. Hitting this rate limit is rare, and happens most often when developing ACME clients. It usually means that your client is creating authorizations and not fulfilling them. Please utilize our staging environment if you’re developing an ACME client.

I’m well aware of the rate limits. :wink: As I say, we’ve not been hitting them.

You will, if you abandon challenges without waiting for them :wink:

(I am not sure if the challenge expires in a week or sooner/later, though… you should check it until it either fails or succeeds. Up to a day later should be fine.)

Our client is deployed widely enough—and for a long enough time—that were there an issue with the pending-authzs rate limit we’d almost certainly know.

But for the sake of argument, is there a way to check an account’s current “progress toward rate limit”?

I think the one-week interval is how long a valid authz lasts.

300 pending authz per account is pretty high. It’s not like a common user would get ratelimited by this.

You should check your logs.

see:

https://letsencrypt.org/docs/rate-limits/

Clearing Pending Authorizations

If you have a large number of pending authorization objects and are getting a rate limiting error, you can trigger a validation attempt for those authorization objects by submitting a JWS-signed POST to one of its challenges, as described in the ACME spec. The pending authorization objects are represented by URLs of the form https://acme-v02.api.letsencrypt.org/acme/authz/XYZ , and should show up in your client logs. Note that it doesn’t matter whether validation succeeds or fails. Either will take the authorization out of ‘pending’ state. If you do not have logs containing the relevant authorization URLs, you need to wait for the rate limit to expire. As described above, there is a sliding window, so this may take less than a week depending on your pattern of issuance.

Note that having a large number of pending authorizations is generally the result of a buggy client. If you’re hitting this rate limit frequently you should double-check your client code.

that’s a month.

We’d hit it, I suspect, since we have one ACME account per server.

We used to hit the rate limit, years back when this was all fairly new, because we did an internal preflight check before polling LE, which meant some authzs were never polled. Since we fixed that, though, there haven’t been problems.

The challenge status stays in pending on his own, unless the ACME server decides to time it out after a very-very long time. The ACME client’s responsibility to fire its URL to initiate the transition out from this state. As far as I know, the challenge status immediately changes to processing on firing its URL.

1 Like

That’s probably it, and why we’re not hitting the rate limit. We poll for a while, then give up and switch to DNS; that first poll probably moves the authz to processing.

30 seconds had been enough. Maybe with LE’s switch to repeat authzs there’s a stronger case for extending that timeout?

Why not just query the ACME server for the Authorization Object before switching to the DNS challenge?

I don’t take your meaning; polling for the authz object’s status is what we already do.