The server SHOULD include a Retry-After header indicating the polling
interval that the ACME server recommends. Conforming clients SHOULD
query the renewalInfo URL again after the Retry-After period has
passed, as the server may provide a different suggestedWindow.
I am not seeing one on my responses, unless I quite possibly have a bug. Since Pebble doesn't yet implement ARI I'm just having trouble testing this. I guess I could just pretend we received one -- but it's not clear whether Boulder will use a number of seconds or a timestamp (I believe both are allowed). Just wondering -- thank you!
I was also missing this header in my HttpResponseMessage in .NET, though when looking at the Boulder source it seems to be setting it to 6 hours in seconds. I just ignored it because my client runs on a daily schedule anyway, but you're not alone in seeing something weird there.
retry-after can be triggered both on boulder and on the firewall/gateway. the firewall/gateway currently uses seconds.
There were some sample headers in the Lounge in the leadup to the the service-busy implementation last Fall. I'm not sure if you can access those or not (I assume you have enough privs on this forum for that, if not the staff should change that). Those are implemented on the load balancers, use seconds, and referenced the RFC. The RFC is silent on seconds vs timestamp, and the HTTP specs allow both - but the timestamp must be in the HTTP Date format (for an example, see Retry-After - HTTP | MDN).
Personally, I would expect this to be seconds and look to support the date format as an edge case.
I was just barely implementing support for both formats so now I have that working, if the header does show up, I'll be ready!
(I don't have Lounge access. It's OK -- I'm not very active here except when doing ACME-related work, but I wouldn't mind access either if that's something mods agree on. I hope when I am active, at least, I can contribute.)
Me too. That's why I'm here and super active for a month or two, then a few months away, repeating infinitely... IMHO you're definitely on the list of the most insightful and positive community members here.
Ha, I don't know if I agree, I sometimes feel like a squeaky wheel, or at least less helpful than most of the amazing helpers here -- but thanks, I appreciate being welcomed.
Anyway, right now my clients typically scan certificates for upcoming expiration every ~hour or so by default. If the expectation is more like ~6 hours for ARI polling, it'd be good for the server to emit that header so we can honor it. I'll implement support for Retry-After, but just FYI, once deployed, our clients will be hitting it every hour for every cert until that header appears, I guess.
For what it's worth, I don't either. Though I did at some point in the past. I think it's based on an internal Discourse metric called Trust Level which may change automatically based on activity?
On the subject of retry-after, while the CA may have recommendations they're not operating with full knowledge of your environment.
For instance I have some users managing 18000+ certs on one server and one account and so in that case the client will have to check at the very least 24 certs per hour to get through them all in a month and deferring to the suggested retry-after may or may not yield the required throughput. On top of that I believe the expectation/convention is to check every cert every 6-12 hrs [so, up to 72000 checks per day in this case].
Personally I would have liked a single endpoint saying that 1 or more events is happening/has happened potentially affecting certs issued within a date range, rather than having to check every cert every day but I'm not aware of all the scenarios that the standard is intended to cover.
Always remember that, if you think you've found a bug in Boulder, we'd really appreciate you filing a bug in the github repo. That way we're guaranteed to see it (we easily could have missed this thread), and anyone on the team can jump on the fix. Even if you're not sure it's a real bug, we're happy for the report and we'll investigate.
Part of it is that people are expecting it to be more likely to be a problem in their client than something in Boulder. Boulder's usually really high quality code, after all.
It's also a bit confusing to client developers that there's been a blog post about ARI but nothing in API announcements (and client authors are encouraged to subscribe to the API Announcements without any similar encouragement for the blog), so it's not clear (at least to me) just how "live" ARI is supposed to be, or if it's in some kind of "soft launch" or something where there are responses in production but client authors aren't expected to be trying to really integrate with it yet.
Of course; thanks. I wasn't sure if it was a bug in my client (which is in development) or in Boulder (which is in production), and as the spec doesn't require Retry-After, I was unclear as to what to expect.
It is sorta a soft launch. Obviously, the RFC is still a draft, and isn't fully finalized. So we want client authors to put some effort in, so we get a good sense of how difficult it is to implement and how useful it is, without every niche client implementing just in case there ends up being a change to the RFC that they then have to go reimplement. Blog posts and community forum posts also serve different purposes with regards to things like fundraising.
I would have expected a thread in the API Announcements category then?
Also, has the ARI feature been enabled on staging before production? Or has this been enabled at the same time on both environments? As there is no blog post mentioning staging.
Also, in the blog there's mentioning of "ARI has been standardized in the IETF (…)", but currently it's still a draft, right? I'd think you could call it "standardized" only when it has reached RFC status?
It was enabled in Staging for months prior to being enabled in prod. As I said, we enabled it quietly on purpose, to facilitate a very slow roll-out. Getting threads like these, where a small collection of the best and most engaged client authors are discovering bugs in our own implementation, is vastly preferable to having the entire ecosystem filing dozens of duplicate bug reports, forum threads, emails, and twitter DMs.