Do you mean "on" as in "available" or as in "forcefully required"?
Our understanding is it is available but not mandatory in production. If it is mandatory in staging, then we have no means at all to test certificate renewals against the upcoming June 6 changes.
Whereas the issuance change has a concrete rollout date within two weeks, and async finalization is a "maybe, for some things, someday", could staging be made to now reflect how production will function on June 6?
That upgrades the version of the underlying dehydrated library to something that supports async finalize.
It seems lua-resty-auto-ssl hasn't recieved even minor maintence updates in 3 years. I am not sure we are going to go out of our way to support seemingly abandoned projects.
We had tested against the updated dehydrated during the attempted rollout of async finalization, and it was unfortunately not successful.
Our system is lua-resty-auto-ssl based, and heavily customized. We are not seeking support for either, though, only a stable testing environment that reflects the existing production environment.
We have a greatly increased rate limit from Let's Encrypt for the amount of certificates we handle each day, and any failures there would mean a fair amount of disappointed people. Should there be any issues with the June 6 issuance changes we would like to commit our resources to addressing those now, and not possible-maybe-someday-async features at this time.
I understand the desire for a testing environment that reflects the current production environment, but we need to recognize that the staging environment is in a constant state of compromise, trying to balance many different testing needs. In some cases we want it to reflect prod as it is today, and in others we want it to reflect prod as it will be in the future.
In this case, it has revealed that your client does not support asynchronous finalization, a thing that it really should support, and which it may need to support if and when we turn async finalization back on in prod. While I understand the frustration at not being able to test the new chains in staging with your current client, this means that having async finalization required in staging is working as intended. Please use this as an opportunity to change or upgrade your client to one which does support async finalization.
Without a testing ground supporting synchronous calls, Let's Encrypt has effectively gone async-only. We could respect that were it reflected in documentation or clearly communicated elsewhere, but as-is you are correct: it is frustrating.
We'll accept this conversation and the current configuration of the testing environment as a form of communicating that async-only is looming...again. Regardless, it has been on our roadmap to migrate to certbot and fully modernize our system, but neither are likely possible before the issuance changes are released to production.
We do not currently have any plans to turn on mandatory async finalization in prod in the future. I wouldn't say that "async-only is looming". However, we may: turn it on for orders with many names; or turn it on for orders that require CAA rechecking due to relying on old validations; or turn it on for any order that takes more than 500ms to finalize; or turn it on for all orders during an emergency that is causing finalization to take unexpectedly long.
Because of this, we are keeping async finalization on in Staging for much the same reasons that we put a random key-value pair in the Directory: to encourage and require client agility. Asynchronous finalization has always been part of the RFC 8555 ACME specification; clients that do not implement it have always been time bombs lying in wait to cause issues for their operators. We ran smack into those problems when we first attempted to turn on async finalization; we are not willing to let clients ossify further.
Again, I'm sorry that this makes testing against staging difficult for your particular case. We are a small organization, and we cannot dedicate time and energy to supporting broken clients that have not had active development in over three years.
Hey, I'm apart of a small team, too! SPOILER: it's how we got married to lua-resty-auto-ssl years prior to its stagnation, and are yet to replace it because it still works (and from our perspective, smoothly, too) — we don't have extra time to go around fixing things that aren't broken and/or labeled as deprecated.
We appreciate the extra context, sincerely. The lack of other definitions, however, makes it difficult for a small team to allocate precious man hours, which I trust you understand. If sync calls are end-of-life, just say it. For real, this time (we know you did once and it didn't go well...but maybe there's more to that...). Hard dates and clear announcements can be planned around. Tidbits of info here and there throughout a forum with wishy-washy/contradictory requirements, not so much.
You say "broken client", but it's clearly functioning in production and not labeled as unsupported anywhere. In fact, it's even linked to as a client option, and has been for years.
So, we're trying to stay up-to-date on the API announcements, follow the guidelines, test for the upcoming changes...and we can't. I guess we're out of luck, but don't tell us it's because we're using broken things.
Maybe we could help with some kind of workaround if we understood what exactly it is with the upcoming changes that you're trying to test? The intermediates on staging and production are different anyway (since the staging ones aren't trusted, of course). And it sounds like you haven't run your system against staging since before they turned on async finalization which was quite a long time ago now, so I don't think that you're testing how your system deals with an intermediate change. So, what exactly are you trying to test?
Unfortunately ACME doesn't have client compatibility modes, so clients have to stay up to date or they will break over time as services mature (hopefully within the confines of the rfc8555 spec, but that's not guaranteed by anyone). All ACME clients have seen this over time.
ZeroSSL etc have had async (i.e. slow) finalization for a long time and most steps in the ACME process require polling for status changes rather than assuming orders will move to the next expected status immediately.
As a general rule if you find your client has become incompatible with staging you can bet money you'll eventually be incompatible with production. If you're lucky the problem will affect many clients and therefore be big enough that it gets pushed back for a period of time, but if not then your client has to adapt or become obsolete.
Ironically there is just no set and forget when it comes to ACME services and client maintenance is the cost of entry.
It's not conforming to RFC 8555. Thus strictly speaking does not speak the entire "ACME" language and could therefore be considered (at least partly) broken.
Note that some free ACME CAs don't even have a staging provider. Another possibility might be to run your own Boulder instance. Although I'm not sure how well that's documented.
Best bet is to fork the ACME client used and modify it yourself if development has indeed stopped on the original client.