Heads Up! Staging Under Maintenance!

https://letsencrypt.status.io/

2 Likes

Right in the middle of me trying to alpha test my overhauled ACME client too!

:man_facepalming:

3 Likes

@sahsanu, @jmorahan

If you're around, please pin this topic ASAP.

jillian's here now.

1 Like

The community forum has some customization that shows a banner with the status page status and links to the page. Right now, it shows 'Planned Maintenance' and other times it shows 'Service Disruption'. I've noticed that if you're already viewing the community forum, it might require a refresh to show the banner. But, most visitors will likely see the banner since they will be accessing the community forum "fresh". We'll look into ways to make sure the banner shows up more promptly and without a refresh.

3 Likes

Thanks for getting to this, jillian. :blush:

I'm always here and haven't seen anything, so I figured I'd try to preempt the issue.

2 Likes

Consider subscribing to our status page too! We do our best to keep our status page up to date and specify the impact of our maintenances and incidents. In this case, we knew a prolonged staging outage could be problematic and alarming so we put out notice in advance. Those who are subscribed to our status page started receiving e-mails one week ago for our upcoming maintenance window and then 72 hours before, 1 hour before, and when it started.

6 Likes

That's good advice! :slightly_smiling_face: I had forgotten anything about this and so ran into it headlong myself.

I'd give you a like, but I'm out again. :sparkling_heart:

2 Likes

I'm suspecting we may get (hopefully only) a handful of people here tomorrow wondering what happened.

2 Likes

By the way, I was privy to another conversation related to this in which @griffin, @jillian and @jsha and some other community members helped figure out that the service status banner didn't display properly on mobile browsers—and fix it so now it does. So if you're reading this thread now, I think @griffin's concern about the difficulty in seeing these notifications has been addressed significantly in the meantime. :grinning:

3 Likes

In the meantime, if you're busy developing an ACME client, you may find pebble useful: GitHub - letsencrypt/pebble: A miniature version of Boulder, Pebble is a small RFC 8555 ACME test server not suited for a production certificate authority. Let's Encrypt is hiring! Work on Pebble with us. -- i recently developed a new ACME client library from scratch and found pebble absolutely invaluable. Made development very easy and resulted in a robust client.

4 Likes

This probably isn't worthwhile because total staging outages don't seem to be that common, but synthesizing an {"type": "urn:ietf:params:acme:error:serverInternal", "detail": "Temporarily unavailable due to scheduled maintenance"} error response on the frontend servers, similar to IP blacklisting, could be another way to get the message across for users who do a --dry-run during these windows.

3 Likes

@_az

Did you see the discussion in the #lounge?

https://community.letsencrypt.org/t/critical-please-pin-outages-in-help-category/144552/25

I think the problem+json you are suggesting could go fantastically with the 503.

2 Likes

There are many different kinds of outages. The vast majority of the time, we do have our outermost line of defense synthesize useful error messages, as you suggest. In this particular case, the physical connection between that outermost layer and the public internet at large was under maintenance, so no fancy error message this time. Sometimes you just need to roll with it, and ensure that your client is resilient to server failures, as it should be anyway.

4 Likes