Any ways bring down LE Staging manually for testing?

Hi,

Are there any possible ways to bring down LE Staging environment for few seconds(or certain time) to test transient error handling scenarios on our service?

We tried with Pebble which I can manually bring down for testing but the problem I faced here is PEBBLE IS STATELESS and does not hold any data about the order that I placed before I brought it down so it fails to continue processing the order.

Please post your thoughts/solutions.

you can mess with your routes. or with your outgoing firewall.

but LE staging is on cloudflare, do you really want to do that?

3 Likes

Or DNS. You could simply add the staging URI in /etc/hosts/ to point to localhost or a non-existing (private) IP address.

4 Likes

I thought about it. But DNS is aggressively cached.

3 Likes

Which make it more realistic to begin with :smiley:

4 Likes

Yeah, you can simulate prolonged outages but not network errors and the like.

Once you have the IP, dns isn't called again.

3 Likes

I would recommend resolving the staging API hostname to a specific (private) IP address and manage routing stuff like you recommended earlier on that specific IP address. So other services can make use of Cloudflare without being affected by the tests for staging.

That's not true, not in my case anyway. I can add hosts to /etc/hosts and the DNS resolving quickly changes to the new IP addresses. Perhaps my local DNS cache is very short, I dunno. YMMV.

4 Likes

I wasn't talking of the cache.

I don't think certbot will ask more than once per run, and I wanted to drop the connection after the dns resolution but before order finalization. :smiley:

3 Likes

I'm pretty famous for bringing the LE staging environment down, whether or not I intend to.

14 Likes

Sounds like you are good at finding unintended features of the LE Staging environment. :rofl:
Not a bad skill to have, remember with "great power comes greater responsibility".

5 Likes

I’ve used GitHub - Shopify/toxiproxy: A TCP proxy to simulate network and system conditions for chaos and resiliency testing in the past to simulate this sort of thing

10 Likes

What I'd do is put a proxy in front of Pebble and configure it to return 500s when you want it to. A simple way would be, when you want it to be 'down', put the wrong address in the proxy field so it doesn't actually reach Pebble. Reconfiguring or restarting the proxy won't reset Pebble.

10 Likes

I solve everything with firewalls [LOL]

I would make a deny rule (above my accept rule) that kicks in on only specific hours of the day (or on certain days of the week).
That way you can predetermine and structurally schedule all your random outages - LOL

In this "example", we can see how access to all defined LE networks can be dropped at midnight and four AM (for one hour) and also during the weekend [48 hours - all Saturday and Sunday]

7 Likes

As @osiris suggested, any easy method to simulate a particular resource being unavailable is to append/remove to your hosts file with a fake IP for the host. You didn't mention what your test environment is so I'm guessing it's a linux based CI/CD platform.

5 Likes

FWIW, nginx will let you test the existence of a file (and other objects) on the operating system during the request, and act appropriately (docs):

if (-f /path/to/semaphore) {}  # file exists
if (!-f /path/to/semaphore) {}  # file doesn't exist

This is a lightweight check, because it's leverages some operating system caching. It's often used to set a "downtime" flag, but I use it often on tests -- you can just return a custom error for a flag.

Another option is to use OpenResty, which is a fork of Nginx that integrates server-side scripting with Lua. It would be trivial to create an endpoint to toggle "proxy on/off", and with a bit more work you could simulate custom network conditions. I have several test systems that use this approach.

8 Likes

Yeah, nginx's file existence test is nice. I also use it (for maintenance mostly), but always remember: If is evil, so be careful how to use it.

8 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.