In your scenario, it sounds like key pinning is the main problem. There's an post about key pinning here. In short: Ideally you should be pinning at the root CA level, with multiple CAs. You should also pin an emergency backup key that you keep offline, somewhere safe. If you're following those practices, a compromise and subsequent revocation and issuance with a new key won't block access to your site.
I'd also like to point out that you can effectively increase the number of certificates you can issue for subdomains on a weekly basis. If you have 40 subdomains and you want to issue independent certificates for each of them, you can issue the first 20 in week 1, and then next 20 in week 2. From then on, you can renew all 40 at once if you want, because of the renewal exception.
To answer your question about why rate limits: Yes, HSMs are our main bottleneck. Database space and write volume is a secondary bottleneck, and we're starting to find that write volume to CT logs is going to be a bit of a bottleneck too.
Providing a free service to the entire Internet is hard, since it's very easy for anyone to take advantage of the service. In order to stay free forever and encrypt the whole web, we need to find ways to provide everyone the services they need at the lowest possible cost to ISRG, a non-profit. Rate limits are a part of that.