My ideal would be to never fall back. Just as we have relatively high standards for functioning DNS in order to get a certificate, I'd like to be able to say "If you're IPv6 address has reachability problems, either remove the AAAA record (which will have benefits for your visitors as well), or fix the reachability problems." However, since IPv6 validation was introduced after our initial launch, there are some people with unreachable AAAA addresses who have been renewing happily without a hitch. We didn't want to break those people when we deployed IPv6, so when we launched we provided fallback on connection timeouts for both TLS and HTTP challenges.
Shortly after launch we got a bunch of reports of failures during HTTP validation due to timeouts. When we checked our logs, we found that these were timeouts after connect. Our interpretation at the time was that the IPv6 address was perfectly routeable, but for some reason the HTTP server listening on that address was not responding. Our decision at the time was not to try and work around that case. However, it turns out that our logs were incorrect due to the race condition I linked: We were getting a connection timeout, but it was being reported as a post-connect timeout. We had to fix that race condition for a number of reasons.
There is an open question of whether we could now adopt a more strict stance on IPv6 fallbacks, since it seems like we may have never actually been doing those fallbacks correctly for the HTTP challenge. What do you think?
Yes, as of now, Boulder should correctly implement the logic we originally intended: Connection timeouts can fallback to IPv4, but not HTTP request/response timeouts.