The explanation is that Windows fetches certificates into its root store lazily. Most programs (including Chrome and Edge) delegate their certificate validation to the operating system: they just hand it an end-entity and the provided chain and say "validate this for me please?". Then Windows does all of the chain building and validation, including fetching any necessary intermediates and including fetching any necessary roots from the Microsoft Trusted Root Program. Then it caches these to make future validations faster.
The real "issue" here is that Rasdial is referencing the on-disk root store without using the OS's built in validation routines, and therefore isn't able to take advantage of the OS's lazy-loading.
As far as I understand it, this complex mechanism is one of the reasons that Firefox ships its own root store (among many other reasons as well), and that Chrome has announced they will do the same.
It appears Internet Explorer is able to trigger the lazy loading flow, but not other parts of the OS, like rasdial. This is a bit of a pickle for us, as the "load this URL in Internet explorer" is not a super viable solution for millions of consumers.
Your help page should include a link like: click here if you are having trouble connecting your VPN with error: 13801
And just ensure they use IE and provide a similarly configured site.
[which just says: Your system should now be updated]
This is not a practical solution. 99% of people do not bother with contacting support, they will simply go "This product doesn't work, I'm uninstalling and getting something else".
I fail to follow the logic here.
If they aren't connected to the Internet, what good does it for them to know about any new trusted roots that are on the Internet?
But I don't really need the use case for that.
Here is what you are looking for:
I'm not interested in the logic I must say. I think having an online root store to pick the certs from when required is a rather stupid idea as a root store. It does of course have advantages such as quick updates, but it a root store should be a complete local root store. Updates can be fixed with other methods.
ISRG root is missing from random versions of Windows, even update to date ones like 20H2, or from year 2019 and 2020 released builds.
Lazy loading doesn't work through rasdial
Combined, this is currently a deal breaker come Sept 2021, at least for our use case. The first one can be a problem for everyone running W10 with no ISRG root, come Sept 2021.
Not if they use IE/Edge (at least once - to any site with such a root)
Something RasDial(Microsoft?) needs to think about (and fix).
You are implying then that no change should ever be made - but the Internet is built on change.
At some point all the current roots will have expired... then what?
@yegor If that Lazy loading is a general thing, subject to all certs and not only the ISRG root, why would the DST Root CA X3 root certificate be any different compared to the ISRG root?
DST root has been around since 2000 (XP era). ISRG was not packaged into Windows until mid 2018, and as evidence shows, it's not present in all cases, even on post mid-2018 Windows releases.
@rg305 I'm not suggesting change is bad, I'm simply pointing out an issue which is a problem for our use case, and our use case is not unique.
The "fix" has to come from the programs using the root store.
There is a method in place to update and keep that root store updated.
Why RASDIAL fails to implement that process is unknown - but that is where the problem starts.
I don't follow. I'm confused. Earlier @aarongable told us about the lazy loading of root certs in Windows.. But that isn't applicable to all root certificates? Is the lazy loading story only applicable to some, i.e. more recent root certificates?
Not sure if the .sst file is the same as your linked .stl, but it seems to be some kind of Windows Update (WU) root certificate list downloader. Not sure if it uses http:// or https:// though, can't test it from my Linux workstation
So the MS local root store consists partly of locally hardcoded root certs and partly of a list of possible to-be downloaded root certs from Windows Update? Very strange...
I would like to chime in here. I too can confirm the general sentiment of this thread.
My current conclusions are very loosely:
Windows 10 does not consider "ISRG Root X1" as a "first-tier" root CA that is hard-coded into the list of trusted roots (and thus appears upon a fresh install of Windows in the list, without connecting to the internet). As we've already seen mentioned this root CA was added by Microsoft in 2018.
ISRG Root X1 is however in Microsofts trusted root program and can be "lazy loaded" by some mechanism in Windows. This mechanism is clearly BROKEN for RasDial which is used for setting up IPSec/IKEv2 connections.
The lazy-loading of ISRG Root X1 can be triggered by a browser like Chrome or Internet Explorer and the root magically appears in the trusted root store. Or, you can add it yourself using Import and it will appear (albeit twice...)
Overall this seems to imply lazy-loading of root certificates in Windows 10 is BROKEN for at the very least IKEv2/IPsec connections which is bad. This means that VPN users are at the mercy of some other mechanism in windows to trigger the loading the ISRG Root X1 prior to connecting to the VPN and hence the intermittent/inconsistent reports and/or lack of reports in general. I do not see how this is at all resolvable without intervention from Microsoft. Even then, lets say Microsoft pushes an update. If nobody takes the update, the problem still exists. This is a major problem for LetsEncrypt users relying on moving to ISRG Root X1 after DST Root CA X3 expires.
I would love to hear from anyone more familiar with the inner workings of Microsoft Windows and it's trusted root store about how this can be addressed. There are undoubtedly people using LetsEncrypt for VPN connections and this is a huge problem that nobody seems to be talking about!
@hft@yegor, does the lazy-loading trigger successfully if you just load an image in the browser via <img src> or something?
@yegor is there any way to get your users to run native code, or browse somewhere, that will result in a browser triggering the lazy-load process? Could they, for example, visit your own web site in a browser to trigger it? (If an image load would trigger it, you could inline load an image from a site that presents a chain to X1 even if you don't use such a chain on your main site.)
Overall I agree with @hft's thought that you should probably get Microsoft to comment on whether there's an intended official solution for non-browser Windows software that may need a root that's not yet present locally.