Pattern for validation tokens

maxmayer · July 21, 2024, 5:08pm

Is there a way to understand if the token that arrives at the domain validation request was really generated by a Let's Encrypt request?
In a cloud scenario, multiple servers handle the validation request for issuing a certificate (http-01), so the token/auth association must be placed in some form of shared storage between the servers (e.g., DB, cache). It would be useful to avoid requests to shared storage in the case of a formally incorrect token. For some time now I have been experiencing a lot of fake requests, but without actually knowing whether they are legitimate or not, I necessarily have to process them.
It would be useful to have some sort of expected pattern of the token or a validation of the token itself (e.g., with a CRC linked to a secret key that the server itself sends to the Let's Encrypt API while requesting the validation).

thanks in advance

Osiris · July 21, 2024, 5:50pm

The regex for a token in the Boulder software is pretty simple:

github.com

letsencrypt/boulder/blob/510996b07a69fb41cfff99a1d19256d89c27d8a4/core/util.go#L77


      
          		panic(fmt.Sprintf("Error reading random bytes: %s", err))
          	}
          	return base64.RawURLEncoding.EncodeToString(b)
          }
          
          // NewToken produces a random string for Challenges, etc.
          func NewToken() string {
          	return RandomString(32)
          }
          
          var tokenFormat = regexp.MustCompile(`^[\w-]{43}$`)
          
          // looksLikeAToken checks whether a string represents a 32-octet value in
          // the URL-safe base64 alphabet.
          func looksLikeAToken(token string) bool {
          	return tokenFormat.MatchString(token)
          }
          
          // Fingerprints
          
          // Fingerprint256 produces an unpadded, URL-safe Base64-encoded SHA256 digest

I'm not sure if this helps you in any way.

mcpherrinm · July 21, 2024, 6:00pm

The tokens are random, so there's not a good way to tell if it genuinely came from Let's Encrypt.
There's no particular guarantee the token format stays the same either. Let's Encrypt is telling you what the token is going to be in the API, and your best course of action today is going to be to use that.

While I don't know what your infrastructure has or what the "fake requests" you're getting look like, perhaps rather than having servers contact an external DB/cache, could you push the token(s) out to all the servers when a challenge is underway? If there's too many servers, another option is to have all the servers issue an http redirect to a single static host, s3 bucket, etc which is just serving the tokens. You could even disable that when no issuance is in progress.

MikeMcQ · July 21, 2024, 6:15pm

Could you filter out "junk" requests by user-agent string perhaps? At least avoiding DB load

Be careful not to filter too tightly in case Let's Encrypt change the strings they use. Or closely monitor cert renewals so you catch lost legit requests.

This is sort of a quick-and-dirty solution but maybe enough to avoid some of the worst of it

maxmayer · July 21, 2024, 6:16pm

Thanks for the tip, I believe that the origin of the fake requests is a vulnerability found on some servers that allows malicious scripts to be inserted into the .well-known/challenge folder. The requests I see are probably used to understand if the malicious script is installed on our servers (and obviously this is not the case).
Why has Let's Encrypt never thought of a way to verify the token that clearly doesn't decrease the security of the token itself? The IP addresses are not public so they cannot be filtered, the tokens are completely random (and rightly so), but they are not verifiable. Something should be done because I believe it can be useful to everyone.

maxmayer · July 21, 2024, 6:17pm

Usually the user agent is indistinguishable from the one used by Let's Encrypt.

MikeMcQ · July 21, 2024, 6:50pm

Is it possible these requests are coming from some other system of yours? Like something setup for testing or prior config that is still issuing cert requests for your domain name(s)?

I ask because we have seen similar problems here but the user-agent strings are often obviously different. A few other cases were actual requests sent to LE from a "lost" machine.

Is there any pattern to the source IP? While LE does not publish and they rotate often all but one of the LE auth server centers are in AWS. Might be hard to assess real-time but might be useful info for spot checking some log entries. Looking at legit cert requests you can see the locations of these centers (with suitable IP lookup tools).

Legit requests will come in bursts of up to 5 (today). One from the primary center and four secondary. If you are seeing "invalid" requests with the same URI in groups of 4 or 5 it could be a "lost" machine. Or someone accidentally making invalid requests for your domain.

Another idea is to switch to using a DNS Challenge.

This hasn't come up much in this forum and I don't see such rogue log entries myself. I apologize if you have already sorted through these other options. It's not easy to tell how much people have evaluated.

maxmayer · July 21, 2024, 7:42pm

The DNS challenge is not a viable option for us at the moment.

The requests come from all over the world and from different IPs, so it can't be a "lost" machine.

Apparently there doesn't seem to be a pattern in the tokens that are sent.

maxmayer · July 21, 2024, 7:44pm

Just to clarify, we don't have any issues with certificate renewals at this time. We just have a ton of requests with fake tokens.

MikeMcQ · July 21, 2024, 8:15pm

I understand. Was trying to get more precision around what you mean by fake

And, I need to correct what I said about the "bursts". LE made a change in recent weeks and you would not see up to 5 failed challenges. LE used to make all requests async but now the primary must succeed before trying the secondaries.

I'd be curious to see the log entries for some of those fakes. This hasn't been a problem we've seen before that I recall. We have seen similar complaints but they ended up being explained.

maxmayer · July 21, 2024, 9:07pm

Here are some examples:

/.well-known/acme-challenge/KD1D7TSUMLR8S7Y4GGOO-U7QKQ0QNHATGAZC32388IQ
/.well-known/acme-challenge/TFHZJ4EE6GANA440SICB9RTECU1ZA3X8GEJEUTQJCQP
/.well-known/acme-challenge/6BR40A9KR-3GGWXYEDYE--1AYSGLQSQKICQX28WVD81
/.well-known/acme-challenge/CJ2-O-H-O-1AZRDHTSVDAHCZO1Q6C5-V64EPWKG551O
/.well-known/acme-challenge/H2D8OW-DKCHWJ-06WP6VUHAXO9DCK88L1BOMQKQRTVI
/.well-known/acme-challenge/W105-5T-75Z0PE0J0BDD1O3U3-31M0KGVDEHIEHALBS
/.well-known/acme-challenge/IRB21CR364NOTUPXMTZGKFDI8Z-W7R4-X5FW30PCCOT
/.well-known/acme-challenge/UKB1TR-GLK5-9TMUZRPYQT37---WB3RSF13GO181P79
/.well-known/acme-challenge/V4FKDR9TF4D6LPQG-JADTB312VZHNTGBHLCV2WCHCOU
/.well-known/acme-challenge/AE9THOPJEZYF15Y8GFIH9-KCGTX-IEX89GTIS6QYOCE
/.well-known/acme-challenge/N-8-4QFH-4OEDPMI3F58SJ0VGZRVI9V774Q1ZQZKYMJ
/.well-known/acme-challenge/EGTZHXFRC28DOV648FT4M2XJORRJIOW-3MZSUZ5U288
/.well-known/acme-challenge/-MVIDA-R2GDWFWSZ9QPX-9ICIAYE92JUAQ789IORXR4
/.well-known/acme-challenge/UKAQ-759JNQ2T4ERLHVQ4H8EVAKRI-3ZC0L3PMJWE1F
/.well-known/acme-challenge/-RF0TXQZCWGKFMX7TRXED0M9VIEPFEBFJIHAQQT2A96
/.well-known/acme-challenge/3AAVOBV3GBFDHG93W-SZZ5OB8-2KL0G2H7CO70UZQA8
/.well-known/acme-challenge/HNF3AU0VT0-PMUE01QK53-01LL18TLK9N8UU4G5ICIJ
/.well-known/acme-challenge/E2ZQAK3UZ94XJ-IFK4AN0DK-5ZSUWC2RUS8GU5CK52Z
/.well-known/acme-challenge/14G-37-7RH5UXLNLE387U-0YBYOC3JZJVUTCBLIP6NQ
/.well-known/acme-challenge/U89-S7VIH-BQFSWYDH5JHYX9I8-G2C5C3ZJOAD4C7VH
/.well-known/acme-challenge/B-FIFF4SWLU-BTDY9SW-BKUO3L8VH8C7ZHMRM793N3A
/.well-known/acme-challenge/2T7VMFV05380X4IQU8K02AY6IAS-3KF6ORZXWCKDYHU
/.well-known/acme-challenge/H-JX3ZWJNN8SNPP31OFKODKZ3Y-2QM5STAFA1GDFUNY
/.well-known/acme-challenge/EJCHLKD9ZQ8DRKZYDBI9MR-LYTB0J5GDJMFOZ69SFD1
/.well-known/acme-challenge/YDCH07DA1LVNX9FXUIINCF8L8-WPP12OP7O56O9VAVN
/.well-known/acme-challenge/15WGF09VVUOUX-5F-F6E9PJKIIHOZH94X54LXMQLSBS
/.well-known/acme-challenge/AXJRVJE1D3FDG1B24Q8C6E70RG9ZGD0F2K661OLS1C6
/.well-known/acme-challenge/GNNL0MQRSGE6EVFBE6UN56K-ASWXJOP62AFR3EMA0YW
/.well-known/acme-challenge/48-6NSB0PWZVKUINNDRKKBA1IZ9KT2C2PSKXUZVMWDH
/.well-known/acme-challenge/UIJBHT2K1H6SDD2JXNB7NN6XN6OG1R4C3HGT8OTECAW
/.well-known/acme-challenge/7L4FEXXFC0N35JXP8JWFIA662F-YNN9FWDOCS932Y3N
/.well-known/acme-challenge/YA83-HK1IBATD7G5EH0BSN-JU0JWB5-1K7HVAM51NNO
/.well-known/acme-challenge/CKDOJ0BPEXTBB-9WQ740LRR5-U3H00O9YUDLPKNZ8N6
/.well-known/acme-challenge/OG1IDZEB-KCH-0PKW6YY5C5LVYJ0UJS2EHF9FYOLRXJ
/.well-known/acme-challenge/-16TT3119L4PBG9HI9EZ-R3WF7EH-LJDGMKOG2825RT
/.well-known/acme-challenge/ATBAN9BIA43Y4T87MDCK27-SI171E3-L1F7Z9MOI5KJ
/.well-known/acme-challenge/42VWC5NQNL6IJPYXKBPHS-VZEDJTI3VYV56JFYZ4H2R
/.well-known/acme-challenge/1505TZKSV57RE-WCOBX-LHQV-FO0HUO8IXBQ7YHLPEH
/.well-known/acme-challenge/7U--LF1YEZ72OY1-CD-FDLY6ZXDWJEE8UGZJR9NPZUJ
/.well-known/acme-challenge/9JYE746SCVWTDFKUHCRPYD-I11GUA2-JWC06FUQ1X6C
/.well-known/acme-challenge/LQITAWYME-4EYSCPH-K-8RFATFBB07ML837P18ON00M
/.well-known/acme-challenge/H1LDOVYTSQNS4XMSE40A0KUOYUFB9O32LJM44HVV3LU
/.well-known/acme-challenge/FVPCLFQYDSVR7-P5-0X8H3-QXA6I14INL6Z2TY2UZ8O
/.well-known/acme-challenge/IF0GSXALXW8QYZV561EEOZ-WX7NRSAET065KDOX7UUM
/.well-known/acme-challenge/RO5JM-BCS7LF6EGPVO0RKASC2-51YNEXZGSQSDB0JOT
/.well-known/acme-challenge/Z84NGHE1ASR-A08-X7AF-LO4RAIR9-4QS0OSRL6FGGW
/.well-known/acme-challenge/-W8XOVQ2IXMF34VKTXEPK15-72MZN1SZPGAM9G1ZD39
/.well-known/acme-challenge/JCXZPQV59NG5MD5FHMN55-CDSY52-TYISC7SPM7421K
/.well-known/acme-challenge/M12JV4AUT3PMWCAR3SH2B94FEEMOMPTXSR7C1OHC7OE
/.well-known/acme-challenge/GT-NVGR2PO9NDGGI-N96-QPRCC-KCJMW4ZJUMHZ2PRE

Then there are some that are more clearly fake, because the length is 32 characters, but I already know how to filter those.

MikeMcQ · July 21, 2024, 9:10pm

Is that all the info you have in the log? The source IP, user-agent, timestamps and anything else might be helpful. Hard to know some times.

I realize I may just come to the same conclusion. But sometimes a different perspective helps

maxmayer · July 21, 2024, 9:15pm

Of course I have more information, but I can't make it public, e.g. I cannot make the IPs public, I can however delete the last 2 octets, I have the timestamp and obviously the complete URL, but I cannot make this public either. However I am sure that the IPs are not from Let's Encrypt.

MikeMcQ · July 21, 2024, 9:20pm

Why would the originating IP need to be a secret?

I have a hard time imagining how responses to those requests would be helpful to anyone. But, it was long-shot that I would see anything anyway.

Osiris · July 21, 2024, 9:21pm

Weird, those all caps tokens. I believe the real tokens are a mixture of lower- and uppercase due to the usage of base64(url).

Having an all uppercase token randomly is certainly possible, but statistically highly unlikely.

MikeMcQ · July 21, 2024, 9:24pm

Good eye!

maxmayer · July 21, 2024, 9:25pm

Unfortunately we have legal and certification constraints. It's just that.

However, I don't think they would have been useful:
110.232..
106.0..
148.251..
152.89..

maxmayer · July 21, 2024, 9:26pm

And why should this identify them as fake if the only constraint Let's Encrypt provides is that the length is 43 characters?

maxmayer · July 21, 2024, 9:29pm

This is exactly the point, we all know that they are fake, but they can potentially be true because Let's Encrypt only says that the length must be 43 characters and that they must contain characters that can be both uppercase and lowercase and the minus, but there is no constraint on the number of uppercase or lowercase letters, so they can potentially be all uppercase or all lowercase.

MikeMcQ · July 21, 2024, 9:40pm

Well, there's a practical element and a theoretical one. It was really Osiris observation I just thought it was a good catch.

The odds of having an entirely uppercase token value is very small. If you were to reject it that would just fail the cert request. But, so what. Aren't you retrying cert renewals frequently anyway? Occasionally original and renewal cert requests fail for any number of temp reasons (comms, LE outages, ...). This would just be one more extremely rare failure.

Your concern seemed directed at the load on your shared storage. This looks like one of those high percentage of the benefit for very low effort things.

Topic		Replies	Views
Random change validation failure Issuance Tech	7	1915	December 11, 2016
Lets Encrypt and Boulder Implementation Questions Client dev	5	1551	October 29, 2016
End User Facing Validation Help	3	1813	August 25, 2017
New Order API Lets Encrypt Help	2	331	January 24, 2024
Validation of certificates digital signature/hash Help	11	294	May 3, 2024

Pattern for validation tokens

Related topics