Account public key discrepancy

webprofusion · November 23, 2018, 8:52am

Hi,

Has anyone ever seen public keys for an account vary between the client and the LE server, yet most requests against the account still work (such as retrieving account info from /new-acct/ and changing the key)?

I’m wondering if there is a difference in the algorithm the library I’m using and the boulder implementation that sometimes results in some account public keys having a different result but still being somewhat valid (or is this normal behaviour).

I’ll admit to being a bit rusty at Elliptic curve crypto (and I’m far from being a crypto expert, generally). Happy to be schooled.

For instance, on staging, account id 7392041 locally had the public key:

 "key": {
    "kty": "EC",
    "crv": "P-256",
    "x": "nPTIABcDASY6FNGSNfHCB51tY7qChtgzeVazOtLrwQ",
    "y": "vEEs4V0egJkNyM2Q4pp001zu14VcpQ0_Ei8xOOPxKZs"
  }

and a query to /new-acct/ (which works fine) says the public key is:

 "key": {
    "kty": "EC",
    "crv": "P-256",
    "x": "AJz0yAAXAwEmOhTRkjXxwgedbWO6gobYM3lWszrS68E",
    "y": "vEEs4V0egJkNyM2Q4pp001zu14VcpQ0_Ei8xOOPxKZs"
  }

Requests generally work, except when validating challenge response (DNS or HTTP), which fail because the answer is not as expected, presumably skewed by the difference in key.

The interesting thing is that calling /key-change with new key works fine, and thereafter the new key works for all operations. Clearly the most likely problem is a bug on my side. I’m using the Certes library : https://github.com/fszlin/certes/blob/master/src/Certes/Crypto/AsymmetricCipherKey.cs

JuergenAuer · November 23, 2018, 6:54pm

Hi @webprofusion

checked my own version (key of my stage-account, RSA 4096):

Local:

xPKBq7VJeCWDcIQ0b8EGrbHuuRxQ2ZI3PncDBE1HbZb0qsI9gReJLPzVbfZqW/tf8GmM0UKR8lFglVMSBWbN6g90ZZza2ZqSxLjSkBVDvvechdeExDiZQ3FZTcHfgo+oNW+c4BGOpKtMtu1A+is4cpXzLyEm/YwrBayD2rXOSmn1yv5BG/OSK/dsN77tvZSX4apBwvua5DfigTLhq7t8t2L4NLKlHu11PdoQFZr2C8c+flEJqhY6xjEMpk/osCm0G9DJtJB+PiiOCMlNqVKYY3+OF5T4XTDK5jMwf32oYfrDZ8s40Phnvj+6KS+xuu694H74l7aKT5wqjpKlTag8ZFieue6zKtpePNBy3+zlFybx9UxE4GIsNaZlYrrxq6jinTnmaZl3Kt1yNz72Toql9q4ptef/FlkrfUxOkt5l4NhvP78SHDXHPH6r5aCaGVE847zLquwZGiJ51Nw1tCpLCgaVorugpa3lj/jI1V8xZKgtA9OktDpoAlQ3BZ2hebmlM1AmSQfjshhIFvbrMTdYtUz9+EkxGVGWalQLe4obePCJxfb9Tq1FRb3N+u4gkvpfAeoiRECnzOzZ6kVcsM+3/BNoO8iTj87g7XKOxXnKA4Mh3pF3xa/pNuyw0veawfrQsT+AC8W/XRG9zB1DBIuVdVbg83NdlNq1Q/RZytpxhCk=

Result of /new-account (v2) with {"onlyReturnExisting":true} as payload:

xPKBq7VJeCWDcIQ0b8EGrbHuuRxQ2ZI3PncDBE1HbZb0qsI9gReJLPzVbfZqW_tf8GmM0UKR8lFglVMSBWbN6g90ZZza2ZqSxLjSkBVDvvechdeExDiZQ3FZTcHfgo-oNW-c4BGOpKtMtu1A-is4cpXzLyEm_YwrBayD2rXOSmn1yv5BG_OSK_dsN77tvZSX4apBwvua5DfigTLhq7t8t2L4NLKlHu11PdoQFZr2C8c-flEJqhY6xjEMpk_osCm0G9DJtJB-PiiOCMlNqVKYY3-OF5T4XTDK5jMwf32oYfrDZ8s40Phnvj-6KS-xuu694H74l7aKT5wqjpKlTag8ZFieue6zKtpePNBy3-zlFybx9UxE4GIsNaZlYrrxq6jinTnmaZl3Kt1yNz72Toql9q4ptef_FlkrfUxOkt5l4NhvP78SHDXHPH6r5aCaGVE847zLquwZGiJ51Nw1tCpLCgaVorugpa3lj_jI1V8xZKgtA9OktDpoAlQ3BZ2hebmlM1AmSQfjshhIFvbrMTdYtUz9-EkxGVGWalQLe4obePCJxfb9Tq1FRb3N-u4gkvpfAeoiRECnzOzZ6kVcsM-3_BNoO8iTj87g7XKOxXnKA4Mh3pF3xa_pNuyw0veawfrQsT-AC8W_XRG9zB1DBIuVdVbg83NdlNq1Q_RZytpxhCk

It' ~ the same, the result of Letsencrypt uses the url-safe version: + -> -, / -> _, = at the end removed.

So I see no difference.

_az · November 23, 2018, 8:16pm

It doesn’t really seem possible, on the face of it.

Are you able to show the raw request and response payloads for newAccount{onlyReturnExisting=true}? Neither reveals your private key nor can they be replayed so it is safe.

webprofusion · November 24, 2018, 12:18am

Let me see if I captured anything useful, unfortunately the key change fixed the problematic key on the account, so now I can’t reproduce it.

I agree it’s very odd. At first I suspected I was using a stale key because the method I was using to save the settings perhaps hadn’t fully flushed to disk by the time the next part of the process loads the key and creates an Acme account - but that doesn’t explain why account operation would work.

It could just be an encoding/rounding difference that actually equates to the same result when fitted to the EC curve? I’m making stuff up now.

_az · November 24, 2018, 12:48am

Wow. I think you're right, but I'm not 100% sure.

Quickly decoding both sets of encoded x,y coordinates seems to decode to the same coordinates: Go Playground - The Go Programming Language . I don't know if I've made a mistake there.

I'm not too certain about the integer-to-octet-string encoding itself works either (its documented by another document linked from RFC 7518 - JSON Web Algorithms (JWA)).

_az · November 24, 2018, 1:21am

OK, I think I know what's wrong with your code: the encoding algorithm is wrong, which as you already figured out, causes your key-authz calculation to be wrong.

From the RFC:

The length of this octet string MUST
be the full size of a coordinate for the curve specified in the "crv"
parameter. For example, if the value of "crv" is "P-521", the octet
string must be 66 octets long.

P-256 is 256-bits, so the string must be 32 octets long.

nPTIABcDASY6FNGSNfHCB51tY7qChtgzeVazOtLrwQ is 31 octets in length.

If you decode that, then pad it to 32 bytes, its encoding ends up as AJz0yAAXAwEmOhTRkjXxwgedbWO6gobYM3lWszrS68E, which is what Boulder sees.

So your encoding is not valid, even though decoders seem to tolerate it and decode it to the same coordinate.

Which also explains why Boulder behaves weird. I wonder if Boulder should be more intolerant of this case? cc @jsha

webprofusion · November 24, 2018, 2:04am

Awesome, thanks for taking the time to look at this, it’s very much appreciated!

I’ll check out where the issue is on my side, most likely in Certes, under the covers it’s using the bouncycastle crypto functions but some parts are custom.

webprofusion · November 25, 2018, 8:42am

Thanks again for your help @_az PR to certes now pending: https://github.com/fszlin/certes/pull/174

_az · November 25, 2018, 9:06am

Cool.

I was just pondering about how a byte actually got chopped off to begin with and whether ACME clients might be affected, since a number of them don’t seem to worry about the padding.

webprofusion · November 25, 2018, 9:44am

It’s a good question, here’s an example private key that then causes the issue to manifest:

The X Coord gets converted (using Bouncy Castle) to a Big Integer, then an unsigned byte array. In this case the byte array ends up 31 bytes, which could be an artifact of the conversion to big integer (as that has a variable storage size). It may be unique to the C#/BouncyCastle variant of EC Coords.

webprofusion · November 25, 2018, 9:58am

Also worth noting that it only affects keys randomly (probably based on the size of the X number), so many users where fine but some would report that the auth challenges were failing despite presenting what appeared to be the correct answer.

_az · November 25, 2018, 9:59am

To further document my learning: Go also decodes that as a big int with a byte-length of 31.

~~However, in 60 seconds spent generating 2,888,254 P-256 keys, I only produced ones with ones with a 32-byte length X coordinate.~~

~~So perhaps generating one with 31-bytes is either vanishingly rare or invalid? :S~~

Edit: I was looking at the wrong thing. It generated 32/31 fairly evenly, as well as others:

32:1425640 31:1419220 30:5562 29:19

So the padding is very important (and it is a real issue for all ACME clients, then).

webprofusion · November 25, 2018, 10:08am

Thanks, yes it seems to be pretty common, which makes it even more great that the solution has been found. Maybe it affects acme v2 more? I dunno. Haven’t really seen it before our v4.x release and I’d assumed there were other problems with account key mgmt but as it turns out they were probably red herrings.

I hadn’t realised that it could affect 50% of users though. I wonder how many failed auth reports are actually faulty key encodings? I’m assuming certbot does the right thing, therefore it would only be other clients that are affected (and they’re probably in the minority compared to certbot).

This is actually a pretty cool thing to raise awareness of.

_az · November 25, 2018, 10:09am

Certbot doesn’t support EC for account keys at all, so it has no chance to screw JWK up there.

I think Posh-ACME might have the same mistake? @rmbolger.

Maybe acme.sh too, based on the way it parses the output of openssl ec.

rmbolger · November 25, 2018, 3:58pm

It looks like this might be a BouncyCastle specific bug or perhaps just an implementation detail it expects the caller to handle? I'm pretty sure Posh-ACME is safe for the majority of users because most of the key related code doesn't use BouncyCastle. It uses .NET native crypto libraries which seem (so far) to handle the padding automatically. I also then store account keys as JWK rather than PEM because I didn't want to need to convert back from PEM using BouncyCastle.

I do have a function that is used to convert from the BC key objects to .NET key objects. But it's currently only used by the Google Cloud DNS plugin to parse the Google account's auth key which is PEM formatted. I tried using that against the "bad key" posted earlier and I do get an exception. BC reads the PEM just fine, but translating the key parameters from the BC object to the .NET object results in an exception that basically says the X/Y parameters are not valid.

Am I correct that the "bad key" is not actually bad? It's just that BouncyCastle's interpretation of it doesn't include the necessary padding? So in my function, I could add the padding and avoid the .NET error without breaking the key?

I'll also try running a similar key creation test to @_az's and see whether .NET does indeed handle the padding automatically or not.

rmbolger · November 25, 2018, 5:12pm

So I ran a .NET test with 1,000,000 P-256 key creations and all of them came out with 32 byte X lengths. So I think it's safe if you're doing everything natively.

I started to run the same test against BouncyCastle and in the process I think I found a better way to fix the bug than the PR you submitted, @webprofusion. In short, BouncyCastle will handle the padding automatically but you have to get the parameter byte array slightly differently.

The certes code and my own conversion function is using the following to get the X/Y byte arrays which doesn't add the padding:

ecKey.Q.AffineXCoord.ToBigInteger().ToByteArrayUnsigned()

Instead, we should use this:

ecKey.Q.AffineXCoord.GetEncoded()

which internally to BouncyCastle then does this in order to handle the padding automatically:

return BigIntegers.AsUnsignedByteArray((FieldSize + 7) / 8, ToBigInteger());

This definitely fixes my own function that converts BC keys to .NET keys.

webprofusion · November 25, 2018, 11:36pm

Neat! I think the original Certes code was inspired by other examples that tend to use .ToByteArrayUnsigned

And regarding the original ‘bad’ pem, as far as I’m aware it’s still a good key, but perhaps there’s a related encoding bug - even so, decoding should be independent of that.

jsha · November 26, 2018, 4:20am

This is a great thread. Thanks for the analysis and the discovery, @_az! I’ll try to reproduce tomorrow or Tuesday, but your conclusion makes sense. Most likely we’ll want to submit an upstream patch in go-jose.

jsha · December 5, 2018, 6:14am

I’ve submitted a PR: https://github.com/square/go-jose/pull/210

webprofusion · December 5, 2018, 9:13am

So the latest version of Certify The Web (v4.0.12) has the key fix (not yet merged upstream to the main certes library), seems to be going mostly ok.

Interestingly I still have one user that cannot successfully perform http validation due the key authz being computed incorrectly for the challenge response. I’m now wondering if bouncy castle has subtly different behaviour on different versions of windows/.net/os patch levels which is a bit crazy but I’m grasping at straws. For a future release I’ll probably look to use more of the standard .net crypto functions instead of relying on BC.

Has anyone ever seen LE respond similarly to the following even if the challenge response is actually correct? (I saw something elsewhere about IPv6 but I don’t see how that would apply):
Validation of the required challenges did not complete successfully. The key authorization file from the server did not match this challenge [Br5NVeKTTEgLaPoMCOXuAl-ivmlGBgPxbSvmeO0SKXw.4hOot_D7SDRumLJLG3sciREeT6z7Um0M-l-KsB73Q_E] != [Br5NVeKTTEgLaPoMCOXuAl-ivmlGBgPxbSvmeO0SKXw.AbVkbPD9c9j2MNbLLQVVhqNJlUdoGnJYME7Sv4A_PXc]

Topic		Replies	Views
Misleading Catchall Use of Client Error 'malformed' Detailed as 'Parse error reading JWS' Feature Requests	11	294	May 1, 2025
Sudden increase in http validation errors Client dev	15	1208	September 19, 2018
Error during Account Registration - Key too small: 0 Client dev	13	3443	November 13, 2016
Challenge Account Key Client dev	16	889	May 18, 2024
"Parse error reading JWS" Client dev	29	2471	August 4, 2022

Account public key discrepancy

Related topics