Misleading Catchall Use of Client Error 'malformed' Detailed as 'Parse error reading JWS'

I am an amateur programmer of my personal ACME2 client. Not surprisingly, I use Python 3. My findings are based on supposition from limited testing of the Let's Encrypt production and testing (staging) servers. I am supposing the production and testing servers behave identically for the purposes of this assessment. I have done what I can to write this intelligently. Technical clarification and correction is welcome. I suppose that could include technical clarification of conventions used by the community providing the Let's Encrypt ACME2 service but not so well known or documented.

I will refer to Let's Encrypt as LE. I will refer to ACME2 as simply ACME. ACME1 is no longer used and seems to have served its purpose as a prototype protocol for testing and refinement.

Account keys and proposed account keys sign a message hash to authenticate the message. Certificate keys are not used to authenticate except as an additional option to request certificate revocation. I make distinction between UNSUPPORTED asymmetrical key or key pair types and digital signing algorithms (DSAs) and UNINTELLIGIBLE uses of keys and DSAs. I am using the term DSA.

I will refer to the U.S. NIST Prime key P-521 as ‘P-521’. There is no such thing as U.S. NIST Prime key type P-512. The DSA ‘ES512’ uses the key type P-521 to encrypt a message hash/digest value generated with secure hash function SHA-512. The ‘512’ in DSA name ‘ES512’ refers to SHA-512. See “SEC 1: Elliptic Curve Cryptography”, version 2 (or perhaps later if it exists) for the definition of the signature of a NIST Prime key as a pair of coordinate values with standardized representations.

I will refer to Request for Comments (RFC) documents by the Internet Engineering Task Force (IETF) as RFCs with a postpositive number identifier.

These are the RFCs specific to my exposition:
RFC 8555, “Automatic Certificate Management Environment (ACME)”, dated March 2019.
RFC 8037, “CFRG Elliptic Curve Diffie-Hellman (ECDH) and Signatures in JSON Object Signing and Encryption (JOSE)”, dated January 2017.
RFC7518, “JSON Web Algorithms (JWA)”, dated May 2015.
RFC 7515, “JSON Web Signature (JWS)”, dated May 2015.
RFC 4648, “The Base16, Base32, and Base64 Data Encodings”, dated October 2006.

This is the problem document that (I believe) is currently used by Let's Encrypt servers as a catchall problem response:
{'type': 'urn:ietf:params:acme:error:malformed', 'detail': 'Parse error reading JWS', 'status': 400}

I will refer to the above problem document as the Malformed JWS problem document.

Discussion 1:

My recent experience shows that LE servers respond to a proposed account key of type Ed25519 (as with a rollover) with the Malformed JWS problem document.

LE thread “Parse error reading JWS” dated 02 October 2021, Parse error reading JWS documents that an LE server responded with the Malformed JWS problem document because the client was using the key type secp256k1, which is used by Bitcoin.

LE thread “New client - parse error reading JWS”, dated December 2020, New client - parse error reading JWS - #4 by _az documents that an LE server responded with the Malformed JWS problem document because the client was representing the full key pair or private key rather than just the public key. The party who identified the client programming error wrote ‘Pebble [the ACME server] should definitely make a much less opaque error’. The thread creator responded in part, ‘I was expecting something like "bad public key"’. That was four years ago.

LE thread “{Urn ietf:params:acme error malformed Parse error reading JWS status 400}”, dated March 2018 documents than an LE server responded to a thoroughly malformed JWS message with the Malformed JWS problem document. Someone not only identified several problems with the code of the thread creator, but with extraordinary involvement provided a working revision of that code.

My recent experience shows that LE servers support RSA signing keys paired with DSA ‘RS256’ and respond to RSA signing keys paired with DSAs ‘RS384’ and ‘RS512’ with the Malformed JWS problem document.

Finding 1: LE servers respond identically to UNSUPPORTED key types and DSAs and UNINTELLIGIBLE uses of keys and DSAs with the Malformed JWS problem document.

Discussion 2:

Software implementations that send correctly formed HTTP messages to servers with bodies that are correctly formed JSON Web Signatures (JWSs) have correct message implementation. Algorithm choices not supported by the ACME server are not errors of message construction, and they are not necessarily errors of choice by the client software or client party.

Finding 2: Feasible authentication algorithms not supported by the ACME server are not JWS message format errors by the ACME client.

Discussion 3:

The JWS message (protected header and payload) is authenticated with an encrypted hash of the message. The message itself is not encrypted. Values of the JWS protected header and payload are encoded in base64url for transmission with HTTP, but that can be decoded by the well-known base64url algorithm, specified by RFC 7515, part 2 by reference to RFC 4648, part 5. The trailing half-octet padding with ‘=’ characters is forbidden by RFC 7515. Octet synchronization with strings of base64url encoded octets is not needed for ACME. A period or full stop delimits the protected header from the payload and another, the payload from the signature.

Because the protected header is NOT encrypted, the server can easily identify the intended account key type and the associated DSA type. I makes no sense (to me at least) for an ACME server to try to parse an HTTP message body as a JWS with a known but unsupported key type or DSA to verify that those algorithms actually work only to reject that key type and/or DSA as UNSUPPORTED rather than to reject the JWS as malformed. The JWS is NOT malformed simply because the ACME server does not support the key type, DSA pairing.

A public key representation of the private key or key pair of the proposed or actual account is conveyed (marshaled) to the ACME server as a JSON Web Key (JWK) with a ‘kty’ header and value. The server uses the public key representation to verify the authenticity of the message signature in the JWS. That public key representation is retained by the ACME server so that a ‘kid’ header value provided by the ACME server may be used in lieu of a ‘jwk’ header value.

Finding 3: ACME servers can easily identify the names of the proposed key type and/or DSA that they reject by simply looking at the normal parse results of a well-formed message.

Discussion 4:

From the perspective of a ACME client programmer, there is a big difference between a truly malformed JWS, which indicates a coding error in the client code, and an unsupported account key type, certificate key type, or DSA selection.

A coding error could cause the JWK representation of the public key to be malformed and unclear. The ‘kty’ field could be omitted, or the ‘kty’ key name could be misspelled, or the ‘kty’ value could be a nonsensical garbage value like ‘XE4rdd’.

A correctly identified but unsupported key type is a matter of server-client negotiation, which is not an error in the client message but a normal social process, even with automation, that is practically required for the goal of ubiquitously automated domain validation (DV) certificates. Making bad use of client programmer time and energy with misleading error responses is counterproductive to ubiquitous adoption. If the ‘kty’ field is available and has a value that can be obtained by normal parsing, why is it not simply used? The ‘alg’ field in the protected header of the JWS similarly declares the DSA to the ACME server.

Finding 4: ACME client programmers need ACME server responses to clearly distinguish between UNSUPPORTED key types and DSAs and UNINTELLIGIBLE uses of keys and DSAs.

Discussion 5:

Edwards key type Ed25519 is a competitive alternative to U.S. NIST P-256, and Edwards key type Ed448 is a competitive alternative to U.S. NIST P-384 and P-521 (not P-512), as the comparative bits of security suggest.

The manipulation the Digital Encryption Standard (DES) by U.S. NSA is well known: see the English Wikipedia entry ‘Data Encryption Standard’.

The choice of 512-bit coordinate values for P-521 is curious, at least to the layman with some study, because 521 is not only not evenly divisible by 8 (the fundamental bit size of octet communication and processing) but its modulus 8 is the remainder value 1. That's one more octet to represent one last bit.

Daniel J. Bernstein, Tanja Lange, and others evidently developed the two well-known Edwards curves to be very useful and safe for privacy. The general lack of adoption of Ed25519 and Ed448 in Web applications is curious and by all accounts is not possibly due to any weakness in their designs. The algorithms for key types Ed25519 and Ed448 have been available for at least nearly a decade. Recent versions of OpenSSL support them. Edwards curves type keys Ed25519 and Ed448 are not only well-known but provide equivalent or better encryption (and authentication) than the NIST Prime keys P-256, P-384, and P-521 do. Your political sentiments will likely guide your opinion if you have one.

For some reason, RFC 7518 identifies the ‘kty’ of NIST Prime keys P-256, P-384, and P-521 as ‘EC’, meaning ‘Elliptic Curve’, whereas RFC 8037 identifies the ‘kty’ family of Edwards curve keys by Daniel J. Bernstein et al. as ‘OKP’, which means ‘Octet Key Pair’. Whether those distinct ‘kty’ designations derive from technical or non-technical reasons I don't know.

The use of field name ‘kty’ to specify the key type is some cases (e.g. ‘RSA’) and the key type family in others (e.g. ‘EC’ and ‘OKP’) is confusing and harder to code correctly than a consistently simple specification with only the ‘kty’ field. If an ACME server were to identify the supported keys, would it use the independent ‘kty’ values and the supplemental leaf node ‘crv’ values as the identifier names?

The ‘alg’ field does not have that complication for DSAs limited to the RSA and NIST Prime key types, but it does (or would) with the addition of the Bernstein et al. Edwards key types. Further exposition on that point is deferred to discussion section 9.

Finding 5: The Ed25519 and Ed448 key types are well-known and competitively viable for Web applications and for ACME certification specifically, but those key types have so far been hardly used for them.

Discussion 6:

RFC 7518, part 3.1, the table lists the RSA key DSA ‘RS256’ as ‘Recommended’ and the RSA key DSAs ‘RS384’ and ‘RS512’ as ‘Optional’. The practical advantage of reduced computational overhead for the smallest hash size of RS256 is presumably waning if not already irrelevant.

The same table of RFC 7518 lists the three U.S. NIST Prime key types P-521 (not P-512), P-384, and P-256 are represented with a single DSA for each: ‘ES512’ (not ‘ES521)’, ‘ES384’, and ‘ES256’, respectively.

RFC 8037, part 3.1 identifies the ‘EdDSA’ family of DSAs and lists the two members DSAs ‘Ed25519’ and ‘Ed448’.

RFC 8555, as its title indicates, defines the initial production version of ACME, known as ACME version 2 or just ACME2.

RFC 8555, part 6.2:
An ACME server MUST implement the "ES256" signature algorithm [RFC7518] and SHOULD implement the "EdDSA" signature algorithm using the "Ed25519" variant (indicated by "crv") [RFC8037].

Finding 6: DSA algorithms ‘RS384’, ‘RS512’, ‘Ed25519', and ‘Ed448’ are well known algorithms and viable choices at least on par with the well-known and generally accepted DSA algorithms ‘RS256’, ‘ES256’, ‘ES384’, and ‘ES512’.

Discussion 7:

RFC 8555, part 6.7:
When the server responds with an error status, it SHOULD provide additional information using a problem document [RFC7807]. To facilitate automatic response to errors, this document defines the
following standard tokens for use in the "type" field (within the ACME URN namespace "urn:ietf:params:acme:error:"):

[A table with headings ‘Type’ and ‘Description’ and these two entries among others.]
badPublicKey                 The JWS was signed by a public key the server does not support
badSignatureAlgorithm  The JWS was signed with an algorithm the server does not support

Notice the words ‘not support’ in the descriptions of the errors ‘badPublicKey’ and ‘badSignatureAlgorithm’. Rendered in all caps as one word that would be known as UNSUPPORTED.

Finding 7: ACME servers SHOULD detect any of the UNSUPPORTED key types used to sign a JWS message and respond with a problem document that has the error ‘badPublicKey’.

Discussion 8:

RFC 8555, part 6.7 has provisioned the use of error badSignatureAlgorithm as shown in the previous discussion section. A finding for DSAs analogous to previous finding 6 is already shown, but wait, there's more.

To make ubiquitous automated certificate renewal practical, ACME clients require the ability to ‘negotiate’ for the best supported account key type, certificate key type, and DSA type. ACME servers can upgrade protocols as they wish without breaking the client code with such a negotiation feature. It is my understanding that Web servers and Web browsers negotiate the connection algorithms to use. Therefore, I posit that practicality requires that ACME servers respond to an unsupported key type and/or DSA with a list of supported key types and/or DSA types. With regard to DSAs, that is not just my opinion.

RFC 8555, part 6.2:
If the client sends a JWS signed with an algorithm that the server does not support, then the server MUST return an error with status code 400 (Bad Request) and type "urn:ietf:params:acme:error:badSignatureAlgorithm". The problem document returned with the error MUST include an "algorithms" field with an array of supported "alg" values.

Finding 8: ACME servers SHOULD detect any of the UNSUPPORTED DSAs used to authenticate a JWS message and respond with a problem document that has the error ‘badSignatureAlgorithm’, and if they do, the reply MUST also include an ‘array’ (a.k.a. a list) of supported ‘alg’ values.

Discussion 9:

RFC 7518, part 3.1, the table lists in part these values for the ‘alg’ field of JWS: RS256, RS384, RS512, ES256, ES384, and ES512. The first three require an RSA key. The last three respectively require the U.S. NIST Prime keys P-256, P-384, and P-521 (not P-512). So long as LE or any other ACME service provider supports only RSA and NIST Prime keys (as they currently do, but not P-521), then the ‘alg’ value alone specifies a DSA.

RFC 8037, part 3.1 breaks (or would break if implemented) that one-to-one relationship for Edwards keys. The document portion specifies the ‘alg’ field value ‘EdDSA’ and the supplemental ‘crv’ field values ‘Ed25519’ and ‘Ed448’.

Finding 9: The ‘alg’ values that an ACME server SHOULD or MUST use to identity the DSAs it supports are (or could become) unclear without prescriptive clarification because RFC 8037 (and perhaps other specifications) broke the one-to-one relationship between the value of the ‘alg’ field and a particular DSA.

It's only tangentially related to your post, but since you mentioned it, here's some crypto trivia:

The NIST SECP curve P-521 has 521 bits because 2^521 - 1 is a prime - this is known as a Mersenne prime. In fact, all of the NIST SECP curves are either Mersenne primes or generalized Mersenne primes (P-256 is = 2^256 - 2^224 + 2^192 + 2^96 + 1, P-384 is = 2^384 - 2^128 - 2^96 + 2^32 -1). Mersenne primes have efficient implementations (you can do modular reduction with any of these without doing any multiplications). However, there are also not that many Mersenne primes: Only 52 are known as of today. So while 512 would've been the "nice number that fits better in your register", 521 being a Mersenne prime was simply superior performance-wise to anything else you could choose in that bit-range.


In general I agree with the idea behind this post: Boulder should honor RFC8555

If the client sends a JWS signed with an algorithm that the server does not support, then the server MUST return an error with status code 400 (Bad Request) and type "urn:ietf:params:acme:error:badSignatureAlgorithm". The problem document returned with the error MUST include an "algorithms" field with an array of supported "alg" values. See Section 6.7 for more details on the structure of error responses.

if it doesn't do already (didn't check).

3 Likes

pebble checks about key type and make dedicated error but throw malformed error for it

boulder catch-all on JWS parse error as malformed

3 Likes

I appreciate your thoughtful comment. The choice of P-521 seems more reasonable knowing that it was the closest Mersenne prime to a 512 bit coordinate size. Real mathematicians are really smart. To contemplate such things without modern libraries in ancient and medieval times must have required genius. With all the ugly things that happen in history, it's nice to know that the competent pursuit of knowledge can and perhaps tends to happen.

I am supposing that when I use LE staging/testing at https://acme-staging-v02.api.letsencrypt.org/directory that I am using the pebble implementation, and when I use LE real at https://acme-v02.api.letsencrypt.org/ I am using the boulder implementation. I did not notice a difference in behavior and supposed the two servers are essentially identical for the concerns I expressed for this thread.

Looking at the boulder code, it looks like the return 'err' value is never used to particularize the error message to the client. To me it also looks like only the supported algorithms are considered as not errors, which I think is your point about boulder.

The pebbles function is less clear to me. I can't clearly understand your statement: "pebble checks about key type and make dedicated error but throw malformed error for it"

What is 'Algorithm field on the JWK'? Is that the 'kty' field?Not the 'alg' field is in the JWS? I think it means key type. It looks like pebbles is addressing one or both of the badPublicKey and badSignatureAlgorithm cases to whatever extent.

Notice that pebbles assumes there is only one possible algorithm, as I read the code. RSA can be upgraded to SHA-384 and SHA-512.

RFC 8555, part 10.1:
ACME does not protect against other types of abuse by a MitM on the ACME channel. For example, such an attacker could send a bogus "badSignatureAlgorithm" error response to downgrade a client to the lowest-quality signature algorithm that the server supports.

The idea of not downgrading is clearly preferred. I don't like having to hardcode ('RS256',) rather than ('RS256', 'RS384', 'RS512'). Those are Python, version 3 tuples.

I wonder if the directory could just list the supported key types and digital signature algorithms, perhaps in the ranked order of preference from the server's perspective. I would like to see some sort of negotiation on key type and signing algorithm. They are intertwined selections, which makes treating them as completely separate considerations perhaps a bad idea. From a philosophical standpoint, some idea constellations are 'irreducible'.

I appreciate your thoughtful reply. I tried my best to grasp your points.

No, both services are running Let's Encrypt's boulder. It's just that staging sometimes runs a slightly newer build of boulder and sometimes has features enabled that production boulder hasn't. Essentially, Let's Encrypts staging boulder is tomorrow's production boulder.

Pebble is also from Let's Encrypt, and is a testing-only ACME CA that can be used for ACME client development or similar tests.

7 Likes

Hi @abyss, and welcome to the forums!

I don't think you need to convince anyone, I completely agree that JWSes using unsupported key or signature algorithm types should be rejected with BadPublicKey or BadSignatureAlgorithm, not Malformed.

In fact, our source code shows that we do exactly that: we respond with BadSignatureAlgorithm if an unsupported algorithm is claimed, and respond with BadPublicKey if presented with an unsupported key type.

If you have a reproduction case where using an Ed25519 key results in Malformed instead of one of those errors, please file a bug against the Boulder repository describing exactly what steps you followed and the full contents of the error message returned by the API.

5 Likes

I found that checkalgorithm function is never reachable from web request because parseJWS is called before validXXXJSW, short circuiting as malformederror.

3 Likes

made a PR about this on boulder separately handle badsignalgo at JWS parse time by orangepizza · Pull Request #8091 · letsencrypt/boulder · GitHub

3 Likes

I just tested the Let's Encrypt staging ACME2 server for Ed25519 key and for RSA with RS512 on 01 April 2025. Not joking today. I really don't like that tradition. Anyway, I found that the catchall malformed error is still returned in both cases. I have some invocation input and output if you want it, but I don't think you need it.

For those of us not so well informed on process, you (orangepizza) have solved the problem of problem identification. I was not sure what you meant by 'solved', but your later posts below show that the fix is an ongoing process.

Thank you for your kind efforts. If you would like me to submit a bug report as suggested by aarongable, let me know and I will. I will be hacking away on my code at some point. Might cause a delay in my ability to check if the fix works for me.

@orangepizza already submitted a PR correcting the bug per the link @orangepizza posted above, so I marked that post as the solution from a community perspective. It's up to the LE staff now to address that PR accordingly.

4 Likes