Certbot should not read comments

Good day,

We ran Certbot on Apache, but a certificate could not be installed.

In the debug log, /var/log/letsencrypt/letsencrypt.log, a line was: “UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1705: invalid continuation byte”.
Note: in latin1 and latin2 the byte 0xe9 symbolizes the letter é.

After some head scratching, I found out it was caused by the character “é“ in a comment line in the httpd.conf file. Then by modifying the comment line # Pépinière by # Pepiniere, we could install the certificate.

Someone should tell Cerbot “Please don’t read comments! You don’t need to”

3 Likes

Welcome to the Let's Encrypt Community :slightly_smiling_face:

Thanks for the feedback!

@certbot-devs

This one's for you.

1 Like

What version of Certbot are you running?

2 Likes

Certbot doesn't read comments, but it's a more general problem of assuming all files are utf-8, since it is pretty universal these days, and having to read the whole thing to find the comments. (Re-saving the conf as utf-8 would have also fixed that error without mangling your name.) Maybe it should fall back to latin-1 first if it encounters a UnicodeDecodeError before giving up.

3 Likes

Certbot version 1.16.0
Apache/2.4.37
CentOS Linux release 8.4.2105
System is up to date via yum update
All in US English, except one line in httpd.conf, which had this comment line: # Pépinière

1 Like

What output do you get from file -bi httpd.conf? This checks the file encoding.

1 Like

Running command: file -bi httpd.conf outputs:
text/plain; charset=us-ascii

2 Likes

Thanks. I'm not entirely sure what we should do about this.

In the nginx plugin, we only support UTF-8 encoding in the configuration files. If the file does not decode properly as UTF-8 (because it is non-ASCII but also non-UTF-8), we raise an error:

Could not read file: /etc/nginx/sites-enabled/foo due to invalid character. Only UTF-8 encoding is supported.

I am going to say the most likely thing that will happen is that we will detect UnicodeDecodeError and print a similar message.

The only saving grace I see is that because the Augeas parsing library gets us 99% of the way to success, it might be viable to match its behavior in that 1 lone place in Certbot where we're not using Augeas (where we copy paste the virtualhost into an SSL virtualhost). I will try find out.

3 Likes

There's quite a good discussion on python 3 encoding options here: Processing Text Files in Python 3 — Nick Coghlan's Python Notes 1.0 documentation (curiousefficiency.org) not sure if it's useful.

@D2S-cloud do you know which editor was used to comment that config file? You can generally start with an ascii file then "upgrade" it to utf-8 by introducing characters that are beyond the ascii codepage. I'd imagine if you updated your comment to be # Pépinière again then running file -bi httpd.conf would now say utf-8. Editors can explicitly mark a text file as UTF-8 (etc) by adding a Byte Order Marker at the start of the file, but it's not mandatory. If you do save your file as utf-8 with BOM enabled then you'll probably find everything works well.

1 Like

I usually use WinSCP’s Internal Text Editor.

WinSCP current version is 5.17.8

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.