Dry-run works missing privkey.pem on live try

I think there’s actually a bug in Certbot that you found, so it could be very helpful to try to get to the bottom of it.

The tricky thing is that it might well have been fixed in a newer version of Certbot (which you could check with certbot-auto), but because of the server-side rate limits, you can’t issue a new certificate until Sunday. The exception would be if you had a third domain name (including a subdomain) that you could include in the certificate request—then the requested certificate would not have the “exact set of names”, which is restricted by the rate limit that you ran into. The rate limit was probably triggered by trying repeatedly in the presence of this bug.

@bmw, I went back to the release commit for this version of Certbot and found the line that generated the exception. It’s renewal.py with the invocation of the logger here:

            logger.warning("Attempting to renew cert from %s produced an "
                           "unexpected error: %s. Skipping.", renewal_file, e)

Apparently either renewal_file or e was not of type str and couldn’t be coerced to str to be interpolated in the log message. Which is extra-annoying in this case, because if we had that log message, we would know what the underlying error was. :frowning:

Also, from the crt.sh logs, it looks like the renewal succeeded (in terms of issuance) but an uncaught exception was raised before returning to renewal.py here.

is it just buried in these letsencrypt.log.1-31 in this folder? how do i find it? you want a zip of all the logs privately or would you rather ssh into the box and look around.

I want to help as much as possible since this is a free service, that is sorta the cost of using it in my book.

in not knowing what elseto do,I changed locals a few time and tried before going back to the en.us utf 8 , and lots of other things as well thaty may have scrambled it .

My two suggestions would be

  • create an additional subdomain even though you don’t plan to use it, and add web server configuration for that subdomain that matches your current configuration; then you can try to get a certificate right away for all three names (base domain, www, and the other subdomain), perhaps using certbot-auto and perhaps also adding --debug

  • wait until Sunday and try again with certbot-auto and perhaps with --debug for your existing two names

I’m not sure that you’re going to have more information in an existing log.

But I am curious whether you ever tried to use either filenames or file contents that were non-ASCII, containing accented or non-Latin characters, during the time that you were experimenting with a different locale. Sometimes there can be character encoding problems that cause weird crashes when we have string-handling bugs.

no, i didnt change any filenames or contents, just the local setup tool with debian, i tried with the other en.us local once and changed back

remember then original problem i had sunday (and before) was file not found cause the “live” symlink was pointing to the non existent variant folder in the archive directory. there was the www version but it was looking in the non www folder in archive…or vice versa, (cant remeber)then i copied that directory without the www (or added it ,whichever ) and matched permissions and that didnt work as well.

ill try a third domain later if i feel up to up but i got to get this done with or without ssl so ill probably reinstall and forgoo https. I really dont see a need for it on the site i am doing, its just a wordpress site for my friends small town bar. I was just doing it to make google search ranking happier if they actually rate ssl sights higher than non ssl.

thanks for all your help, if you think of anything you need off that server for your purposes let me know, and if i havent wiped yet i will provide it. But if its for ME, dont worry about it. Im going to give a third domain setup a 15 minute shot later and then call it after that and send in the format monster.

I do think it would be great to get to the bottom of this error in case it’s a Certbot bug that could affect other people.

While it doesn't necessarily propose a path forward, this looks relevant. The error message in the output above is:

TypeError: __str__ returned non-string (type SysCallError)

and SysCallError is from pyOpenSSL.

In particular, I'd like a log where renewal fails not due to a rate limit. Running grep -lr SysCallError /var/log/letsencrypt may help with this.

Alternatively, sharing all of your logs with me wouldn't hurt and would provide me with more information. If you'd like to do this privately, you can email them to my current username @eff.org.

inbound let me wrap them up

I sure hope this is a bug and not just something about the way I got my certs or otherwised messed up. hate to have you guys digging into it this much if it was user error. Regardless thanks alot, I appreciate the free certs your efforts here.and

I wont have time to try the third domain option till later tonight. I know it may not take long but with nginx I can either do something right away or it takes 4 hours lol. There is no middle ground. I believe all ill need a new record in my dns and setup a serverblock for the new domain requests Should be easy in theory.

Thanks again guys, and any girls reading to try to help that may not have needed to say anything yet.

You should have got that log zip by now BMW, let me know if you havent.

omg adding a domain worked as far as issuing a certificate. and I am getting a much better handle on this whole certificate thing.

Maybe i figured out the problem?

When i got the cert, i had trouble, it was months ago and i been devving on a minecraft server and have done a million things since then, none of it i should be attempting, so i dont rememeber.

but back then i did it twice. Now, after i got the three domain cert, i tried a dry renewal, and it failed!! but i look and it was a parsing error dealing with mycert.conf-0001…and this new three domain cert is 0002. so i look in the renewal folder and i had both a mydomain-0001.com.conf AND a mydomain.com.conf…

so maybe this was user error after all, i had two renewal confs for the same domain. after i renamed them both , my third cert renewed both dry and detected it wasnt due for renewal yet when not dry!

maybe its still a bug because two confs shouldnt cdo that or something. If you need those two old conf files, i have them. But as for me, my server is up and all green locked and i am good. I am not going to automate the renewal yet and wait till i get the email(i thought i got an email before) so i can watch it when it happens live after a few dry tests.

thanks again for looking into this and I am sorry if i wasted your time. The good news for me is i think i have a decent grasp of this now and its not so complicated with invisible moving parts like i thought. so this is how i learn, sorry i dragged you guys into it!

I got the zip file. Thanks for sending it to me.

Yes, when you have multiple certificates named after the same domain name, you're supposed to have one renewal configuration file per certificate. (However, getting into this position in the first place is often a sign that something unintended happened.) This is an intended design and is definitely not supposed to break renewals in any way.

Edit: By the way, congratulations on getting your cert working!

Unfortunately, there were a number of things going on.

There were a number of failures due to network issues. Most of these happened as soon as Certbot tried to connect to Let’s Encrypt (and therefore before Certbot issued any certs), but there was once where Certbot’s request to Let’s Encrypt to issue the new cert timed out causing the output you included above.

The main cause of the problems that caused you to get rate limited without getting a cert, however, was a corrupted /etc/letsencrypt directory. For your own reference, the relevant lines for this in the logs are:

IOError: [Errno 2] No such file or directory: '/etc/letsencrypt/archive/placeholder.com/privkey2.pem'

Did you manually modify any files in /etc/letsencrypt? If not, do you have any additional logs? It appears that some have been deleted.

If you did modify /etc/letsencrypt, I’m sorry to say that Certbot isn’t very robust yet at handling improperly configured directories. What was happening here is we’d download the cert from Let’s Encrypt but then fail when trying to write it to disk. This caused you to get rate limited as Certbot was never able to save a renewed certificate. Even worse, because this problem only occurred when we were trying to write the certificate, --dry-run (which doesn’t modify your existing certificates) was unaffected. I’ve created https://github.com/certbot/certbot/issues/5009 for us to look into and fix this issue.

Moving forward for you, I recommend you delete the two old configuration files in /etc/letsencrypt/renewal. Alternatively, you can move the entire /etc/letsencrypt directory somewhere else for temporary safekeeping, run Certbot again to obtain a cert for your three domains, and after you’re sure that works, delete your previous /etc/letsencrypt directory. This will give you a new configuration in a clean state.

With all that said, I’m glad you’ve got things working and thanks for taking the time to help us debug the problem. Again, if you don’t think you modified /etc/letsencrypt yourself, I’d love any additional information you can provide.

ok missing logs. early on after it wasnt working I deleted all the logs, ran it , and then tried looking at the one log in an attempt to get rid of all the noise. It had been failing autorenew for a month or so and i was overwhelmed and wanted to make sure i was looking at the log for the event only.

i did not modify the /etc/lets encrypt beyond copying the placeholder.com.001 folder archive directory and renaming it with a www. I left the original. so archive had a placeholder.com-001 directory and after i duplicated it and renamed it, it had both. I had noticed the live privkey was symlinked to a non existent directory with the www front, that is why i duplicated the non www directory, to try to give it something to find. This was after getting the ioerror errno 2 the first few times.

this is a super cheap 256 meg of ram vps host I got on through an add on lowendbox (love that site) for 12 bucks a year. maybe it was as simple as disk io problems timing out? its not very spunky. go to the now working site and you will see its quite chunkety.

Thats about the whole story, I did that over a few weeks waiting every time it limited out.

the two old configs i renamed with different extensions(the 'ol .bak) and left in that directory. now i need to wait 60 days to try it ? again i am not chronning it yet so i can watch when it renews, but i am hopeful!

if there is anything else i can tell you , dont hesitate to let me know. Thanks again for your diligence and thoroughness.

Trevor C

But you don't know how the live placeholder.com ended up symlinked to the www.placeholder.com archive directory? That's strange. We have yet to see a configuration directory that was corrupted by Certbot itself. If you (or anyone else) becomes able to provide instructions for how to reproduce this, please let me know as this would be a high priority bug to fix.

the two old configs i renamed with different extensions(the 'ol .bak) and left in that directory. now i need to wait 60 days to try it ? again i am not chronning it yet so i can watch when it renews, but i am hopeful!

Changing the extension to those files (such as appending .bak) should work just fine. If you want, you can force a real renewal with --force-renewal but doing this is still subject to rate limits.

We're happy to help and again thanks for taking the time to provide us with so information to help us improve Certbot.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.