Deploying Boulder in Production

Agreed…
But why am I getting the text/html error?

Hey,

You need to point Certbot at your directory, not at the home page of Boulder :slight_smile: .

For example,

certbot register --server https://boulder.ctao6.net/directory --register-unsafely-without-email

Now, your other problem, as already pointed out, is that the absolute URLs generated by Boulder point to localhost:4000.

To solve this, in your nginx proxy_pass configuration, you need to pass two headers:

  1. Host: boulder.ctao6.net
  2. X-Forwarded-Proto: https

I’m quite sure that the Docker environments included with Boulder are not intended for production use. It’s been raised before, but there really isn’t any public “production ops manual” available for Boulder.

1 Like

Thanks!
That gets me to the next stage.
Now it claiming that my domain is not registered...
(I removed the real domain name fro the printout because I'm not sure I want it on a public forum)
I can include it if it is important.

When I run certbot I get the error below:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator nginx, Installer nginx
Obtaining a new certificate
Performing the following challenges:
tls-sni-01 challenge for censored_server_name.com
Waiting for verification...
Cleaning up challenges
Failed authorization procedure. censored_server_name.com (tls-sni-01): urn:acme:error:unknownHost :: The server could not resolve a domain name :: No valid IP addresses found for censored_server_name.com

IMPORTANT NOTES:

  • The following errors were reported by the server:

    Domain: censored_server_name.com
    Type: unknownHost
    Detail: No valid IP addresses found for censored_server_name.com

    To fix these errors, please make sure that your domain name was
    entered correctly and the DNS A/AAAA record(s) for that domain
    contain(s) the right IP address.

  • Your account credentials have been saved in your Certbot
    configuration directory at /etc/letsencrypt. You should make a
    secure backup of this folder now. This configuration directory will
    also contain certificates and private keys obtained by Certbot so
    making regular backups of this folder is ideal.

All of my servers and containers can resolve this address.
However, when I login to the boulder docker container, it is not able to resolve any DNS (not even google.com)

docker exec -i -t boulder_boulder_run_1 curl www.google.com
curl: (6) Could not resolve host: www.google.com

Is this somehow related to the FAKE_DNS parameter mentioned in GitHub - letsencrypt/boulder: An ACME-based certificate authority, written in Go.? I initialized this to boulder.ctao6.net, but that was probably a bad choice.

So I think that the Docker environment for Boulder does not provide real DNS resolvers, and instead points to something called challtestsrv (which does not do real DNS).

In production, Let’s Encrypt runs https://www.unbound.net/ for its resolvers.

I think you need to setup some resolvers and then configure the validation authority (test/config/va.json) to use them:

    "dnsResolvers": [
      "127.0.0.1:8053",
      "127.0.0.1:8054"
    ],

You will need to run your own recursive nameservers (rather than using 1.1.1.1 or 8.8.8.8) because you need to make queries without any caching happening whatsoever.

You can configure Unbound with https://github.com/jsha/unboundtest/blob/master/unbound.conf as a template that vaguely mimics what they use in production.

I really have no idea what FAKE_DNS is about - it appears to still involve challtestsrv. Maybe @cpu can comment … or provide better advice on how to approach setting Boulder up for real use.

**Dlsclaimer: this is all gleaned info, I’m not sure any of it is current or accurate.

2 Likes

If I had access to the dockerfile I could try to reverse engineer and debug it.
However the docker-compose.yml only downloads a binary docker image: letsencrypt/boulder-tools-go${TRAVIS_GO_VERSION:-1.10.2}:2018-06-12
Any idea where I can see the dockerfile for that?
The answer to my dns problems might be in there…

1 Like

I can confirm what @_az says is correct. You’ll need to change the dnsResolvers settings to point at a DNS server that can resolve the hostnames you want to issue for. If those hostnames are only resolvable from inside your internal network, you’ll need to point at your internal nameserver.

Also, just to make sure: You understand that certificates from your own Boulder instance will not be trusted by most browsers, right? Only certificates from a publicly trusted CA will be trusted by default in browsers.

Sure, this is not meant for web browsing. This is meant for some custom software.

I tried doing this in both files (changing my DNS to multiple resolvers and relaunching), but I haven't got it to work yet. The specific boulder container won't even ping google.com. (I am running this on AWS), so I guess any dns resolver should be fine (using google's 8.8.8.8).
I notice that the default json files have port numbers in them. Do I need to add port numbers to my DNS resolvers?

OK,
Found the problem!
I needed to manually specify port 53.
“8.8.8.8” was not verbose enough.
I needed to specify 8.8.8.8:53 explicitly.

Be cautious about using public resolvers. They will cache answers (such as for _acme-challenge.example.org), which will cause Boulder to incorrectly fail your authorizations, depending on timing.

Thanks.
At this stage, I just need it to work (at all).
My next stage will be to setup my own resolver.
Do you have any recommendations?

Now that the DNS issue is resolved,
certbot gives me the following error:

2018-07-16 09:04:21,667:DEBUG:urllib3.connectionpool:https://boulder.ctao6.net:443 "POST /acme/new-reg HTTP/1.1" 500 111
2018-07-16 09:04:21,667:DEBUG:acme.client:Received response:
HTTP 500
Server: nginx/1.10.3 (Ubuntu)
Date: Mon, 16 Jul 2018 09:04:21 GMT
Content-Type: application/problem+json
Content-Length: 111
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
Replay-Nonce: hn99XZMmsGSWiDs2SR3EwHMwkY2xRCFM1L78Ev3wjg0

{
  "type": "urn:acme:error:serverInternal",
  "detail": "Failed to get registration by key",
  "status": 500
}
2018-07-16 09:04:21,668:DEBUG:acme.client:Storing nonce: hn99XZMmsGSWiDs2SR3EwHMwkY2xRCFM1L78Ev3wjg0
2018-07-16 09:04:21,668:DEBUG:certbot.log:Exiting abnormally:
Traceback (most recent call last):
  File "/usr/bin/certbot", line 11, in <module>
    load_entry_point('certbot==0.25.0', 'console_scripts', 'certbot')()
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1323, in main
    return config.func(config, plugins)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 1078, in run
    le_client = _init_le_client(config, authenticator, installer)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 642, in _init_le_client
    acc, acme = _determine_account(config)
  File "/usr/lib/python3/dist-packages/certbot/main.py", line 521, in _determine_account
    config, account_storage, tos_cb=_tos_cb)
  File "/usr/lib/python3/dist-packages/certbot/client.py", line 174, in register
    regr = perform_registration(acme, config, tos_cb)
  File "/usr/lib/python3/dist-packages/certbot/client.py", line 199, in perform_registration
    tos_cb)
  File "/usr/lib/python3/dist-packages/acme/client.py", line 747, in new_account_and_tos
    regr = self.client.register(regr)
  File "/usr/lib/python3/dist-packages/acme/client.py", line 284, in register
    response = self._post(self.directory[new_reg], new_reg)
  File "/usr/lib/python3/dist-packages/acme/client.py", line 93, in _post
    return self.net.post(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/acme/client.py", line 1082, in post
    return self._post_once(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/acme/client.py", line 1096, in _post_once
    return self._check_response(response, content_type=content_type)
  File "/usr/lib/python3/dist-packages/acme/client.py", line 956, in _check_response
    raise messages.Error.from_json(jobj)
acme.messages.Error: urn:acme:error:serverInternal :: The server experienced an internal error :: Failed to get registration by key
2018-07-16 09:04:21,669:ERROR:certbot.log:An unexpected error occurred:
2018-07-16 09:04:21,669:ERROR:certbot.log:The server experienced an internal error :: Failed to get registration by key

EDIT:
After tinkering with the docker-compose file a little bit I found some clues:

“curl www.google.com” works If I remove this line from the docker-compose.yml:
dns: 10.77.77.77

sed -i -e "s/dns: 10.77.77.77//g" docker-compose.yml

However once I do this, I get the error “boulder_1 | E133216 boulder-wfe [AUDIT] grpc: the naming watcher stops working due to lookup ra.boulder on 127.0.0.11:53: read udp 127.0.0.1:46231->127.0.0.11:53: i/o timeout.” ( naturally ra1.boulder cannot be resolved).

So when I remove this line, I can resolve google.com but not ra1.boulder. But if I leave it I cannot resolve google.com.

It’s hard to say without knowing more about your use case, but if you are just deploying internal certificates to a set of hosts you control, you may find minica much easier to use: https://github.com/jsha/minica.

1 Like

This server (currently in prototype stage) will be public.
Using "Boulder" is a requirement.

So after I hack the configuration together, and get a basic setup up and running, I will probably do a deeper dive on a proper deployment setup.

I mannaged to hack the docker compose file so that it uses both 8.8.8.8 and bluenet for dns.
Both curl and internal dns lookup seem to be working fine (assuming my hack was not bad).

sed -i -e  "s/        dns: 10.77.77.77/        dns:\n          - 10.77.77.77\n          - 8.8.8.8/g" docker-compose.yml
sed -i -e  "s/              - sa1.boulder/              - sa.boulder\n              - ca.boulder\n              - ra.boulder\n              - va.boulder\n              - sa1.boulder/g" docker-compose.yml

However, now I get a new error. I when I run certbot, I see that boulder is trying to challenge my NGINX on port 5001 instead of the normal http ports. If I run this versus letsencrypt.org instead of boulder, it works on the regular http ports.

This is the log from boulder. Why is it challenging on port 5001 instead of using http or https?
(I censored my real domain and IP in this log beccause I don’t want them published on this forum, but I can publish them if it’s important )

boulder_1    | I082633 boulder-va [AUDIT] Checked CAA records for censored.domaine.name, [Present: false, Account ID: 2, Challenge: tls-sni-01, Valid for issuance: true] Records=null
boulder_1    | I082633 boulder-va tls-sni-01 [{dns censored.domaine.name}] Attempting to validate for 123.123.123.123:5001 dc5324e4d0e679a52b47a646fee4e025.97cd89ca360817c235e9aaf889564670.acme.invalid
boulder_1    | I082643 boulder-va tls-sni-01 connection failure for {dns censored.domaine.name}. err=[&net.OpError{Op:"dial", Net:"tcp", Source:net.Addr(nil), Addr:(*net.TCPAddr)(0xc4203e1e60), Err:(*poll.TimeoutError)(0xdd96a0)}] errStr=[dial tcp 123.123.123.123:5001: i/o timeout]
boulder_1    | I082643 boulder-va [AUDIT] Validation result JSON={"ID":"UjmtTqZu7IB2CUJCnPbMjWpeSaNqKaUM5GvtD-lUhh8","Requester":2,"Hostname":"censored.domaine.name","ValidationRecords":[{"hostname":"censored.domaine.name","port":"5001","addressesResolved":["123.123.123.123"],"addressUsed":"123.123.123.123"}],"Challenge":{"id":6,"type":"tls-sni-01","status":"invalid","error":{"type":"connection","detail":"Timeout during connect (likely firewall problem)","status":400},"token":"sixbh8-ddZMXU8OdRRouc4eRLhKSA-2C8_qmWI1Jwx4","keyAuthorization":"sixbh8-ddZMXU8OdRRouc4eRLhKSA-2C8_qmWI1Jwx4.YrFtUtSnwWALydhpQ12kJSXQqcqAPIzSSSAJG8qH_ek","validationRecord":[{"hostname":"censored.domaine.name","port":"5001","addressesResolved":["123.123.123.123"],"addressUsed":"123.123.123.123"}]},"RequestTime":"2018-07-17T08:26:33.037906453Z","ResponseTime":"0001-01-01T00:00:00Z","Error":"connection :: Timeout during connect (likely firewall problem)"}
boulder_1    | I082643 boulder-va Validations: {ID:UjmtTqZu7IB2CUJCnPbMjWpeSaNqKaUM5GvtD-lUhh8 Identifier:{Type: Value:} RegistrationID:2 Status: Expires:<nil> Challenges:[] Combinations:[] Wildcard:false}

Check va.json. You will find the ports used for the HTTP challenge:

    "portConfig": {
      "httpPort": 5002,
      "httpsPort": 5001,
      "tlsPort": 5001
    },
2 Likes

Eurika!
it works.

On the odd chance that anyone is reading this thread in the future, here is what got it working for me (make sure you scroll all the way to the right for the sed commands)

    git checkout 36a83150adf342764eb5b31271337a51f2fd9e15
	DNS_SERVER=8.8.8.8
	DNS_SERVER_WITH_PORT=$DNS_SERVER:53
	sed -i -e  "s/        dns: 10.77.77.77/        dns:\n          - 10.77.77.77\n          - $DNS_SERVER/g" docker-compose.yml
	sed -i -e  "s/              - sa1.boulder/              - sa.boulder\n              - ca.boulder\n              - ra.boulder\n              - va.boulder\n              - sa1.boulder/g" docker-compose.yml
	echo Changing ra.json dns servers for production
	sed -i -e  "s/127.0.0.1:8054/$DNS_SERVER_WITH_PORT/g" test/config/ra.json 
	sed -i -e  "s/\"127.0.0.1:8053\",/ /g" test/config/ra.json;
	echo Changing va.json dns servers for production
	sed -i -e  "s/127.0.0.1:8054/$DNS_SERVER_WITH_PORT/g" test/config/va.json
	sed -i -e  "s/\"127.0.0.1:8053\",/ /g" test/config/va.json
	echo changing challenge port configuration
	sed -i -e  "s/: 5001/: 443/g" test/config/va.json

and here is the generated diff

git diff
diff --git a/docker-compose.yml b/docker-compose.yml
index 8332e9a..81e73f5 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -14,6 +14,10 @@ services:
           bluenet:
             ipv4_address: 10.77.77.77
             aliases:
+              - sa.boulder
+              - ca.boulder
+              - ra.boulder
+              - va.boulder
               - sa1.boulder
               - ca1.boulder
               - ra1.boulder
@@ -35,7 +39,9 @@ services:
         # forward the query to this IP (running sd-test-srv). We have
         # special logic there that will return multiple IP addresses for
         # service names.
-        dns: 10.77.77.77
+        dns:
+          - 10.77.77.77
+          - 8.8.8.8
         ports:
           - 4000:4000 # ACME
           - 4001:4001 # ACMEv2
diff --git a/test/config/ra.json b/test/config/ra.json
index da4a2a2..34be49e 100644
--- a/test/config/ra.json
+++ b/test/config/ra.json
@@ -5,8 +5,8 @@
     "maxContactsPerRegistration": 100,
     "dnsTries": 3,
     "dnsResolvers": [
-      "127.0.0.1:8053",
-      "127.0.0.1:8054"
+       
+      "8.8.8.8:53"
     ],
     "debugAddr": ":8002",
     "hostnamePolicyFile": "test/hostname-policy.json",
diff --git a/test/config/va.json b/test/config/va.json
index 579225f..179ec13 100644
--- a/test/config/va.json
+++ b/test/config/va.json
@@ -4,14 +4,14 @@
     "debugAddr": ":8004",
     "portConfig": {
       "httpPort": 5002,
-      "httpsPort": 5001,
-      "tlsPort": 5001
+      "httpsPort": 443,
+      "tlsPort": 443
     },
     "maxConcurrentRPCServerRequests": 100000,
     "dnsTries": 3,
     "dnsResolvers": [
-      "127.0.0.1:8053",
-      "127.0.0.1:8054"
+       
+      "8.8.8.8:53"
     ],
     "issuerDomain": "happy-hacker-ca.invalid",
     "tls": {
2 Likes

You don’t want 8.8.8.8 in the docker-compose DNS, you want it only in the VA config.

Basically there are two ways Boulder uses DNS:

  • service discovery: looking up other boulder components. This is the docker-compose setting.
  • validation: looking up domain names to validate. This is the va.json setting.

If you have both 8.8.8.8 and 10.77.77.77 in your docker-compose.yml, it will probably work but is likely to be flaky.

Also, it would be useful to share a description of the service you are prototyping. We can probably help guide you around some obstacles you might otherwise encounter.

1 Like

The main thing to be aware of is that we consider the docker-compose file a testing tool rather than a deployment one, so we don’t guarantee backwards compatibility.

You’ll probably want to review the Component Model (https://github.com/letsencrypt/boulder#component-model) and Deployment and Implementation Guide (https://github.com/letsencrypt/boulder/wiki/Deployment-&-Implementation-Guide).

In short: You should set up a system to run each Boulder component, not use start.py (which our test docker environment does). Because start.py is a testing tool, it has various non-production behaviors, like bringing down all services if a single service fails. It also runs services you don’t need, like ct-test-srv and challtestsrv. You should be good setting up systemd units (or equivalent) for each item in start.py that doesn’t have “test” in the name. The set of boulder binaries hardly ever changes, so this is a one-time cost.

You can run all Boulder components on a single machine if you like; you can also separate them onto multiple components talking via gRPC.

You’ll want to check in a copy of test/config/* into your config management software, then make modifications there to fit your local configuration. When you update Boulder, you should compare the updated copies of test/config/* to what you have checked in, and decide for each config value whether you want to copy the updated value from our tests or set your own value.

2 Likes