First certificate behind an external Load Balancer

My domain is: cdp.obdo.dev

I ran this command:

Install RKE cluster behind of Load Balancer.

I install cert-manager 1.0.3 :
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.3/cert-manager.crds.yaml
helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v1.0.3
kubectl -n cert-manager rollout status deploy/cert-manager

I install Rancher :

 helm install rancher rancher-stable/rancher --namespace cattle-system --set hostname=cdp.obdo.dev --set ingress.tls.source=letsEncrypt --set letsEncrypt.email=<email>

The certificate is not issued, cert-manager expects a response but the challenge shows another response depending on IP on which it is resolved.

It produced this output:

E1020 09:41:34.245669 1 sync.go:183] cert-manager/controller/challenges "msg"="propagation check failed" "error"="did not get expected response when querying endpoint, expected \"UEJgSbJKKACGNq2NX1DK7R-l32tUAr9gsF5AkwdXM5w.XtUsdNKCseioTqRG85H2jizyozIIzOVqshkq223I_DM\" but got: UEJgSbJKKACGNq2NX1DK7R-l... (truncated)" "dnsName"="cdp.obdo.dev" "resource_kind"="Challenge" "resource_name"="tls-rancher-ingress-bcxgf-1464574410-2707586409" "resource_namespace"="cattle-system" "resource_version"="v1" "type"="HTTP-01"

This is the response of the challenge when I ask directly to an instance node behind external Load Balancer :

curl -H 'Host: cdp.obdo.dev' http://146.59.197.177/.well-known/acme-challenge/UEJgSbJKKACGNq2NX1DK7R-l32tUAr9gsF5AkwdXM5w
UEJgSbJKKACGNq2NX1DK7R-l32tUAr9gsF5AkwdXM5w.XtUsdNKCseioTqRG85H2jizyozIIzOVqshkq223I_DM

This is the response of the challenge when I ask through the Load Balancer IP (pointed by the DNS) :

curl -H 'Host: cdp.obdo.dev' http://51.91.60.230/.well-known/acme-challenge/UEJgSbJKKACGNq2NX1DK7R-l32tUAr9gsF5AkwdXM5w
UEJgSbJKKACGNq2NX1DK7R-l32tUAr9gsF5AkwdXM5w.4E3VCTFsySjUrqnCg0ooULx-3kbdPBygi0aWkvg5Gd8

The operating system my web server runs on is (include version): Ubuntu 20.08

My hosting provider, if applicable, is: OVH VPS for servers + OVH Load Balancer

I can login to a root shell on my machine (yes or no, or I don't know): Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): cert-manager 1.0.3

I already installed many cert-manager without problem. But the first time on OVH Load Balancer.

Any help would be appreciated !

1 Like

This is OVH's Let's Encrypt account key.

Basically, OVH is intercepting the validation request (because it also issues SSL certificates for your domains) and it's not arriving to your k8s cluster.

Architecturally, I think you need to decide whether you want to terminate SSL yourself (in which case using OVH's managed HTTP load balancer doesn't really make sense), or whether you want OVH to terminate SSL for you (in which case it doesn't really make sense to run cert-manager).

You could also change the OVH Load Balancer to be TCP mode instead of HTTP mode, and that will prevent the ACME requests from being intercepted.

5 Likes

I want to terminate SSL myself. So should I abandon Network Load Balancer for what ? I still need a service to load balancer between my 3 rke nodes ?

(thanks for your rapid answer)

2 Likes

You could change the OVH load balancer to be TCP mode only, instead of HTTP.

It will allow you to terminate SSL yourself, and it will allow your ClusterIssuer to issue certificates without OVH interfering.

Assuming that you do not rely on any HTTP-specific features of the load balancer (like functionality relating to HTTP headers).

5 Likes

Ok, I changed the LB to TCP. The error changed :

E1020 10:18:13.651211       1 controller.go:158] cert-manager/controller/CertificateKeyManager "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-rancher-ingress\": the object has been modified; please apply your changes to the latest version and try again" "key"="cattle-system/tls-rancher-ingress" 
I1020 10:18:13.678585       1 conditions.go:233] Setting lastTransitionTime for CertificateRequest "tls-rancher-ingress-kvsq2" condition "Ready" to 2020-10-20 10:18:13.678577536 +0000 UTC m=+207.356741017
I1020 10:18:14.515667       1 setup.go:270] cert-manager/controller/issuers "msg"="verified existing registration with ACME server" "related_resource_kind"="Secret" "related_resource_name"="letsencrypt-production" "related_resource_namespace"="cattle-system" "resource_kind"="Issuer" "resource_name"="rancher" "resource_namespace"="cattle-system" "resource_version"="v1" 
I1020 10:18:14.515704       1 conditions.go:92] Setting lastTransitionTime for Issuer "rancher" condition "Ready" to 2020-10-20 10:18:14.515696006 +0000 UTC m=+208.193859547
I1020 10:18:14.535468       1 setup.go:170] cert-manager/controller/issuers "msg"="skipping re-verifying ACME account as cached registration details look sufficient" "related_resource_kind"="Secret" "related_resource_name"="letsencrypt-production" "related_resource_namespace"="cattle-system" "resource_kind"="Issuer" "resource_name"="rancher" "resource_namespace"="cattle-system" "resource_version"="v1" 
E1020 10:18:14.649199       1 controller.go:158] cert-manager/controller/certificaterequests-issuer-acme "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificaterequests.cert-manager.io \"tls-rancher-ingress-kvsq2\": the object has been modified; please apply your changes to the latest version and try again" "key"="cattle-system/tls-rancher-ingress-kvsq2" 
I1020 10:18:16.651214       1 acme.go:184] cert-manager/controller/certificaterequests-issuer-acme/sign "msg"="certificate issued" "related_resource_kind"="Order" "related_resource_name"="tls-rancher-ingress-kvsq2-1464574410" "related_resource_namespace"="cattle-system" "related_resource_version"="v1" "resource_kind"="CertificateRequest" "resource_name"="tls-rancher-ingress-kvsq2" "resource_namespace"="cattle-system" "resource_version"="v1" 
I1020 10:18:16.651694       1 conditions.go:222] Found status change for CertificateRequest "tls-rancher-ingress-kvsq2" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2020-10-20 10:18:16.651686744 +0000 UTC m=+210.329850226
I1020 10:18:16.704803       1 conditions.go:162] Found status change for Certificate "tls-rancher-ingress" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2020-10-20 10:18:16.704795346 +0000 UTC m=+210.382958830
E1020 10:18:16.731947       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-rancher-ingress\": the object has been modified; please apply your changes to the latest version and try again" "key"="cattle-system/tls-rancher-ingress" 
I1020 10:18:16.732547       1 conditions.go:162] Found status change for Certificate "tls-rancher-ingress" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2020-10-20 10:18:16.732540313 +0000 UTC m=+210.410703797

E1020 10:18:16.736830       1 controller.go:158] cert-manager/controller/CertificateIssuing "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-rancher-ingress\": the object has been modified; please apply your changes to the latest version and try again" "key"="cattle-system/tls-rancher-ingress" 
E1020 10:18:17.117403       1 controller.go:158] cert-manager/controller/CertificateReadiness "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-rancher-ingress\": the object has been modified; please apply your changes to the latest version and try again" "key"="cattle-system/tls-rancher-ingress" 
I1020 10:18:17.118717       1 conditions.go:162] Found status change for Certificate "tls-rancher-ingress" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2020-10-20 10:18:17.118707602 +0000 UTC m=+210.796871088
E1020 10:18:17.253584       1 controller.go:158] cert-manager/controller/CertificateKeyManager "msg"="re-queuing item  due to error processing" "error"="Operation cannot be fulfilled on certificates.cert-manager.io \"tls-rancher-ingress\": the object has been modified; please apply your changes to the latest version and try again" "key"="cattle-system/tls-rancher-ingress"

The resolver is down now, no certificate issued.

I reinstalled Rancher and had this error. I then uninstall Rancher, delete secrets and reinstall. But same error...

1 Like

I just checked, their is a line "certificate issued"

And my command gives :
k get certificate -A
NAMESPACE NAME READY SECRET AGE
cattle-system tls-rancher-ingress True tls-rancher-ingress 11m

Looking at the cert-manager issue tracker it looks like it could be a rate limiting issue.

Looks like there's a certificate now:

$ curl -I --resolve cdp.obdo.dev:443:146.59.197.177 https://cdp.obdo.dev
HTTP/2 200

but not on the 51.91.60.230 IP.

Did you setup the TCP load balancer to forward 443 as well?

2 Likes

Yes I did. But after checking, their was an error on the port ... I reset 443 corretly ...

And it WORKS.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.