The operating system my web server runs on is (include version):
Ubuntu 20.04
I can login to a root shell on my machine (yes or no, or I don't know):
Yes
I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
No
The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
0.40.0
Forced redirect when setting up the cert. Works fine for users but robots view the http version. For example Google is indexing some http pages, and Redirect Checker | Check your Statuscode 301 vs 302 (and other tools) show 200 when you give them a http link no matter which user agent is chosen.
I am using Full SSL in Cloudflare and have also tried Flexible. I have set the site URL as https://www in Wordpress.
sites-available .conf looks like this. If I uncomment lines 20-23 inclusive I get a http > https > http redirect chain.
# Added to mitigate CVE-2017-8295 vulnerability
UseCanonicalName On
<VirtualHost *:80>
ServerAdmin X
ServerName broadband4europe(d0t)com
ServerAlias www.broadband4europe(d0t)com
DocumentRoot /var/www/broadband4europe(d0t)com/public_html
<Directory /var/www/broadband4europe(d0t)com/public_html/>
Options FollowSymLinks
AllowOverride All
Require all granted
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
#RewriteEngine on
#RewriteCond %{SERVER_NAME} =broadband4europe(d0t)com [OR]
#RewriteCond %{SERVER_NAME} =www.broadband4europe(d0t)com
#RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>
I have other sites running on this server with no issues.
Given Apache's redirects are commented out any redirect is probably being done by Cloudflare. Do you have cache rules or something setup that would serve robots.txt from there?
In any event, server redirect strategy is not much of a Let's Encrypt issue. I think asking about this on the Cloudflare forum will help you more directly.
CF SSL/TLS is already set to "Full" and I have already tried "Flexible" as mentioned in the OP. This ("Full" plus Let's Encrypt) is the same setup as I have on all of my other sites including those running on this server.
There are no specific caching rules set up in Cloudflare. Robots.txt looks the same as the other sites I have set up that are not experiencing this issue.
I have tried turning off CF SSL, this results in a redirect loop when trying to access it with the http and https versions.
I have tried adding this to .htaccess and restarting Apache to no avail.
RewriteEngine On
RewriteBase /
RewriteCond %{HTTPS} =on
RewriteCond %{HTTP_HOST} ^www.broadband4europe(d0t)com
sites-available le-ssl.conf looks like this:
<IfModule mod_ssl.c>
<VirtualHost *:443>
ServerAdmin X
ServerName broadband4europe(d0t)com
ServerAlias www.broadband4europe(d0t)com
DocumentRoot /var/www/broadband4europe(d0t)com/public_html
<Directory /var/www/broadband4europe(d0t)com/public_html/>
Options FollowSymLinks
AllowOverride All
Require all granted
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
Include /etc/letsencrypt/options-ssl-apache.conf
SSLCertificateFile /etc/letsencrypt/live/broadband4europe(d0t)com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/broadband4europe(d0t)com/privkey.pem
</VirtualHost>
</IfModule>
Bots are not having issues accessing the site. They are not being redirected to https when viewing http URLs.
Googlebot (whichever version they are using right now to crawl from a user submission in GSC) and Toolbot (which Redirect Checker | Check your Statuscode 301 vs 302 is using) are two examples. Same issue occurs when using other user agents that that website allows you to select. I am not sure how to extract the exact user agent string of these two bots.
sudo apachectrl -t -D DUMP_VHOSTS returns:
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.16.0.5. Set the 'ServerName' directive globally to suppress this message
VirtualHost configuration:
*:443 is a NameVirtualHost
default server othersite1(d0t)com (/etc/apache2/sites-enabled/000-default-le-ssl.conf:2)
port 443 namevhost othersite1(d0t)com (/etc/apache2/sites-enabled/000-default-le-ssl.conf:2)
alias www.othersite1(d0t)com
port 443 namevhost broadband4europe(d0t)com (/etc/apache2/sites-enabled/broadband4europe(d0t)com-le-ssl.conf:2)
alias www.broadband4europe(d0t)com
*:80 is a NameVirtualHost
default server othersite1(d0t)com (/etc/apache2/sites-enabled/000-default.conf:4)
port 80 namevhost othersite1(d0t)com (/etc/apache2/sites-enabled/000-default.conf:4)
alias www.othersite1(d0t)com
port 80 namevhost broadband4europe(d0t)com (/etc/apache2/sites-enabled/broadband4europe(d0t)com.conf:4)
alias www.broadband4europe(d0t)com
port 80 namevhost othersite2(d0t)com (/etc/apache2/sites-enabled/othersite2(d0t)com.conf:4)
alias www.othersite2(d0t)com
port 80 namevhost othersite3(d0t)com (/etc/apache2/sites-enabled/othersite3(d0t)com.conf:4)
alias www.$domain
I see, so is X-Redirect-By: WordPress not relevant? I don't use wordpress but I'd risk the assumption that the redirect is being specified by WordPress (not apache, or cloudflare).
Could X-Redirect-By be contributing to this? If so I can try to disable this response header. In saying that I've never had to do that before on Wordpress, and haven't done anything with this install that should affect redirects.