What makes Open Graph checkers unable to detect Open Graph data?

My page, after adding SSL certificate, cannot have preview fetched by Facebook or Twitter when sharing the link. I have followed The Open Graph protocol and include the following open graph tags:

<meta property="og:type" content="article" />
<meta property="og:title" content="Corner Timer: gently make you feel guilty on time-wasting apps" />
<meta property="og:url" content="https://lyminhnhat.com/resources/productivity/corner-timer-gently-make-you-feel-guilty-on-time-wasting-apps/" />
<meta property="og:description" content="Make you feel guilty for your unproductive curiosity" />
<meta property="article:published_time" content="2019-04-26T10:50:30+00:00" />
<meta property="article:modified_time" content="2019-08-06T07:11:42+00:00" />
<meta property="og:site_name" content="Lý Minh Nhật" />
<meta property="og:image" content="https://lyminhnhat.com/wp-content/uploads/2019/04/Screenshot_2019-04-11-11-31-39.png" />
<meta property="og:image:width" content="480" />
<meta property="og:image:height" content="800" />
<meta property="og:locale" content="en_US" />
<meta name="twitter:site" content="@ooker777" />
<meta name="twitter:text:title" content="Corner Timer: gently make you feel guilty on time-wasting apps" />
<meta name="twitter:image" content="https://lyminhnhat.com/wp-content/uploads/2019/04/Screenshot_2019-04-11-11-31-39.png?w=640" />
<meta name="twitter:card" content="summary_large_image" />

However, all 3 Open Graph checkers I use - OpenGraphCheck.com, Abhinay Rathore’s Open Graph Tester, Facebook’s Object Debugger - say that there is no Open Graph implement. There is one exception though: Iframely’s Embed Codes

Since all three checkers have problem with this, probably this is not just a problem of of Facebook, as suggested in FB OpenGraph og:image not pulling images (possibly https?). Nevertheless, nothing changes even though I have tried using html links only, stripping end white space, using <html prefix="og: http://ogp.me/ns#">.

This person suggests that this may be a server issue. A misconfiguration, perhaps. Do you know why this happens or how to identify the problem?

Other information:

  • SSL certificate: Let’s Encrypt
  • Control Panel: DirectAdmin
  • Server: Nginx

Also asked on Webmasters Stack Exchange: Why can’t Open Graph checkers detect Open Graph data?

Hi @ooker,

I looked at your certificate manually and using the Qualys SSL Labs tool

https://www.ssllabs.com/ssltest/

I don’t see any problems with your certificate or your HTTPS configuration. Your certificate is correct and your configuration is compatible with a wide range of clients. So I think the most likely case is that the compatibility problem lies elsewhere—that it’s not an HTTPS-related problem.

I did notice that two of the links you shared are testing Open Graph data on https://quảcầu.com/ rather than https://lyminhnhat.com/. I’m not sure if that was intentional, as the content on the two sites is completely different.

I thought that perhaps the use of accented Vietnamese characters in the domain quảcầu.com could be confusing some of the software here. This is a minor possibility, but on the other hand the certificate for the IDN ASCII form xn--qucu-hr5aza.com is valid and so if the accented characters are any part of the problem, it’s still not a problem related to HTTPS.

Another thing that I noticed is that both sites appear to have some kind of measure (perhaps based on examining User-Agent: headers) to block non-browser access. In particular, when I access the sites with curl, they return no content at all, whereas when I access them with a web browser, they return site content. This could very likely be a problem because the bots that are crawling for Open Graph data may send User-Agent strings that accurately identify them as bots rather than browsers. If your web server then blocks these bots, presumably they won’t be able to see the data you intended to show them!

1 Like

Wow, thank you for your detailed help. I really appreciate that.

Yes, both domains are mine. They are all hosted in one share hosting, and after I add the certification they all have problems. The Vietnamese site, quảcầu.com or xn--qucu-hr5aza.com, is newly installed and should have no problem in WordPress sourcecode or plugin conflict.

Here is the output when using curl for both sites in https:

PS C:\Users\Ooker> curl https://lyminhnhat.com
curl: (56) OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054
PS C:\Users\Ooker> curl https://xn--qucu-hr5aza.com
curl: (56) OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054

For http requests, curl xn--qucu-hr5aza.com returns normal, but curl lyminhnhat.com returns this:

PS C:\Users\Ooker> curl -v lyminhnhat.com
*   Trying 103.254.12.54:80...
* TCP_NODELAY set
* Connected to lyminhnhat.com (103.254.12.54) port 80 (#0)
> GET / HTTP/1.1
> Host: lyminhnhat.com
> User-Agent: curl/7.65.3
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
< Server: nginx
< Date: Tue, 20 Aug 2019 07:26:42 GMT
< Content-Type: text/html; charset=UTF-8
< Content-Length: 0
< Connection: keep-alive
< X-Powered-By: PHP/7.2.18
< Pragma: no-cache
< Vary: Accept-Encoding,Cookie,User-Agent
< Expires: Tue, 20 Aug 2019 08:26:42 GMT
< Cache-Control: max-age=3600
< X-Redirect-By: WordPress
< Set-Cookie: PHPSESSID=e38b6e538848185698247c8cf365b08a; path=/
< Location: https://lyminhnhat.com/
<
* Connection #0 to host lyminhnhat.com left intact

The user agents in both sites are curl/7.65.3. What do you think about this?

I also ask this on Server Fault Stack Exchange: Curl returns SSL_ERROR_SYSCALL even though the certificate is correct

Hi @ooker

I don’t know if this is the problem.

But checking your domain - https://check-your-website.server-daten.de/?q=lyminhnhat.com

Your http headers are ok, there is a Content-Type - header.

Server: nginx
Date: Tue, 20 Aug 2019 08:15:15 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: close
Vary: Accept-Encoding,Accept-Encoding,Cookie,User-Agent
X-Powered-By: PHP/7.2.18
Cache-Control: max-age=3, must-revalidate

But your site has only a

<meta charset="UTF-8">

element. The very old standard element

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

is missing.

Normally, that should work, your http header is correct. But sometimes tools are buggy. With non-ASCII -> that may be a problem.

So add the old standard element.

I add the old standard, but the curl requests don’t change. Do you have any idea?

Using

the url check doesn’t work.

But copying the raw page code in that Html-Form - that works.

Looks like it is something of your server configuration. A bot detection that blocks some bots.

Using https://www.uptrends.com/de/tools/uptime - most checks work. Two times checked - Toronto doesn’t work. One time Tampa, one time São Paulo doesn’t work.

1 Like

PS: Yep, there is a problem with your configuration.

Using my old download.exe to check the headers. 4 times.

Result:

D:\temp>download https://lyminhnhat.com/ -h
SystemDefault
SSL-Zertifikat is valide
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding,Accept-Encoding,Cookie,User-Agent
Cache-Control: max-age=3, must-revalidate
Content-Type: text/html; charset=UTF-8
Date: Tue, 20 Aug 2019 10:01:49 GMT
Server: nginx
X-Powered-By: PHP/7.2.18

Status: 200 OK

1641,72 milliseconds
1,64 seconds

D:\temp>download https://lyminhnhat.com/ -h
SystemDefault
SSL-Zertifikat is valide
SSL-Zertifikat is valide
Error (1): Die zugrunde liegende Verbindung wurde geschlossen: Die Verbindung wurde unerwartet getrennt…
ConnectionClosed
3

2586,38 milliseconds
2,59 seconds

D:\temp>download https://lyminhnhat.com/ -h
SystemDefault
SSL-Zertifikat is valide
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding,Accept-Encoding,Cookie,User-Agent
Cache-Control: max-age=3, must-revalidate
Content-Type: text/html; charset=UTF-8
Date: Tue, 20 Aug 2019 10:02:14 GMT
Server: nginx
X-Powered-By: PHP/7.2.18

Status: 200 OK

3119,55 milliseconds
3,12 seconds

D:\temp>download https://lyminhnhat.com/ -h
SystemDefault
SSL-Zertifikat is valide
SSL-Zertifikat is valide
Error (1): Die zugrunde liegende Verbindung wurde geschlossen: Die Verbindung wurde unerwartet getrennt…
ConnectionClosed
3

33597,23 milliseconds
33,60 seconds

(1) is ok, (2) is closed unexpectly after 2 seconds, (3) is ok, (4) is closed unexpectly after 33 seconds.

May be a bot detection, may be an instable server.

1 Like

Really appreciate your time. I’ll check with my hosting provider.

If possible, can you tell me what is that header? It doesn’t seem to be in the header.php at all.

Also, is there any difference between using curl, ping and download?

Thanks again.

These are the normal http headers.

Perhaps you have other places, config files, .htaccess, application.

Curl and my own download.exe are http tools, they send a http command and show the content. Ping is only a network connection tool.

PS: Checked with the Twitter Card validator:

https://cards-dev.twitter.com/validator

There is the error:

ERROR: Failed to fetch page due to: ChannelClosed

Looks like you have a bot detection that blocks some user agents or ip addresses.

2 Likes

I think it is user agents, because I could access the site with a browser but not with curl from my own machine.

Just an update: it seems to work again. I have submitted this bug to them here: Crawler is unable to fetch images, but adding a brand new, unique query string can make it work for one first time - Facebook for Developers

I also post all of my research in this thread: https://stackoverflow.com/a/57606813/3416774

Thank you everyone for supporting me.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.