Using nfqueue on Linux as a novel, webserver-agnostic HTTP authenticator

orangepizza · February 14, 2023, 9:14am

while messing with this I just realized this is literally forging acme challenge reply out of band, just happens to be on real web server and middleman happens runs on same box.
not sure how picky LE (and golang as that's what calls ) catches oddity on TCP layer, and how well we forged in this challange

orangepizza · February 16, 2023, 1:04am

btw can I ask why it used iptable command if nfqueue (from name) requires to run nftable?

_az · February 16, 2023, 2:17am

From what I've observed, the iptables CLI is more ubiquitous than the nft CLI on most Linux distros, even though it is usually just a frontend into the nftables backend.

Certainly there are a number of equivalent methods of adding and removing the required rule, and a production version of such a plugin should probably support a couple of them. As well as checking for/loading the nfnetlink_queue module etc.

orangepizza · February 21, 2023, 12:37pm

ported equivalent solver for lego:

github.com/go-acme/lego

DPI Http01 solver for linux with nfqueue

go-acme:master ← orangepizza:http01-dpi

opened 12:35PM - 21 Feb 23 UTC

orangepizza

+233 -0

This pull request will add http.nfqueueport option, which when assigned a port w…ill put a nfqueue rule on firewall to capture http request for token path, block it to reach web server and craft request packet for it. (Linux only) Why anyone do that? because with this solver it don't need to care about any webserver on port 80. Why this is draft PR? because this is just enough to run, and much to do yet (ex: currently can't handle ipv6 (not sure it skips or panics), no reasonable docstring, and some more bugs maybe) using nfqueue for port-sniffing solver isn't my original idea: https://community.letsencrypt.org/t/using-nfqueue-on-linux-as-a-novel-webserver-agnostic-http-authenticator/192625/23 http.nfqueueport is with port number option so run on pebble, but does it really need to set a port number or just blindly running on port 80 is enough?

orangepizza · February 22, 2023, 12:38am

nfqueue dangling session tcp.pcapng (2.6 KB)
this makes TCP we steal a packet from session fail to close, and looks like that dangles for 2~3 minutes, and create a bunch of TCP retransmission from acme sever side. before release this we should close TCP channel on both side: wonder if sending RST to both side will be enough, or need something else? or just wrtie another packet with RST+ASK?

sending RST to server will make server to not care about this session, but our webserver sill pings the we ACME server. VA doesn't care though
not sure how to write packet to back to webserver or so

_az · February 22, 2023, 10:14pm

Definitely it should take over the TCP session and do a graceful FIN/ACK for the connections from the ACME server.

At that point it may well be convenient to reuse the same logic for the connection to the local webserver, though I don't think it's a big deal to RST that one.

jvanasco · February 22, 2023, 10:42pm

So is the local webserver aware of this? I assumed these packets would never hit it, and there wouldn't be any connection to the local backend.

_az · February 22, 2023, 10:44pm

The takeover only kicks in when the HTTP request comes in, so the local webserver is aware of an open TCP connection without any data sent yet.

orangepizza · February 22, 2023, 10:46pm

when we catch the acme request there is already a connection between acme VA and our backend, as until TCP handshake finished and start sending payload there is no way to know firewall to this is for acme. when we inject http reply we increase seq from that session by len(reply), which our main webserver doesn't know it did, and both side (unless we sent rst and close session) send retransmission but can't talk because seq/ack no longer match.

not sure how to inject packet to our backend with forged src address though

jvanasco · February 23, 2023, 12:09am

Ah, that makes sense.

IIRC, closing this should not matter much on Nginx (it can handle many slow/orphaned connections) but this can seriously degrade performance on Apache and would be needed there. I'm not sure of other platforms.

_az · February 23, 2023, 6:07am

After playing with this for a while, it does seem to be a problem.

Can't forge FIN/RST towards the backend with raw packets, because it bypasses the kernel's network stack. The kernel ends up thinking the connections are still ESTAB and things like epoll then don't work properly.
Been trying to use nfqueue's mangle, but either I am screwing up the packet and it's being ignored, or it's too late in the networking stack processing pipeline and it's not actually possible to alter the connection state at that point. It's probably possible to directly delete the connection state using libnetfilter_conntrack, but then the hackery involved is getting wildly out of control.

At least, forging FIN/ACK to the ACME servers seems to work, but ofc retransmits still come from Linux.

This is what I've been trying.

jvanasco · February 23, 2023, 10:17pm

This may not be relevant, but I am bringing it up just-in-case. Many years ago, I ran into the issue of Python's requests library not being able to give me actual information about the connection - which caused a lot of blockages/issues in troubleshooting. I eventually realized the cause of our problems were domains that had multiple DNS records, and we had no way to determine what IP address we connected to (our issue) OR what their SSL Certificate was (another group's issue that was essentially the same as ours, and we eventually needed).

The underlying reason for this, was the manner in which requests utilizes urllib3, and that urllib3 closes the socket connection without logging any info or offering hooks to capture data. Suggested "workarounds" all involved a second connection, which is not guaranteed to be similar to the first. We eventually found a workaround technique for persisting IP data, but could not persist the SSL Certificate data without a fork or monkeypatch. urllib3 and requests are open to a new debug object, but no one involved had enough time to fully spec this out and get enough consensus to generate a PR that would be accepted.

Anyways, my suggestion is to check the fnfqueue source to see if they are closing something or just not persisting some variable or connection. There might also be an opportunity for a new hook.

_az · February 24, 2023, 12:37am

This was the issue and it's fixed now. Probably the seq did not add up when I stripped the payload. If I set only the RST flag to the inbound packet and mangle it, everything works OK and the connection gets immediately closed. nginx (or whoever runs on port 80) sees a connection reset, even via epoll.

It can be done "more properly" but I'm happy for now, no more rogue retransmissions.

Should work the same in the Go nfqueue library I think, mangle the packet to add RST flag.

So what remains is:

Check whether IPv6 support needs any changes
UX around having the right netfilter module loaded and nfqueue library installed
...

orangepizza · February 24, 2023, 6:16am

for ipv6: hw_protocol will hold Ethernet frame protocol info, 0x0800 for ipv4 and 0x86DD for ipv6

orangepizza · February 24, 2023, 8:22am

I kinda feel like it should be fail safe, that even when certbot killed before cleanup is called we should ensure we don't left firewall rule on and block webserver: if client is killed but nfq rule is still there than every traffic to port 80 will be droped

looks like nfqueue have --queue-bypass which will packet to pass though if nothing listening on queue, we should add this, but duplicate rule still mess next renewal as we will send two reply.
change token with zeros when we pass with RST will work though

edit:: adding --queue-bypass makes it not send any packet to us hmmm
edit2:: it was my code take to long to process, optimizing it fixed

_az · February 24, 2023, 8:44pm

Ah, that's very cool and worth using. Nice find. Works for me.

I also found this note:

This feature is broken from kernel 3.10 to 3.12: when using a recent iptables, passing the option --queue-bypass has no effect on these kernels.

orangepizza · February 25, 2023, 3:38pm

what should it do when there is no webserver running on that port? as is challenge will fail because kernel will send RST so we don't get http traffic. there would be 3ish options

just let it fail the challange
test port binding and fail with message to use normal standalone
we call normal standalone mode solver
we bind port 80 ourselves so something is listening (but this sounds really roundabout way)

_az · February 25, 2023, 11:04pm

I like this one because it catches users who might be holding the plugin wrong, early on. I've applied it.

I also replaced the iptables invocation with pure Python netlink code. Right now I think it only depends on the kernel module, not any C libraries (other than libc). But time will tell when I try test on some older distros.

Osiris · February 26, 2023, 8:31am

Hm, looks complicated

I don't see the variable port being used in the expression, except earlier for removing the table, is that normal? I don't see 80 (or 0x50) anywhere for that matter. Maybe I'm blind

_az · February 26, 2023, 9:05am

Oops, nice catch. b"\x00P" is 0x0050. The API takes bytes for some reason, I don't know. Just virtual machine things. Fixed!

Topic		Replies	Views
Does the validator server for HTTP01 have a public IP list? Issuance Tech	11	187	March 13, 2025
Noop (stateless) challenge Feature Requests	4	3074	August 30, 2017
Need for HTTP at all Server	6	1618	February 12, 2017
Connection Timeout with ACME Challenge Help	2	3859	October 12, 2017
[SOLVED] Certbot opening port 8080 Help	18	977	August 31, 2024

Using nfqueue on Linux as a novel, webserver-agnostic HTTP authenticator

Related topics