Maybe you’re tired of all the fail2ban log messages about attempts to crack your ssh passwords. Maybe you have developers who need to access a web port from a machine not on your VPN. For all sorts of reasons you might want to allow access to specific ports on a server and port knocking is an easy way to do that.
There are dedicated knocking managers like knockd, but I don’t want to run another daemon if it’s not necessary. Fortunately, IPTables contains the necessary functionality to implement port knocking with the “recent” module. Documentation on “recent” is thin, and doing multi-level port-knocking is even harder to find good instructions for. You can check the Arch Wiki, or even better is the Digital Ocean tutorial, but even that one didn’t work for me, and cannot work with some tools commonly used for knocking (probably those tools’ bugs). Still, it’s largely inspired the structure of this post – thank you Justin. Please read that one for additional insight.
First, a bit of the theory here. IPTables can have many, well “tables” of rules. On a simple server you might only ever edit the ‘INPUT’ table. The rules here are just copied from an old EL5 cluster, so the default table is called ‘RH-Firewall-1-INPUT’ – the name doesn’t matter. If your setup just uses ‘INPUT’ that’s fine too. Adjust to taste. If you happen to find yourself looking at a machine with firewalld (EL7) then burn it with fire and come back here when you have a text file to work with (/etc/sysconfig/iptables on Fedora-derived systems).
If you’re just using a single knock port you can just use a pair of simple rules using “-m recent”. The “recent” module keeps named lists of IP addresses. For a given matching condition, you can add an IP to a named list or remove it. You can also match whether an IP is on a list as a condition. What makes this double-knock job difficult is that you cannot use multiple “-m recent” conditions in a single rule (say, a match and a set). So you wind up having to build a small state machine using IPTables. This example will be for a two-port knock, so there are three possible states: “not knocked”, “first knock packet seen”, “second knock packet seen”. You can extend this logic to an arbitrary number of tables.
If a packet matches the first knock packet, we’ll put the IP on the KNOCK_1_OK list. If it matches the second knock packet (after the first) we’ll put it on the KNOCK_2_OK list. After we’re done with other packet processing, we’ll jump to the KNOCKCONTROL table. We’ll dispatch based on state here. If the second packet has been matched, jump to the KNOCKED table to accept connections. If the first packet has been matched, jump to the table that will check to see if the second packet matches. If nothing has been matched, jump to a check for the first packet (this is our ‘default’ case).
To make these tables, we’ll add this to our config file:
:KNOCK1 - [0:0]
:KNOCK2 - [0:0]
:KNOCKED - [0:0]
:KNOCKCONTROL - [0:0]
Put them after the rule for INPUT that looks like those. If you have a complex iptables configuration with *nat and *mangle sections, put these in the *filter section.
Now go down and look for your default deny rule. It’s probably at the bottom. Add these just before:
## port knocking
-A RH-Firewall-1-INPUT -j KNOCKCONTROL
-A KNOCK1 -m state --state NEW -m tcp -p tcp --dport 1111 -m recent --name KNOCK_1_OK --set -j REJECT --reject-with icmp-host-prohibited
-A KNOCK1 -j REJECT --reject-with icmp-host-prohibited
-A KNOCK2 -m recent --name KNOCK_1_OK --remove
-A KNOCK2 -m state --state NEW -m tcp -p tcp --dport 2222 -m recent --name KNOCK_2_OK --set -j REJECT --reject-with icmp-host-prohibited
-A KNOCK2 -j KNOCK1
# retries from nc, nmap will break with this:
#-A KNOCKED -m recent --name KNOCK_2_OK --remove
-A KNOCKED -m state --state NEW -m tcp -p tcp --dport 3333 -m recent --name KNOCK_2_OK --remove -j REJECT --reject-with icmp-host-prohibited
-A KNOCKED -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A KNOCKED -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT
-A KNOCKED -j KNOCK1
# the last one can't timeout with --seconds because then 8080 access would be unusable
-A KNOCKCONTROL -m recent --rcheck --name KNOCK_2_OK -j KNOCKED
-A KNOCKCONTROL -m recent --rcheck --name KNOCK_1_OK --seconds 10 -j KNOCK2
-A KNOCKCONTROL -j KNOCK1
# default reject, probably never reached with knocking in place
These rules create a situation where a knock on 1111 then 2222 will open up the ports for your IP. 3333 will close down the ports.
So, to write out the logic as prose (this can be a mind-full at first glance): First we jump to the KNOCKCONTROL table on all packets that don’t get handled by the non-knock-related rules. Then we check to see if the IP is on the “second knock packet received” table. If it is we jump to the KNOCKED table to accept connections. If only the first packet has been received then we jump to a check for the second packet. If no matching packets have been received we check for the first one, and then reject the packet (to replace the default deny – a RETURN could happen here as well to be involved in more complex setups, but explicit is good in security).
As for the nuances of each step: When checking for the first packet (in the KNOCK1 table) we look for a packet that matches the port (1111) and if we have it, we use the recent module to add the IP to the “KNOCK_1_OK” table (this is the criteria we used back in KNOCKCONTROL to dispatch to KNOCK2). We then reject the packet, so that the knocking tool (e.g. ‘nc’) will immediately return and speed up the connection process. The original recipe uses DROP which delays the connection and also gives an adversary a fingerprint to discover. We also want our default deny rule to be an ICMP host-prohibited packet, for the sanity of network engineers everywhere.
In the KNOCK2 table, where we check for the second packet, there are a few tricks. First, the thing we do right away is to remove the IP from KNOCK_1_OK – our logic in KNOCKCONTROL could not work if both flags were set! Then we check for the second knock port (2222) and add the IP to KNOCK_2_OK if it matches (and then reject). Lastly, we jump to KNOCK1. Why? Two reasons – we want to handle the case where the user starts over, with port 1111 a second time, and also our default deny rule is at the end of it.
Finally, in the KNOCKED table we first check to see if we’re matching the ‘close’ packet (3333). If we see it, take the IP off the KNOCK_2_OK list – effectively closing the port knock for that IP. Next, accept our TCP connections as we would have before knocking. Note the use of the state module here to catch only the beginnings of the TCP stream, to minimize CPU load. Lastly, jump to KNOCK1 again in case were starting again with a 1111 packet, and to default-deny if none of these rules match.
OK, hopefully that’s clear. Now for the harder bits. First, notice that the KNOCK_1_OK has a “–seconds 10” match before it jumps to KNOCK2. What’s that about? To find out, have a look at the Digital Ocean tutorial for just SSH access. There, when it gets into the equivalent of the KNOCKED table, the IP is removed from the equivalent of the KNOCK_2_OK table. This is OK if you just need an ssh connection. But if you have a user also using an interactive port (here a webserver on port 8080) then you have to leave that IP on the KNOCK_2_OK list or only the first connection to the webserver will work (without knocking again). Your users will not love you for that! Additionally, if any other traffic is arriving from the same IP (say it’s also a mail server and the mail client is downloading in the background) then the very next packet will invalidate the first state, setting you back to square one. This would make it almost impossible to port-knock. So, we have a situation where we want to get a first packet and a second packet, but we can’t close the door behind the second packet and we have to put up with other irrelevant packets in the meantime. So, to compensate we say, “the second packet has to arrive within ten seconds of the first packet”. That gives the bad guys ten seconds to find your two magic ports with full ports scans, in the right order. We’ll have to count on not being disastrously unlucky on that one. If you’re on Earth you can probably reduce that to five seconds since we’re using icmp rejects at the end of the processing chain, so the port-knock client has no need for sleep() commands to cause delays. Your latency may vary (YLMV?).
The other reason we can’t remove the KNOCK_2_OK state immediately is that the knocking clients (nc, nmap) send multiple packets, no matter how much you ask them not to. Go ahead and run tcpdump on the server while you run ‘nc -w 1 server port’ or ‘nmap -Pn –host_timeout 201 –max-retries 0 -p 1111 server’ (etc) and you’ll see two packets on 1111, then two packets on 2222. The second 1111 is fine, but if you’re invalidating state on KNOCK_2_OK in the KNOCKED table then the first 2222 will enter the state and it will be gone after the second one, so your port 22 or port 8080 connection will wind up back at the ‘no packets seen’ state and you’ll never get in. Did you see where I told nmap to never retry? nmap don’t care (at least EL6 nmap, where my test client is). The downside is you have to yell at the kids … err, users … to remember to close the door. Consider setting –seconds LARGEVALUE on KNOCK_2_OK to make them re-knock after some period of time if your kids … err, users … can’t clean up after themselves. The close can be easily added to a script, so it’s not that bad. For ssh you might have:
$ nc $SERVER 1111; nc $SERVER 2222; ssh user@$SERVER; nc $SERVER 3333
Here’s a test-run of what all this looks like in action:
$ export SERVER=foo.example.com
$ nc $SERVER 22
$ printf "GET / HTTP/1.0\r\n\r\n" | nc $SERVER 8080 | grep DOCTYPE
$ nc $SERVER 1111
$ nc $SERVER 22
$ printf "GET / HTTP/1.0\r\n\r\n" | nc $SERVER 8080 | grep DOCTYPE
$ nc $SERVER 1111
$ nc $SERVER 2222
$ nc $SERVER 22
SSH-2.0-OpenSSH_4.3
$ printf "GET / HTTP/1.0\r\n\r\n" | nc $SERVER 8080 | grep DOCTYPE
$ nc $SERVER 3333
$ nc $SERVER 22
$
So, the results are as expected and the user interface is pretty easy. This can be easily added to a connect script. On Android, Port Knocker works great and fronts JuiceSSH just fine (makes for a nice knock for your mosh session on mobile). Still TODO:
1) is there any way to get .ssh/config to pre-launch a knock script for a server? So many people are writing ssh wrappers which seems ridiculous.
2) corral these rules into puppet-iptables declarations for easy cluster deployment.
Please leave comments below if you’ve done either of those two things and I’ll add them to the article.