Post Banner

I write a lot of network enabled programs for my day job. Often, I have the opposite problem of the classic “It Works On My Machine” issue. Instead, I get a lot of “It Doesn’t Work On My Machine Only” from customers. The program will work perfectly fine everywhere but on the one customer’s network. I have the luxury of controlling the machine and OS my software runs on, as it’s more of an embedded system. But I never have control of the network and the clients that talk to my machine. So sometimes I need to get right down to the network card and take a look at what’s actually happening to get to the bottom of the problem. Honestly, the logs my program generates give me everything I need to know, but capturing the actual traffic on the network card will give me more proof that the problem is not me, or provide a valuable clue as to where my code has gone wrong.

Now, I need to capture these packets over the course of several days. Weeks possibly. Because if I get to the point where I’m bothering to do this, we have an intermittent problem that’s proven elusive. I can run tshark on the command line to capture the packets. This is handy because my embedded systems are command line only. There is no GUI desktop interface to run Wireshark on, and I’m usually connected over an SSH terminal anyway. Plus, you know me: I’m more comfortable on a command line. That’s where I was born and raised on computer interfaces.

So we run into another problem. I can’t consume all of the disk space on a single, giant pcap file that spans weeks. I want the file to roll over like my log files do. Normally, I use logrotate, and rotate the logs every 24 hours, keeping only 2 weeks worth. If I can do this with the pacap files, that would be ideal. But if I use logrotate on the file generated by tshark, it gets broken when I snip it from the outside. It’s not a nice text file, or even a Sqlite3 database file like my log files are. I assume some closure and cleanup is done whenever I kill tshark manually that’s not getting done with the typical logrotate process, because the resulting file is useless after it’s been roated.

The solution is dumpcap. It’s my understanding that dumpcap is used by tshark and wireshark. My understanding is limited, because I’ve got a job to do and don’t have time to dig into the bowels of some tool that needs to just do its job. But I do know that dumpcap will perform just like tshark, with the added benefit that it provides options that do the rotation of the caputre files without any other outside tools. I can even use the same capture filters I’m familiar with from tshark.

So, here’s the command I’ve come up with:

dumpcap -i eth0 -w pcap.cap -b files:14 -b duration:86400 -f "tcp" -f "host" -f "port 502" &

Here’s the breakdown:

  • Listen on interface eth0 (we can get that from ifconfig)
  • Output file name is pcap.cap (dumpcap will add some text to this name to differentiate the many files it will create)
  • Keep 14 files (two weeks)
  • Keep the files 24 hours long. (24 hours x 60 minutes x 60 seconds = 86400)
  • Capture only TCP, from remote host at address
  • End the command with an ampersand so that it runs in the background.

Now, I know the amount of traffic to expect between the server and the target machine, but if I didn’t, I risk the possibility of having gigantic files in each 24 hour period. If I were worried about that, I could also limit the capture file size using -b filesize: 16384 for a 16 mb maximum size.

So that’s it. I finally got tired of looking this up every time I needed it, and wrote this blog post so that it would be easy to find.

You Have Permission

Both tshark and dumcap are very skitish when it comes to permissions. It seems like they folow a different and more restrictive set of rules when it comes to where they can even write their output files.

So I run this program as root, and any output files go to /root/.

“Oh nooo!” cry all the online professional Linux people, “Do not be root!” Listen: I don’t have time to navigate the absurd permissions issues of a Linux machine. I get it, it’s “safer”. But let’s just assume I’m a big boy and I understand the risk and have made a calculated decision to get on with my day and root this whole process. This gives me a full 8 hours of sleep.

Still, if you try this at home, make sure you are the admin of the machine in question. Do the research, understand and accept the responsibility. Or spend two days figuring out which flags need to be set where just to get something simple done. I’m not the boss of you.