Reverse Engineering an Unknown Binary Protocol

Plenty of devices talk in a custom binary protocol that no public spec describes. Decoding it opens up the device’s commands, and often its weakest authentication. The work splits between captures, which give you theories, and the firmware parser, which gives you facts.
Decide Where the Protocol Lives
Before capturing anything, work out the transport. A protocol over TCP or UDP is captured with a network tap. A protocol over the device’s serial port or an internal bus needs a different probe. A quick port scan and a look at the traffic during normal use usually tells you which one you are dealing with.
# is the control channel on the network at all?
nmap -sS -p- 192.168.1.50 | grep open22/tcp open ssh 80/tcp open http 5000/tcp open upnp? <- proprietary control channel
An odd open port with a service nmap cannot name, like 5000 here, is the usual home of a custom protocol. If nothing on the network looks proprietary, the protocol is probably on the wire between the app and a companion device, and you capture it at the serial or USB layer instead.
Capture Clean Samples
Record traffic during distinct, deliberate actions: a login, a status poll, a single setting change. Keep a log of exactly what you did and when, because that log is what lets you line up bytes with behavior later. A capture filter keeps the noise down.
# capture only traffic to and from the device on its control port
tcpdump -i eth0 -w device.pcap 'host 192.168.1.50 and port 5000'tcpdump: listening on eth0, link-type EN10MB (Ethernet) ^C 47 packets captured 47 packets received by filter
Repeat each action two or three times. Repetition is the lever that separates fixed framing from the part of the message that actually changes, and it costs nothing but a few extra minutes at the bench.
Diff Captures to Find the Fields
Pull the payloads out as hex and compare them. Bytes that never change are framing, length markers, or constants. Bytes that move predictably are counters or payload. Changing one input at a time and watching which byte moves is the fastest way to map a field.
# two captures of the same 'set channel' command, channel 6 then channel 11
xxd payload_ch6.bin
xxd payload_ch11.bin00000000: aa55 0006 12 06 1d 0d .U......... 00000000: aa55 0006 12 0b 22 0d .U.........
The differences tell the story. aa55 is a fixed start marker. 0006 is a length. 12 is the command byte for set-channel. The byte that changed from 06 to 0b is the channel value, 6 then 11. The byte after it also moved, which makes it part of a checksum that covers the payload.
Find the Framing
Most binary protocols share a shape: a start marker, a length field, a type or command byte, the payload, and a checksum. Confirm the length field by sending payloads of different sizes and watching one byte track the change. The checksum is usually the last byte or two.
# does a simple sum of the body match the checksum byte?
python3 -c "d=bytes.fromhex('aa550006120 61d'.replace(' ','')); print('sum8:', hex(sum(d[:-1]) & 0xff))"sum8: 0x1d # matches the second-to-last byte -> 8-bit additive checksum over the body
Once that sum matches, you have an 8-bit additive checksum, the most common choice on small devices. If it had not matched, a CRC-16 would be the next guess. Being able to compute the checksum yourself is what lets you forge valid messages later.
Let the Parser Settle It
Captures give you a strong theory. The parsing code in the firmware turns it into fact. Find the function that reads from the socket or serial buffer and follow how it splits the bytes. That code is the protocol’s specification.
// decompiled receive handler, cleaned up
if (buf[0] == 0xAA && buf[1] == 0x55) { // start marker
len = (buf[2] << 8) | buf[3]; // big-endian length
cmd = buf[4]; // command byte
payload = &buf[5];
if (checksum(buf, len + 4) != buf[len + 4])
return ERR_CRC;
dispatch(cmd, payload, len); // no auth check before dispatch
}Reading the parser resolves the ambiguities captures leave behind: endianness, whether the length includes the header, and what happens at the edges when a field is malformed. It also exposes the security posture in a single glance.
Map the Command Table
The dispatch function usually indexes a table of handlers by the command byte. Recovering that table converts your handful of observed commands into the device’s full vocabulary, including the commands you never triggered during testing.
# command byte -> handler, recovered from the jump table
0x10 cmd_status
0x12 cmd_set_channel
0x20 cmd_reboot
0x30 cmd_read_config
0x31 cmd_write_config
0x40 cmd_firmware_update <- high value6 documented commands observed in captures 2 undocumented recovered: 0x31 cmd_write_config, 0x40 cmd_firmware_update
The capture showed you status and set-channel. The table reveals cmd_write_config and cmd_firmware_update, which are the commands an attacker actually wants. They were there all along, just never exercised during your captures.
Look for the Weak Spots
Two questions decide how serious the protocol’s weaknesses are. First, is the length field trusted without bounds checking? A memcpy sized by an attacker-controlled length is a buffer overflow waiting for a crafted packet. Second, is the checksum doing security or just integrity?
An additive checksum or a plain CRC stops corruption, not an attacker who recomputes it. The most common and most serious finding is the one visible in the parser above: dispatch happens with no authentication. The device trusts anything that arrives in the right shape, which means the firmware-update command is reachable by anyone who can talk to the port.
Build a Client and Test Authentication
Once you can frame and checksum a message, a small client turns your map into a working tool. Replaying a captured command tests whether the device verifies who is talking or merely that the bytes parse.
# send a forged 'reboot' command (cmd 0x20) with a correct checksum
python3 send_cmd.py --host 192.168.1.50 --port 5000 --cmd 0x20[*] framed: aa5500012020 0d [*] sent 7 bytes [*] device acked: aa5500010600 0d (status OK) [*] target rebooting... no credentials were sent
A device that reboots on an unauthenticated command will do far worse on the commands that write config or push firmware. That single test is frequently the headline of the whole protocol review.
Fuzz the Parser Once You Understand It
Knowing the framing lets you fuzz intelligently rather than blindly. Hold the structure valid and mutate one field at a time, especially the length, so the parser sees inputs its author never tested.
# mutate the length field while keeping framing + checksum valid
boofuzz_run.py --target 192.168.1.50:5000 --template custom_proto.json[fuzz] case 312: len=0xFFFF payload=8B -> no response, connection reset [fuzz] case 318: len=0x0000 payload=64B -> device watchdog reboot [!] crash signature: oversized length triggers reset (likely overflow)
A length field that crashes the device is a denial of service at minimum and often memory corruption underneath. Structure-aware fuzzing finds those edges in minutes because every case is a valid frame with one hostile field.
Defending a Custom Protocol
The fix is rarely a new protocol. It is usually adding an authenticated session or signing commands, so that the right shape is no longer enough to be obeyed. Bounds-check every length against the real buffer size, and treat the checksum as integrity only, never as a substitute for authentication.
I flag the length-handling separately in any report, because an unauthenticated parser that also has a bounds bug is not just a logic gap, it is remote code execution waiting to happen. The two findings compound, and they should be fixed together.
Length-Prefix Versus Delimiter Framing
Two framing styles cover most custom protocols, and telling them apart early saves confusion. A length-prefixed protocol announces how many bytes follow, as the example above does. A delimiter-framed protocol ends each message with a fixed byte, often a newline or a null, and you split on that instead.
# does the protocol use a trailing delimiter? look for a repeated end byte
xxd capture.bin | tail -400000040: aa55 0006 1206 1d0d ... 0d 00000050: aa55 0008 3001 0000 ... 0d # every message ends in 0x0d -> delimiter framing alongside the length field
This protocol uses both, a length field and a trailing 0x0d, which is common and slightly redundant. Knowing which style you face tells you how to chunk a stream of bytes back into individual messages, which is the first thing your client needs to get right.
When the Payload Is Encrypted or Obfuscated
Sometimes the payload past the header looks like noise even though the framing is clear. Before assuming strong encryption, check for the cheap obfuscation that small devices favor: a fixed XOR key, a rolling counter, or a simple byte swap. An entropy check and a guessed-key XOR rule those in or out fast.
# try the most common obfuscation: a single-byte XOR recovered from a known field
python3 -c "import sys; d=open('payload.bin','rb').read(); k=d[0]^0x10; print(bytes(b^k for b in d)[:24])"b'{"cmd":"status","seq":3'
# single-byte XOR with key 0x37 -> the payload is JSON underneathA payload that turns into readable JSON after a one-byte XOR was never really protected. If it survives every cheap guess and the entropy stays flat, then it is genuinely encrypted, and the key hunt moves into the firmware where the device decrypts it.
Documenting It With a Formal Grammar
As the structure firms up, write it down in a machine-readable grammar rather than scattered notes. Kaitai Struct lets you describe the framing once and generates a parser and a hex-tree view, so every new capture is decoded automatically and your understanding stays testable.
# kaitai spec excerpt for the framing recovered above
seq:
- id: magic ; contents: [0xaa, 0x55]
- id: length ; type: u2
- id: command ; type: u1
- id: payload ; size: length - 1
- id: checksum ; type: u1A written grammar is also the cleanest possible deliverable. It hands the product team an unambiguous description of their own protocol, which is often the first time anyone has written it down completely, and it makes the security findings concrete rather than narrative.
Where This Fits
Decoding a custom protocol is a core part of a product security assessment and a penetration test, and it feeds directly into fuzzing the parser and hardening the command surface. If you have a proprietary protocol you want mapped and pressure-tested before someone else does it, that is the kind of work we do at Berkner Tech.