Understanding what the algorithm actually doesLooking at this carefully, the key insight is what the 4-byte grouping actually
means. It is not really a "4-byte block" algorithm — it is a standard CRC-16/ARC (polynomial 0xA001, initial 0, reflected) applied to a stream of bytes, but with the bytes of each 32-bit little-endian word consumed
most-significant-byte first (indices 3, 2, 1, 0 within each word).
In other words, the firmware treats the buffer as a stream of 32-bit LE words and feeds the CRC the MSB of each word first. For a trailing 16-bit half-word, the natural and almost certainly correct extension is the same thing at 16-bit granularity: process byte 1 first, then byte 0.
Why padding with 0x0000 probably didn't workCRC-16/ARC is
not invariant to trailing zero bytes. Appending
\x00\x00 after your data is equivalent to running two extra
hdd_update_crc16_byte(0, crc) steps, which changes the value. So the answer is almost never "pad and run the 4-byte version"; it is "handle the 2-byte tail natively, without any padding".
There is also a subtlety about the reverse-within-word order. If the trailing half-word originally occupied the
low half of a 32-bit slot (bytes 0–1, with bytes 2–3 unused), then in the 4-byte algorithm those two unused bytes would have been consumed
first (indices 3, 2) before your real data (indices 1, 0). If the trailing half-word occupied the
high half of the slot, the reverse would be true. The firmware's actual choice depends on how the CRC routine increments its pointer — but for NvCache records whose length is genuinely 2-byte-granular (not padded to 4), Option A below matches every Seagate implementation I have seen documented.
Three variants worth trying, in descending order of likelihood- Option A — Process the 2-byte tail MSB-first, no padding (most likely correct): consume data[last-1], then data[last-2].
- Option B — Pad \x00\x00 as the high half of a final 32-bit word: same as A but with two zero updates inserted before the tail, i.e. 0x00, 0x00, data[last-1], data[last-2]. This is equivalent to appending \x00\x00 and then running the original 4-byte algorithm. If you already tried exactly this, Option B is ruled out.
- Option C — Pad \x00\x00 as the low half: two zero updates after the tail, i.e. data[last-1], data[last-2], 0x00, 0x00.
Cleaned-up implementation (Python 3)Below is a drop-in replacement that accepts any length that is a multiple of 2 and lets you select the tail-handling mode. It also removes the Python 2 ord()/chr() calls since indexing a bytes object in Python 3 already returns an int.
Code:
# -*- coding: utf-8 -*-
# Seagate HDD firmware CRC-16/ARC (poly 0xA001, init 0, reflected).
# Bytes are consumed MSB-first within each little-endian word.
# Extended to handle buffers whose length is a multiple of 2 (not only 4).
hdd_crc_table = [0, 0xC0C1, 0xC181, 0x140, 0xC301, 0x3C0, 0x280, 0xC241,
0xC601, 0x6C0, 0x780, 0xC741, 0x500, 0xC5C1, 0xC481, 0x440,
0xCC01, 0xCC0, 0xD80, 0xCD41, 0xF00, 0xCFC1, 0xCE81, 0xE40,
0xA00, 0xCAC1, 0xCB81, 0xB40, 0xC901, 0x9C0, 0x880, 0xC841,
0xD801, 0x18C0, 0x1980, 0xD941, 0x1B00, 0xDBC1, 0xDA81, 0x1A40,
0x1E00, 0xDEC1, 0xDF81, 0x1F40, 0xDD01, 0x1DC0, 0x1C80, 0xDC41,
0x1400, 0xD4C1, 0xD581, 0x1540, 0xD701, 0x17C0, 0x1680, 0xD641,
0xD201, 0x12C0, 0x1380, 0xD341, 0x1100, 0xD1C1, 0xD081, 0x1040,
0xF001, 0x30C0, 0x3180, 0xF141, 0x3300, 0xF3C1, 0xF281, 0x3240,
0x3600, 0xF6C1, 0xF781, 0x3740, 0xF501, 0x35C0, 0x3480, 0xF441,
0x3C00, 0xFCC1, 0xFD81, 0x3D40, 0xFF01, 0x3FC0, 0x3E80, 0xFE41,
0xFA01, 0x3AC0, 0x3B80, 0xFB41, 0x3900, 0xF9C1, 0xF881, 0x3840,
0x2800, 0xE8C1, 0xE981, 0x2940, 0xEB01, 0x2BC0, 0x2A80, 0xEA41,
0xEE01, 0x2EC0, 0x2F80, 0xEF41, 0x2D00, 0xEDC1, 0xEC81, 0x2C40,
0xE401, 0x24C0, 0x2580, 0xE541, 0x2700, 0xE7C1, 0xE681, 0x2640,
0x2200, 0xE2C1, 0xE381, 0x2340, 0xE101, 0x21C0, 0x2080, 0xE041,
0xA001, 0x60C0, 0x6180, 0xA141, 0x6300, 0xA3C1, 0xA281, 0x6240,
0x6600, 0xA6C1, 0xA781, 0x6740, 0xA501, 0x65C0, 0x6480, 0xA441,
0x6C00, 0xACC1, 0xAD81, 0x6D40, 0xAF01, 0x6FC0, 0x6E80, 0xAE41,
0xAA01, 0x6AC0, 0x6B80, 0xAB41, 0x6900, 0xA9C1, 0xA881, 0x6840,
0x7800, 0xB8C1, 0xB981, 0x7940, 0xBB01, 0x7BC0, 0x7A80, 0xBA41,
0xBE01, 0x7EC0, 0x7F80, 0xBF41, 0x7D00, 0xBDC1, 0xBC81, 0x7C40,
0xB401, 0x74C0, 0x7580, 0xB541, 0x7700, 0xB7C1, 0xB681, 0x7640,
0x7200, 0xB2C1, 0xB381, 0x7340, 0xB101, 0x71C0, 0x7080, 0xB041,
0x5000, 0x90C1, 0x9181, 0x5140, 0x9301, 0x53C0, 0x5280, 0x9241,
0x9601, 0x56C0, 0x5780, 0x9741, 0x5500, 0x95C1, 0x9481, 0x5440,
0x9C01, 0x5CC0, 0x5D80, 0x9D41, 0x5F00, 0x9FC1, 0x9E81, 0x5E40,
0x5A00, 0x9AC1, 0x9B81, 0x5B40, 0x9901, 0x59C0, 0x5880, 0x9841,
0x8801, 0x48C0, 0x4980, 0x8941, 0x4B00, 0x8BC1, 0x8A81, 0x4A40,
0x4E00, 0x8EC1, 0x8F81, 0x4F40, 0x8D01, 0x4DC0, 0x4C80, 0x8C41,
0x4400, 0x84C1, 0x8581, 0x4540, 0x8701, 0x47C0, 0x4680, 0x8641,
0x8201, 0x42C0, 0x4380, 0x8341, 0x4100, 0x81C1, 0x8081, 0x4040]
def hdd_update_crc16_byte(data, crc):
return hdd_crc_table[(data ^ crc) & 0xff] ^ (crc >> 8)
def hdd_crc16(data, tail_mode="msb_first"):
"""
CRC-16 over `data`. Length must be a multiple of 2.
Each full 4-byte word is consumed MSB-first (indices 3, 2, 1, 0).
A trailing 2-byte half-word is handled according to `tail_mode`:
- 'msb_first' (default, Option A): consume byte 1 then byte 0
- 'pad_high' (Option B): insert two zero updates, then consume 1, 0
- 'pad_low' (Option C): consume 1, 0, then two zero updates
"""
if len(data) % 2 != 0:
raise ValueError("Data length must be a multiple of 2")
crc = 0
i = 0
n = len(data)
while i + 4 <= n:
crc = hdd_update_crc16_byte(data[i + 3], crc)
crc = hdd_update_crc16_byte(data[i + 2], crc)
crc = hdd_update_crc16_byte(data[i + 1], crc)
crc = hdd_update_crc16_byte(data[i + 0], crc)
i += 4
if i < n:
if tail_mode == "msb_first":
crc = hdd_update_crc16_byte(data[i + 1], crc)
crc = hdd_update_crc16_byte(data[i + 0], crc)
elif tail_mode == "pad_high":
crc = hdd_update_crc16_byte(0, crc)
crc = hdd_update_crc16_byte(0, crc)
crc = hdd_update_crc16_byte(data[i + 1], crc)
crc = hdd_update_crc16_byte(data[i + 0], crc)
elif tail_mode == "pad_low":
crc = hdd_update_crc16_byte(data[i + 1], crc)
crc = hdd_update_crc16_byte(data[i + 0], crc)
crc = hdd_update_crc16_byte(0, crc)
crc = hdd_update_crc16_byte(0, crc)
else:
raise ValueError("Unknown tail_mode: %r" % tail_mode)
return crc
def hdd_crc16_checksum_buffer(data, tail_mode="msb_first"):
body = data[:-4]
crc = hdd_crc16(body, tail_mode=tail_mode)
crc = (crc >> 8) ^ hdd_crc_table[crc & 0xff]
crc = (crc >> 8) ^ hdd_crc_table[crc & 0xff]
return body + bytes([(crc >> 8) & 0xff, crc & 0xff, 0, 0])
What a quick sanity check confirms- For a length-16 buffer, all three tail modes give identical results to the original 4-byte-only code (because there is no tail) — so the new version is backward-compatible.
- tail_mode="pad_high" gives exactly the same output as running the original 4-byte algorithm on data + b"\x00\x00". If that is what you already tried, Option B is ruled out.
- That leaves Option A (msb_first) as the overwhelmingly likely correct answer, with Option C (pad_low) as a long shot.
Recommended plan- Re-run your NvCache blocks with tail_mode="msb_first" (the default). If the computed CRCs match the stored ones, you are done.
- If a specific block still fails, try tail_mode="pad_low" on that one block. It is rare but it does appear in a couple of firmware families where the record length is stored in 32-bit words and the CRC unit flushes after the last partial word.
- If neither matches, either the ROM dump has corruption in that NvCache region, or the NvCache uses a different CRC polynomial or seed than the main ROM section. It would not be unheard of — Seagate has been known to use CRC-16/CCITT (poly 0x1021) for some firmware structures and CRC-16/ARC for others. In that case, pull the CRC routine out of the ROM (look for references to a 256-entry table near known init code) and confirm the polynomial directly in the disassembly rather than guessing.
One more suggestion: since the ROM is available, the CRC table should be locatable inside the image itself. Searching for the little-endian byte pattern
00 00 00 00 C1 C0 00 00 81 C1 00 00 40 01 00 00 (the first four 32-bit-padded entries of the table) will find it instantly if it is there in the same form. If the table is different, that alone tells you the NvCache uses a different CRC.
[hr]
Technical analysis completely drafted by Claude.ai Opus 4.7 (Anthropic). - In few words didn't written a single character; if it works , this is just the surface of the power you can get using your skills with the support of contemporay AI LLMs. Opus 4.7 is only available with a PRO plan which in Italy costs 19,00 euro per month