RCTF 2025 RE wp

chaos

RCTF{AntiDbg_KeyM0d_2025_R3v3rs3}

chaos2

有花指令

全patch了

patch要注意下

如main中

不仅红框部分需要nop，绿框也需要

再修改下函数边界即可(不会的话可以问我)，然后UCP即可

恢复逻辑，是个rc4

main调用 EnumUILanguagesA 触发所有反调试回调，随后通过 rc4_init / rc4_apply 对 128 字节密文表解密并输出 flag。

0x401440在模块中搜索标记 0x12345678，找到后记录其后 0x80 字节的地址（0x402818），这里存放 RC4 key，初始为 flag:{Th1sflaglsG00ds}。

每个反调试函数都在未检测到调试器时把 key 的一个字节修正成正确值，使其最终变成 flag:{ThisflagIsGoods}：

check_PEB_BeingDebugged (0x401090) 把 key[8] 从 ‘1’ 改成 ‘i’。
check_ProcessDebugPort (0x401200) 调 NtQueryInformationProcess class 7，返回 0 时把 key[14] 改成 ‘I’。
exception_key_patch (0x4012a0) 使用 NtClose 触发异常并在 SEH 中把 key[17] 写成 ‘o’。
check_ProcessDebugFlags (0x4013a0) 查询 class 31，返回 1 时写 ‘o’ 到 key[18]。
- 这四处patch成功后，rc4_init以 128 字节 key 做标准 RC4 KSA，rc4_apply对密文异或得到RCTF{AntiDbg_Reversing_2025_v2.0_Ch4llenge}。

exp

from typing import List

def rc4_ksa(key_bytes):
  s = list(range(256))
  j = 0
  for i in range(256):
      j = (j + s[i] + key_bytes[i % len(key_bytes)]) & 0xFF

def rc4_prga(state, data) -> bytes:
  i = j = 0
  out = bytearray()
  for byte in data:
      i = (i + 1) & 0xFF
      j = (j + state[i]) & 0xFF
      state[i], state[j] = state[j], state[i]
      k = state[(state[i] + state[j]) & 0xFF]
      out.append(byte ^ k)
  return bytes(out)

cipher = bytes([
  15, 26, 138, 90, 34, 171, 30, 99, 25, 90, 135, 242, 230, 233, 215, 209,
  151, 249, 248, 50, 91, 222, 45, 214, 163, 79, 126, 203, 97, 178, 63, 191,
  183, 27, 10, 132, 179, 180, 222, 3, 70, 123, 131, 240, 196, 179, 171, 123,
  41, 188, 31, 254, 138, 121, 38, 218, 8, 1, 133, 102, 125, 187, 238, 15,
  137, 89, 212, 95, 172, 24, 174, 11, 78, 240, 183, 5, 92, 129, 4, 159,
  164, 28, 93, 160, 185, 7, 146, 92, 138, 83, 243, 255, 247, 167, 221, 46,
  230, 237, 15, 119, 44, 74, 34, 241, 54, 79, 167, 238, 13, 214, 4, 115,
  85, 94, 62, 147, 164, 52, 41, 103, 252, 35, 121, 25, 216, 201, 43, 207
])

key = bytearray(b"flag:{Th1sflaglsG00ds}" + b"\x00" * (128 - 22))
key[8] = ord('i')
key[14] = ord('I')
key[17] = key[18] = ord('o')

state = rc4_ksa(bytes(key))
flag = rc4_prga(state, cipher).rstrip(b"\x00")
print(flag.decode())

Onion [赛后]

赛后做了下其实不是难题，赛中主要去参加了R3con，回来实在不想做了(因为没仔细看以为每层的加密算法不一样 😂)

sub_171D0中包含了如何解释opcode

像洋葱一样剥就行有几个方法

叠工作量，每个去逆
写自动脚本，因为有具体的模式，不是每层不一样的算法

sub_171D0 载入 full_vmcode，在 0x10000 字节缓冲区里跑一个自制 VM，并把用户输入的 50 个 64 位 key写到 VM 内存 0xE000 开始的位置。

0x0000 - 0xDFFF : VM 字节码

0xE000 - 0xE18F (400 字节): 50 个 64 位用户输入密钥

0xE190 - 0xFFFF : 栈空间

这个是opcode功能的识别

opcode	语义（64-bit 寄存器 r0~r7，16-bit pointer1/pointer2，栈指针 stack_ptr）
0x00	NOP
0x01 target	无条件跳转：pc = target
0x02 target	flag==0 时跳转（“if not equal”）
0x03 target	flag!=0 时跳转（“if equal”）
0x11 addr	ptr1 = addr（16-bit）
0x12 addr	ptr2 = addr
0x15 r	r = qword[ptr1]；flag = (r==0)
0x16 r, imm	r = imm；flag = (r==0)
0x17 dst, src	dst = src；flag = (dst==0)
0x18 r, +off	r = qword[(ptr1 + off) mod 0x10000]；flag = (r==0)
0x19 r	qword[ptr1] = r
0x1A dst, offReg	取 byte：dst = byte[(ptr1 + (offReg & 0xFFFF))]；flag = (dst==0)
0x1B dst, offReg	写 byte：byte[(ptr1 + (offReg & 0xFFFF))] = dst & 0xFF
0x1C r	r = (r + 1) & MASK64；flag = (r==0)
0x1D r	r = (r - 1) & MASK64；flag = (r==0)
0x1E r, sh	r >>= sh；flag = (r==0)
0x1F r	ptr1 = (ptr1 + (r & 0xFFFF)) & 0xFFFF
0x25 dst, src	dst &= src；flag = (dst==0)
0x26 dst, src	dst ^= src；flag = (dst==src)?
0x27 r, sh	r <<= sh；flag = (r==0)
0x29 r, imm	tmp=r; r ^= imm; flag = (tmp==imm)
0x2A r, imm	r &= imm；flag = (r==0)
0x2B dst, offReg	取 byte：dst = byte[(ptr2 + (offReg & 0xFFFF))]
0x2C dst, offReg	写 byte：byte[(ptr2 + (offReg & 0xFFFF))] = dst & 0xFF
0x32 r, imm	比较：flag = (r == imm)
0x80	“保存 PC”指令，后续 0x81/0x82 用
0x81 idx	call_table[idx] = saved_pc + 3
0x82 idx	调用：把返回地址压栈，pc = call_table[idx]
0x83	RET：从栈弹回
0x84 r	把 r 压到 VM 栈（stack_ptr -= 8）
0x85 r	从 VM 栈弹出到 r
0x90 byte	输出字符（打印 flag 时用）
0xFF	HALT（结束，返回某个寄存器值）

写个小脚本翻译一下第一层的opcode成asm

1. 读取full_vmcode，加载到一块 0x10000 字节的缓冲区。
2. 提示 “Enter 50 64-bit keys”，将用户输入的 50 个 64-bit 数写入虚拟机内存，实现方式是把每个 key 存到 0xE000 + i * 8。
3. 初始化虚拟机（8 个 64-bit 寄存器、两个 16-bit data pointer、栈/调用表、条件标志等），然后以full_vmcode为指令区执行解释器。

基于这些理解，让ai大致写了一个翻译脚本

from __future__ import annotations
from pathlib import Path
import argparse

OPCODES = {
    0x00: ("nop", []),
    0x01: ("jmp", ["addr16"]),
    0x02: ("jne", ["addr16"]),
    0x03: ("je", ["addr16"]),
    0x11: ("setp1", ["imm16"]),
    0x12: ("setp2", ["imm16"]),
    0x15: ("loadp1", ["reg"]),
    0x16: ("mov_imm", ["reg", "imm64"]),
    0x17: ("mov", ["reg", "reg"]),
    0x18: ("loadp1_off", ["reg", "imm16"]),
    0x19: ("storep1", ["reg"]),
    0x1A: ("loadp1_idx8", ["reg", "reg"]),
    0x1B: ("storep1_idx8", ["reg", "reg"]),
    0x1C: ("inc", ["reg"]),
    0x1D: ("dec", ["reg"]),
    0x1E: ("shr", ["reg", "imm8"]),
    0x1F: ("addp1_reg", ["reg"]),
    0x25: ("and", ["reg", "reg"]),
    0x26: ("xor", ["reg", "reg"]),
    0x27: ("shl", ["reg", "imm8"]),
    0x29: ("xor_imm", ["reg", "imm64"]),
    0x2A: ("and_imm", ["reg", "imm64"]),
    0x2B: ("loadp2_idx8", ["reg", "reg"]),
    0x2C: ("storep2_idx8", ["reg", "reg"]),
    0x32: ("cmp_imm", ["reg", "imm64"]),
    0x80: ("save_pc", []),
    0x81: ("store_call", ["imm8"]),
    0x82: ("call", ["imm8"]),
    0x83: ("ret", []),
    0x84: ("push", ["reg"]),
    0x85: ("pop", ["reg"]),
    0x90: ("print_byte", ["imm8"]),
    0xFF: ("halt", []),
}

REGS = {i: f"r{i}" for i in range(8)}


def read_u16(buf: bytes, off: int) -> tuple[int, int]:
    return int.from_bytes(buf[off:off+2], "little"), off + 2


def read_u64(buf: bytes, off: int) -> tuple[int, int]:
    return int.from_bytes(buf[off:off+8], "little"), off + 8


def disasm(buf: bytes, start: int, end: int | None = None):
    pc = start
    limit = len(buf) if end is None else min(len(buf), end)
    while pc < limit:
        opcode = buf[pc]
        mnemonic, operands = OPCODES.get(opcode, (None, None))
        cur = pc + 1
        parts: list[str]
        if mnemonic is None:
            # unknown opcode, bail
            print(f"{pc:04x}: .byte 0x{opcode:02x}")
            pc += 1
            continue
        values: list[str] = []
        for operand in operands:
            if operand == "addr16" or operand == "imm16":
                val, cur = read_u16(buf, cur)
                values.append(f"0x{val:04x}")
            elif operand == "imm8":
                val = buf[cur]
                cur += 1
                values.append(f"0x{val:02x}")
            elif operand == "imm64":
                val, cur = read_u64(buf, cur)
                values.append(f"0x{val:016x}")
            elif operand == "reg":
                val = buf[cur]
                cur += 1
                values.append(REGS.get(val, f"r{val}"))
            else:
                raise ValueError(f"unknown operand kind {operand}")
        ops = ", ".join(values)
        spacing = " " if ops else ""
        print(f"{pc:04x}: {mnemonic}{spacing}{ops}")
        pc = cur


def main():
    parser = argparse.ArgumentParser(description="Disassemble VM bytecode range")
    parser.add_argument("start", type=lambda x: int(x, 0))
    parser.add_argument("end", nargs="?", type=lambda x: int(x, 0))
    parser.add_argument("file", nargs="?", default="full_vmcode")
    args = parser.parse_args()
    buf = Path(args.file).read_bytes()
    disasm(buf, args.start, args.end)


if __name__ == "__main__":
    main()

1	python3 disasm_vm.py 0x5a9 0x6f4

05a9: setp1 0xe000
05ac: loadp1_off r0, 0x0010
05b0: mov r5, r0
05b3: push r5
05b5: mov_imm r5, 0x48f0e6421ac66dea
05bf: xor_imm r5, 0xffffffffffffffff
05c9: push r6
05cb: push r7
05cd: mov_imm r6, 0x0000000000000001
05d7: mov r7, r5
05da: and r7, r6
05dd: xor r5, r6
05e0: cmp_imm r7, 0x0000000000000000
05ea: je 0x05f6
05ed: shl r7, 0x01
05f0: mov r6, r7
05f3: jmp 0x05d7
05f6: pop r7
05f8: pop r6
05fa: push r6
05fc: push r7
05fe: mov r6, r5
0601: mov r7, r0
0604: and r7, r6
0607: xor r0, r6
060a: cmp_imm r7, 0x0000000000000000
0614: je 0x0620
0617: shl r7, 0x01
061a: mov r6, r7
061d: jmp 0x0601
0620: and_imm r0, 0xffffffffffffffff
062a: pop r7
062c: pop r6
062e: and_imm r0, 0xffffffffffffffff
0638: pop r5
063a: xor_imm r0, 0x5074d85b9194e696
0644: push r5
0646: mov_imm r5, 0x5566488c9c5cf234
0650: xor_imm r5, 0xffffffffffffffff
065a: push r6
065c: push r7
065e: mov_imm r6, 0x0000000000000001
0668: mov r7, r5
066b: and r7, r6
066e: xor r5, r6
0671: cmp_imm r7, 0x0000000000000000
067b: je 0x0687
067e: shl r7, 0x01
0681: mov r6, r7
0684: jmp 0x0668
0687: pop r7
0689: pop r6
068b: push r6
068d: push r7
068f: mov r6, r5
0692: mov r7, r0
0695: and r7, r6
0698: xor r0, r6
069b: cmp_imm r7, 0x0000000000000000
06a5: je 0x06b1
06a8: shl r7, 0x01
06ab: mov r6, r7
06ae: jmp 0x0692
06b1: and_imm r0, 0xffffffffffffffff
06bb: pop r7
06bd: pop r6
06bf: and_imm r0, 0xffffffffffffffff
06c9: pop r5
06cb: xor_imm r0, 0x8cb331163a92fc19
06d5: setp1 0x7200
06d8: mov_imm r1, 0x36b1cc9fe433713d
06e2: storep1 r1
06e4: push r0
06e6: mov_imm r0, 0x0000000000000008
06f0: addp1_reg r0
06f2: pop r0

伪C逻辑如下：

uint64_t inverse_transform(uint64_t y) {
    const uint64_t C1 = 0x48f0e6421ac66deaULL;
    const uint64_t K1 = 0x5074d85b9194e696ULL;
    const uint64_t C2 = 0x5566488c9c5cf234ULL;
    const uint64_t K2 = 0x8cb331163a92fc19ULL;
    const uint64_t MASK = 0xFFFFFFFFFFFFFFFFULL;

    uint64_t x3 = (y ^ K2);
    uint64_t x2 = (x3 + C2) & MASK;
    uint64_t x1 = (x2 ^ K1);
    uint64_t x  = (x1 + C1) & MASK;

    return x;
}

直接：

1	python3 disasm_vm.py 0x02be 0x0715

02be: push r0
02c0: push r1
02c2: push r2
02c4: push r3
02c6: push r4
02c8: push r5
02ca: push r6
02cc: push r7
02ce: mov r1, r0
02d1: shr r1, 0x20
02d4: and_imm r0, 0x00000000ffffffff
02de: loadp1 r2
02e0: push r0
02e2: mov_imm r0, 0x0000000000000008
02ec: addp1_reg r0
02ee: pop r0
02f0: loadp1 r3
02f2: push r0
02f4: mov_imm r0, 0x0000000000000008
02fe: push r5
0300: mov_imm r5, 0x0000000000000010
030a: xor_imm r5, 0xffffffffffffffff
0314: push r6
0316: push r7
0318: mov_imm r6, 0x0000000000000001
0322: mov r7, r5
0325: and r7, r6
0328: xor r5, r6
032b: cmp_imm r7, 0x0000000000000000
0335: je 0x0341
0338: shl r7, 0x01
033b: mov r6, r7
033e: jmp 0x0322
0341: pop r7
0343: pop r6
0345: push r6
0347: push r7
0349: mov r6, r5
034c: mov r7, r0
034f: and r7, r6
0352: xor r0, r6
0355: cmp_imm r7, 0x0000000000000000
035f: je 0x036b
0362: shl r7, 0x01
0365: mov r6, r7
0368: jmp 0x034c
036b: and_imm r0, 0xffffffffffffffff
0375: pop r7
0377: pop r6
0379: and_imm r0, 0xffffffffffffffff
0383: pop r5
0385: addp1_reg r0
0387: pop r0
0389: mov r4, r2
038c: mov r5, r3
038f: and_imm r2, 0x00000000ffffffff
0399: shr r4, 0x20
039c: and_imm r3, 0x00000000ffffffff
03a6: shr r5, 0x20
03a9: push r7
03ab: mov r7, r3
03ae: mov r3, r4
03b1: mov r4, r7
03b4: pop r7
03b6: mov_imm r6, 0x0000000000000000
03c0: push r7
03c2: mov r7, r0
03c5: push r6
03c7: push r7
03c9: mov r6, r7
03cc: mov r7, r7
03cf: shr r6, 0x08
03d2: shl r7, 0x18
03d5: xor r6, r7
03d8: and_imm r6, 0x00000000ffffffff
03e2: pop r7
03e4: mov r7, r6
03e7: pop r6
03e9: mov r0, r7
03ec: push r6
03ee: push r7
03f0: mov r6, r1
03f3: mov r7, r0
03f6: and r7, r6
03f9: xor r0, r6
03fc: cmp_imm r7, 0x0000000000000000
0406: je 0x0412
0409: shl r7, 0x01
040c: mov r6, r7
040f: jmp 0x03f3
0412: and_imm r0, 0xffffffffffffffff
041c: pop r7
041e: pop r6
0420: and_imm r0, 0x00000000ffffffff
042a: xor r0, r2
042d: and_imm r0, 0x00000000ffffffff
0437: mov r7, r1
043a: push r6
043c: push r7
043e: mov r6, r7
0441: mov r7, r7
0444: shl r6, 0x03
0447: shr r7, 0x1d
044a: xor r6, r7
044d: and_imm r6, 0x00000000ffffffff
0457: pop r7
0459: mov r7, r6
045c: pop r6
045e: mov r1, r7
0461: xor r1, r0
0464: and_imm r1, 0x00000000ffffffff
046e: pop r7
0470: cmp_imm r6, 0x000000000000001a
047a: je 0x054a
047d: push r0
047f: push r1
0481: mov r0, r3
0484: mov r1, r2
0487: mov r2, r6
048a: push r7
048c: mov r7, r0
048f: push r6
0491: push r7
0493: mov r6, r7
0496: mov r7, r7
0499: shr r6, 0x08
049c: shl r7, 0x18
049f: xor r6, r7
04a2: and_imm r6, 0x00000000ffffffff
04ac: pop r7
04ae: mov r7, r6
04b1: pop r6
04b3: mov r0, r7
04b6: push r6
04b8: push r7
04ba: mov r6, r1
04bd: mov r7, r0
04c0: and r7, r6
04c3: xor r0, r6
04c6: cmp_imm r7, 0x0000000000000000
04d0: je 0x04dc
04d3: shl r7, 0x01
04d6: mov r6, r7
04d9: jmp 0x04bd
04dc: and_imm r0, 0xffffffffffffffff
04e6: pop r7
04e8: pop r6
04ea: and_imm r0, 0x00000000ffffffff
04f4: xor r0, r2
04f7: and_imm r0, 0x00000000ffffffff
0501: mov r7, r1
0504: push r6
0506: push r7
0508: mov r6, r7
050b: mov r7, r7
050e: shl r6, 0x03
0511: shr r7, 0x1d
0514: xor r6, r7
0517: and_imm r6, 0x00000000ffffffff
0521: pop r7
0523: mov r7, r6
0526: pop r6
0528: mov r1, r7
052b: xor r1, r0
052e: and_imm r1, 0x00000000ffffffff
0538: pop r7
053a: mov r2, r1
053d: mov r3, r4
0540: mov r4, r5
0543: mov r5, r0
0546: pop r1
0548: pop r0
054a: inc r6
054c: cmp_imm r6, 0x000000000000001b
0556: je 0x055c
0559: jmp 0x03c0
055c: shl r1, 0x20
055f: push r6
0561: push r7
0563: mov r6, r1
0566: mov r7, r0
0569: and r7, r6
056c: xor r0, r6
056f: cmp_imm r7, 0x0000000000000000
0579: je 0x0585
057c: shl r7, 0x01
057f: mov r6, r7
0582: jmp 0x0566
0585: and_imm r0, 0xffffffffffffffff
058f: pop r7
0591: pop r6
0593: pop r7
0595: pop r6
0597: pop r5
0599: pop r4
059b: pop r3
059d: pop r2
059f: pop r1
05a1: pop r6
05a3: ret
05a4: store_call 0x20
05a6: jmp 0x0003
05a9: setp1 0xe000
05ac: loadp1_off r0, 0x0010
05b0: mov r5, r0
05b3: push r5
05b5: mov_imm r5, 0x48f0e6421ac66dea
05bf: xor_imm r5, 0xffffffffffffffff
05c9: push r6
05cb: push r7
05cd: mov_imm r6, 0x0000000000000001
05d7: mov r7, r5
05da: and r7, r6
05dd: xor r5, r6
05e0: cmp_imm r7, 0x0000000000000000
05ea: je 0x05f6
05ed: shl r7, 0x01
05f0: mov r6, r7
05f3: jmp 0x05d7
05f6: pop r7
05f8: pop r6
05fa: push r6
05fc: push r7
05fe: mov r6, r5
0601: mov r7, r0
0604: and r7, r6
0607: xor r0, r6
060a: cmp_imm r7, 0x0000000000000000
0614: je 0x0620
0617: shl r7, 0x01
061a: mov r6, r7
061d: jmp 0x0601
0620: and_imm r0, 0xffffffffffffffff
062a: pop r7
062c: pop r6
062e: and_imm r0, 0xffffffffffffffff
0638: pop r5
063a: xor_imm r0, 0x5074d85b9194e696
0644: push r5
0646: mov_imm r5, 0x5566488c9c5cf234
0650: xor_imm r5, 0xffffffffffffffff
065a: push r6
065c: push r7
065e: mov_imm r6, 0x0000000000000001
0668: mov r7, r5
066b: and r7, r6
066e: xor r5, r6
0671: cmp_imm r7, 0x0000000000000000
067b: je 0x0687
067e: shl r7, 0x01
0681: mov r6, r7
0684: jmp 0x0668
0687: pop r7
0689: pop r6
068b: push r6
068d: push r7
068f: mov r6, r5
0692: mov r7, r0
0695: and r7, r6
0698: xor r0, r6
069b: cmp_imm r7, 0x0000000000000000
06a5: je 0x06b1
06a8: shl r7, 0x01
06ab: mov r6, r7
06ae: jmp 0x0692
06b1: and_imm r0, 0xffffffffffffffff
06bb: pop r7
06bd: pop r6
06bf: and_imm r0, 0xffffffffffffffff
06c9: pop r5
06cb: xor_imm r0, 0x8cb331163a92fc19
06d5: setp1 0x7200
06d8: mov_imm r1, 0x36b1cc9fe433713d
06e2: storep1 r1
06e4: push r0
06e6: mov_imm r0, 0x0000000000000008
06f0: addp1_reg r0
06f2: pop r0
06f4: mov_imm r1, 0xf97646d69c84ebd8
06fe: storep1 r1
0700: setp1 0x7200
0703: call 0x20
0705: cmp_imm r0, 0xda19ba6b81c83f61
070f: je 0x0715
0712: call 0x01
0714: halt

我放在一起了

其实应该是

python3 disasm_vm.py 0x02be 0x055c，也就是 call-table 入口 0x20 所指向的那段字节码（起始 full_vmcode:0x02be，结束 0x05a3），文件里就是那次 call 0x20 的完整被调函数。

入口 full_vmcode:0x05a9–0x0714（用 python3 disasm_vm.py 0x5a9 0x714 可再现），它负责从 0xE010 取 key、做两次可逆的加/异或混淆、执行 Speck 调用并比较结果；2) 被调用的 call 0x20 代码，即 layer1_call20.asm 中 full_vmcode:0x02be–0x05a3 的那大段嵌套循环
（Speck64/128 实现）。把两段拼在一起就是“第一次处理”的完整指令流。

求解第一个key

#!/usr/bin/env python3
"""Model the outermost VM layer and recover the first 64-bit key.

The layer performs a couple of invertible 64-bit add/xor scramblings on the
user-controlled key and then runs a 27-round Speck64/128 encryption on the
result with a fixed 128-bit key.  The Speck output must equal the constant
`0xDA19BA6B81C83F61`, so we can invert the whole chain and recover the key.
"""

from __future__ import annotations

MASK32 = 0xFFFFFFFF
MASK64 = 0xFFFFFFFFFFFFFFFF
ROUNDS = 27

# Constants pulled directly from the first-layer bytecode near 0x05b0.
CONST_SUB_1 = 0x48F0_E642_1AC6_6DEA
CONST_SUB_2 = 0x5566_488C_9C5C_F234
CONST_XOR_1 = 0x5074_D85B_9194_E696
CONST_XOR_2 = 0x8CB3_3116_3A92_FC19

# Speck key material written to VM memory at 0x7200 before the call.
SPECK_KEY_WORDS = [
    0xE433_713D,
    0x36B1_CC9F,
    0x9C84_EBD8,
    0xF976_46D6,
]

# Ciphertext that the VM compares against after the Speck call.
TARGET_CT = 0xDA19_BA6B_81C8_3F61


def _rol32(x: int, r: int) -> int:
    return ((x << r) | (x >> (32 - r))) & MASK32


def _ror32(x: int, r: int) -> int:
    return ((x >> r) | ((x << (32 - r)) & MASK32)) & MASK32


def speck_round_keys(key_words: list[int]) -> list[int]:
    """Generate 27 round keys for Speck64/128 (matches the VM's key schedule)."""

    l = list(key_words[1:])
    k = key_words[0]
    keys: list[int] = []
    for i in range(ROUNDS):
        keys.append(k)
        idx = i % (len(key_words) - 1)
        val = (_ror32(l[idx], 8) + k) & MASK32
        val ^= i
        l[idx] = val
        k = (_rol32(k, 3) ^ val) & MASK32
    return keys


ROUND_KEYS = speck_round_keys(SPECK_KEY_WORDS)


def speck_encrypt_block(block: int) -> int:
    x = block & MASK32
    y = (block >> 32) & MASK32
    for k in ROUND_KEYS:
        x = (_ror32(x, 8) + y) & MASK32
        x ^= k
        y = _rol32(y, 3) ^ x
    return (y << 32) | x


def speck_decrypt_block(block: int) -> int:
    x = block & MASK32
    y = (block >> 32) & MASK32
    for k in reversed(ROUND_KEYS):
        y = _ror32(y ^ x, 3)
        x ^= k
        x = _rol32((x - y) & MASK32, 8)
    return (y << 32) | x


def layer1_forward(user_key: int) -> int:
    """Exact logic executed by the first layer before the comparison."""

    state = (user_key - CONST_SUB_1) & MASK64
    state ^= CONST_XOR_1
    state = (state - CONST_SUB_2) & MASK64
    state ^= CONST_XOR_2
    return speck_encrypt_block(state)


def solve_layer1() -> int:
    """Invert the layer to retrieve the 64-bit key expected by the VM."""

    state = speck_decrypt_block(TARGET_CT)
    state ^= CONST_XOR_2
    state = (state + CONST_SUB_2) & MASK64
    state ^= CONST_XOR_1
    state = (state + CONST_SUB_1) & MASK64
    return state


if __name__ == "__main__":
    key = solve_layer1()
    assert layer1_forward(key) == TARGET_CT
    print(f"first key = 0x{key:016x}")

运行后得到第三个key

1	0xa28f38bd0463522c

一些问题的解释：

定位首层片段

虚拟机执行前会用 0x80/0x81 idx/0x03 建好 call table，每条形如 … 80 01 A4 05 … 81 20 …。python3 disasm_vm.py 0x5a9 0x6f4 这个范围是通过搜索 0x81 0x20 得到 call index 0x20 的入口偏移 full_vmcode:0x05a4，从 setp1 0xe000 (full_vmcode:0x05a9) 起正好是最外层“取第 3 个 key 并处理”的主体，在那里能看到读取 0xE010、一串 mov_imm/xor_imm/“逐位加法”循环，再写入 0x7200 和 0x7208 后 call 0x20。

好像是rc4的东西

python3 disasm_vm.py 0x0116 0x02c0

0116: setp1 0x7100
0119: storep1 r0
011b: setp1 0x7000
011e: mov_imm r2, 0x0000000000000000
0128: storep1_idx8 r2, r2
012b: inc r2
012d: cmp_imm r2, 0x0000000000000100
0137: jne 0x0128
013a: mov_imm r2, 0x0000000000000000
0144: mov_imm r3, 0x0000000000000000
014e: mov_imm r7, 0x00000000000000ff
0158: setp1 0x7000
015b: loadp1_idx8 r5, r2
015e: mov r4, r2
0161: and_imm r4, 0x0000000000000007
016b: setp1 0x7100
016e: loadp1_idx8 r4, r4
0171: push r6
0173: push r7
0175: mov r6, r5
0178: mov r7, r3
017b: and r7, r6
017e: xor r3, r6
0181: cmp_imm r7, 0x0000000000000000
018b: je 0x0197
018e: shl r7, 0x01
0191: mov r6, r7
0194: jmp 0x0178
0197: and_imm r3, 0xffffffffffffffff
01a1: pop r7
01a3: pop r6
01a5: push r6
01a7: push r7
01a9: mov r6, r4
01ac: mov r7, r3
01af: and r7, r6
01b2: xor r3, r6
01b5: cmp_imm r7, 0x0000000000000000
01bf: je 0x01cb
01c2: shl r7, 0x01
01c5: mov r6, r7
01c8: jmp 0x01ac
01cb: and_imm r3, 0xffffffffffffffff
01d5: pop r7
01d7: pop r6
01d9: and r3, r7
01dc: setp1 0x7000
01df: loadp1_idx8 r6, r3
01e2: storep1_idx8 r6, r2
01e5: storep1_idx8 r5, r3
01e8: inc r2
01ea: cmp_imm r2, 0x0000000000000100
01f4: jne 0x0158
01f7: mov_imm r2, 0x0000000000000000
0201: mov_imm r3, 0x0000000000000000
020b: mov_imm r6, 0x0000000000000000
0215: cmp_imm r1, 0x0000000000000000
021f: je 0x02b7
0222: inc r2
0224: and r2, r7
0227: setp1 0x7000
022a: loadp1_idx8 r5, r2
022d: push r6
022f: push r7
0231: mov r6, r5
0234: mov r7, r3
0237: and r7, r6
023a: xor r3, r6
023d: cmp_imm r7, 0x0000000000000000
0247: je 0x0253
024a: shl r7, 0x01
024d: mov r6, r7
0250: jmp 0x0234
0253: and_imm r3, 0xffffffffffffffff
025d: pop r7
025f: pop r6
0261: and r3, r7
0264: loadp1_idx8 r0, r3
0267: storep1_idx8 r0, r2
026a: storep1_idx8 r5, r3
026d: push r6
026f: push r7
0271: mov r6, r0
0274: mov r7, r5
0277: and r7, r6
027a: xor r5, r6
027d: cmp_imm r7, 0x0000000000000000
0287: je 0x0293
028a: shl r7, 0x01
028d: mov r6, r7
0290: jmp 0x0274
0293: and_imm r5, 0xffffffffffffffff
029d: pop r7
029f: pop r6
02a1: and r5, r7
02a4: loadp1_idx8 r4, r5
02a7: loadp2_idx8 r5, r6
02aa: xor r5, r4
02ad: storep2_idx8 r5, r6
02b0: inc r6
02b2: dec r1
02b4: jmp 0x0215
02b7: ret
02b8: store_call 0x10
02ba: save_pc
02bb: jmp 0x05a4
02be: push r0

同时

0715: mov r0, r5          ; 恢复原始 key
0718: setp2 0x072a         ; 目标缓冲 = 0x072a
071b: mov_imm r1, 0x6057   ; 长度 0x6057 字节
0725: call 0x10           ; 调用 index 0x10 的子程序
0727: jmp 0x072a          ; 跳进刚写出来的新代码

call 0x10 的实现就在 full_vmcode:0x0200–0x02b7（用 python3 disasm_vm.py 0x0200 0x02c0 能看到完整指令），它是一个标准的 RC4 PRGA

在进入 VM 后的初始化（full_vmcode:0x0116–0x01f7）里，先把 0x7000 处的 S 盒置成 0…255，再用 0x7100 的 8 字节 key 做 RC4 KSA。寄存器在 sub_171D0 里全部 memset 成 0，所以 r0 起始为 0，setp1 0x7100; storep1 r0 把 8 个 0 写进 key 区，等价于一个 8 字节全 0 的 RC4 密钥。

call 0x10 里能看到典型的 PRGA：r2 = (r2+1)&0xFF，r3 = (r3+S[i])&0xFF，交换 S[i]/S[j]，取 keystream byte S[(S[i]+S[j])&0xFF]，最后与 ptr2 指向的密文异或，把结果写回 ptr2。循环次数就是 r1 = 0x6057。

那段 call 0x10 不是随便 XOR，而是一个 RC4 解密器。它在入口（full_vmcode:0x0116）把当前 r0 写到 0x7100 作为 8 字节密钥，然后做 KSA/PRGA 去 XOR ptr2 指向的密文。第一层通过校验后会把原始 key 放回 r0（0x0715: mov r0, r5），所以 RC4 的密钥其实就是我们刚算出的第一层 key 0xA28F38BD0463522C。

from pathlib import Path

start = 0x072a
length = 0x6057
full = Path("full_vmcode").read_bytes()
enc = full[start:start+length]
key = (0xA28F38BD0463522C).to_bytes(8, "little")

S = list(range(256)); j = 0
for i in range(256):
    j = (j + S[i] + key[i % len(key)]) & 0xFF
    S[i], S[j] = S[j], S[i]

i = j = 0
out = bytearray(length)
for idx in range(length):
    i = (i + 1) & 0xFF
    j = (j + S[i]) & 0xFF
    S[i], S[j] = S[j], S[i]
    k = S[(S[i] + S[j]) & 0xFF]
    out[idx] = enc[idx] ^ k

Path("stage2.bin").write_bytes(out)

python3 disasm_vm.py 0 0x200 stage2.bin

0000: setp1 0xe000
0003: loadp1_off r0, 0x0008
0007: mov r5, r0
000a: xor_imm r0, 0x95714c91bc8b306f
0014: xor_imm r0, 0x4303f92241dd9a9f
001e: xor_imm r0, 0x311e18c91413b58c
0028: push r5
002a: mov_imm r5, 0x8df6073d0dbbff09
0034: xor_imm r5, 0xffffffffffffffff
003e: push r6
0040: push r7
0042: mov_imm r6, 0x0000000000000001
004c: mov r7, r5
004f: and r7, r6
0052: xor r5, r6
0055: cmp_imm r7, 0x0000000000000000
005f: je 0x0795
0062: shl r7, 0x01
0065: mov r6, r7
0068: jmp 0x0776
006b: pop r7
006d: pop r6
006f: push r6
0071: push r7
0073: mov r6, r5
0076: mov r7, r0
0079: and r7, r6
007c: xor r0, r6
007f: cmp_imm r7, 0x0000000000000000
0089: je 0x07bf
008c: shl r7, 0x01
008f: mov r6, r7
0092: jmp 0x07a0
0095: and_imm r0, 0xffffffffffffffff
009f: pop r7
00a1: pop r6
00a3: and_imm r0, 0xffffffffffffffff
00ad: pop r5
00af: push r5
00b1: mov_imm r5, 0xee5744efe81e97b7
00bb: xor_imm r5, 0xffffffffffffffff
00c5: push r6
00c7: push r7
00c9: mov_imm r6, 0x0000000000000001
00d3: mov r7, r5
00d6: and r7, r6
00d9: xor r5, r6
00dc: cmp_imm r7, 0x0000000000000000
00e6: je 0x081c
00e9: shl r7, 0x01
00ec: mov r6, r7
00ef: jmp 0x07fd
00f2: pop r7
00f4: pop r6
00f6: push r6
00f8: push r7
00fa: mov r6, r5
00fd: mov r7, r0
0100: and r7, r6
0103: xor r0, r6
0106: cmp_imm r7, 0x0000000000000000
0110: je 0x0846
0113: shl r7, 0x01
0116: mov r6, r7
0119: jmp 0x0827
011c: and_imm r0, 0xffffffffffffffff
0126: pop r7
0128: pop r6
012a: and_imm r0, 0xffffffffffffffff
0134: pop r5
0136: push r6
0138: push r7
013a: mov_imm r6, 0xf8a82a8dbdb78c3f
0144: mov r7, r0
0147: and r7, r6
014a: xor r0, r6
014d: cmp_imm r7, 0x0000000000000000
0157: je 0x088d
015a: shl r7, 0x01
015d: mov r6, r7
0160: jmp 0x086e
0163: pop r7
0165: pop r6
0167: push r6
0169: push r7
016b: mov_imm r6, 0x58e8abfc7618f5fd
0175: mov r7, r0
0178: and r7, r6
017b: xor r0, r6
017e: cmp_imm r7, 0x0000000000000000
0188: je 0x08be
018b: shl r7, 0x01
018e: mov r6, r7
0191: jmp 0x089f
0194: pop r7
0196: pop r6
0198: xor_imm r0, 0x99d88c4fa4cc68aa
01a2: setp1 0x7200
01a5: mov_imm r1, 0x8d85b3156df9f721
01af: storep1 r1
01b1: push r0
01b3: mov_imm r0, 0x0000000000000008
01bd: addp1_reg r0
01bf: pop r0
01c1: mov_imm r1, 0x28e3d33340bc0884
01cb: storep1 r1
01cd: setp1 0x7200
01d0: call 0x20
01d2: cmp_imm r0, 0x659391a5dc3522b3
01dc: je 0x090c
01df: call 0x01
01e1: halt
01e2: mov r0, r5
01e5: setp2 0x0921
01e8: mov_imm r1, 0x0000000000005e60
01f2: call 0x10
01f4: jmp 0x0921
01f7: .byte 0xbc
01f8: .byte 0x38
01f9: .byte 0xcd
01fa: .byte 0x98
01fb: .byte 0xe7
01fc: addp1_reg r5
01fe: .byte 0x49
01ff: push r136

可以看到第二层流程几乎与第一层一致：取 key[1]（偏移 0xE008），依次 XOR/减去几个 64 位常量，最后 XOR 0x99d8…，把新的 Speck 代码写到 0x7200。然后同样 call 0x20（Speck64/128，密钥改成 0x8d85b3156df9f721 || 0x28e3d33340bc0884），比较 r0 是否等于 0x659391A5DC3522B3，通过后恢复原始
key，调用 call 0x10 以该 key 为 RC4 密钥，再去解 0x5E60 字节的第三层。

第二层和第一层一样，只是换了一组可逆算子和 Speck 密钥。把第一层 key（0xA28F38BD0463522C）当成 RC4 密钥再跑 call 0x10，0x072a 开始的 0x6057 字节就会解成 stage2.bin，从头的指令能读出下面的管线：

setp1 0xE000; loadp1_off r0, 0x0008 取第 2 个 64-bit key。
依次执行 3 次 XOR（0x95714C91BC8B306F, 0x4303F92241DD9A9F, 0x311E18C91413B58C）。
两次“sub gadget”把 0x8DF6073D0DBBFF09、0xEE5744EFE81E97B7 从寄存器里减掉（跟第一层一样是先取补码再加）。
两次“add gadget”把 0xF8A82A8DBDB78C3F、0x58E8ABFC7618F5FD 加回去（这次不做补码，所以是加法）。
再 XOR 0x99D88C4FA4CC68AA。
像上一层一样在 0x7200/0x7208 写入 Speck64/128 的新 key（小端拆成 [0x6DF9F721, 0x8D85B315, 0x40BC0884, 0x28E3D333]），call 0x20，比较 r0 是否等于 0x659391A5DC3522B3，成功后把原 key 放回 r0，用它做 RC4 解出下一层。

把这一套流程都编码进了 test1.py，现在它既能描述第一层，也能逆出第二层。直接运行脚本：

#!/usr/bin/env python3
"""Model the outermost VM layer and recover the first 64-bit key.

The layer performs a couple of invertible 64-bit add/xor scramblings on the
user-controlled key and then runs a 27-round Speck64/128 encryption on the
result with a fixed 128-bit key.  The Speck output must equal the constant
`0xDA19BA6B81C83F61`, so we can invert the whole chain and recover the key.
"""

from __future__ import annotations

MASK32 = 0xFFFFFFFF
MASK64 = 0xFFFFFFFFFFFFFFFF
ROUNDS = 27

# Layer 1 constants pulled from bytecode around 0x05b0.
L1_CONST_SUB = (0x48F0_E642_1AC6_6DEA, 0x5566_488C_9C5C_F234)
L1_CONST_XOR = (0x5074_D85B_9194_E696, 0x8CB3_3116_3A92_FC19)

L1_SPECK_KEY = [
    0xE433_713D,
    0x36B1_CC9F,
    0x9C84_EBD8,
    0xF976_46D6,
]

L1_TARGET_CT = 0xDA19_BA6B_81C8_3F61

# Layer 2 constants read from stage2.bin.
L2_CONST_XOR_PREFIX = (
    0x9571_4C91_BC8B_306F,
    0x4303_F922_41DD_9A9F,
    0x311E_18C9_1413_B58C,
)
L2_CONST_SUB = (
    0x8DF6_073D_0DBB_FF09,
    0xEE57_44EF_E81E_97B7,
)
L2_CONST_ADD = (
    0xF8A8_2A8D_BDB7_8C3F,
    0x58E8_ABFC_7618_F5FD,
)
L2_CONST_XOR_SUFFIX = 0x99D8_8C4F_A4CC_68AA

L2_SPECK_KEY = [
    0x6DF9_F721,
    0x8D85_B315,
    0x40BC_0884,
    0x28E3_D333,
]

L2_TARGET_CT = 0x6593_91A5_DC35_22B3


def _rol32(x: int, r: int) -> int:
    return ((x << r) | (x >> (32 - r))) & MASK32


def _ror32(x: int, r: int) -> int:
    return ((x >> r) | ((x << (32 - r)) & MASK32)) & MASK32


def speck_round_keys(key_words: list[int]) -> list[int]:
    """Generate 27 round keys for Speck64/128 (matches the VM's key schedule)."""

    l = list(key_words[1:])
    k = key_words[0]
    keys: list[int] = []
    for i in range(ROUNDS):
        keys.append(k)
        idx = i % (len(key_words) - 1)
        val = (_ror32(l[idx], 8) + k) & MASK32
        val ^= i
        l[idx] = val
        k = (_rol32(k, 3) ^ val) & MASK32
    return keys


def speck_encrypt_block(block: int, round_keys: list[int]) -> int:
    x = block & MASK32
    y = (block >> 32) & MASK32
    for k in round_keys:
        x = (_ror32(x, 8) + y) & MASK32
        x ^= k
        y = _rol32(y, 3) ^ x
    return (y << 32) | x


def speck_decrypt_block(block: int, round_keys: list[int]) -> int:
    x = block & MASK32
    y = (block >> 32) & MASK32
    for k in reversed(round_keys):
        y = _ror32(y ^ x, 3)
        x ^= k
        x = _rol32((x - y) & MASK32, 8)
    return (y << 32) | x


L1_ROUND_KEYS = speck_round_keys(L1_SPECK_KEY)
L2_ROUND_KEYS = speck_round_keys(L2_SPECK_KEY)


def layer1_forward(user_key: int) -> int:
    """Exact logic executed by the first layer before the comparison."""

    state = (user_key - L1_CONST_SUB[0]) & MASK64
    state ^= L1_CONST_XOR[0]
    state = (state - L1_CONST_SUB[1]) & MASK64
    state ^= L1_CONST_XOR[1]
    return speck_encrypt_block(state, L1_ROUND_KEYS)


def solve_layer1() -> int:
    """Invert the layer to retrieve the 64-bit key expected by the VM."""

    state = speck_decrypt_block(L1_TARGET_CT, L1_ROUND_KEYS)
    state ^= L1_CONST_XOR[1]
    state = (state + L1_CONST_SUB[1]) & MASK64
    state ^= L1_CONST_XOR[0]
    state = (state + L1_CONST_SUB[0]) & MASK64
    return state


def layer2_forward(user_key: int) -> int:
    state = user_key
    for c in L2_CONST_XOR_PREFIX:
        state ^= c
    for c in L2_CONST_SUB:
        state = (state - c) & MASK64
    for c in L2_CONST_ADD:
        state = (state + c) & MASK64
    state ^= L2_CONST_XOR_SUFFIX
    return speck_encrypt_block(state, L2_ROUND_KEYS)


def solve_layer2() -> int:
    state = speck_decrypt_block(L2_TARGET_CT, L2_ROUND_KEYS)
    state ^= L2_CONST_XOR_SUFFIX
    for c in reversed(L2_CONST_ADD):
        state = (state - c) & MASK64
    for c in reversed(L2_CONST_SUB):
        state = (state + c) & MASK64
    for c in reversed(L2_CONST_XOR_PREFIX):
        state ^= c
    return state


if __name__ == "__main__":
    key1 = solve_layer1()
    assert layer1_forward(key1) == L1_TARGET_CT
    print(f"layer1 key = 0x{key1:016x}")

    key2 = solve_layer2()
    assert layer2_forward(key2) == L2_TARGET_CT
    print(f"layer2 key = 0x{key2:016x}")

layer1 key = 0xa28f38bd0463522c

layer2 key = 0xbf11b34d0ce941cc

所以后续步骤是：

用 Speck 逆向还原第二层常量，算出 key[1]（方法跟第一层脚本一样，换常量即可）。
拿这个 key 做 RC4，解 full_vmcode[0x0921 : 0x0921+0x5E60] 得到第三层，再继续同样的分析。

就是剥洋葱…逻辑差不多，要写个自动脚本

需要

解开key->获得下一段长度和偏移->rc4解密->识别更改的地方->解密得到key

这里解密测试碰上了个问题

也就是Rc4解密的时候也是嵌套的

也就是说第50层要经过50次解密，所以之前的rc4解密代码得换种写法

from pathlib import Path

start = 0x072a
length = 0x6057
full = Path("full_vmcode").read_bytes()
print(f"Full file length: {hex(len(full))}")

# 提取加密部分
enc = full[start:start+length]
print(f"Encrypted data length: {hex(len(enc))}")

# RC4解密
key = (0xA28F38BD0463522C).to_bytes(8, "little")

S = list(range(256)); j = 0
for i in range(256):
    j = (j + S[i] + key[i % len(key)]) & 0xFF
    S[i], S[j] = S[j], S[i]

i = j = 0
out = bytearray(length)
for idx in range(length):
    i = (i + 1) & 0xFF
    j = (j + S[i]) & 0xFF
    S[i], S[j] = S[j], S[i]
    k = S[(S[i] + S[j]) & 0xFF]
    out[idx] = enc[idx] ^ k

new_full = bytearray(full)
new_full[start:start+length] = out
Path("stage2.bin").write_bytes(new_full)

from pathlib import Path

start = 0x0921
length = 0x5E60
full = Path("stage2.bin").read_bytes()
print(f"Full file length: {hex(len(full))}")

# 提取加密部分
enc = full[start:start+length]
print(f"Encrypted data length: {hex(len(enc))}")

# RC4解密
key = (0xBF11B34D0CE941CC).to_bytes(8, "little")

S = list(range(256)); j = 0
for i in range(256):
    j = (j + S[i] + key[i % len(key)]) & 0xFF
    S[i], S[j] = S[j], S[i]

i = j = 0
out = bytearray(length)
for idx in range(length):
    i = (i + 1) & 0xFF
    j = (j + S[i]) & 0xFF
    S[i], S[j] = S[j], S[i]
    k = S[(S[i] + S[j]) & 0xFF]
    out[idx] = enc[idx] ^ k

new_full = bytearray(full)
new_full[start:start+length] = out
Path("stage3.bin").write_bytes(new_full)

final exp:

import struct
from pathlib import Path

MASK64 = 0xFFFFFFFFFFFFFFFF
MASK32 = 0xFFFFFFFF
ROUNDS = 27


def _rol32(x, r):
    return ((x << r) | (x >> (32 - r))) & MASK32


def _ror32(x, r):
    return ((x >> r) | ((x << (32 - r)) & MASK32)) & MASK32

OPCODES = {
    0x00: ("nop", []),
    0x01: ("jmp", ["imm16"]),
    0x02: ("jne", ["imm16"]),
    0x03: ("je", ["imm16"]),
    0x11: ("setp1", ["imm16"]),
    0x12: ("setp2", ["imm16"]),
    0x15: ("loadp1", ["reg"]),
    0x16: ("mov_imm", ["reg", "imm64"]),
    0x17: ("mov", ["reg", "reg"]),
    0x18: ("loadp1_off", ["reg", "imm16"]),
    0x19: ("storep1", ["reg"]),
    0x1f: ("addp1_reg", ["reg"]),
    0x25: ("and", ["reg", "reg"]),
    0x26: ("xor", ["reg", "reg"]),
    0x27: ("shl", ["reg", "imm8"]),
    0x29: ("xor_imm", ["reg", "imm64"]),
    0x2a: ("and_imm", ["reg", "imm64"]),
    0x32: ("cmp_imm", ["reg", "imm64"]),
    0x80: ("save_pc", []),
    0x81: ("store_call", ["imm8"]),
    0x82: ("call", ["imm8"]),
    0x83: ("ret", []),
    0x84: ("push", ["reg"]),
    0x85: ("pop", ["reg"]),
    0x90: ("print_byte", ["imm8"]),
    0xFF: ("halt", []),
}

REGS = {i: f"r{i}" for i in range(8)}


def read_u16(buf, off):
    return int.from_bytes(buf[off:off+2], "little"), off + 2


def read_u64(buf, off):
    return int.from_bytes(buf[off:off+8], "little"), off + 8


def decode(buf, off):
    opcode = buf[off]
    mnemonic, operands = OPCODES.get(opcode, (None, None))
    if mnemonic is None:
        raise ValueError(f"Unknown opcode 0x{opcode:02x} at {off}")
    cur = off + 1
    values = []
    for operand in operands:
        if operand == "imm16":
            val, cur = read_u16(buf, cur)
            values.append(val)
        elif operand == "imm8":
            val = buf[cur]
            cur += 1
            values.append(val)
        elif operand == "imm64":
            val, cur = read_u64(buf, cur)
            values.append(val)
        elif operand == "reg":
            val = buf[cur]
            cur += 1
            values.append(REGS.get(val, f"r{val}"))
        else:
            raise ValueError(f"Unknown operand {operand}")
    return mnemonic, values, cur


def speck_round_keys(words):
    l = list(words[1:])
    k = words[0]
    keys = []
    for i in range(ROUNDS):
        keys.append(k)
        idx = i % (len(words) - 1)
        val = (_ror32(l[idx], 8) + k) & MASK32
        val ^= i
        l[idx] = val
        k = (_rol32(k, 3) ^ val) & MASK32
    return keys


def speck_decrypt(ct, round_keys):
    x = ct & MASK32
    y = (ct >> 32) & MASK32
    for k in reversed(round_keys):
        y = _ror32(y ^ x, 3)
        x ^= k
        x = _rol32((x - y) & MASK32, 8)
    return (y << 32) | x


def rc4_decrypt(data, key_qword):
    key = key_qword.to_bytes(8, "little")
    S = list(range(256))
    j = 0
    key_len = len(key)
    for i in range(256):
        j = (j + S[i] + key[i % key_len]) & 0xFF
        S[i], S[j] = S[j], S[i]
    i = 0
    j = 0
    out = bytearray(len(data))
    for idx, byte in enumerate(data):
        i = (i + 1) & 0xFF
        j = (j + S[i]) & 0xFF
        S[i], S[j] = S[j], S[i]
        k = S[(S[i] + S[j]) & 0xFF]
        out[idx] = byte ^ k
    return bytes(out)


def skip_until_pop(buf, pos, reg):
    depth = 0
    while pos < len(buf):
        mnemonic, operands, nxt = decode(buf, pos)
        pos = nxt
        if mnemonic == "push" and operands[0] == reg:
            depth += 1
        elif mnemonic == "pop" and operands[0] == reg:
            if depth == 0:
                break
            depth -= 1
    return pos


def analyze_layer(buf):
    entry = buf.find(b"\x11\x00\xe0")
    if entry == -1:
        raise ValueError("setp1 0xE000 not found in layer")
    pos = entry
    last_setp1 = None
    collecting_speck = False
    speck_words = []
    operations = []
    info = {
        "key_offset": None,
        "target_ct": None,
        "next_offset": None,
        "next_length": None,
    }
    pending_add = False
    last_mov_r1 = None
    while pos < len(buf):
        mnemonic, operands, nxt = decode(buf, pos)
        if mnemonic == "setp1":
            last_setp1 = operands[0]
            if operands[0] == 0x7200 and len(speck_words) < 2:
                collecting_speck = True
        elif mnemonic == "push" and operands[0] == "r6":
            pending_add = True
        elif mnemonic == "push" and operands[0] == "r5":
            pending_add = False
        elif mnemonic == "loadp1_off" and last_setp1 == 0xE000 and operands[0] == "r0":
            info["key_offset"] = operands[1]
        elif mnemonic == "xor_imm" and operands[0] == "r0":
            operations.append(("xor", operands[1]))
        elif mnemonic == "mov_imm" and operands[0] == "r5":
            # check for subtract macro
            mnemonic2, operands2, nxt2 = decode(buf, nxt)
            if mnemonic2 == "xor_imm" and operands2[0] == "r5" and operands2[1] == MASK64:
                operations.append(("sub", operands[1]))
                pos = skip_until_pop(buf, nxt2, "r5")
                pending_add = False
                continue
        elif mnemonic == "mov_imm" and operands[0] == "r6":
            if operands[1] > 0xFFFF and pending_add:
                operations.append(("add", operands[1]))
                pos = skip_until_pop(buf, nxt, "r6")
                pending_add = False
                continue
        elif mnemonic == "storep1" and operands[0] == "r1" and collecting_speck:
            if last_mov_r1 is not None:
                speck_words.append(last_mov_r1)
                if len(speck_words) == 2:
                    collecting_speck = False
        elif mnemonic == "mov_imm" and operands[0] == "r1":
            last_mov_r1 = operands[1]
            if info["next_offset"] is not None and info["next_length"] is None and not collecting_speck:
                info["next_length"] = operands[1]
        elif mnemonic == "cmp_imm" and operands[0] == "r0":
            info["target_ct"] = operands[1]
        elif mnemonic == "setp2":
            # debug
            # print(f"setp2 at {hex(pos)} -> {hex(operands[0])}")
            info["next_offset"] = operands[0]
        elif mnemonic == "jmp" and info["next_offset"] is not None and operands[0] == info["next_offset"]:
            pos = nxt
            break
        pos = nxt
    info["operations"] = operations
    info["speck_words"] = speck_words
    return info


def invert_layer(info):
    if info["key_offset"] is None or info["target_ct"] is None or len(info["speck_words"]) != 2:
        raise ValueError("Incomplete layer info")
    key_words = []
    for qword in info["speck_words"]:
        key_words.append(qword & MASK32)
        key_words.append((qword >> 32) & MASK32)
    round_keys = speck_round_keys(key_words)
    state = speck_decrypt(info["target_ct"], round_keys)
    for op, val in reversed(info["operations"]):
        if op == "xor":
            state ^= val
        elif op == "sub":
            state = (state + val) & MASK64
        elif op == "add":
            state = (state - val) & MASK64
    return state


def main(max_layers: int = 50, dump_stages: bool = True):
    workspace = bytearray(Path("full_vmcode").read_bytes())
    offset = 0x5A9
    key_table = {}
    for layer in range(1, max_layers + 1):
        stage_view = workspace[offset:]
        info = analyze_layer(stage_view)
        key = invert_layer(info)
        key_index = info["key_offset"] // 8 if info["key_offset"] is not None else None
        print(f"Layer {layer}: key_index={key_index}, key=0x{key:016x}")
        if key_index is not None:
            key_table[key_index] = key
        if info["next_offset"] is None or info["next_length"] is None:
            break
        start = info["next_offset"]
        end = start + info["next_length"]
        if end > len(workspace):
            print("[!] Next layer exceeds VM code length, stopping.")
            break
        dec = rc4_decrypt(workspace[start:end], key)
        workspace[start:end] = dec
        if dump_stages:
            Path(f"stage{layer+1}.bin").write_bytes(workspace)
        offset = start
    if key_table:
        print("\nKeys in index order:")
        for idx in sorted(key_table):
            print(f"key[{idx:02d}] = 0x{key_table[idx]:016x}")


if __name__ == "__main__":
    main()

可以得到key

Layer 1: key_index=2, key=0xa28f38bd0463522c
Layer 2: key_index=1, key=0xbf11b34d0ce941cc
Layer 3: key_index=20, key=0xef320f9e6ae31520
Layer 4: key_index=17, key=0x36646367b78c2f91
Layer 5: key_index=48, key=0xa1570f48caceb3dd
Layer 6: key_index=29, key=0x497cff13eaa5bf76
Layer 7: key_index=8, key=0xcd05f91609d653fa
Layer 8: key_index=18, key=0x9eed7637cd5eaa26
Layer 9: key_index=31, key=0xa922933b0b315a10
Layer 10: key_index=30, key=0xd51ceddab7795459
Layer 11: key_index=41, key=0x4f749f6bbca2014c
Layer 12: key_index=43, key=0x9c73a6d3f711e66e
Layer 13: key_index=15, key=0xac1b4e2750778a01
Layer 14: key_index=47, key=0x5e68e47d3a360a80
Layer 15: key_index=32, key=0xcabd557ffa1df043
Layer 16: key_index=5, key=0xfe13c54ceb12fea8
Layer 17: key_index=7, key=0xad1f6be84bbb4680
Layer 18: key_index=14, key=0xb5e1534e1dc36c87
Layer 19: key_index=3, key=0x79ed5d84199dd9cb
Layer 20: key_index=38, key=0x8d4c8f2124957228
Layer 21: key_index=33, key=0xe0459b855188d045
Layer 22: key_index=0, key=0xba610b6c5d80c91a
Layer 23: key_index=28, key=0x7e1a125dcfa56359
Layer 24: key_index=9, key=0x55493aa141fbe86f
Layer 25: key_index=35, key=0xc01552dff3a12f67
Layer 26: key_index=26, key=0xf4d25540ed584887
Layer 27: key_index=12, key=0x5fcca9a9cb65130d
Layer 28: key_index=11, key=0xd8817dda43824d2c
Layer 29: key_index=42, key=0xb1e1adc831c8d567
Layer 30: key_index=22, key=0x1a9a0626a035fb9d
Layer 31: key_index=16, key=0xc8f82d07316dcd3b
Layer 32: key_index=4, key=0x4d9c56b2a1d77a0d
Layer 33: key_index=44, key=0x2ab305ec4e07b0b4
Layer 34: key_index=27, key=0xc12422512500c887
Layer 35: key_index=6, key=0x494a63fc85b9953a
Layer 36: key_index=49, key=0xd6ab1c9a18ebb936
Layer 37: key_index=13, key=0x6f3ed35da24dacfa
Layer 38: key_index=25, key=0x749e8082db34037d
Layer 39: key_index=19, key=0xff546a0085041459
Layer 40: key_index=46, key=0xc409de0e72c1029e
Layer 41: key_index=40, key=0x554fca602792e879
Layer 42: key_index=24, key=0x8a0bf5239eed75c4
Layer 43: key_index=21, key=0x1e00a4b9e25488f6
Layer 44: key_index=10, key=0x25bc9aff736b80a8
Layer 45: key_index=37, key=0x0e189fa829657913
Layer 46: key_index=36, key=0x0615548ece7312fb
Layer 47: key_index=23, key=0xe2f1eb0e5248cd2c
Layer 48: key_index=45, key=0x98a16d274bb044d2
Layer 49: key_index=34, key=0x82700d6f6a986873
Layer 50: key_index=39, key=0x451572c65bcb3425

Keys in index order:
key[00] = 0xba610b6c5d80c91a
key[01] = 0xbf11b34d0ce941cc
key[02] = 0xa28f38bd0463522c
key[03] = 0x79ed5d84199dd9cb
key[04] = 0x4d9c56b2a1d77a0d
key[05] = 0xfe13c54ceb12fea8
key[06] = 0x494a63fc85b9953a
key[07] = 0xad1f6be84bbb4680
key[08] = 0xcd05f91609d653fa
key[09] = 0x55493aa141fbe86f
key[10] = 0x25bc9aff736b80a8
key[11] = 0xd8817dda43824d2c
key[12] = 0x5fcca9a9cb65130d
key[13] = 0x6f3ed35da24dacfa
key[14] = 0xb5e1534e1dc36c87
key[15] = 0xac1b4e2750778a01
key[16] = 0xc8f82d07316dcd3b
key[17] = 0x36646367b78c2f91
key[18] = 0x9eed7637cd5eaa26
key[19] = 0xff546a0085041459
key[20] = 0xef320f9e6ae31520
key[21] = 0x1e00a4b9e25488f6
key[22] = 0x1a9a0626a035fb9d
key[23] = 0xe2f1eb0e5248cd2c
key[24] = 0x8a0bf5239eed75c4
key[25] = 0x749e8082db34037d
key[26] = 0xf4d25540ed584887
key[27] = 0xc12422512500c887
key[28] = 0x7e1a125dcfa56359
key[29] = 0x497cff13eaa5bf76
key[30] = 0xd51ceddab7795459
key[31] = 0xa922933b0b315a10
key[32] = 0xcabd557ffa1df043
key[33] = 0xe0459b855188d045
key[34] = 0x82700d6f6a986873
key[35] = 0xc01552dff3a12f67
key[36] = 0x0615548ece7312fb
key[37] = 0x0e189fa829657913
key[38] = 0x8d4c8f2124957228
key[39] = 0x451572c65bcb3425
key[40] = 0x554fca602792e879
key[41] = 0x4f749f6bbca2014c
key[42] = 0xb1e1adc831c8d567
key[43] = 0x9c73a6d3f711e66e
key[44] = 0x2ab305ec4e07b0b4
key[45] = 0x98a16d274bb044d2
key[46] = 0xc409de0e72c1029e
key[47] = 0x5e68e47d3a360a80
key[48] = 0xa1570f48caceb3dd
key[49] = 0xd6ab1c9a18ebb936

最终flag

需要识别0x90(其实不用我们直接根据最后的信息0x6706是offset，就可以打印出来了)

python3 disasm_vm.py 0x6706 0x6800 stage51.bin

6706: print_byte 0x52
6708: print_byte 0x43
670a: print_byte 0x54
670c: print_byte 0x46
670e: print_byte 0x7b
6710: print_byte 0x56
6712: print_byte 0x4d
6714: print_byte 0x5f
6716: print_byte 0x41
6718: print_byte 0x4c
671a: print_byte 0x55
671c: print_byte 0x5f
671e: print_byte 0x53
6720: print_byte 0x4d
6722: print_byte 0x43
6724: print_byte 0x5f
6726: print_byte 0x52
6728: print_byte 0x43
672a: print_byte 0x34
672c: print_byte 0x5f
672e: print_byte 0x53
6730: print_byte 0x50
6732: print_byte 0x45
6734: print_byte 0x43
6736: print_byte 0x4b
6738: print_byte 0x21
673a: print_byte 0x5f
673c: print_byte 0x35
673e: print_byte 0x39
6740: print_byte 0x33
6742: print_byte 0x65
6744: print_byte 0x62
6746: print_byte 0x36
6748: print_byte 0x30
674a: print_byte 0x37
674c: print_byte 0x39
674e: print_byte 0x64
6750: print_byte 0x32
6752: print_byte 0x64
6754: print_byte 0x61
6756: print_byte 0x36
6758: print_byte 0x63
675a: print_byte 0x31
675c: print_byte 0x38
675e: print_byte 0x37
6760: print_byte 0x65
6762: print_byte 0x64
6764: print_byte 0x34
6766: print_byte 0x36
6768: print_byte 0x32
676a: print_byte 0x62
676c: print_byte 0x30
676e: print_byte 0x33
6770: print_byte 0x33
6772: print_byte 0x66
6774: print_byte 0x65
6776: print_byte 0x65
6778: print_byte 0x33
677a: print_byte 0x34
677c: print_byte 0x7d
677e: print_byte 0x0a
6780: halt

# 所有 print_byte 里的十六进制字节
bytes_hex = [
    0x52, 0x43, 0x54, 0x46, 0x7b,
    0x56, 0x4d, 0x5f, 0x41, 0x4c, 0x55, 0x5f,
    0x53, 0x4d, 0x43, 0x5f,
    0x52, 0x43, 0x34, 0x5f,
    0x53, 0x50, 0x45, 0x43, 0x4b, 0x21, 0x5f,
    0x35, 0x39, 0x33, 0x65, 0x62, 0x36, 0x30, 0x37,
    0x39, 0x64, 0x32, 0x64, 0x61, 0x36, 0x63, 0x31,
    0x38, 0x37, 0x65, 0x64, 0x34, 0x36, 0x32, 0x62,
    0x30, 0x33, 0x33, 0x66, 0x65, 0x65, 0x33, 0x34,
    0x7d
]

flag = ''.join(chr(b) for b in bytes_hex)
print(flag)