简单学习ollvm混淆

转载参考自:[原创]深入浅出 Ollvm 混淆原理及反混淆技术-Android安全-看雪安全社区|专业技术交流与安全研究论坛

OLLVM 的三大核心混淆手段:指令替换、虚假控制流、控制流平坦化

指令替换

通过数学等价变换来增加代码复杂度的混淆技术。它的核心思想是将程序中原本简单的指令(如加法、异或),替换为一段功能等效但逻辑极其晦涩的指令序列。

比如把a+b换成a-b+c-c+b+b

去除手段D810

image-20260303222645703

可以看到,虽然 D-810 成功去除了一部分混淆,但核心逻辑(RC4 的异或操作)依然被更深层次的 MBA 表达式掩盖,未能完全还原。

当 D-810 无法完全还原,或者你需要处理极高强度的 MBA 表达式时,GAMBA 是你的终极武器。

该工具的详细使用方法请移步:[分享]Ollvm 指令替换混淆还原神器:GAMBA 使用指南

image-20260304132744747

虚假控制流

通过向代码中注入永远不会执行的“死代码”块和难以预测的条件跳转,来干扰控制流图(CFG)分析。

不透明谓词:在程序运行时其真假结果是确定的,但对静态分析器而言难以推断其恒真或恒假的条件表达式

示例if (x * (x + 1) % 2 != 0) { ... }。数学上任意整数 x,其 x(x+1) 必然是偶数,因此条件永远为假。但 IDA 在不进行深度代数分析的情况下,会认为这是一条合法的分支。

不可达块:在真实程序执行路径中永远不会被执行的基本块,但它在控制流图中是存在的。

D-810 内置了强大的不透明谓词匹配器,能够自动识别常见的 OLLVM 谓词模式并将其优化掉。

修改数据段属性与初值 (利用编译器优化)也可以一定去除

OLLVM 常使用全局变量(通常未初始化,位于 .bss 段)作为不透明谓词的判断条件。
核心思路:既然 IDA 不知道这些变量的值,我们就人为地给它赋予一个定值,并告诉 IDA 这个值是“只读”的。这样 IDA 的反编译器就会触发常量传播(Constant Propagation) 优化,自动剪除死代码分支。

OLLVM 生成类似这种结构:

1
2
3
4
5
if (g_flag) {
real_block();
} else {
bogus_block();
}

其中:

  • g_flag 是一个 全局变量
  • 位于 .bss
  • 未初始化(默认值 0)
  • 但在程序早期某处被设置为固定值

对 CPU 来说运行时它是确定值(例如始终为 1)

但对 IDA 来说,这是一个内存变量,可能被修改,无法确定值,所以反编译器只能保守处理

image-20260303224554083

辅助脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import ida_segment
import ida_bytes

# 获取 .bss 段
seg = ida_segment.get_segm_by_name('.bss')

# 1. 批量赋值:将该段所有变量初始化为 2 (或其他固定值)
# 步长为 4 (int类型)
for ea in range(seg.start_ea, seg.end_ea, 4):
ida_bytes.patch_bytes(ea, int(2).to_bytes(4, 'little'))

# 2. 修改段权限为只读 (Read Only)
# seg.perm 格式: 4=Read, 2=Write, 1=Execute
seg.perm = 0b100 # 只保留读权限 (R--)
ida_segment.update_segm(seg)

print("[+] BSS segment patched: Read-only & Initialized.")

.bss 段对相关变量执行 “Convert to data” (快捷键 D),然后回到反编译界面按 F5 刷新。你会发现大量分支因条件确定而被 IDA 自动优化消失了。

也可以直接修补汇编改成立即数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import ida_xref
import ida_idaapi
from ida_bytes import get_bytes, patch_bytes
import ida_segment

def do_patch(ea):
# 检查指令特征:mov reg, [mem] (通常是 8B 开头)
# 注意:这里仅适配了特定的 mov 指令格式,实战需根据具体指令调整
if get_bytes(ea, 1) == b"\x8B":
# 解析目标寄存器
reg = (ord(get_bytes(ea + 1, 1)) & 0b00111000) >> 3

# 构造新指令:mov reg, 0
# 操作码:0xB8 + reg
# 填充 nop 保持指令长度一致
new_code = (0xB8 + reg).to_bytes(1, 'little') + b'\x00\x00\x00\x00\x90\x90'

patch_bytes(ea, new_code)
print(f"[+] Patched at {hex(ea)}: mov reg, 0")
else:
print(f"[-] Skip unknown instruction at {hex(ea)}")

# 遍历 .bss 段中的不透明谓词变量
seg = ida_segment.get_segm_by_name('.bss')
for addr in range(seg.start_ea, seg.end_ea, 4):
# 获取所有引用了该变量的代码位置
ref = ida_xref.get_first_dref_to(addr)
while ref != ida_idaapi.BADADDR:
do_patch(ref)
ref = ida_xref.get_next_dref_to(addr, ref)

print("[+] All opaque predicates patched.")

此方法直接从汇编层面切断了不透明谓词的来源,IDA 在重新分析时会发现这些寄存器都是定值

控制流平坦化

旨在摧毁程序结构信息的重度混淆技术,它通过引入一个中央分发器,将原函数中原本层级分明、先后有序的基本块(Basic Blocks)全部“拍扁”,使得它们在控制流图(CFG)上看起来像是在同一个层级上。

  1. 序言 (Prologue) :函数的入口。它的核心任务是初始化​状态变量​。
  2. 主分发器 (Main Dispatcher) :混淆的心脏。通常是一个巨大的 while(true)​ 循环,内部包裹着 switch(state)。它不断读取当前状态值,决定下一个要执行哪个块。
  3. 子分发器 (Sub-Dispatcher) (变种) :更复杂的 FLA 会嵌套多层 switch,或者使用数学公式计算跳转目标,进一步隐藏状态转移关系。
  4. 真实块 (Relevant Blocks) :包含原始业务逻辑的代码块。它们不再直接跳转到下一个真实块,而是被隔离成了 switch 的一个个 case。
  5. 预处理块 (Predispatcher) / 状态更新:每个真实块执行完后,不会直接跳转。而是通过更新 state​ 变量(例如 state = NEXT_KEY),然后无条件跳转回主分发器,由分发器在下一轮循环中根据新状态进行调度。
  6. 返回块 (Return) :函数的出口,当状态变量达到特定值时,跳出循环并返回。

image-20260304102647824

执行流程示例

  1. 初始化:序言设置 state = 1
  2. 分发:主分发器检查 state,跳转到 case 1(真实块 A)。
  3. 执行与更新:真实块 A 执行业务逻辑,并在末尾将 state 更新为 2
  4. 回环:跳转回主分发器。
  5. 再分发:主分发器检查 state(此时为 2),跳转到 case 2(真实块 B)。
  6. 循环往复… 直到 state 变为结束标志。

D810可以一定程度去除

Deflat.py angr脚本(网上有自己写也行)

使用 Unicorn 框架 (动态模拟还原)

通过模拟执行,我们可以记录程序在运行时的真实轨迹 (Trace) ,从而无视复杂的静态混淆逻辑。

核心技术路径

  1. 静态提取:利用 IDA 识别并提取所有基本块信息(真实块、虚假块、分发器)。
  2. 动态模拟:使用 Unicorn 运行程序,记录块与块之间的跳转关系及上下文(ZF 标志位)。
  3. 静态修复:根据记录的关系,Patch 二进制代码,短路分发器,重建 CFG。

第一步:静态提取 (IDA Python)

我们需要先让 IDA 告诉我们哪些是真实块,哪些是分发器/虚假块。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
import idaapi
import idc

# ================= 配置 =================
TARGET_FUNC_EA = 0x401E80 # 被flatten的函数的入口,一般是头
PREPROCESSOR_EA = 0x402697 # 分发器/预处理器地址 常见形式是一个循环+switch/间接跳转,根据 state 变量跳转到不同 handler/块,一般是尾

# ================= 数据结构 =================
true_blocks = set() # 真实块集合
fake_blocks = set() # 虚假块集合
# 集合里存的是二元组 (start_ea, end_ea),表示一个基本块的起止范围。

func = idaapi.get_func(TARGET_FUNC_EA)
flowchart = idaapi.FlowChart(func, flags=idaapi.FC_PREDS)
# idaapi.get_func 取到包含 TARGET_FUNC_EA 的函数对象
# FlowChart(func, FC_PREDS) 构造该函数的基本块图,并且带前驱信息(preds)

print(f"[*] Analyzing function at 0x{TARGET_FUNC_EA:x}...")

# ================= CFG 分析 =================
for block in flowchart:
start_ea = block.start_ea
# 获取块的实际结束地址(排除对齐填充)
end_ea = idc.prev_head(block.end_ea)

# 1. 识别分发器 (Dispatcher)
# 分发器本身归类为虚假块
if start_ea == PREPROCESSOR_EA:
fake_blocks.add((start_ea, end_ea))

# 分发器的前驱通常是真实块(因为真实块执行完要跳回来)
for pred in block.preds():
true_blocks.add((pred.start_ea, idc.prev_head(pred.end_ea)))
continue

# 2. 识别返回块 (Return Block)
# 没有后继的块通常是函数出口
succs = list(block.succs())
if not succs:
print(f"[+] Found Return Block: 0x{start_ea:x}")
# 返回块算作真实逻辑的一部分,但不参与循环
continue

# 3. 识别真实块 (True Block)
# 如果当前块的后继是分发器,说明它是参与调度的真实块
if any(succ.start_ea == PREPROCESSOR_EA for succ in succs):
true_blocks.add((start_ea, end_ea))
continue

# 4. 识别序言 (Prologue)
if start_ea == TARGET_FUNC_EA:
print(f"[+] Found Prologue: 0x{start_ea:x}")
print(f"[+] Prologue_end: 0x{end_ea:x}")
continue

# 5. 其他块归类为虚假块 (Fake Block)
# 排除掉已经被标记为真实的块
if (start_ea, end_ea) not in true_blocks:
fake_blocks.add((start_ea, end_ea))

# ================= 输出 =================
print(f"\n[+] True Blocks Count: {len(true_blocks)}")
print("TBS =", sorted(true_blocks))

print(f"\n[+] Fake Blocks Count: {len(fake_blocks)}")
print("FBS =", sorted(fake_blocks))

第二步:动态模拟 (Unicorn Trace)

有了块列表,我们使用 Unicorn 模拟执行,记录真实的跳转路径。

具体的 Unicorn 框架学习文章:[原创]深入浅出 Unicorn 框架学习

关键点:不仅要记录跳到了哪里,还要记录跳转时的 ZF (Zero Flag) 标志位。因为条件跳转(JZ​/JNZ)完全依赖它。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
from unicorn import *
from unicorn.x86_const import *
from capstone import *

# ============================================================
# 1. 全局配置(地址布局 & 目标函数)
# ============================================================
BASE_ADDR = 0x400000
CODE_ADDR = BASE_ADDR
CODE_SIZE = 1024 * 1024
STACK_ADDR = 0x0
STACK_SIZE = 1024 * 1024

MAIN_ADDR = 0x401E80
MAIN_END = 0x40269C

# ============================================================
# 2. 基本块信息(IDA 静态分析得到)
# ============================================================
# 从第一步脚本中获取的真实块列表
TBS = [
(4203066, 4203066), (4203071, 4203098), (4203103, 4203157),
(4203162, 4203314), (4203319, 4203341), (4203346, 4203366),
(4203371, 4203398), (4203403, 4203428), (4203433, 4203457),
(4203462, 4203490), (4203495, 4203514), (4203519, 4203558),
(4203563, 4203585), (4203590, 4203609), (4203614, 4203636),
(4203641, 4203651), (4203656, 4203689), (4203694, 4203737),
(4203742, 4203776), (4203781, 4203804), (4203809, 4203831),
(4203836, 4203856), (4203861, 4203888), (4203893, 4203918),
(4203923, 4203957), (4203962, 4203981), (4203986, 4204025),
(4204030, 4204040), (4204045, 4204067), (4204072, 4204091),
(4204096, 4204118), (4204123, 4204133), (4204138, 4204171)]

# 结果记录: [(tb_start, tb_end), zf_value]
tb_trace = []

# ============================================================
# 3. 反汇编 & 模拟器初始化
# ============================================================
cs = Cs(CS_ARCH_X86, CS_MODE_64)
uc = Uc(UC_ARCH_X86, UC_MODE_64)

# ============================================================
# 4. Hook:指令级 Hook(核心)
# ============================================================
def hook_code(uc, address, size, user_data):
# 1. 模拟环境修补
# 读取指令,处理 call 和 ret
try:
code = uc.mem_read(address, size)
except: return

for insn in cs.disasm(code, address):
# 跳过 Call:FLA 通常只在当前函数内,无需跟进子函数
if insn.mnemonic == "call":
# print(f"[Skip Call] 0x{address:x}")
uc.reg_write(UC_X86_REG_RIP, address + size)
return

# 遇到 Ret:函数结束,停止模拟
if insn.mnemonic == "ret":
print("[*] Function Return hit. Stopping...")
uc.emu_stop()

# 输出最终 Trace 供下一步使用
print("\n" + "="*30)
print("real_flow = [")
for item in tb_trace:
print(f" {item},")
print("]")
print("="*30 + "\n")
return

# 2. 记录执行轨迹
# 检查当前地址是否是某个真实块的“结束地址”
for tb_start, tb_end in TBS:
if address == tb_end:
# 记录此时的 ZF 标志位 (EFLAGS 第 6 位)
eflags = uc.reg_read(UC_X86_REG_EFLAGS)
zf = (eflags >> 6) & 1

tb_trace.append(((tb_start, tb_end), zf))
break

# ============================================================
# 5. Hook:非法内存访问 / 中断(调试用)
# ============================================================
def hook_mem_invalid(uc, access, address, size, value, user_data):
access_type = {
UC_MEM_READ_UNMAPPED: "READ",
UC_MEM_WRITE_UNMAPPED: "WRITE",
UC_MEM_FETCH_UNMAPPED: "FETCH",
}.get(access, "UNKNOWN")
# 打印内存错误信息
print(f"[MEM {access_type}] 0x{address:x}, size={size}")
return False


def hook_intr(uc, intno, user_data):
print(f"[INT] interrupt {intno}")
return False

# ============================================================
# 6. Unicorn 初始化
# ============================================================
def init_unicorn(uc, code_data):
# 映射内存
uc.mem_map(CODE_ADDR, CODE_SIZE, UC_PROT_ALL)
uc.mem_map(STACK_ADDR, STACK_SIZE, UC_PROT_ALL)

# 写入代码
uc.mem_write(CODE_ADDR, code_data)

# 初始化栈
uc.reg_write(UC_X86_REG_RSP, STACK_ADDR + STACK_SIZE // 2)

# 添加 hook 逻辑
uc.hook_add(UC_HOOK_CODE, hook_code)

# 未映射内存访问
uc.hook_add(UC_HOOK_MEM_UNMAPPED, hook_mem_invalid)

# 中断(int 0x80 / syscall / ud2 等)
uc.hook_add(UC_HOOK_INTR, hook_intr)

# ============================================================
# 7. 主流程
# ============================================================
if __name__ == "__main__":
# 读取二进制文件
with open(r"./test-fla", "rb") as f:
CODE_DATA = f.read()

init_unicorn(uc, CODE_DATA)

print("[*] Starting Emulation...")
try:
uc.emu_start(MAIN_ADDR, MAIN_END)
except UcError as e:
print(f"[Error] {e}")

第三步:静态修复 (IDA Python Patch)

拿到 real_flow(真实执行流)后,我们就可以在 IDA 中重建 CFG 了。

修复策略

  1. 单后继块:如果块 A 后面永远只跟块 B,直接修改块 A 结尾为 JMP Block_B
  2. 双后继块(条件跳转) :如果块 A 后面有时跟 B(ZF=1),有时跟 C(ZF=0),说明它是条件跳转。
    • 难题:原始块 A 结尾空间可能不够写 JZ + JMP 指令。
    • 巧解:利用无用的 虚假块 (Fake Blocks) 作为跳板。
    • 操作Block_A -> JMP Fake_Block -> [JZ Target_True; JMP Target_False]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
import idaapi
import ida_bytes
import ida_ua
import ida_kernwin
from collections import defaultdict, deque

# ============================================================
# 输入数据区
# ============================================================
# fake_blocks:
# 第一步分析 CFG / dispatcher 后得到的「虚假块列表」
# 每一项格式为 (start_ea, end_ea)
# 这些块在最终逻辑中只作为“跳板”或被 NOP 掉
fake_blocks = [(4202133, 4202159), (4202165, 4202165), (4202170, 4202187), (4202193, 4202193), (4202198, 4202215),
(4202221, 4202221), (4202226, 4202243), (4202249, 4202249), (4202254, 4202271), (4202277, 4202277),
(4202282, 4202299), (4202305, 4202305), (4202310, 4202327), (4202333, 4202333), (4202338, 4202355),
(4202361, 4202361), (4202366, 4202383), (4202389, 4202389), (4202394, 4202411), (4202417, 4202417),
(4202422, 4202439), (4202445, 4202445), (4202450, 4202467), (4202473, 4202473), (4202478, 4202495),
(4202501, 4202501), (4202506, 4202523), (4202529, 4202529), (4202534, 4202551), (4202557, 4202557),
(4202562, 4202579), (4202585, 4202585), (4202590, 4202607), (4202613, 4202613), (4202618, 4202635),
(4202641, 4202641), (4202646, 4202663), (4202669, 4202669), (4202674, 4202691), (4202697, 4202697),
(4202702, 4202719), (4202725, 4202725), (4202730, 4202747), (4202753, 4202753), (4202758, 4202775),
(4202781, 4202781), (4202786, 4202803), (4202809, 4202809), (4202814, 4202831), (4202837, 4202837),
(4202842, 4202859), (4202865, 4202865), (4202870, 4202887), (4202893, 4202893), (4202898, 4202915),
(4202921, 4202921), (4202926, 4202943), (4202949, 4202949), (4202954, 4202971), (4202977, 4202977),
(4202982, 4202999), (4203005, 4203005), (4203010, 4203027), (4203033, 4203033), (4203038, 4203055),
(4203061, 4203061), (4204183, 4204183)]

# real_flow:
# 第二步通过动态 / 符号执行 / 手工跟踪得到的真实执行路径
# 每一项格式为:
# ((block_start, block_end), zf)
# 含义是:
# 执行到该真实块时,ZF 的取值为 zf
real_flow = [
((4203071, 4203098), 0), ((4203103, 4203157), 0), ((4203162, 4203314), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 0), ((4203403, 4203428), 0), ((4203433, 4203457), 1),
((4203462, 4203490), 0), ((4203495, 4203514), 1), ((4203519, 4203558), 1), ((4203563, 4203585), 1),
((4203590, 4203609), 0), ((4203614, 4203636), 1), ((4203641, 4203651), 1), ((4203319, 4203341), 1),
((4203346, 4203366), 1), ((4203371, 4203398), 1), ((4203403, 4203428), 1), ((4203656, 4203689), 1),
((4203694, 4203737), 1), ((4203742, 4203776), 1), ((4203781, 4203804), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 0), ((4203893, 4203918), 0), ((4203923, 4203957), 0),
((4203962, 4203981), 1), ((4203986, 4204025), 1), ((4204030, 4204040), 1), ((4204045, 4204067), 1),
((4204072, 4204091), 0), ((4204096, 4204118), 1), ((4204123, 4204133), 1), ((4203809, 4203831), 1),
((4203836, 4203856), 1), ((4203861, 4203888), 1), ((4203893, 4203918), 1), ((4204138, 4204171), 1)
]

# 函数序言块的起止地址
# 用于修复 main 入口,直接跳转到第一个真实块
PROLOGUE_STAR = 0x401E80
PROLOGUE_END = 0x401E8B

# 最终 return 块(例如 epilogue / leave; ret 所在块)
RETURN_BLOCK = 0x402690

# ============================================================
# 逻辑处理区
# ============================================================

# ------------------------------------------------------------
# 1. 构建真实控制流映射
# ------------------------------------------------------------

# block_next_map:
# 结构为:
# block_next_map[block][zf] = {next_block1, next_block2, ...}
#
# 表示:
# 当执行到 block 且 ZF == zf 时
# 下一跳可能进入哪些真实块
block_next_map = defaultdict(lambda: defaultdict(set))

# block_zf_map:
# block_zf_map[block] = {0, 1}
#
# 表示:
# 该真实块在执行过程中,ZF 出现过哪些取值
block_zf_map = defaultdict(set)

# 根据 real_flow 构建上述两个映射
for i in range(len(real_flow) - 1):
cur_block, zf = real_flow[i] # 当前真实块及其 ZF
next_block, _ = real_flow[i + 1] # 下一个真实块(ZF 无关)

block_zf_map[cur_block].add(zf)
block_next_map[cur_block][zf].add(next_block)

# ------------------------------------------------------------
# 2. 准备虚假块资源池
# ------------------------------------------------------------

# 使用 deque:
# - 顺序分配 fake block
# - 避免重复使用
fake_queue = deque(fake_blocks)

# 记录哪些 fake block 被用作跳板
used_fake = set()


def alloc_fake_block(min_size=10):
"""
从 fake_blocks 中分配一个可用的虚假块

要求:
- 尚未使用
- 空间足够大(至少能容纳 jz + jmp,约 11 字节)

返回:
(start_ea, end_ea)
"""
while fake_queue:
fb = fake_queue.popleft()
if (fb[1] - fb[0]) >= min_size:
used_fake.add(fb)
return fb
raise Exception("No more fake blocks available!")


# ------------------------------------------------------------
# 通用工具函数
# ------------------------------------------------------------

def nop_range(start, end):
"""
将 [start, end] 区间全部填充为 NOP
用于:
- 清除原 FLA 垃圾代码
- 防止残留逻辑被误执行
"""
ea = start
while ea <= end:
ida_bytes.patch_byte(ea, 0x90)
ea += 1


def get_last_insn_ea(block_start, block_end):
"""
在一个 block 内,反向查找最后一条“有效指令”

目的:
FLA 中 block 末尾通常是 dispatcher 跳转
我们需要精准定位并 patch 这条指令
"""
ea = ida_bytes.prev_head(block_end + 1, block_start)
while ea != idaapi.BADADDR and ea >= block_start:
if ida_bytes.is_code(ida_bytes.get_full_flags(ea)):
return ea
ea = ida_bytes.prev_head(ea, block_start)
return idaapi.BADADDR


def patch_jmp(frm, to):
"""
在 frm 地址处,强制 patch 成:
jmp to

用途:
- 替换原 dispatcher 跳转
- 替换原 jcc / 间接跳转
"""
ida_bytes.del_items(frm, ida_bytes.DELIT_SIMPLE)
ida_ua.create_insn(frm)
ida_bytes.patch_byte(frm, 0xE9)
rel = to - (frm + 5)
ida_bytes.patch_dword(frm + 1, rel)


def emit_jz_jmp(ea, true_target, false_target):
"""
在 fake block 中构造如下逻辑:

jz true_target
jmp false_target

用于:
- 恢复真实 if / while / for 条件分支
- ZF == 1 → true_target
- ZF == 0 → false_target
"""

# jz true_target
ida_bytes.del_items(ea, ida_bytes.DELIT_SIMPLE)
ida_ua.create_insn(ea)
ida_bytes.patch_byte(ea, 0x0F)
ida_bytes.patch_byte(ea + 1, 0x84)
rel = true_target - (ea + 6)
ida_bytes.patch_dword(ea + 2, rel)
ea += 6

# jmp false_target
ida_bytes.del_items(ea, ida_bytes.DELIT_SIMPLE)
ida_ua.create_insn(ea)
ida_bytes.patch_byte(ea, 0xE9)
rel = false_target - (ea + 5)
ida_bytes.patch_dword(ea + 1, rel)
ea += 5

return ea


print("[*] Starting Patching...")

# ------------------------------------------------------------
# 3. 修复函数序言块
# ------------------------------------------------------------
# main 的序言块不应再进入 dispatcher
# 直接跳转到第一个真实块
first_real_block = real_flow[0][0][0]
patch_jmp(
get_last_insn_ea(PROLOGUE_STAR, PROLOGUE_END),
first_real_block
)

# ------------------------------------------------------------
# 4. 修复所有真实块
# ------------------------------------------------------------
for block, zf_set in block_zf_map.items():
start, end = block
last_insn = get_last_insn_ea(start, end)
branches = block_next_map[block]

# --------------------------------------------------------
# 情况 A:该真实块只出现过一种 ZF
# → 实际是“退化条件”或“直跳块”
# --------------------------------------------------------
if len(zf_set) == 1:
zf = list(zf_set)[0]
target = list(branches[zf])[0][0]
patch_jmp(last_insn, target)

# --------------------------------------------------------
# 情况 B:该真实块同时出现 ZF=0 / ZF=1
# → 真正的条件分支
# --------------------------------------------------------
else:
# 分配一个 fake block 作为条件跳板
fb_start, fb_end = alloc_fake_block()

# 原真实块无条件跳到 fake block
patch_jmp(last_insn, fb_start)

# 确定 ZF=1 / ZF=0 的真实目标
true_target = list(branches[1])[0][0]
false_target = list(branches[0])[0][0]

# 清空 fake block
nop_range(fb_start, fb_end)

# 写入:
# if (ZF) goto true_target
# else goto false_target
emit_jz_jmp(fb_start, true_target, false_target)

# ------------------------------------------------------------
# 5. 修复最后一个真实块 → return block
# ------------------------------------------------------------
last_true_block_start = real_flow[-1][0][0]
last_true_block_end = real_flow[-1][0][1]
patch_jmp(
get_last_insn_ea(last_true_block_start, last_true_block_end),
RETURN_BLOCK
)

# ------------------------------------------------------------
# 6. 清理所有未使用的 fake blocks
# ------------------------------------------------------------
# 防止残留 FLA 垃圾逻辑
for fb in fake_blocks:
if fb not in used_fake:
nop_range(fb[0], fb[1])

print("[+] Patching Done! Press F5 to decompile.")

[例题] RoarCTF2019 polyre

ps:现在IDA 9似乎对这类ollvm做了优化 可以正常识别逻辑

angr

这里我跑了两个脚本分别是angr符号执行对抗ollvm - Qmeimei’s Blog | 探索一切,攻破一切和[原创]深入浅出 Ollvm 混淆原理及反混淆技术-Android安全-看雪安全社区|专业技术交流与安全研究论坛两位师傅的

其中第二份脚本就是上面那个(好理解一点,第一份脚本比较全面,能覆盖更多变体),跑出的结果不同,因为其规定了分发器本身归类为虚假块(我觉得有道理,待会都试试)

python1(原理看不懂直接看师傅的博客吧):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
import idaapi
import idc
from collections import deque # 用于 BFS 遍历

def get_basic_block(ea):
func = idaapi.get_func(ea)
if not func:
return None
f = idaapi.FlowChart(func) # 获取函数的控制流图
for block in f:
if block.start_ea <= ea < block.end_ea:
return block
return None


def find_loop_head(start_ea):
loop_heads = set()
queue = deque() # BFS 队列
blcok = get_basic_block(start_ea) # 获取起始地址所在的基本块
queue.append((blcok,[]))
while len(queue) > 0:
cur_block, path = queue.popleft()
if cur_block.start_ea in path:
loop_heads.add(cur_block.start_ea) # 找到循环头
continue
path = path + [cur_block.start_ea] # 更新路径
queue.extend((s, path) for s in cur_block.succs()) # 将后继加入队列

all_loop_heads = list(loop_heads)
all_loop_heads.sort() # 升序排序,确保主循环头在第一个
print("[+]Find loop heads:",[hex(lh) for lh in all_loop_heads]," -- total:",len(all_loop_heads))
return all_loop_heads

def find_converge_addr(loop_head_addr):
converge_addr = 0
block = get_basic_block(loop_head_addr) # 循环头
preds = block.preds() # 获取前驱基本块
pred_list = list(preds)

if len(pred_list) == 2: # 标准 ollvm:循环头有两个前驱,一个序言块一个汇聚块
for pred in pred_list:
tmp_list = list(pred.preds())
if len(tmp_list) > 1: # 有多个前驱的块是汇聚块
converge_addr = pred.start_ea
print("[+]Find converge_addr:",hex(converge_addr))
return converge_addr

def get_block_size(block):
return block.end_ea - block.start_ea

def find_ret_block(blocks):
for block in blocks:
succs = list(block.succs()) # 获取后继块
succs_list = list(succs)

end_ea = block.end_ea # end_ea 指向基本块最后一条指令的下一个地址
last_inst_ea = idc.prev_head(end_ea) # 获取基本块最后一条指令地址
mnem = idc.print_insn_mnem(last_inst_ea) # 获取指令助记符

if len(succs_list) == 0: # 没有后继块
if mnem == "retn": # 最后一条指令是 ret 指令
ori_ret_block = block

# 向上寻找更合适的 ret 块
while True:
tmp_block = block.preds()
pred_list = list(tmp_block)
if len(pred_list) == 1: # 只有一个前驱
block = pred_list[0]
if get_block_size(block) == 4: # 单指令块
continue
else:
break
else: # 多个前驱或者无前驱
break

# 处理子分发器情况
block2 = block
num = 0
i = 0
while True:
i += 1
succs_block = block2.succs()
for succ in succs_block:
child_succs = succ.succs()
succ_list = list(child_succs)
if len(succ_list) != 0:
block2 = succ
num += 1
if num > 2:
block = ori_ret_block
break
if i > 2:
break
print("[+]ret块",hex(block.start_ea))
return block.start_ea


def find_all_real_blocks(fun_ea):
blocks = idaapi.FlowChart(idaapi.get_func(fun_ea))
loop_heads = find_loop_head(fun_ea)
all_real_blocks = []

for loop_head_addr in loop_heads:
loop_head_block = get_basic_block(loop_head_addr)
converge_addr = find_converge_addr(loop_head_addr)
real_blocks = []

#找出序言
loop_head_preds = list(loop_head_block.preds())
loop_head_preds_addr = [b.start_ea for b in loop_head_preds]
if loop_head_addr != converge_addr:
loop_head_preds_addr.remove(converge_addr)
print("序言块:",[hex(x) for x in loop_head_preds_addr])
real_blocks.extend(loop_head_preds_addr)

converge_block = get_basic_block(converge_addr)
list_preds = list(converge_block.preds())

for pred in list_preds:
end_ea = pred.end_ea
last_inst_ea = idc.prev_head(end_ea)
mnem = idc.print_insn_mnem(last_inst_ea)

size = get_block_size(pred)
if size > 5: # 大于单指令块且不是跳转指令
start_ea = pred.start_ea
real_blocks.append(start_ea)

real_blocks.sort() # 排序,第一个是序言块
all_real_blocks.append(real_blocks)

print("子循环头及其子真实块", [hex(child_block_ea) for child_block_ea in real_blocks])

ret_addr = find_ret_block(blocks)
all_real_blocks.append(ret_addr)
print("all_real_blocks:",all_real_blocks)

all_real_block_list = []
for real_blocks in all_real_blocks:
if isinstance(real_blocks,list):
all_real_block_list.extend(real_blocks)
else:
all_real_block_list.append(real_blocks)

print(f"\n所有真实块获取完成 真实块数量: {len(all_real_block_list)}")
print(all_real_block_list)

# all_child_prologue_addr = all_real_blocks.copy()
# all_child_prologue_addr.remove(ret_addr)
# all_child_prologue_addr.remove(all_child_prologue_addr[0]) # 移除主序言块
# print("所有子循环及其子真实块", all_child_prologue_addr)
return 0

find_all_real_blocks(0x400620)

核心思想:先找循环头→再找汇聚块→把汇聚块的前驱当真实块

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Propagating type information...
Function argument information has been propagated
The initial autoanalysis has been finished.
400536: positive sp value 8 has been found
40053D: variable 'v3' is possibly undefined

[+]Find loop heads: ['0x40063f'] -- total: 1
[+]Find converge_addr: 0x4020cc
序言块: ['0x400620']
子循环头及其子真实块 ['0x400620', '0x401121', '0x401198', '0x4011de', '0x40124f', '0x40125e', '0x4012a4', '0x4012f6', '0x401305', '0x401326', '0x40136c', '0x4013b2', '0x4013cf', '0x4013ef', '0x401435', '0x401481', '0x401490', '0x4014ae', '0x4014d2', '0x4014e8', '0x4014f7', '0x401506', '0x401521', '0x401567', '0x4015b6', '0x4015c5', '0x4015d4', '0x4015ed', '0x4015fc', '0x401642', '0x401691', '0x4016a0', '0x4016e6', '0x401739', '0x401748', '0x401765', '0x4017ab', '0x4017fc', '0x40180b', '0x401830', '0x401849', '0x401861', '0x4018a7', '0x4018fa', '0x401909', '0x401926', '0x401940', '0x401960', '0x40197d', '0x40199b', '0x4019e1', '0x401a3d', '0x401a4c', '0x401a73', '0x401a8d', '0x401ad3', '0x401b25', '0x401b34', '0x401b4e', '0x401b5d', '0x401b75', '0x401bbb', '0x401c0d', '0x401c1c', '0x401c2b', '0x401c46', '0x401c69', '0x401caf', '0x401d03', '0x401d12', '0x401d2d', '0x401d45', '0x401d54', '0x401d9a', '0x401e00', '0x401e0f', '0x401e2d', '0x401e73', '0x401eb9', '0x401ed6', '0x401efa', '0x401f09', '0x401f2d', '0x401f3c', '0x401f60', '0x401f97', '0x401fa6', '0x401fb5', '0x401fcd', '0x401fe5', '0x401ff4', '0x40200c', '0x40201b', '0x402033', '0x40204d', '0x402072', '0x402096', '0x4020b3', '0x4020c2']
[+]ret块 0x401f54
all_real_blocks: [[4195872, 4198689, 4198808, 4198878, 4198991, 4199006, 4199076, 4199158, 4199173, 4199206, 4199276, 4199346, 4199375, 4199407, 4199477, 4199553, 4199568, 4199598, 4199634, 4199656, 4199671, 4199686, 4199713, 4199783, 4199862, 4199877, 4199892, 4199917, 4199932, 4200002, 4200081, 4200096, 4200166, 4200249, 4200264, 4200293, 4200363, 4200444, 4200459, 4200496, 4200521, 4200545, 4200615, 4200698, 4200713, 4200742, 4200768, 4200800, 4200829, 4200859, 4200929, 4201021, 4201036, 4201075, 4201101, 4201171, 4201253, 4201268, 4201294, 4201309, 4201333, 4201403, 4201485, 4201500, 4201515, 4201542, 4201577, 4201647, 4201731, 4201746, 4201773, 4201797, 4201812, 4201882, 4201984, 4201999, 4202029, 4202099, 4202169, 4202198, 4202234, 4202249, 4202285, 4202300, 4202336, 4202391, 4202406, 4202421, 4202445, 4202469, 4202484, 4202508, 4202523, 4202547, 4202573, 4202610, 4202646, 4202675, 4202690], 4202324]

所有真实块获取完成 真实块数量: 100
[4195872, 4198689, 4198808, 4198878, 4198991, 4199006, 4199076, 4199158, 4199173, 4199206, 4199276, 4199346, 4199375, 4199407, 4199477, 4199553, 4199568, 4199598, 4199634, 4199656, 4199671, 4199686, 4199713, 4199783, 4199862, 4199877, 4199892, 4199917, 4199932, 4200002, 4200081, 4200096, 4200166, 4200249, 4200264, 4200293, 4200363, 4200444, 4200459, 4200496, 4200521, 4200545, 4200615, 4200698, 4200713, 4200742, 4200768, 4200800, 4200829, 4200859, 4200929, 4201021, 4201036, 4201075, 4201101, 4201171, 4201253, 4201268, 4201294, 4201309, 4201333, 4201403, 4201485, 4201500, 4201515, 4201542, 4201577, 4201647, 4201731, 4201746, 4201773, 4201797, 4201812, 4201882, 4201984, 4201999, 4202029, 4202099, 4202169, 4202198, 4202234, 4202249, 4202285, 4202300, 4202336, 4202391, 4202406, 4202421, 4202445, 4202469, 4202484, 4202508, 4202523, 4202547, 4202573, 4202610, 4202646, 4202675, 4202690, 4202324]

第二份脚本的输出

1
2
3
4
5
6
7
8
9
10
[*] Analyzing function at 0x400620...
[+] Found Prologue: 0x400620
[+] Prologue_end: 0x400635
[+] Found Return Block: 0x401f54

[+] True Blocks Count: 99
TBS = [(4198684, 4198684), (4198689, 4198803), (4198808, 4198873), (4198878, 4198986), (4198991, 4199001), (4199006, 4199071), (4199076, 4199153), (4199158, 4199168), (4199173, 4199201), (4199206, 4199271), (4199276, 4199341), (4199346, 4199370), (4199375, 4199402), (4199407, 4199472), (4199477, 4199548), (4199553, 4199563), (4199568, 4199593), (4199598, 4199629), (4199634, 4199651), (4199656, 4199666), (4199671, 4199681), (4199686, 4199708), (4199713, 4199778), (4199783, 4199857), (4199862, 4199872), (4199877, 4199887), (4199892, 4199912), (4199917, 4199927), (4199932, 4199997), (4200002, 4200076), (4200081, 4200091), (4200096, 4200161), (4200166, 4200244), (4200249, 4200259), (4200264, 4200288), (4200293, 4200358), (4200363, 4200439), (4200444, 4200454), (4200459, 4200491), (4200496, 4200516), (4200521, 4200540), (4200545, 4200610), (4200615, 4200693), (4200698, 4200708), (4200713, 4200737), (4200742, 4200763), (4200768, 4200795), (4200800, 4200824), (4200829, 4200854), (4200859, 4200924), (4200929, 4201016), (4201021, 4201031), (4201036, 4201070), (4201075, 4201096), (4201101, 4201166), (4201171, 4201248), (4201253, 4201263), (4201268, 4201289), (4201294, 4201304), (4201309, 4201328), (4201333, 4201398), (4201403, 4201480), (4201485, 4201495), (4201500, 4201510), (4201515, 4201537), (4201542, 4201572), (4201577, 4201642), (4201647, 4201726), (4201731, 4201741), (4201746, 4201768), (4201773, 4201792), (4201797, 4201807), (4201812, 4201877), (4201882, 4201979), (4201984, 4201994), (4201999, 4202024), (4202029, 4202094), (4202099, 4202164), (4202169, 4202193), (4202198, 4202229), (4202234, 4202244), (4202249, 4202280), (4202285, 4202295), (4202300, 4202319), (4202336, 4202386), (4202391, 4202401), (4202406, 4202416), (4202421, 4202440), (4202445, 4202464), (4202469, 4202479), (4202484, 4202503), (4202508, 4202518), (4202523, 4202542), (4202547, 4202568), (4202573, 4202605), (4202610, 4202641), (4202646, 4202670), (4202675, 4202685), (4202690, 4202690)]

[+] Fake Blocks Count: 199
FBS = [(4195903, 4195929), (4195935, 4195935), (4195940, 4195957), (4195963, 4195963), (4195968, 4195985), (4195991, 4195991), (4195996, 4196013), (4196019, 4196019), (4196024, 4196041), (4196047, 4196047), (4196052, 4196069), (4196075, 4196075), (4196080, 4196097), (4196103, 4196103), (4196108, 4196125), (4196131, 4196131), (4196136, 4196153), (4196159, 4196159), (4196164, 4196181), (4196187, 4196187), (4196192, 4196209), (4196215, 4196215), (4196220, 4196237), (4196243, 4196243), (4196248, 4196265), (4196271, 4196271), (4196276, 4196293), (4196299, 4196299), (4196304, 4196321), (4196327, 4196327), (4196332, 4196349), (4196355, 4196355), (4196360, 4196377), (4196383, 4196383), (4196388, 4196405), (4196411, 4196411), (4196416, 4196433), (4196439, 4196439), (4196444, 4196461), (4196467, 4196467), (4196472, 4196489), (4196495, 4196495), (4196500, 4196517), (4196523, 4196523), (4196528, 4196545), (4196551, 4196551), (4196556, 4196573), (4196579, 4196579), (4196584, 4196601), (4196607, 4196607), (4196612, 4196629), (4196635, 4196635), (4196640, 4196657), (4196663, 4196663), (4196668, 4196685), (4196691, 4196691), (4196696, 4196713), (4196719, 4196719), (4196724, 4196741), (4196747, 4196747), (4196752, 4196769), (4196775, 4196775), (4196780, 4196797), (4196803, 4196803), (4196808, 4196825), (4196831, 4196831), (4196836, 4196853), (4196859, 4196859), (4196864, 4196881), (4196887, 4196887), (4196892, 4196909), (4196915, 4196915), (4196920, 4196937), (4196943, 4196943), (4196948, 4196965), (4196971, 4196971), (4196976, 4196993), (4196999, 4196999), (4197004, 4197021), (4197027, 4197027), (4197032, 4197049), (4197055, 4197055), (4197060, 4197077), (4197083, 4197083), (4197088, 4197105), (4197111, 4197111), (4197116, 4197133), (4197139, 4197139), (4197144, 4197161), (4197167, 4197167), (4197172, 4197189), (4197195, 4197195), (4197200, 4197217), (4197223, 4197223), (4197228, 4197245), (4197251, 4197251), (4197256, 4197273), (4197279, 4197279), (4197284, 4197301), (4197307, 4197307), (4197312, 4197329), (4197335, 4197335), (4197340, 4197357), (4197363, 4197363), (4197368, 4197385), (4197391, 4197391), (4197396, 4197413), (4197419, 4197419), (4197424, 4197441), (4197447, 4197447), (4197452, 4197469), (4197475, 4197475), (4197480, 4197497), (4197503, 4197503), (4197508, 4197525), (4197531, 4197531), (4197536, 4197553), (4197559, 4197559), (4197564, 4197581), (4197587, 4197587), (4197592, 4197609), (4197615, 4197615), (4197620, 4197637), (4197643, 4197643), (4197648, 4197665), (4197671, 4197671), (4197676, 4197693), (4197699, 4197699), (4197704, 4197721), (4197727, 4197727), (4197732, 4197749), (4197755, 4197755), (4197760, 4197777), (4197783, 4197783), (4197788, 4197805), (4197811, 4197811), (4197816, 4197833), (4197839, 4197839), (4197844, 4197861), (4197867, 4197867), (4197872, 4197889), (4197895, 4197895), (4197900, 4197917), (4197923, 4197923), (4197928, 4197945), (4197951, 4197951), (4197956, 4197973), (4197979, 4197979), (4197984, 4198001), (4198007, 4198007), (4198012, 4198029), (4198035, 4198035), (4198040, 4198057), (4198063, 4198063), (4198068, 4198085), (4198091, 4198091), (4198096, 4198113), (4198119, 4198119), (4198124, 4198141), (4198147, 4198147), (4198152, 4198169), (4198175, 4198175), (4198180, 4198197), (4198203, 4198203), (4198208, 4198225), (4198231, 4198231), (4198236, 4198253), (4198259, 4198259), (4198264, 4198281), (4198287, 4198287), (4198292, 4198309), (4198315, 4198315), (4198320, 4198337), (4198343, 4198343), (4198348, 4198365), (4198371, 4198371), (4198376, 4198393), (4198399, 4198399), (4198404, 4198421), (4198427, 4198427), (4198432, 4198449), (4198455, 4198455), (4198460, 4198477), (4198483, 4198483), (4198488, 4198505), (4198511, 4198511), (4198516, 4198533), (4198539, 4198539), (4198544, 4198561), (4198567, 4198567), (4198572, 4198589), (4198595, 4198595), (4198600, 4198617), (4198623, 4198623), (4198628, 4198645), (4198651, 4198651), (4198656, 4198673), (4198679, 4198679), (4202700, 4202700)]

获得真实块后,就是确定其执行顺序

在 OLLVM flatten 里,真实块的后继不是靠显式的 jz/jnz分支跳转决定的,而是靠 cmovxx/csel 这类条件选择指令把 state(调度变量)写成不同值,然后统一跳回 dispatcher。所以要恢复真实 CFG(每个真实块会通向哪个下一个真实块),关键是:对每个真实块,把 cmovxx 的两种结果都跑出来,分别对应两条后继边,并把它们存成一个映射表用于后续 patch/重建

Flatten 的函数入口(序言)通常做这些事:

  • 建栈/保存寄存器(prologue)
  • 初始化 state 变量(dispatcher index)
  • 初始化一些全局/局部变量

如果不先跑序言,直接从某个真实块开跑:

  • 栈帧(rbp、局部变量)可能是垃圾
  • state/全局变量值不对
  • 真实块里的计算会跑飞
  • 最后写回 dispatcher 的 state 也会错

先执行序言初始化环境,才能得到可重复、可控的执行结果。

jz/jnz 是控制流分支,angr 遇到这种会天然 fork 两个 state。

cmovxx 是数据流选择:

  • 控制流还是一条线(顺序执行)
  • 只是寄存器值变成 ITE(cond, a, b) 这样的表达式(条件表达式)

所以 angr 默认会把 ecx 表示成:ecx = if cond then eax else ecx_old

然后一路带着约束走下去,并不会自动产生两条路径。

在 flatten 里,真实块不会直接 jz loc_A / jnz loc_B 那样分叉,因为那样 CFG 很明显。

它会这样做:

  1. 计算一个条件(ZF/CF/SF等标志由 cmp/test 设置)
  2. 准备两个候选 state 值(比如 ecx = Aeax = B
  3. 用条件选择指令把 state 写成其中一个:
  • x86:cmovnz ecx, eax(条件成立则把 eax 写进 ecx,否则 ecx 保持原值)
  • arm64:csel x, a, b, cond
  1. 把这个 state 存到内存(比如 [rbp+var_114] 或全局)
  2. 无条件 jmp dispatcher

示例块里:

1
2
3
4
5
6
7
8
9
10
11
cmp     eax, 0
setz sil
cmp ecx, 0Ah
setl dil
or sil, dil
test sil, 1
mov eax, 0F37184F0h
mov ecx, 0A105D2C4h
cmovnz ecx, eax
mov [rbp+var_114], ecx
jmp loc_4020CC

含义是:

  • 前面一堆计算把 ZF(或 nz 条件)决定出来
  • ecx 先放 “默认 state = 0xA105D2C4”
  • eax 放 “另一个 state = 0xF37184F0”
  • cmovnz ecx, eax:如果条件成立,就把 state 改成另一个值
  • 之后写到 [rbp+var_114],跳回 dispatcher
  • dispatcher 根据 [rbp+var_114] 的值跳到不同真实块

后继边是由 cmovxx 写入的 state 决定的

对每个真实块:从同一初始化快照开始执行,遇到 cmovxx 时不让 angr 把它当 ITE 表达式,而是人为 fork 两个 state(cond 真/假),分别跑到 dispatcher 读取下一跳真实块,从而恢复该块的两条后继边,并按 ZF=1/0 的顺序保存成映射表用于后续 patch。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
import logging
import angr
from tqdm import tqdm

logging.getLogger('angr').setLevel(logging.ERROR)

def capstone_decode_cmovxx(insn):
operands = insn.op_str.replace(" ", "").split(",")
dst_reg = operands[0]
src_reg = operands[1]
print(f"cmovxx解析结果: 目标寄存器:{dst_reg}, 源寄存器:{src_reg}")
return dst_reg, src_reg

def find_state_succ_cmovxx(proj, base, local_state, flag, real_blocks, real_block_addr, path):
# 仅在 find_block_succ 识别为 cmov 时调用
ins = local_state.block().capstone.insns[0]
dst_reg, src_reg = capstone_decode_cmovxx(ins)

# 逻辑修正:
# flag == True -> ZF=1 (Zero) -> cmovnz (Not Zero) 条件不满足 -> 不执行 Move -> Pass
# flag == False -> ZF=0 (Not Zero) -> cmovnz (Not Zero) 条件满足 -> 执行 Move

if not flag: # 需要执行 Move
try:
# 修正:state.regs 没有 .get() 方法,用 getattr
src_val = getattr(local_state.regs, src_reg)
setattr(local_state.regs, dst_reg, src_val)
except Exception as e:
print(f"寄存器访问错误: {e}")

# 关键修正:手动跳过这条 cmov 指令!防止 Angr 再次执行它
local_state.regs.ip += ins.size

sm = proj.factory.simgr(local_state)

while(len(sm.active)):
for active_state in sm.active:
try:
ins_offset = active_state.addr - base
if ins_offset in real_blocks:
value = path[real_block_addr]
if ins_offset not in value:
value.append(ins_offset)
return ins_offset
except:
pass
sm.step(num_inst=1)


def find_block_succ(proj, base, func_offset, state, real_block_addr, real_blocks, path):
msm = proj.factory.simgr(state) # 构造模拟器
while len(msm.active):
for active_state in msm.active:
#print(active_state.block().capstone.insns[0])
offset = active_state.addr - base
#print("当前偏移地址:", hex(offset),"寻找真实块:", hex(real_block_addr))
if offset == real_block_addr: # 找到真实块
print("找到真实块:", hex(real_block_addr))
mstate = active_state.copy() # 复制state,为后继块的获取做准备
msm2 = proj.factory.simgr(mstate)
msm2.step(num_inst=1) # 让状态进到块内的下一条指令位置,避免和外层状态混淆

while len(msm2.active):

for mactive_state in msm2.active:
#print(mactive_state.block().capstone.insns[0])
ins_offset = mactive_state.addr - base
if ins_offset in real_blocks: # 无分支块(或无条件跳转)
# 在无条件跳转中,并且有至少两条路径同时执行到真实块时,取非ret块的真实块
msm2_len = len(msm2.active)
if msm2_len > 1:
tmp_addrs = []
for s in msm2.active:
moffset = s.addr - base
tmp_value = path[real_block_addr]
if moffset in real_blocks and moffset not in tmp_value:
tmp_addrs.append(moffset)
if len(tmp_addrs) > 1:
print("当前至少有两个路径同时执行到真实块:", [hex(tmp_addr) for tmp_addr in tmp_addrs])
ret_addr = real_blocks[len(real_blocks) - 1]
if ret_addr in tmp_addrs:
tmp_addrs.remove(ret_addr)
ins_offset = tmp_addrs[0]
print("两个路径同时执行到真实块最后取得:", hex(ins_offset))

value = path[real_block_addr]
if ins_offset not in value:
value.append(ins_offset)
print(f"无条件跳转块关系:{hex(real_block_addr)}-->{hex(ins_offset)}")
return
# 可能是 cmovnz 分支指令
ins = mactive_state.block().capstone.insns[0]
if ins.mnemonic == 'cmovnz' or ins.mnemonic == 'cmovne':
print("发现 cmovnz/cmovne 指令,进行分支处理:", hex(ins_offset))
state_true = mactive_state.copy()
state_true_succ_addr = find_state_succ_cmovxx(proj, base, state_true, True, real_blocks, real_block_addr, path)

state_false = mactive_state.copy()
state_false_succ_addr = find_state_succ_cmovxx(proj, base, state_false, False, real_blocks, real_block_addr, path)
if state_true_succ_addr is None or state_false_succ_addr is None:
print("cmovnz/cmovne错误指令地址:", hex(ins_offset))
print(f"cmovnz/cmovne后继有误:{hex(real_block_addr)}-->{hex(state_true_succ_addr) if state_true_succ_addr is not None else state_true_succ_addr},"
f"{hex(state_false_succ_addr) if state_false_succ_addr is not None else state_false_succ_addr}")
return "erro"
#cmovne
print(f"cmovnz/cmovne分支跳转块关系:{hex(real_block_addr)}-->{hex(state_true_succ_addr)} zf = 1, {hex(state_false_succ_addr)} zf != 1")
#print(f"csel分支跳转块关系:{hex(real_block_addr)}-->{hex(state_true_succ_addr)},{hex(state_false_succ_addr)}")
return
if ins.mnemonic == 'cmovz' or ins.mnemonic == 'cmove':
print("发现 cmovz/cmove 指令,进行分支处理:", hex(ins_offset))
state_true = mactive_state.copy()
state_true_succ_addr = find_state_succ_cmovxx(proj, base, state_true, False, real_blocks, real_block_addr, path)

state_false = mactive_state.copy()
state_false_succ_addr = find_state_succ_cmovxx(proj, base, state_false, True, real_blocks, real_block_addr, path)
if state_true_succ_addr is None or state_false_succ_addr is None:
print("cmovz/cmove误指令地址:", hex(ins_offset))
print(f"cmovz/cmove后继有误:{hex(real_block_addr)}-->{hex(state_true_succ_addr) if state_true_succ_addr is not None else state_true_succ_addr},"
f"{hex(state_false_succ_addr) if state_false_succ_addr is not None else state_false_succ_addr}")
return "erro"
#cmovne
print(f"cmovz/cmove分支跳转块关系:{hex(real_block_addr)}-->{hex(state_true_succ_addr)} zf = 1, {hex(state_false_succ_addr)} zf != 1")
#print(f"csel分支跳转块关系:{hex(real_block_addr)}-->{hex(state_true_succ_addr)},{hex(state_false_succ_addr)}")
return

msm2.step(num_inst=1)
# 真实块集合中的最后一个基本块如果最后没找到后继,说明是return块,直接返回
return
msm.step(num_inst=1)

def angr_main(real_blocks,func_offset,file_path):
proj = angr.Project(file_path, auto_load_libs=False)
base = 0
func_addr = base + func_offset
init_state = proj.factory.blank_state(addr=func_addr)
init_state.options.add(angr.options.CALLLESS)

path = {addr: [] for addr in real_blocks}
ret_addr = real_blocks[len(real_blocks) - 1]

first_block = proj.factory.block(func_addr)
first_block_insns = first_block.capstone.insns
# 主序言的最后一条指令
first_block_last_ins = first_block_insns[len(first_block_insns) - 1]
print(hex(first_block_last_ins.address))

for real_block_addr in tqdm(real_blocks):
if ret_addr == real_block_addr:
continue

state = init_state.copy()
print("正在寻找:",hex(real_block_addr))

def jump_to_address(state):
#print(state.regs.pc)

state.regs.pc = base + real_block_addr - 6
print("跳转到地址:", hex(base + real_block_addr - 6))
proj.unhook(0x400675)
print(hex(real_block_addr),hex(func_offset))

if real_block_addr != func_offset:
print("序言结束")
proj.hook(0x400675, jump_to_address, first_block_last_ins.size)

ret = find_block_succ(proj, base, func_offset, state, real_block_addr, real_blocks, path)

if ret == "erro":
return

hex_dict = {
hex(key): [hex(value) for value in values]
for key, values in path.items()
}

for i in hex_dict.keys():
print(f"{i}: {hex_dict[i]}")
print(hex_dict)
return hex_dict

all_real_blocks: list[int] =[4195872, 4198689, 4198808, 4198878, 4198991, 4199006, 4199076, 4199158, 4199173, 4199206, 4199276, 4199346, 4199375, 4199407, 4199477, 4199553, 4199568, 4199598, 4199634, 4199656, 4199671, 4199686, 4199713, 4199783, 4199862, 4199877, 4199892, 4199917, 4199932, 4200002, 4200081, 4200096, 4200166, 4200249, 4200264, 4200293, 4200363, 4200444, 4200459, 4200496, 4200521, 4200545, 4200615, 4200698, 4200713, 4200742, 4200768, 4200800, 4200829, 4200859, 4200929, 4201021, 4201036, 4201075, 4201101, 4201171, 4201253, 4201268, 4201294, 4201309, 4201333, 4201403, 4201485, 4201500, 4201515, 4201542, 4201577, 4201647, 4201731, 4201746, 4201773, 4201797, 4201812, 4201882, 4201984, 4201999, 4202029, 4202099, 4202169, 4202198, 4202234, 4202249, 4202285, 4202300, 4202336, 4202391, 4202406, 4202421, 4202445, 4202469, 4202484, 4202508, 4202523, 4202547, 4202573, 4202610, 4202646, 4202675, 4202690, 4202324]


angr_main(all_real_blocks, 0x400620, r"./attachement")

最终输出形态是:

1
2
3
4
5
path = {
real_block_addr1: [succ1, succ2?],
real_block_addr2: [succ1, succ2?],
...
}

capstone_decode_cmovxx(insn):解析 cmov 的两个寄存器

1
2
3
operands = insn.op_str.replace(" ", "").split(",")
dst_reg = operands[0]
src_reg = operands[1]

比如 cmovnz ecx, eax 会解析出:

  • dst = "ecx"
  • src = "eax"

find_state_succ_cmovxx(...)手动执行/不执行 cmov,然后继续跑到下一个真实块这个函数只在检测到 cmov 时被调用。它做了三件事

取当前指令(假设当前正好在 cmov 上)

1
2
ins = local_state.block().capstone.insns[0]
dst_reg, src_reg = capstone_decode_cmovxx(ins)

flag 决定是否执行这条 cmov 的 move 语义

你在注释里写的是针对 cmovnz 的逻辑:

  • flag == True 把它解释成 ZF=1cmovnz 条件不满足 → 不 move
  • flag == False 解释成 ZF=0cmovnz 条件满足 → 执行 move

所以代码是:

1
2
3
if not flag:  # 需要执行 Move
src_val = getattr(local_state.regs, src_reg)
setattr(local_state.regs, dst_reg, src_val)

从 cmov 后继续单步执行,直到命中下一个真实块

1
2
3
4
5
6
7
8
sm = proj.factory.simgr(local_state)
while len(sm.active):
for active_state in sm.active:
ins_offset = active_state.addr - base
if ins_offset in real_blocks:
path[real_block_addr].append(ins_offset)
return ins_offset
sm.step(num_inst=1)

这就是跑到下一个真实块入口就停,并把它记录进 path[当前真实块]

1
2
3
def jump_to_address(state):
state.regs.pc = base + real_block_addr - 6
proj.unhook(0x400675)

避免每次都跑完整序言/dispatcher

总结:

这段代码在 angr 里对每个真实块做一次小实验,实验结果累积到 path 里,最后把 path 转成 hex_dict

对每个真实块 B = real_block_addr,从同一个初始状态 init_state 出发,运行 angr,直到进入块 B,然后继续跑,看看接下来会进入哪个真实块,把那个下一个真实块写进 path[B]

实例:

正在外层循环处理

1
2
3
real_block_addr = 0x401eb9
state = init_state.copy()
find_block_succ(..., real_block_addr=0x401eb9, path)

此时:

1
path[0x401eb9] == []

Step 1:先从函数入口跑到序言结束点,再跳到目标块

Step 2:find_block_succ() 的外层循环:找到块入口

msm.step(num_inst=1) 不断单步里,某一刻:

1
active_state.addr == 0x401eb9

命中这一段:

1
2
3
4
if offset == real_block_addr:
mstate = active_state.copy()
msm2 = proj.factory.simgr(mstate)
msm2.step(num_inst=1)

现在开始用 msm2 在块内继续跑,找后继。

Step 3:块内继续执行,遇到 cmov(关键点)

在块内每一步都会取当前指令:

1
ins = mactive_state.block().capstone.insns[0]

假设在 0x401eb9 这个真实块里,最终遇到类似:

1
2
3
4
5
0x401ed0:  mov ecx, 0xAAAAAAA1     ; 默认state
0x401ed5: mov eax, 0xBBBBBBB2 ; 另一个state
0x401eda: cmovnz ecx, eax ; 关键:条件成立就把 ecx 变成 eax
0x401ede: mov [state], ecx
0x401ee4: jmp dispatcher

那么当 msm2 跑到 0x401eda,就触发:

1
2
3
4
5
if ins.mnemonic == 'cmovnz':
state_true = mactive_state.copy()
succ_true = find_state_succ_cmovxx(... flag=True ...)
state_false= mactive_state.copy()
succ_false = find_state_succ_cmovxx(... flag=False ...)

Step 4:第一次分裂(flag=True)——“不执行 move”

进入:

1
find_state_succ_cmovxx(... local_state=state_true, flag=True ...)

它会做:

  1. 解析 cmov:
  • dst = "ecx"
  • src = "eax"
  1. 因为 flag=True,你的语义是:ZF=1 → cmovnz 不满足 → 不 move

所以这一句不会执行:

1
2
if not flag:
ecx = eax

此时 ecx 仍然保持原值 0xAAAAAAA1

  1. 手动跳过 cmov 指令:local_state.regs.ip += ins.size

    不然下一步 sm.step()angr 还会再把这条 cmov 执行一遍

  2. 继续单步执行(sm.step),走到 dispatcher 再分发,最终 angr 的 RIP 落到某个真实块入口。

假设这个路径最后到达:

1
active_state.addr == 0x401ed6

则命中:

1
2
3
if ins_offset in real_blocks:
path[0x401eb9].append(0x401ed6)
return 0x401ed6

此时:

1
path[0x401eb9] == [0x401ed6]

state 并不是由 cmov 这条指令凭空产生的。cmov 只是从两个已经准备好的候选值里选一个

Step 5:第二次分裂(flag=False)——“执行 move”

同理进入:

1
find_state_succ_cmovxx(... local_state=state_false, flag=False ...)

这次 not flag 成立,于是:

1
2
src_val = state.regs.eax      # = 0xBBBBBBB2
state.regs.ecx = src_val # ecx 改成 0xBBBBBBB2

跳过 cmov 后继续跑,最终到达另一个真实块入口,比如:

1
active_state.addr == 0x401f09

于是:

1
2
path[0x401eb9].append(0x401f09)
return 0x401f09

现在:

1
path[0x401eb9] == [0x401ed6, 0x401f09]

外层 angr_main() 最后把 path 转成 hex 字典:

1
hex_dict["0x401eb9"] = ["0x401ed6", "0x401f09"]

最后patch:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
from collections import deque

import ida_funcs
import idaapi
import idautils
import idc
import keystone

# 把某条指令所在的字节全部改成 0x90
def patch_ins_to_nop(ins):
size = idc.get_item_size(ins)
for i in range(size):
idc.patch_byte(ins + i,0x90)

# 把一段 bytes 写入到指定地址
def patch_bytes(addr, data):
for i, b in enumerate(data):
idc.patch_byte(addr + i, b)

def fill_nop(start_ea, end_ea):
# [FIX 1] 应该是 end - start,否则是负数
size = end_ea - start_ea
if size > 0:
# [FIX 2] 使用 patch_bytes 批量写入
patch_bytes(start_ea, b'\x90' * size)

# 在 ea 所在函数的 FlowChart 里找到包含这个地址的基本块
def get_block_by_address(ea):
func = idaapi.get_func(ea)
blocks = idaapi.FlowChart(func)
for block in blocks:
if block.start_ea <= ea < block.end_ea:
return block
return None

def generate_jmp_code(src, dst):
# E9 xx xx xx xx
offset = dst - (src + 5)
return b'\xE9' + offset.to_bytes(4, 'little', signed=True)

def generate_jz_code(src, dst):
# 0F 84 xx xx xx xx
offset = dst - (src + 6)
return b'\x0F\x84' + offset.to_bytes(4, 'little', signed=True)

def patch_branch(patch_dict):
for ea in patch_dict:
values = patch_dict[ea]
if len(values) == 0:#如果后继块为0,基本都是return块,不需要patch,直接跳过
continue
block = get_block_by_address(int(ea, 16))
start_ea = block.start_ea
end_ea = block.end_ea
last_ins_ea = idc.prev_head(end_ea)#因为block.end_ea获取的地址是块最后一个地址的下一个地址,所以需要向上取一个地址
if len(values) == 2:
for ins in idautils.Heads(start_ea,end_ea):
if idc.print_insn_mnem(ins).startswith("cmov"):
print("find cmov")
jz_code = generate_jz_code(ins, int(values[0],16))
jmp_code = generate_jmp_code(ins + len(jz_code), int(values[1],16))

# [FIX 2] 实际写入内存!
patch_bytes(ins, jz_code)
patch_bytes(ins + len(jz_code), jmp_code)

# 3. 填充 NOP
nop_start = ins + len(jz_code) + len(jmp_code)
fill_nop(nop_start, end_ea)
if len(values) == 1:
mnem = idc.print_insn_mnem(last_ins_ea)
if mnem.startswith("jmp"):
jmp_code = generate_jmp_code(last_ins_ea, int(values[0],16))
patch_bytes(last_ins_ea, jmp_code)
nop_start = last_ins_ea + len(jmp_code)
fill_nop(nop_start, end_ea)

def find_all_useless_block(func_ea,real_blocks):
blocks = idaapi.FlowChart(idaapi.get_func(func_ea))
local_real_blocks = real_blocks.copy()
useless_blocks = []
# local_real_blocks.extend(succ.start_ea for succ in cur_block.succs())
for block in blocks:
start_ea = block.start_ea
end_ea = block.end_ea
if start_ea not in local_real_blocks:
useless_blocks.append([start_ea,end_ea])

print("所有的无用块:",[b for b in useless_blocks])
return useless_blocks


def patch_useless_blocks(useless_blocks):
# print(useless_blocks)
for useless_block in useless_blocks:
print(f"Nop-ing useless block from {hex(useless_block[0])} to {useless_block[1]}")
fill_nop(useless_block[0], useless_block[1])
print("无用块nop完成")


func_ea = 0x400620
all_real_blocks =[4195872, 4198689, 4198808, 4198878, 4198991, 4199006, 4199076, 4199158, 4199173, 4199206, 4199276, 4199346, 4199375, 4199407, 4199477, 4199553, 4199568, 4199598, 4199634, 4199656, 4199671, 4199686, 4199713, 4199783, 4199862, 4199877, 4199892, 4199917, 4199932, 4200002, 4200081, 4200096, 4200166, 4200249, 4200264, 4200293, 4200363, 4200444, 4200459, 4200496, 4200521, 4200545, 4200615, 4200698, 4200713, 4200742, 4200768, 4200800, 4200829, 4200859, 4200929, 4201021, 4201036, 4201075, 4201101, 4201171, 4201253, 4201268, 4201294, 4201309, 4201333, 4201403, 4201485, 4201500, 4201515, 4201542, 4201577, 4201647, 4201731, 4201746, 4201773, 4201797, 4201812, 4201882, 4201984, 4201999, 4202029, 4202099, 4202169, 4202198, 4202234, 4202249, 4202285, 4202300, 4202336, 4202391, 4202406, 4202421, 4202445, 4202469, 4202484, 4202508, 4202523, 4202547, 4202573, 4202610, 4202646, 4202675, 4202690, 4202324]
useless_blocks = find_all_useless_block(func_ea,all_real_blocks)
patch_branch({'0x400620': ['0x401121'], '0x401121': ['0x401198'], '0x401198': ['0x401f60', '0x4011de'], '0x4011de': ['0x401f60', '0x40124f'], '0x40124f': ['0x40125e'], '0x40125e': ['0x401f97', '0x4012a4'], '0x4012a4': ['0x401f97', '0x4012f6'], '0x4012f6': ['0x401305'], '0x401305': ['0x401326'], '0x401326': ['0x401fa6', '0x40136c'], '0x40136c': ['0x401fa6', '0x4013b2'], '0x4013b2': ['0x4015d4', '0x4013cf'], '0x4013cf': ['0x4013ef'], '0x4013ef': ['0x401fb5', '0x401435'], '0x401435': ['0x401fb5', '0x401481'], '0x401481': ['0x401490'], '0x401490': ['0x4014ae', '0x4014f7'], '0x4014ae': ['0x4014d2'], '0x4014d2': ['0x4014e8'], '0x4014e8': ['0x4015d4'], '0x4014f7': ['0x401506'], '0x401506': ['0x401521'], '0x401521': ['0x401fcd', '0x401567'], '0x401567': ['0x401fcd', '0x4015b6'], '0x4015b6': ['0x4015c5'], '0x4015c5': ['0x40125e'], '0x4015d4': ['0x4015ed'], '0x4015ed': ['0x4015fc'], '0x4015fc': ['0x401fe5', '0x401642'], '0x401642': ['0x401fe5', '0x401691'], '0x401691': ['0x4016a0'], '0x4016a0': ['0x401ff4', '0x4016e6'], '0x4016e6': ['0x401ff4', '0x401739'], '0x401739': ['0x401748'], '0x401748': ['0x401d54', '0x401765'], '0x401765': ['0x40200c', '0x4017ab'], '0x4017ab': ['0x40200c', '0x4017fc'], '0x4017fc': ['0x40180b'], '0x40180b': ['0x401830'], '0x401830': ['0x401849'], '0x401849': ['0x401861'], '0x401861': ['0x40201b', '0x4018a7'], '0x4018a7': ['0x40201b', '0x4018fa'], '0x4018fa': ['0x401909'], '0x401909': ['0x401c2b', '0x401926'], '0x401926': ['0x401940'], '0x401940': ['0x401960'], '0x401960': ['0x401a73', '0x40197d'], '0x40197d': ['0x40199b'], '0x40199b': ['0x402033', '0x4019e1'], '0x4019e1': ['0x402033', '0x401a3d'], '0x401a3d': ['0x401a4c'], '0x401a4c': ['0x401b4e'], '0x401a73': ['0x401a8d'], '0x401a8d': ['0x40204d', '0x401ad3'], '0x401ad3': ['0x40204d', '0x401b25'], '0x401b25': ['0x401b34'], '0x401b34': ['0x401b4e'], '0x401b4e': ['0x401b5d'], '0x401b5d': ['0x401b75'], '0x401b75': ['0x402072', '0x401bbb'], '0x401bbb': ['0x402072', '0x401c0d'], '0x401c0d': ['0x401c1c'], '0x401c1c': ['0x401849'], '0x401c2b': ['0x401c46'], '0x401c46': ['0x401c69'], '0x401c69': ['0x402096', '0x401caf'], '0x401caf': ['0x402096', '0x401d03'], '0x401d03': ['0x401d12'], '0x401d12': ['0x401d2d'], '0x401d2d': ['0x401d45'], '0x401d45': ['0x4015fc'], '0x401d54': ['0x4020b3', '0x401d9a'], '0x401d9a': ['0x4020b3', '0x401e00'], '0x401e00': ['0x401e0f'], '0x401e0f': ['0x401e2d'], '0x401e2d': ['0x4020c2', '0x401e73'], '0x401e73': ['0x4020c2', '0x401eb9'], '0x401eb9': ['0x401ed6', '0x401f09'], '0x401ed6': ['0x401efa'], '0x401efa': ['0x401f3c'], '0x401f09': ['0x401f2d'], '0x401f2d': ['0x401f3c'], '0x401f3c': ['0x401f54'], '0x401f60': ['0x4011de'], '0x401f97': ['0x4012a4'], '0x401fa6': ['0x40136c'], '0x401fb5': ['0x401435'], '0x401fcd': ['0x401567'], '0x401fe5': ['0x401642'], '0x401ff4': ['0x4016e6'], '0x40200c': ['0x4017ab'], '0x40201b': ['0x4018a7'], '0x402033': ['0x4019e1'], '0x40204d': ['0x401ad3'], '0x402072': ['0x401bbb'], '0x402096': ['0x401caf'], '0x4020b3': ['0x401d9a'], '0x4020c2': ['0x401e73'], '0x401f54': []})

patch_useless_blocks(useless_blocks)
ida_funcs.reanalyze_function(ida_funcs.get_func(func_ea))#刷新函数控制流图
print("控制流图已刷新")

generate_jmp_code(src, dst)

生成 5 字节近跳 E9 rel32rel32 = dst - (src + 5)

返回 b'\xE9' + rel32

generate_jz_code(src, dst)

生成 6 字节 0F 84 rel32(JZ near):rel32 = dst - (src + 6)

返回 b'\x0F\x84' + rel32

思想是:用固定长度的 near jump/jz 来替换原本 flatten 中的 cmov+dispatcher 逻辑。

patch_dict 的结构是用 angr 得到的后继表:

  • key:真实块起始地址(字符串 hex)
  • value:
    • 长度 1:无条件后继 [succ]
    • 长度 2:条件后继 [succ_when_zf1, succ_when_zf0]
    • 空:return 块

如果后继有两个:在块内找 cmov,把它改成 jz + jmp

1
2
3
4
5
6
7
8
if len(values) == 2:
for ins in Heads(start_ea, end_ea):
if print_insn_mnem(ins).startswith("cmov"):
jz_code = generate_jz_code(ins, values[0])
jmp_code = generate_jmp_code(ins + len(jz_code), values[1])
patch_bytes(ins, jz_code)
patch_bytes(ins+len(jz_code), jmp_code)
fill_nop(剩余空间)

unicorn

angr:为了恢复 CFG,需要枚举所有可能后继,所以你手动分裂 cmov 两种情况。

Unicorn:是具体执行,只会走一条真实路径。它做的是:把程序跑一遍,记录它实际走过哪些真实块(以及每个块末尾 ZF),得到一条 real_flow 执行轨迹。

利用上面的输出依次放进脚本运行就行了

原来的代码会出现报错

1
2
3
4
(angr) D:\Matriy\Desktop\VN\VN13>python rebuild.py
[*] Starting Emulation...
[MEM READ] 0x603054, size=4
[Error] Invalid memory read (UC_ERR_READ_UNMAPPED)

现在的加载方式是“把整个文件 raw bytes 写到 0x400000”:

1
2
uc.mem_map(0x400000, 1MB)
uc.mem_write(0x400000, file_bytes)

不是 ELF 的正确加载方式。ELF 在内存里会按 Program Header 把不同段映射到不同虚拟地址

在 hook_mem_invalid 里按需补页

让模拟器遇到 unmapped 自动补一页内存继续跑,而不是直接停:

1
2
3
4
5
6
7
8
9
10
11
12
PAGE = 0x1000

def hook_mem_invalid(uc, access, address, size, value, user_data):
page = address & ~(PAGE - 1)
try:
uc.mem_map(page, PAGE, UC_PROT_ALL)
# 可选:写 0 初始化不用写,默认就是 0
print(f"[MAP] mapped page at 0x{page:x} for access 0x{address:x}")
return True # 告诉 unicorn:我处理了,继续执行
except UcError as e:
print(f"[MEM] map failed at 0x{page:x}: {e}")
return False

关键点:返回 True,表示你处理了这个异常,Unicorn 会重试这次访问

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
from unicorn import *
from unicorn.x86_const import *
from capstone import *

# ============================================================
# 1. 全局配置(地址布局 & 目标函数)
# ============================================================
BASE_ADDR = 0x400000
CODE_ADDR = BASE_ADDR
CODE_SIZE = 1024 * 1024
STACK_ADDR = 0x0
STACK_SIZE = 1024 * 1024

MAIN_ADDR = 0x400620
MAIN_END = 0x402150

# ============================================================
# 2. 基本块信息(IDA 静态分析得到)
# ============================================================
# 从第一步脚本中获取的真实块列表
TBS = [
(4198684, 4198684), (4198689, 4198803), (4198808, 4198873), (4198878, 4198986), (4198991, 4199001), (4199006, 4199071), (4199076, 4199153), (4199158, 4199168), (4199173, 4199201), (4199206, 4199271), (4199276, 4199341), (4199346, 4199370), (4199375, 4199402), (4199407, 4199472), (4199477, 4199548), (4199553, 4199563), (4199568, 4199593), (4199598, 4199629), (4199634, 4199651), (4199656, 4199666), (4199671, 4199681), (4199686, 4199708), (4199713, 4199778), (4199783, 4199857), (4199862, 4199872), (4199877, 4199887), (4199892, 4199912), (4199917, 4199927), (4199932, 4199997), (4200002, 4200076), (4200081, 4200091), (4200096, 4200161), (4200166, 4200244), (4200249, 4200259), (4200264, 4200288), (4200293, 4200358), (4200363, 4200439), (4200444, 4200454), (4200459, 4200491), (4200496, 4200516), (4200521, 4200540), (4200545, 4200610), (4200615, 4200693), (4200698, 4200708), (4200713, 4200737), (4200742, 4200763), (4200768, 4200795), (4200800, 4200824), (4200829, 4200854), (4200859, 4200924), (4200929, 4201016), (4201021, 4201031), (4201036, 4201070), (4201075, 4201096), (4201101, 4201166), (4201171, 4201248), (4201253, 4201263), (4201268, 4201289), (4201294, 4201304), (4201309, 4201328), (4201333, 4201398), (4201403, 4201480), (4201485, 4201495), (4201500, 4201510), (4201515, 4201537), (4201542, 4201572), (4201577, 4201642), (4201647, 4201726), (4201731, 4201741), (4201746, 4201768), (4201773, 4201792), (4201797, 4201807), (4201812, 4201877), (4201882, 4201979), (4201984, 4201994), (4201999, 4202024), (4202029, 4202094), (4202099, 4202164), (4202169, 4202193), (4202198, 4202229), (4202234, 4202244), (4202249, 4202280), (4202285, 4202295), (4202300, 4202319), (4202336, 4202386), (4202391, 4202401), (4202406, 4202416), (4202421, 4202440), (4202445, 4202464), (4202469, 4202479), (4202484, 4202503), (4202508, 4202518), (4202523, 4202542), (4202547, 4202568), (4202573, 4202605), (4202610, 4202641), (4202646, 4202670), (4202675, 4202685), (4202690, 4202690)]

# 结果记录: [(tb_start, tb_end), zf_value]
tb_trace = []

# ============================================================
# 3. 反汇编 & 模拟器初始化
# ============================================================
cs = Cs(CS_ARCH_X86, CS_MODE_64)
uc = Uc(UC_ARCH_X86, UC_MODE_64)

# ============================================================
# 4. Hook:指令级 Hook(核心)
# ============================================================
def hook_code(uc, address, size, user_data):
# 1. 模拟环境修补
# 读取指令,处理 call 和 ret
try:
code = uc.mem_read(address, size)
except: return

for insn in cs.disasm(code, address):
# 跳过 Call:FLA 通常只在当前函数内,无需跟进子函数
if insn.mnemonic == "call":
# print(f"[Skip Call] 0x{address:x}")
uc.reg_write(UC_X86_REG_RIP, address + size)
return

# 遇到 Ret:函数结束,停止模拟
if insn.mnemonic == "ret":
print("[*] Function Return hit. Stopping...")
uc.emu_stop()

# 输出最终 Trace 供下一步使用
print("\n" + "="*30)
print("real_flow = [")
for item in tb_trace:
print(f" {item},")
print("]")
print("="*30 + "\n")
return

# 2. 记录执行轨迹
# 检查当前地址是否是某个真实块的“结束地址”
for tb_start, tb_end in TBS:
if address == tb_end:
# 记录此时的 ZF 标志位 (EFLAGS 第 6 位)
eflags = uc.reg_read(UC_X86_REG_EFLAGS)
zf = (eflags >> 6) & 1

tb_trace.append(((tb_start, tb_end), zf))
break

# ============================================================
# 5. Hook:非法内存访问 / 中断(调试用)
# ============================================================
PAGE = 0x1000

def hook_mem_invalid(uc, access, address, size, value, user_data):
page = address & ~(PAGE - 1)
try:
uc.mem_map(page, PAGE, UC_PROT_ALL)
# 可选:写 0 初始化不用写,默认就是 0
print(f"[MAP] mapped page at 0x{page:x} for access 0x{address:x}")
return True # 告诉 unicorn:我处理了,继续执行
except UcError as e:
print(f"[MEM] map failed at 0x{page:x}: {e}")
return False


def hook_intr(uc, intno, user_data):
print(f"[INT] interrupt {intno}")
return False

# ============================================================
# 6. Unicorn 初始化
# ============================================================
def init_unicorn(uc, code_data):
# 映射内存
uc.mem_map(CODE_ADDR, CODE_SIZE, UC_PROT_ALL)
uc.mem_map(STACK_ADDR, STACK_SIZE, UC_PROT_ALL)

# 写入代码
uc.mem_write(CODE_ADDR, code_data)

# 初始化栈
uc.reg_write(UC_X86_REG_RSP, STACK_ADDR + STACK_SIZE // 2)

# 添加 hook 逻辑
uc.hook_add(UC_HOOK_CODE, hook_code)

# 未映射内存访问
uc.hook_add(UC_HOOK_MEM_UNMAPPED, hook_mem_invalid)

# 中断(int 0x80 / syscall / ud2 等)
uc.hook_add(UC_HOOK_INTR, hook_intr)

# ============================================================
# 7. 主流程
# ============================================================
if __name__ == "__main__":
# 读取二进制文件
with open(r"./attachment", "rb") as f:
CODE_DATA = f.read()

init_unicorn(uc, CODE_DATA)

print("[*] Starting Emulation...")
try:
uc.emu_start(MAIN_ADDR, MAIN_END)
except UcError as e:
print(f"[Error] {e}")

ida_patch

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
import idaapi
import ida_bytes
import ida_ua
import ida_kernwin
from collections import defaultdict, deque

# ============================================================
# 输入数据区
# ============================================================
# fake_blocks:
# 第一步分析 CFG / dispatcher 后得到的「虚假块列表」
# 每一项格式为 (start_ea, end_ea)
# 这些块在最终逻辑中只作为“跳板”或被 NOP 掉
fake_blocks = [(4195903, 4195929), (4195935, 4195935), (4195940, 4195957), (4195963, 4195963), (4195968, 4195985), (4195991, 4195991), (4195996, 4196013), (4196019, 4196019), (4196024, 4196041), (4196047, 4196047), (4196052, 4196069), (4196075, 4196075), (4196080, 4196097), (4196103, 4196103), (4196108, 4196125), (4196131, 4196131), (4196136, 4196153), (4196159, 4196159), (4196164, 4196181), (4196187, 4196187), (4196192, 4196209), (4196215, 4196215), (4196220, 4196237), (4196243, 4196243), (4196248, 4196265), (4196271, 4196271), (4196276, 4196293), (4196299, 4196299), (4196304, 4196321), (4196327, 4196327), (4196332, 4196349), (4196355, 4196355), (4196360, 4196377), (4196383, 4196383), (4196388, 4196405), (4196411, 4196411), (4196416, 4196433), (4196439, 4196439), (4196444, 4196461), (4196467, 4196467), (4196472, 4196489), (4196495, 4196495), (4196500, 4196517), (4196523, 4196523), (4196528, 4196545), (4196551, 4196551), (4196556, 4196573), (4196579, 4196579), (4196584, 4196601), (4196607, 4196607), (4196612, 4196629), (4196635, 4196635), (4196640, 4196657), (4196663, 4196663), (4196668, 4196685), (4196691, 4196691), (4196696, 4196713), (4196719, 4196719), (4196724, 4196741), (4196747, 4196747), (4196752, 4196769), (4196775, 4196775), (4196780, 4196797), (4196803, 4196803), (4196808, 4196825), (4196831, 4196831), (4196836, 4196853), (4196859, 4196859), (4196864, 4196881), (4196887, 4196887), (4196892, 4196909), (4196915, 4196915), (4196920, 4196937), (4196943, 4196943), (4196948, 4196965), (4196971, 4196971), (4196976, 4196993), (4196999, 4196999), (4197004, 4197021), (4197027, 4197027), (4197032, 4197049), (4197055, 4197055), (4197060, 4197077), (4197083, 4197083), (4197088, 4197105), (4197111, 4197111), (4197116, 4197133), (4197139, 4197139), (4197144, 4197161), (4197167, 4197167), (4197172, 4197189), (4197195, 4197195), (4197200, 4197217), (4197223, 4197223), (4197228, 4197245), (4197251, 4197251), (4197256, 4197273), (4197279, 4197279), (4197284, 4197301), (4197307, 4197307), (4197312, 4197329), (4197335, 4197335), (4197340, 4197357), (4197363, 4197363), (4197368, 4197385), (4197391, 4197391), (4197396, 4197413), (4197419, 4197419), (4197424, 4197441), (4197447, 4197447), (4197452, 4197469), (4197475, 4197475), (4197480, 4197497), (4197503, 4197503), (4197508, 4197525), (4197531, 4197531), (4197536, 4197553), (4197559, 4197559), (4197564, 4197581), (4197587, 4197587), (4197592, 4197609), (4197615, 4197615), (4197620, 4197637), (4197643, 4197643), (4197648, 4197665), (4197671, 4197671), (4197676, 4197693), (4197699, 4197699), (4197704, 4197721), (4197727, 4197727), (4197732, 4197749), (4197755, 4197755), (4197760, 4197777), (4197783, 4197783), (4197788, 4197805), (4197811, 4197811), (4197816, 4197833), (4197839, 4197839), (4197844, 4197861), (4197867, 4197867), (4197872, 4197889), (4197895, 4197895), (4197900, 4197917), (4197923, 4197923), (4197928, 4197945), (4197951, 4197951), (4197956, 4197973), (4197979, 4197979), (4197984, 4198001), (4198007, 4198007), (4198012, 4198029), (4198035, 4198035), (4198040, 4198057), (4198063, 4198063), (4198068, 4198085), (4198091, 4198091), (4198096, 4198113), (4198119, 4198119), (4198124, 4198141), (4198147, 4198147), (4198152, 4198169), (4198175, 4198175), (4198180, 4198197), (4198203, 4198203), (4198208, 4198225), (4198231, 4198231), (4198236, 4198253), (4198259, 4198259), (4198264, 4198281), (4198287, 4198287), (4198292, 4198309), (4198315, 4198315), (4198320, 4198337), (4198343, 4198343), (4198348, 4198365), (4198371, 4198371), (4198376, 4198393), (4198399, 4198399), (4198404, 4198421), (4198427, 4198427), (4198432, 4198449), (4198455, 4198455), (4198460, 4198477), (4198483, 4198483), (4198488, 4198505), (4198511, 4198511), (4198516, 4198533), (4198539, 4198539), (4198544, 4198561), (4198567, 4198567), (4198572, 4198589), (4198595, 4198595), (4198600, 4198617), (4198623, 4198623), (4198628, 4198645), (4198651, 4198651), (4198656, 4198673), (4198679, 4198679), (4202700, 4202700)]

# real_flow:
# 第二步通过动态 / 符号执行 / 手工跟踪得到的真实执行路径
# 每一项格式为:
# ((block_start, block_end), zf)
# 含义是:
# 执行到该真实块时,ZF 的取值为 zf
real_flow = [
((4198689, 4198803), 1),
((4198808, 4198873), 0),
....
]

# 函数序言块的起止地址
# 用于修复 main 入口,直接跳转到第一个真实块
PROLOGUE_STAR = 0x401E80
PROLOGUE_END = 0x401E8B

# 最终 return 块(例如 epilogue / leave; ret 所在块)
RETURN_BLOCK = 0x402690

# ============================================================
# 逻辑处理区
# ============================================================

# ------------------------------------------------------------
# 1. 构建真实控制流映射
# ------------------------------------------------------------

# block_next_map:
# 结构为:
# block_next_map[block][zf] = {next_block1, next_block2, ...}
#
# 表示:
# 当执行到 block 且 ZF == zf 时
# 下一跳可能进入哪些真实块
block_next_map = defaultdict(lambda: defaultdict(set))

# block_zf_map:
# block_zf_map[block] = {0, 1}
#
# 表示:
# 该真实块在执行过程中,ZF 出现过哪些取值
block_zf_map = defaultdict(set)

# 根据 real_flow 构建上述两个映射
for i in range(len(real_flow) - 1):
cur_block, zf = real_flow[i] # 当前真实块及其 ZF
next_block, _ = real_flow[i + 1] # 下一个真实块(ZF 无关)

block_zf_map[cur_block].add(zf)
block_next_map[cur_block][zf].add(next_block)

# ------------------------------------------------------------
# 2. 准备虚假块资源池
# ------------------------------------------------------------

# 使用 deque:
# - 顺序分配 fake block
# - 避免重复使用
fake_queue = deque(fake_blocks)

# 记录哪些 fake block 被用作跳板
used_fake = set()


def alloc_fake_block(min_size=10):
"""
从 fake_blocks 中分配一个可用的虚假块

要求:
- 尚未使用
- 空间足够大(至少能容纳 jz + jmp,约 11 字节)

返回:
(start_ea, end_ea)
"""
while fake_queue:
fb = fake_queue.popleft()
if (fb[1] - fb[0]) >= min_size:
used_fake.add(fb)
return fb
raise Exception("No more fake blocks available!")


# ------------------------------------------------------------
# 通用工具函数
# ------------------------------------------------------------

def nop_range(start, end):
"""
将 [start, end] 区间全部填充为 NOP
用于:
- 清除原 FLA 垃圾代码
- 防止残留逻辑被误执行
"""
ea = start
while ea <= end:
ida_bytes.patch_byte(ea, 0x90)
ea += 1


def get_last_insn_ea(block_start, block_end):
"""
在一个 block 内,反向查找最后一条“有效指令”

目的:
FLA 中 block 末尾通常是 dispatcher 跳转
我们需要精准定位并 patch 这条指令
"""
ea = ida_bytes.prev_head(block_end + 1, block_start)
while ea != idaapi.BADADDR and ea >= block_start:
if ida_bytes.is_code(ida_bytes.get_full_flags(ea)):
return ea
ea = ida_bytes.prev_head(ea, block_start)
return idaapi.BADADDR


def patch_jmp(frm, to):
"""
在 frm 地址处,强制 patch 成:
jmp to

用途:
- 替换原 dispatcher 跳转
- 替换原 jcc / 间接跳转
"""
ida_bytes.del_items(frm, ida_bytes.DELIT_SIMPLE)
ida_ua.create_insn(frm)
ida_bytes.patch_byte(frm, 0xE9)
rel = to - (frm + 5)
ida_bytes.patch_dword(frm + 1, rel)


def emit_jz_jmp(ea, true_target, false_target):
"""
在 fake block 中构造如下逻辑:

jz true_target
jmp false_target

用于:
- 恢复真实 if / while / for 条件分支
- ZF == 1 → true_target
- ZF == 0 → false_target
"""

# jz true_target
ida_bytes.del_items(ea, ida_bytes.DELIT_SIMPLE)
ida_ua.create_insn(ea)
ida_bytes.patch_byte(ea, 0x0F)
ida_bytes.patch_byte(ea + 1, 0x84)
rel = true_target - (ea + 6)
ida_bytes.patch_dword(ea + 2, rel)
ea += 6

# jmp false_target
ida_bytes.del_items(ea, ida_bytes.DELIT_SIMPLE)
ida_ua.create_insn(ea)
ida_bytes.patch_byte(ea, 0xE9)
rel = false_target - (ea + 5)
ida_bytes.patch_dword(ea + 1, rel)
ea += 5

return ea


print("[*] Starting Patching...")

# ------------------------------------------------------------
# 3. 修复函数序言块
# ------------------------------------------------------------
# main 的序言块不应再进入 dispatcher
# 直接跳转到第一个真实块
first_real_block = real_flow[0][0][0]
patch_jmp(
get_last_insn_ea(PROLOGUE_STAR, PROLOGUE_END),
first_real_block
)

# ------------------------------------------------------------
# 4. 修复所有真实块
# ------------------------------------------------------------
for block, zf_set in block_zf_map.items():
start, end = block
last_insn = get_last_insn_ea(start, end)
branches = block_next_map[block]

# --------------------------------------------------------
# 情况 A:该真实块只出现过一种 ZF
# → 实际是“退化条件”或“直跳块”
# --------------------------------------------------------
if len(zf_set) == 1:
zf = list(zf_set)[0]
target = list(branches[zf])[0][0]
patch_jmp(last_insn, target)

# --------------------------------------------------------
# 情况 B:该真实块同时出现 ZF=0 / ZF=1
# → 真正的条件分支
# --------------------------------------------------------
else:
# 分配一个 fake block 作为条件跳板
fb_start, fb_end = alloc_fake_block()

# 原真实块无条件跳到 fake block
patch_jmp(last_insn, fb_start)

# 确定 ZF=1 / ZF=0 的真实目标
true_target = list(branches[1])[0][0]
false_target = list(branches[0])[0][0]

# 清空 fake block
nop_range(fb_start, fb_end)

# 写入:
# if (ZF) goto true_target
# else goto false_target
emit_jz_jmp(fb_start, true_target, false_target)

# ------------------------------------------------------------
# 5. 修复最后一个真实块 → return block
# ------------------------------------------------------------
last_true_block_start = real_flow[-1][0][0]
last_true_block_end = real_flow[-1][0][1]
patch_jmp(
get_last_insn_ea(last_true_block_start, last_true_block_end),
RETURN_BLOCK
)

# ------------------------------------------------------------
# 6. 清理所有未使用的 fake blocks
# ------------------------------------------------------------
# 防止残留 FLA 垃圾逻辑
for fb in fake_blocks:
if fb not in used_fake:
nop_range(fb[0], fb[1])

print("[+] Patching Done! Press F5 to decompile.")

后面就是简单的算法不赘述了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
flag = [
0xBC8FF26D43536296, 0x520100780530EE16,
0x4DC0B5EA935F08EC, 0x342B90AFD853F450,
0x8B250EBCAA2C3681, 0x55759F81A2C68AE4
]

for i in range(6):
enc = ""
x = flag[i]
for k in range(64):
if x & 1:
x = (x ^ 0xB0004B7679FA26B3) >> 1
x = x | 0x8000000000000000
else:
x = x >> 1

flag[i] = x
for i in range(8):
enc += chr(x & 0xff)
x = x >> 8
print(enc, end="")

flag{6ff29390-6c20-4c56-ba70-a95758e3d1f8}