Skip to content

Commit 87c582e

Browse files
authored
Merge pull request #2036 from HackTricks-wiki/research_update_src_reversing_reversing-tools-basic-methods_satisfiability-modulo-theories-smt-z3_20260319_212048
Research Update Enhanced src/reversing/reversing-tools-basic...
2 parents 8ef64d8 + 98513a2 commit 87c582e

1 file changed

Lines changed: 132 additions & 0 deletions

File tree

src/reversing/reversing-tools-basic-methods/satisfiability-modulo-theories-smt-z3.md

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,31 @@ solve(x < 0)
104104
solve(ULT(x, 0))
105105
```
106106

107+
108+
### Bit-vector helpers commonly needed in reversing
109+
110+
When you are **lifting checks from assembly or decompiler output**, it is usually better to model every input byte as a `BitVec(..., 8)` and then rebuild words exactly like the target code does. This avoids bugs caused by mixing mathematical integers with machine arithmetic.
111+
112+
```python
113+
from z3 import *
114+
115+
b0, b1, b2, b3 = BitVecs('b0 b1 b2 b3', 8)
116+
eax = Concat(b3, b2, b1, b0) # little-endian bytes -> 32-bit register value
117+
low_byte = Extract(7, 0, eax) # AL
118+
high_word = Extract(31, 16, eax) # upper 16 bits
119+
signed_b0 = SignExt(24, b0) # movsx eax, byte ptr [...]
120+
unsigned_b0 = ZeroExt(24, b0) # movzx eax, byte ptr [...]
121+
rot = RotateLeft(eax, 13) # rol eax, 13
122+
logical = LShR(eax, 3) # shr eax, 3
123+
arith = eax >> 3 # sar eax, 3 (signed shift)
124+
```
125+
126+
Some common pitfalls while translating code into constraints:
127+
128+
- `>>` is an **arithmetic** right shift for bit-vectors. Use `LShR` for the logical `shr` instruction.
129+
- Use `UDiv`, `URem`, `ULT`, `ULE`, `UGT` and `UGE` when the original comparison/division was **unsigned**.
130+
- Keep widths explicit. If the binary truncates to 8 or 16 bits, add `Extract` or rebuild the value with `Concat` instead of silently promoting everything to Python integers.
131+
107132
### Functions
108133

109134
**Interpreted functio**ns such as arithmetic where the **function +** has a **fixed standard interpretation** (it adds two numbers). **Uninterpreted functions** and constants are **maximally flexible**; they allow **any interpretation** that is **consistent** with the **constraints** over the function or constant.
@@ -183,9 +208,116 @@ else:
183208
print "failed to solve"
184209
```
185210

211+
212+
### Reversing workflows
213+
214+
If you need to **symbolically execute the binary and collect constraints automatically**, check the angr notes here:
215+
216+
{{#ref}}
217+
angr/README.md
218+
{{#endref}}
219+
220+
If you are already looking at the decompiled checks and only need to solve them, raw Z3 is usually faster and easier to control.
221+
222+
#### Lifting byte-based checks from a crackme
223+
224+
A very common pattern in crackmes and packed loaders is a long list of byte equations over a candidate password. Model bytes as 8-bit vectors, constrain the alphabet, and only widen them when the original code widens them.
225+
226+
<details>
227+
<summary>Example: rebuild a serial check from decompiled arithmetic</summary>
228+
229+
```python
230+
from z3 import *
231+
232+
flag = [BitVec(f'flag_{i}', 8) for i in range(8)]
233+
s = Solver()
234+
235+
for c in flag:
236+
s.add(c >= 0x20, c <= 0x7e)
237+
238+
w0 = Concat(flag[3], flag[2], flag[1], flag[0])
239+
w1 = Concat(flag[7], flag[6], flag[5], flag[4])
240+
241+
s.add((ZeroExt(24, flag[0]) + ZeroExt(24, flag[5])) == 0x90)
242+
s.add((flag[1] ^ flag[2] ^ flag[3]) == 0x5a)
243+
s.add(RotateLeft(w0, 7) ^ w1 == BitVecVal(0x4f625a13, 32))
244+
s.add(ULE(flag[6], flag[7]))
245+
s.add(LShR(w1, 5) == BitVecVal(0x03a1f21, 32))
246+
247+
if s.check() == sat:
248+
m = s.model()
249+
print(bytes(m[c].as_long() for c in flag))
250+
```
251+
252+
</details>
253+
254+
This style maps well to real-world reversing because it matches what modern writeups do in practice: recover the arithmetic/bitwise relations, turn each comparison into a constraint, and solve the whole system at once.
255+
256+
#### Incremental solving with `push()` / `pop()`
257+
258+
While reversing, you often want to test several hypotheses without rebuilding the whole solver. `push()` creates a checkpoint and `pop()` discards the constraints added after that checkpoint. This is useful when you are not sure whether a branch is signed or unsigned, whether a register is zero-extended or sign-extended, or when you are trying several candidate constants extracted from disassembly.
259+
260+
```python
261+
from z3 import *
262+
263+
x = BitVec('x', 32)
264+
s = Solver()
265+
s.add((x & 0xff) == 0x41)
266+
267+
s.push()
268+
s.add(UGT(x, 0x1000))
269+
print(s.check())
270+
s.pop()
271+
272+
s.push()
273+
s.add(x == 0x41)
274+
print(s.check())
275+
print(s.model())
276+
s.pop()
277+
```
278+
279+
#### Enumerating more than one valid input
280+
281+
Some keygens, license checks, and CTF challenges intentionally admit **many** valid inputs. Z3 does not enumerate them automatically, but you can add a **blocking clause** after every model to force the next result to differ in at least one position.
282+
283+
```python
284+
from z3 import *
285+
286+
xs = [BitVec(f'x{i}', 8) for i in range(4)]
287+
s = Solver()
288+
for x in xs:
289+
s.add(And(x >= 0x30, x <= 0x39))
290+
s.add(xs[0] + xs[1] == xs[2] + 1)
291+
s.add(xs[3] == xs[0] ^ 3)
292+
293+
while s.check() == sat:
294+
m = s.model()
295+
print(''.join(chr(m[x].as_long()) for x in xs))
296+
s.add(Or([x != m.eval(x, model_completion=True) for x in xs]))
297+
```
298+
299+
#### Tactics for ugly bit-vector formulas
300+
301+
Z3's default solver is usually enough, but decompiler-generated formulas with lots of equalities and bit-vector rewrites often become easier after a first normalization pass. In those cases it can be useful to build a solver from tactics:
302+
303+
```python
304+
from z3 import *
305+
306+
t = Then('simplify', 'solve-eqs', 'bit-blast', 'sat')
307+
s = t.solver()
308+
```
309+
310+
This is specially helpful when the problem is almost entirely **bit-vector + Boolean logic** and you want Z3 to simplify and eliminate obvious equalities before handing the formula to the SAT backend.
311+
312+
#### CRCs and other custom checkers
313+
314+
Recent reversing challenges still use Z3 for constraints that are annoying to brute-force but straightforward to model, such as CRC32 checks over ASCII-only input, mixed rotate/xor/add pipelines, or many chained arithmetic predicates extracted from a JITed/obfuscated checker. For CRC-like problems, keep the state as bit-vectors and apply per-byte ASCII constraints early to shrink the search space.
315+
186316
## References
187317

188318
- [https://ericpony.github.io/z3py-tutorial/guide-examples.htm](https://ericpony.github.io/z3py-tutorial/guide-examples.htm)
319+
- [https://microsoft.github.io/z3guide/docs/theories/Bitvectors/](https://microsoft.github.io/z3guide/docs/theories/Bitvectors/)
320+
- [https://theory.stanford.edu/~nikolaj/programmingz3.html](https://theory.stanford.edu/~nikolaj/programmingz3.html)
189321

190322
{{#include ../../banners/hacktricks-training.md}}
191323

0 commit comments

Comments
 (0)