Skip to content

x86 instruction generator

Here's something amusing. I spent the first half of the day writing a short Haskell program which generates x86 instructions in MASM syntax. The program generates all variants of the non-privileged instructions from the opcodes.chm file of the MASM32 package. This means that the instruction generator is not complete at all. FPU, MMX, SSE and other newer-than-x486 instructions are not covered. Nevertheless the generator already generates nearly 150,000 different x86 instructions.

When assembled with MASM32 the resulting file is more than 600 KB big. Trying to disassemble this thing with a few standard disassemblers turns out to be a problem. IDA fails to disassemble an instruction after maybe 5% of the executable and never manages to recover afterwards. Lots of manual help is necessary to convince IDA to go on. OllyDBG manages to disassemble that instruction but has huge gaps at many, many other points of the disassembly. The created file is an interesting test file for x86 disassemblers I'd say.

The Haskell program is just about 300 lines long. 280 of those lines are the definitions of  the instructions and what operands they can take. The generation of the instructions from the instruction definitions is just 20 lines and all but 8 lines are not even strictly necessary. I love Haskell's expressiveness.

Anyway, click here to see the Haskell source or click here to download the whole package including the Haskell program (source + EXE), the generated output of the Haskell program, a MASM32 source file that can be used to assemble the test file, and the test file EXE itself.

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

bw on :

an interesting sample :), i guess the best results i get were from HIEW

Rolf Rolles on :

If you disable "make final analysis pass" in IDA's kernel options #1, "perform no-return analysis" in kernel options #2, and "disassemble zero opcode instructions" in processor options, IDA produces a proper listing, terminated only by jmp and ret/retn instructions. Whether the disassembly is correct is a different story; I haven't checked and wouldn't make any bets one way or the other.

Interesting work -- I note that IDA disassembles bound and imul (and a few other) instructions very slowly.

igorsk on :

If you have any issues with IDA disassembly, please send test files :-) I already made some improvements for next version and will check this file as well.

James on :

If you tell ollydbg not to "analyse" then it seems to not have any gaps, at a glance.

I think the analysis stops it displaying the ones with seemingly broken memory constants.

sp on :

Thank you for pointing this out. I need this file to be disassembled properly at some point in the next four weeks and I was ready to complain to Ilfak. :-)

sp on :

Good to know. I wonder whether this can be used by programs to screw with OllyDbg. Someone else should try that though, it's not what I am trying to do with the file.

sp on :

Yeah, that was the plan. I would have complained in time when I needed a proper disassembly of this file. :-)

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
BBCode format allowed
Form options

Submitted comments will be subject to moderation before being displayed.