Friday, December 12. 2008Chaos Communication Congress 2008
I am going to attend the Chaos Communication Congress 2008 in two weeks. People who want to meet me there to talk about BinNavi, Hexer, static code analysis, reverse engineering in general, or why the movie Hackers is more realistic than most people think, please contact me using one of the options you can find on the navigation bar on the right side of this page.
Saturday, November 15. 2008x86 instruction generatorHere's something amusing. I spent the first half of the day writing a short Haskell program which generates x86 instructions in MASM syntax. The program generates all variants of the non-privileged instructions from the opcodes.chm file of the MASM32 package. This means that the instruction generator is not complete at all. FPU, MMX, SSE and other newer-than-x486 instructions are not covered. Nevertheless the generator already generates nearly 150,000 different x86 instructions. When assembled with MASM32 the resulting file is more than 600 KB big. Trying to disassemble this thing with a few standard disassemblers turns out to be a problem. IDA fails to disassemble an instruction after maybe 5% of the executable and never manages to recover afterwards. Lots of manual help is necessary to convince IDA to go on. OllyDBG manages to disassemble that instruction but has huge gaps at many, many other points of the disassembly. The created file is an interesting test file for x86 disassemblers I'd say. The Haskell program is just about 300 lines long. 280 of those lines are the definitions of the instructions and what operands they can take. The generation of the instructions from the instruction definitions is just 20 lines and all but 8 lines are not even strictly necessary. I love Haskell's expressiveness. Anyway, click here to see the Haskell source or click here to download the whole package including the Haskell program (source + EXE), the generated output of the Haskell program, a MASM32 source file that can be used to assemble the test file, and the test file EXE itself. Wednesday, October 15. 2008hack.lu 2008Next week I'm going to attend this year's hack.lu conference in Luxembourg. So if anybody else who attends hack.lu wants to talk about the new version of BinNavi, Hexer, static code analysis, reverse engineering in general, or some kind of other topic contact me please (see the right side bar for contact options). Saturday, July 12. 2008Some Win32 API usage statisticsYesterday I saw a talk given by Frank Boldewin where he mentioned the FreeIconList trick to fool code emulators. At this point I started to wonder what other Win32 API functions are basically unused. Using Ero Carrera's Python library pefile to parse PE files I wrote a small Python script that tries to find out what Win32 API are basically unused. The modus operandi was simple. I read the exported functions of all DLL files in WindowsDir and WindowsDir/system32 and compared them to the functions imported by all EXE/DLL files in WindowsDir, WindowsDir/system32 and my entire Program Files directory. The first result is that most exported functions are apparently basically never used. My script managed to find 127569 exported functions in 1225 DLL files. 104608 of those are never used by the 6615 EXE/DLL files which import functions ("used" is liberally defined as "imported through the import directory" here, of course). That leaves 22961 functions which are actually used. Here are some output files which show the exported DLL functions sorted by their usage. The numeric column contains the number of PE files which import the function statically. That means that 3475 of the 6615 files I tested import GetLastError for example.
Random notes
Click here to download the Python script. Tuesday, June 10. 2008RECON 2008Unless something goes spectacularly wrong I'll be in Montreal from next Thursday to Sunday to attend RECON 2008. If you wanna do one or more of the following things, make sure to meet me at some point:
I'll be the guy with the yellow Sabre Security bag. Alternatively you can shoot me an email or ask around. Thursday, December 13. 2007Pair Reverse Engineering
Two days ago I had the pleasant experience to participate in some kind of informal reverse engineering session with three other guys. Between dinner and way too long after midnight we debugged a popular piece of malware that is floating around the internet right now. The first guy already reverse-engineered an earlier version of the malware. He was the guy in charge who did most of the debugging. The second guy was the author of a program that monitors and logs the behaviour of processes, especially malware processes. The goal of the session was to find out why the malware sample worked perfectly in VMWare (after we patched out the VMWare check, at least) but crashed as soon as second guy's monitoring tool was active. The third guy was very familiar with the malware too but on a higher level (behaviour, network activity, how it spreads, its historic development and usage, ...). I was the fourth guy. Without a direct interest in the malware or the malware monitoring tool I just wanted to see what goes wrong. Furthermore I was the guy for snarky comments from the background like "see, I told you take a VMWare snapshot before stepping over that call".
Anyway, so much for the introduction. This was not the first time I debugged binaries with someone else, but in the past I always had the keyboard. This time I staid in the background and observed what happened. Primarily a software developer and only a hobbyist reverse engineer, I compared what I saw to pair programming where two people sit in front of the same computer and write code together. While I believe that pair programming is at least moderately useful, I got the impression that there are serious problems with pair reverse engineering (or quad reverse engineering). Continue reading "Pair Reverse Engineering" Thursday, November 1. 2007A brief analysis of 40,000 leaked MySpace passwords
Over the last days some group released passwords to nearly 45000 MySpace accounts and they announced to release another 30000 passwords in the next few days. I used a few hours before Saturday's lunch to write a small program that analyzes the passwords that were released so far.
At worst the results of this are a useless time-filler, at best it's a case study of what happens if a website forces their users to choose passwords with certain minimum requirements. MySpace demands that every password contains at least one non-alphabetical character (like 0, 1, 2, or !, ?, @). How the users adhered to this requirement can be seen in the tables below. It is my understanding that the 43713 passwords that were leaked so far come from fishing sites that trapped people to enter their password. This makes the passwords less reliable than a password list hacked straight from the MySpace servers. People could have misspelled their MySpace passwords or they could have entered fake information after they noticed that someone was trying to steal their password. A quick analysis has shown that probably less than 1% of the leaked passwords suffer from these problems. Continue reading "A brief analysis of 40,000 leaked MySpace passwords" Wednesday, October 10. 2007I finished college, yayIf you have ever wondered about things like "gee, why does that sp guy never update his site anymore" I have some good news. I was busy being a grad student. At least until 27 minutes ago when I finished the presentation of my thesis and answered the last pesky question I was asked about it by a professor. Being a grad student turned out to be way more demanding than being an undergrad. Less time to slack off led to less site updates. A shocking concept, I know. Anyway, unless some kind of nightmare happens like Administration Guy: Hey sp, what was your second elective? I will soon receive a nifty little Master of Science (in Computer Science) diploma. It will be my second most important formal proof of qualification, topped only by my beloved Windows 95 Power User certificate I received from Brainbench like a decade ago (it looks like this but I don't have a pic of my own one here). Well, that gave me an excuse to write a site update. Time to slack off a bit now. Between moving out of my student appartment, doing a lot of administrative stuff at college, visiting a bunch of people (potentially) for the last time, and finding a job, I probably won't make another site update in the next 2-3 weeks. But after I'm done with all of that, my goal is to write more site updates again. Monday, January 22. 2007Data-mining Wikipedia II
Here are some more details about the program I used to create some graphs from Wikipedia two days ago. The C# source code is now available. The program takes five command-line parameters.
The key to cool graphs is to choose a keyword that has lots of articles which nevertheless belong closely together. An example for a bad keyword is "Mathematics". There are thousands of math-related articles in Wikipedia but they don't belong closely together because math is a huge and fragmented field. The resulting graphs of keywords like math degenerate into trees or unconnected subgraphs. Generating a graph takes approximately 5 minutes on my computer. In most cases nearly all the time is spent on parsing the 8 GB XML file. Generating the actual graph is nearly always a matter of seconds. Only for keywords like Germany or America which have some ten-thousand relevant articles generating the graph takes a few more minutes. Saturday, January 20. 2007Data-mining Wikipedia
Recently I finished Ian Shaw's book The Oxford History of Ancient Egypt. It was pretty interesting but it introduces more names per page than your average Tolkien book. I really could have used a chronologically ordered graph that shows names of important people and places and how they are connected. Creating a graph like that manually is obviously too much work so I tried to use the power of Wikipedia to create a graph automatically. My original plan didn't quite work out. Apparently the problem of data-mining from texts written in a natural language like the English language can't be solved in a few hours on a lazy saturday afternoon. At least not by me. Nevertheless I managed to get some interesting partial results.
Continue reading "Data-mining Wikipedia" Friday, January 19. 2007Pythia 1.1
Here's a new version of my Delphi obfuscator Pythia. The main reason for the update was a necessary bugfix. Version 1.0 incorrectly obfuscated some files with forms that contained complex controls (that is controls like Panels which consist of several subcontrols). I've added two additional command-line options too. The first option prints information about the input file (Click to see an example), the second option prints information about the obfuscation process (Click to see an example).
Here's a short example of what Pythia does. Click here to see a screenshot of the information that can be gathered about a typical Delphi file. Click here to see the same file after obfuscation. Curiously enough the obfuscated file is actually a Borland C++ Builder file. Pythia can handle BCB++ files too because the underlying VCL model is the same in BCB++ and in Delphi. Click here to download Pythia 1.1 Wednesday, December 13. 2006It's the little surprises ...
It's the little surprises that make programming so interesting. In chapter 14 ( Pragmatic Paranoia ) of the Pragmatic Programmer the authors Andrew Hunt and David Thomas teach the reader that assuming things about code is a bad idea (one example they give is about minutes which have more or less than 60 seconds). You should double and triple check every assumption you make and then let the program check your assumption again with an assert statement.
Yesterday one of these assumptions cost me about 90 minutes. What do you think is the result of the expression 0xFFFFFFFF12345678 & 0xFFFFFFFF where & is the operator of the bit-wise AND operation? Continue reading "It's the little surprises ..." Tuesday, December 12. 2006Some thoughts on freshmen programming classes
My college career is slowly coming to an end. If everything is going according to the plan I am going to graduate with a Master of Science degree in Computer Science at the end of the next semester. During my time at college I worked as a TA for five classes and I will probably work as a TA for a sixth class next semester. Four of the five classes were Intro to C++ classes taught to 4th semester students. The other class was an Intro to Programming class taught to freshmen.
During the last few years I was in the role of the student and I was in the role of the teacher in programming classes. In that time I developed opinions about the curriculum and pretty much everything that went on in class. I already promised some of my teachers that I am going to write down everything once I hold my degree in my hands. Every complaint and every suggestion will be neatly categorized in a visually appealing LaTeX document. That's the plan at least. One of the suggestions I'm planning to write down is the way introductory programming classes are taught and especially the programming language used in class. The style of introductory programming classes and the programming languages taught in college are certainly popular topics to complain about on the Internet. Joel Spolsky decries JavaSchools where graduates know neither pointers nor recursion. Dan Zambonini touches on the topic as part of his idea of a decent CS curriculum (the article started a Slashdot discussion with 654 comments too). On Digg someone raised a comparable question and received 154 answers. And once in a while you can hear surprising news like MIT getting rid of Scheme and replacing it with Python in undergraduate classes. Continue reading "Some thoughts on freshmen programming classes" Monday, November 20. 2006Converting strings to upper case is tricky
While I wrote my last blog entry I remembered another curiosity about localization which I like to mention once in a while in various exciting places on the Internet. Converting strings between upper case and lower case is difficult. It is in fact so difficult that I don't think software will ever get it perfectly right.
Here is the example I like to give. The German word for street is Straße. The ß character (called Eszett or sharp S) is a curious character. It only exists in lower case form. There is no corresponding upper case character for it. So how do we capitalize the word Straße? The grammatically correct capitalization of the word Straße is STRASSE. The lower case ß character is mapped to two upper case S characters. Continue reading "Converting strings to upper case is tricky"Saturday, November 18. 2006A plea for unique error codesA few days ago Elliotte Rusty Harold blogged about The Downside of Localization. He gave the example of the Xerces XML parser and how error messages are now available in various languages. Mr Harold correctly recognized that this could introduce a new problem: Developers working with localized error messages are less likely to find a solution for their problem on Google. "Paste an English exception message into Google, and you’ll probably find 10 people who have already had and solved your problem. But the same message in Greek or Croatian? Maybe not." I wrote a comment on his blog mentioning that the problem could easily be solved if only error messages contained a unique error code. That error code could then be used in Google queries. In a reply to my comment someone pointed out that ideas like that are rarely talked about. Now I'm not an expert on localization, I've localized exactly one commercial application in my life (a small web application based on JSP/Servlets I wrote during a summer internship three years ago). Nevertheless I like to read blogs and websites about localization efforts and therefore I know that proper use of Unicode characters is shockingly difficult, not all cultures read and write from left to right and therefore you better watch out how you design your GUI, and that not all countries agree on the exact position of national borders and people might boycott your product if the localized version of your software includes the wrong map. I think this qualifies me to write something about the topic on my site. Continue reading "A plea for unique error codes" |
Calendar
QuicksearchArchivesContact
Links
Errorserendipity error: could not include serendipity_plugin_topexits:9e394f6ce1233c944505729bbd323460 - exiting.
Blog AdministrationPowered byCategories |