Sunday, November 27. 2005It's quine time
Quines: In computing, a quine is a program (a form of metaprogram) that produces its complete source code as its only output. For amusement, hackers sometimes attempt to develop the shortest possible quine in any given programming language.
Recently I've had the pleasure of writing two of them. Here's my Ruby quine. I swear it's a total coincidence that it nearly looks like one I found on this website. t="t=%c%s%c;printf t,34,t,34";printf t,34,t,34 And I quickly ported it to Java too. Thanks to the new powers of Java 1.5 this one's shorter than any Java quine you can find on this website. class Q { public static void main(String[] a) { String t = "class Q { \ public static void main(String[] a) { String t = %c%s%c; \ System.out.printf(t, 34, t, 34); } }"; System.out.printf(t, 34, t, 34); } } Both of these quines only work on operating systems working with ASCII because I (ab-)use the fact that 34 is the ASCII code for the character ". Thursday, November 17. 2005Two new F4I license infringements found
Third update for today! I swear I'm not making this stuff up but we've found two additional potential license infringements.
Rolf from Sabre Security was kind enough to point out that we had missed a giant copyright string. 000C48C0 4641 4143 202D 2046 7265 6577 6172 6520 FAAC - Freeware 000C48D0 4164 7661 6E63 6564 2041 7564 696F 2043 Advanced Audio C 000C48E0 6F64 6572 2028 6874 7470 3A2F 2F77 7777 oder (http://www 000C48F0 2E61 7564 696F 636F 6469 6E67 2E63 6F6D .audiocoding.com 000C4900 2F29 0A20 436F 7079 7269 6768 7420 2843 /). Copyright (C 000C4910 2920 3139 3939 2C32 3030 302C 3230 3031 ) 1999,2000,2001 000C4920 2020 4D65 6E6E 6F20 4261 6B6B 6572 0A20 Menno Bakker. 000C4930 436F 7079 7269 6768 7420 2843 2920 3230 Copyright (C) 20 000C4940 3032 2C32 3030 3320 204B 727A 7973 7A74 02,2003 Krzyszt 000C4950 6F66 204E 696B 6965 6C0A 5468 6973 2073 of Nikiel.This s 000C4960 6F66 7477 6172 6520 6973 2062 6173 6564 oftware is based 000C4970 206F 6E20 7468 6520 4953 4F20 4D50 4547 on the ISO MPEG 000C4980 2D34 2072 6566 6572 656E 6365 2073 6F75 -4 reference sou 000C4990 7263 6520 636F 6465 2E0A 0000 312E 3234 rce code....1.24 Yeah. Apparently FAAC code was used too. I positively identified several functions myself. For starters: The function at virtual offset 0x1007BA80 is known as WriteFAACStr in the file bitstream.c of the FAAC project. You can work yourself through other FAAC functions from there. I don't know for sure if that's GPL or LGPL. I think it's LGPL though. And while we're at it. Matti found mpg123 references. In his opinion this is how the mpglib code made it into the OCX. It still needs to be determined if there's more mpg123 code in the OCX except the mpglib stuff. If that's the case another GPL infringement can be added to the list. Thursday, November 17. 2005Proof that F4I violates the GPL
Due to the importance of the latest discoveries, here's another update. For the first time I'm updating twice on one day. I'm sure you've already been waiting for some proof about the GPL infringement by F4I. This post contains it in the already well-known form of a comparison between the original C code and an annotated disassembly of the F4I binary. All C code is from the function DoShuffle from the file drms.c which is part of the VideoLAN project.
I want to mention though that I'm not going to explain all code because the function is pretty long. I've picked two parts of the function where it's easily recognizable that the two functions are basically the same (there's one tiny difference explained later). Nevertheless I'm going to provide the other parts of the function too, I just won't comment them. Here's the full disassembly. Continue reading "Proof that F4I violates the GPL" Wednesday, November 16. 2005Is F4I in violation of the LGPL? - Part III
"Frank" posted the following comment to my last update.
"What does this code fragment do? It accesses data that is well-defined by an open specification. There is little leeway for a software developer to do things differently. So far, this may be coincidence, or a case of developers being "inspired" by looking at other source code -- like the famous "stolen" Unix code fragments in Linux. This is an invitation for more research, yes, but I don't see a smoking gun just yet." This is a valid concern and I wanted to address this anyway, so let's do it right here. I think it's important enough to write a new update instead of just replying to Frank. I've produced a complete annotated disassembly of the function in question. It matches the function from the LAME code 99%. I say 99% because there are differences which can be reasonably explained by common compiler optimization techniques. I've mentioned these techniques in the annotated disassembly where appropriate. If the 99%/100% match of a 90 lines C function is a coincidence it goes beyond what I'm capable to detect using my tools. Click here for the LAME source code. Click here for the annotated disassembly. Edit: I want to ask the people familiar with the LAME function in question something. Look at the following four lines from the LAME source code: if( buf[0] != VBRTag[0] && buf[0] != VBRTag2[0] ) return 0; / fail / if( buf[1] != VBRTag[1] && buf[1] != VBRTag2[1]) return 0; /* header not found*/ if( buf[2] != VBRTag[2] && buf[2] != VBRTag2[2]) return 0; if( buf[3] != VBRTag[3] && buf[3] != VBRTag2[3]) return 0; This piece of code attempts to check if buf contains 'Xing' or 'Info'. I don't know about the underlying data structures but the way it checks looks wrong to me. This piece of code passes if buf is a combination of 'Xing' and 'Info', like 'Iing' or 'Xnfo' which is probably not the desired functionality. Can anybody confirm that this is a bug or is not a bug? Because if it's a bug it's also part of the F4I code. This would solidify the assumption that the code was stolen from LAME even more. Edit: OK guys, I'm going to proclaim victory now as I've found undeniable proof that this match is not a coincidence. I just took the the functions GetVbrTag and ExtractI4 from the LAME code and compiled them myself using the freely available Visual C++ 2003 commandline tools. The only compiler parameter I used is /Ox to turn on maximum optimizations. The resulting code is byte for byte the code from the F4I OCX file. Including all my correctly predicted compiler optimizations (function inlining, if-clause merging, operation re-ordering, ...). Click here to see the disassembly of my own compiled version of the LAME code. It matches the disassembly posted earlier perfectly. Note that I didn't bother to re-name variables or to insert comments. Tuesday, November 15. 2005Is Sony in violation of the LGPL? - Part II
Hey,
I'm sure you've been waiting for updates that prove what we're talking about. Here it comes. I want to talk about the file ECDPlayerControl.ocx which the fanstastic muzzy found yesterday while I had nothing better to do than to listen to my pillow. It uses LAME code (and code from at least one other LGPL library). At virtual offset 0x100607D0 you can find a function that's called GetVbrTag in the LAME source code (it can be found in the file VbrTag.c). Here's some code straight from the LAME source code (it's only the first part of the function, I don't want this post to get too long): int GetVbrTag(VBRTAGDATA *pTagData, unsigned char buf) { int i, head_flags; int h_bitrate,h_id, h_mode, h_sr_index; int enc_delay,enc_padding; / get Vbr header data / pTagData->flags = 0; / get selected MPEG header data / h_id = (buf[1] >> 3) & 1; h_sr_index = (buf[2] >> 2) & 3; h_mode = (buf[3] >> 6) & 3; h_bitrate = ((buf[2]>>4)&0xf); h_bitrate = bitrate_table[h_id][h_bitrate]; / check for FFE syncword */ if ((buf[1]>>4)==0xE) pTagData->samprate = samplerate_table[2][h_sr_index]; else pTagData->samprate = samplerate_table[h_id][h_sr_index]; ...Now compare this to the disassembly. You can easily spot "pTagData->flags = 0", the three shifts, the array access and the if-comparison (although the code was a bit optimized by the compiler). To make it easier, here's the flow-chart diagram of the function too. .text:100607D0 GetVbrTag proc near ; CODE XREF: sub_10059240+77p .text:100607D0 .text:100607D0 arg_0 = dword ptr 4 .text:100607D0 arg_4 = dword ptr 8 .text:100607D0 .text:100607D0 mov ecx, [esp+arg_4] .text:100607D4 push ebx .text:100607D5 push ebp .text:100607D6 push esi .text:100607D7 xor eax, eax .text:100607D9 push edi .text:100607DA mov edi, [esp+10h+arg_0] .text:100607DE mov dword ptr [edi+8], 0 .text:100607E5 mov dl, [ecx+1] .text:100607E8 movzx ebx, byte ptr [ecx+3] .text:100607EC mov al, dl .text:100607EE and dl, 0F0h .text:100607F1 shr ebx, 6 .text:100607F4 shr eax, 3 .text:100607F7 and eax, 1 .text:100607FA mov ebp, eax .text:100607FC movzx eax, byte ptr [ecx+2] .text:10060800 mov esi, eax .text:10060802 mov [esp+10h+arg_0], ebp .text:10060806 shr eax, 4 .text:10060809 shl ebp, 4 .text:1006080C add eax, ebp .text:1006080E mov eax, ds:bitrate_table[eax*4] .text:10060815 mov ebp, [esp+10h+arg_0] .text:10060819 shr esi, 2 .text:1006081C and esi, 3 .text:1006081F cmp dl, 0E0h .text:10060822 mov [esp+10h+arg_4], eax .text:10060826 jnz short loc_10060831 .text:10060828 mov edx, ds:samplerate_table2[esi*4] .text:1006082F jmp short loc_1006083BI think you agree with me that this is a clear case. I also want to mention that the entire id3lib library (also LGPL software) is in the file too. Thankfully id3lib is written in C++ and not in C and therefore finding matches is significantly faster as the original function names are part of the binary files (thanks for the debug build too). Just click this link to see some of the id3lib functions in the file. I want to summarize what we have and raise a few questions at this point: - The LGPL is not mentioned on the CD. - That means no copyright notice as the LGPL demands either. - Does an OCX qualify as a linked library? Probably. But I am not able to re-create the OCX file because it contains at least two LGPL libraries and additional (probably proprietary) code. Is this necessary to be LGPL-compliant? (see the end of the article). - Are the files part of this LGPL-licensed software by Sony? Does that have any effect on the legality of the OCX? The first two points would still stand. All this legalese is killing me, I can only report on the code. I think we've reached a certain point where it's time to take a break. We've definitely found LGPL code in the software. Now it's time for the license gurus to find out if that constitutes a license violation. Until this is cleared up I think I'm going to do something else. Edit: There are differences in opinion about what constitutes a LGPL infringement. Wikipedia says "Essentially, it must be possible for the software to be linked with a newer version of the LGPL-covered program. The most commonly used method for doing so is to use "a suitable shared library mechanism for linking". Alternatively, static linking is allowed if either source code or linkable object files are provided." Does that mean everybody must be able to recreate the OCX file? Or what does that mean? Monday, November 14. 2005Is Sony in violation of the LGPL?
Update: Click here
I'm sure you've already heard about the Sony rootkit that was first revealed by Mark Russinovich of Sysinternals. After the Finnish hacker Matti Nikki (aka muzzy) found some revealing strings in one of the files (go.exe) that are part of the copy protection software, the rootkit is also suspected to be in violation of the open-source license LGPL. The strings indicate that code from the open-source project LAME was used in the copy protection software in a way that's not compatible with the LGPL license which is used by LAME. On Slashot muzzy mentioned that he doesn't have access to Sabre BinDiff, a tool that can be used to compare binary files. I was in the opposite position as I have BinDiff but I didn't have the file in question (go.exe). I mailed muzzy and he hooked me up with the file. I compared go.exe with a VC++-compiled version of lame_enc.dll but unfortunately BinDiff didn't find a single relevant matched function. A quick manual check didn't reveal any LAME functions in go.exe either. Even though go.exe apparently does not contain any LAME code, a considerable amount of tables and constants from the LAME source files can be found in the go.exe file. Here's a list of the LAME tables I've been able to locate. The first column shows the hex address where the table can be found in the go.exe file, the second column shows the name of the table as it appears in the LAME source code and the third column shows the LAME source file where the table can be found. I have to add though, that not a single table actually seems to be used by the go.exe code. What does that mean? I've asked random people and I've heard speculation ranging between "accidentaly linked" and "encrypted code in go.exe that uses the tables and can't be found in the disassembler". Further analysis needs to be made but at this point I'm leaning towards more or less accidental inclusion. Friday, November 11. 2005Generating a word list from Wikipedia
Holy shit, a site update! And after only 6 weeks too! Great.
This update is mainly a small program that shows how to parse huge XML files (about 3.5 GB) with C#. Recently I needed a giant word list and all word lists I found on the internet were very unsatisfactory. Therefore I decided to make my own one and the best source for words right now is probably Wikipedia (which you can thankfully download in XML format). No, I didn't need that word list for a dictionary attack on some unsuspecting victim. Let's just pretend I was inspired by this flash movie and I wanted to find out what the highest scoring Scrabble words are. Unfortunately that Scrabble program is on hold right now because I realized that I've never actually played Scrabble (except for like 10 games with a Shareware game while I developed my program) and there were some discrepancies between Scrabble score lists available online and the results I calculated. Now I don't know if I'm wrong or if they are wrong as I'm not really familiar with Scrabble rules at this point. Continue reading "Generating a word list from Wikipedia" |
CalendarQuicksearchArchivesContact
Links
Errorserendipity error: could not include serendipity_plugin_topexits:9e394f6ce1233c944505729bbd323460 - exiting.
Blog AdministrationPowered byCategories |