Proof that F4I violates the GPL
Due to the importance of the latest discoveries, here's another update. For the first time I'm updating twice on one day. I'm sure you've already been waiting for some proof about the GPL infringement by F4I. This post contains it in the already well-known form of a comparison between the original C code and an annotated disassembly of the F4I binary. All C code is from the function DoShuffle from the file drms.c which is part of the VideoLAN project.
I want to mention though that I'm not going to explain all code because the function is pretty long. I've picked two parts of the function where it's easily recognizable that the two functions are basically the same (there's one tiny difference explained later). Nevertheless I'm going to provide the other parts of the function too, I just won't comment them. Here's the full disassembly.
I want to mention though that I'm not going to explain all code because the function is pretty long. I've picked two parts of the function where it's easily recognizable that the two functions are basically the same (there's one tiny difference explained later). Nevertheless I'm going to provide the other parts of the function too, I just won't comment them. Here's the full disassembly.
First: I'm sure all the code is going to break the HTML formatting. I'm not going to fix it though.
Let's start right at the beginning of the function where we can find the code the decrypts the ROT13-encrypted Apple copyright message.
Here's the C code.
At this point it would not actually necessary to go on. Matching non-trivial string literals, non-trivial tables and similar control-flow don't happen accidentaly. I nevertheless want to present another part of the function though.
The next part is from the very end of the function. Here comes the C code.
And here's the annotated disassembly.
Besides that line (and another part with exactly the same additional version check somewhere in the middle of the function) the functions are absolutely identical I think.
If you have trouble finding the two assignments right behind the continue in the loop with the switch-statement don't think those lines are missing in the disassembled code. There's some very tricky compiler optimization going on but the code is there.
At this point I think we're basically done. It can't get any better than finding GPL code. Sure, there are maybe other libraries illegaly used in the code too besides LAME, mpglib and VideoLAN but it's not my job to check for that. If you're affiliated with any decent open-source multi-media project you should maybe check yourself if you were ripped off too.
I'm going to write emails to 6 - 8 people who are in some way affiliated with these projects or open-source licenses in general. But before that I'm going to relax for a while. I think it's about time for taking a break.
By the way, it's possible that an old public domain id3lib library was used by F4I. That's why I didn't list it among the other libraries a few lines above. The potential "loss" of id3lib is neglectable though as there's still LAME and mpglib from the LGPL department.
Let's start right at the beginning of the function where we can find the code the decrypts the ROT13-encrypted Apple copyright message.
Here's the C code.
static uint32_t i_secret = 0; static uint32_t p_secret1[] = { 0xAAAAAAAA, 0x01757700, 0x00554580, 0x01724500, 0x00424580, 0x01427700, 0x00000080, 0xC1D59D01, 0x80144981, 0x815C8901, 0x80544981, 0x81D45D01, 0x00000080, 0x81A3BB03, 0x00A2AA82, 0x01A3BB03, 0x0022A282, 0x813BA202, 0x00000080, 0x6D575737, 0x4A5275A5, 0x6D525725, 0x4A5254A5, 0x6B725437, 0x00000080, 0xD5DDB938, 0x5455A092, 0x5D95A013, 0x4415A192, 0xC5DD393A, 0x00000080, 0x55555555 }; static char p_secret2[] = "pbclevtug (p) Nccyr Pbzchgre, Vap. Nyy Evtugf Erfreirq."; if( i_secret == 0 ) { REVERSE( p_secret1, sizeof(p_secret1)/sizeof(p_secret1[ 0 ]) ); for( ; p_secret2[ i_secret ] != '\0'; i_secret++ ) { #define ROT13(c) (((c)>='A'&&(c)<='Z')?(((c)-'A'+13)%26)+'A':\ ((c)>='a'&&(c)<='z')?(((c)-'a'+13)%26)+'a':c) p_secret2[ i_secret ] = ROT13(p_secret2[ i_secret ]); } i_secret++; / include zero terminator / }And here's the disassembly.
.text:10089E00 sub_10089E00 proc near ; CODE XREF: sub_1008A0C0+159p .text:10089E00 ; sub_1008A0C0+1F1p ... .text:10089E00 .text:10089E00 var_A8 = dword ptr -0A8h .text:10089E00 var_98 = dword ptr -98h .text:10089E00 var_94 = dword ptr -94h .text:10089E00 var_90 = dword ptr -90h .text:10089E00 var_8C = dword ptr -8Ch .text:10089E00 var_88 = dword ptr -88h .text:10089E00 var_84 = dword ptr -84h .text:10089E00 var_80 = dword ptr -80h .text:10089E00 var_50 = dword ptr -50h .text:10089E00 var_4C = dword ptr -4Ch .text:10089E00 var_3E = dword ptr -3Eh .text:10089E00 var_3A = dword ptr -3Ah .text:10089E00 p_shuffle = dword ptr 4 .text:10089E00 p_buffer = dword ptr 8 .text:10089E00 i_size = dword ptr 0Ch .text:10089E00 .text:10089E00 sub esp, 98h .text:10089E06 push ebx .text:10089E07 push ebp .text:10089E08 mov ebp, [esp+0A0h+p_shuffle] ; uint32_t p_bordel = p_shuffle->p_bordel; .text:10089E0F push esi .text:10089E10 push edi .text:10089E11 mov edi, i_secret .text:10089E17 test edi, edi .text:10089E19 lea esi, [ebp+54h] .text:10089E1C jnz short loc_10089E83 ; if (i_secret == 0) .text:10089E1E mov al, byte ptr aPbclevtugPNccyrPbzchgreVa ; "pbclevtug (p) Nccyr Pbzchgre, Vap. Nyy"... .text:10089E23 test al, al ; p_secret2[ i_secret ] != '\0' .text:10089E25 jz short loc_10089E7C .text:10089E27 .text:10089E27 The REVERSE is missing because it's a pre-processor macro .text:10089E27 that is defined toat platforms where the bytes don't .text:10089E27 need to be reversed. .text:10089E27 .text:10089E27 mov ecx, offset aPbclevtugPNccyrPbzchgreVa ; "pbclevtug (p) Nccyr Pbzchgre, Vap. Nyy"... .text:10089E2C lea esp, [esp+0] .text:10089E30 .text:10089E30 First line of the ROT13 code. .text:10089E30 .text:10089E30 loc_10089E30: ; CODE XREF: sub_10089E00+7Aj .text:10089E30 cmp al, 'A' .text:10089E32 jl short loc_10089E4B .text:10089E34 cmp al, 'Z' .text:10089E36 jg short loc_10089E4B .text:10089E38 movsx eax, al .text:10089E3B sub eax, '4' .text:10089E3E cdq .text:10089E3F mov ebx, 26 .text:10089E44 idiv ebx .text:10089E46 add edx, 'A' .text:10089E49 jmp short loc_10089E69 .text:10089E4B ; --------------------------------------------------------------------------- .text:10089E4B .text:10089E4B Second line of the ROT13 code. .text:10089E4B .text:10089E4B loc_10089E4B: ; CODE XREF: sub_10089E00+32j .text:10089E4B ; sub_10089E00+36j .text:10089E4B cmp al, 'a' .text:10089E4D jl short loc_10089E66 .text:10089E4F cmp al, 'z' .text:10089E51 jg short loc_10089E66 .text:10089E53 movsx eax, al .text:10089E56 sub eax, 'T' .text:10089E59 cdq .text:10089E5A mov ebx, 26 .text:10089E5F idiv ebx .text:10089E61 add edx, 'a' .text:10089E64 jmp short loc_10089E69 .text:10089E66 ; --------------------------------------------------------------------------- .text:10089E66 .text:10089E66 loc_10089E66: ; CODE XREF: sub_10089E00+4Dj .text:10089E66 ; sub_10089E00+51j .text:10089E66 movsx edx, al .text:10089E69 .text:10089E69 loc_10089E69: ; CODE XREF: sub_10089E00+49j .text:10089E69 ; sub_10089E00+64j .text:10089E69 inc edi ; i_secret++ (from the for-loop) .text:10089E6A mov [ecx], dl .text:10089E6C mov al, byte ptr aPbclevtugPNccyrPbzchgreVa[edi] ; "pbclevtug (p) Nccyr Pbzchgre, Vap. Nyy"... .text:10089E72 test al, al ; p_secret2[ i_secret ] != '\0'; (for-loop condition) .text:10089E74 lea ecx, aPbclevtugPNccyrPbzchgreVa[edi] ; "pbclevtug (p) Nccyr Pbzchgre, Vap. Nyy"... .text:10089E7A jnz short loc_10089E30 .text:10089E7C .text:10089E7C loc_10089E7C: ; CODE XREF: sub_10089E00+25j .text:10089E7C inc edi ; i_secret++ (from after the for loop) .text:10089E7D mov i_secret, edi .text:10089E83 .text:10089E83 loc_10089E83: ; CODE XREF: sub_10089E00+1Cj .text:10089E83 lea edx, [ebp+4] .text:10089E86 mov edi, 20 ; This is the 20 from the < 20 part of the next loop. .text:10089E8B jmp short loc_10089E90
At this point it would not actually necessary to go on. Matching non-trivial string literals, non-trivial tables and similar control-flow don't happen accidentaly. I nevertheless want to present another part of the function though.
The next part is from the very end of the function. Here comes the C code.
if( p_shuffle->i_version == 0x01000300 ) { DoExtShuffle( p_bordel ); } / Convert our newly randomised p_bordel to big endianness and take its MD5 hash. / InitMD5( &md5 ); for( i = 0; i < 16; i++ ) { p_big_bordel[ i ] = U32_AT(p_bordel + i); } AddMD5( &md5, (uint8_t )p_big_bordel, 64 ); if( p_shuffle->i_version == 0x01000300 ) { AddMD5( &md5, (uint8_t )p_secret1, sizeof(p_secret1) ); AddMD5( &md5, (uint8_t )p_secret2, i_secret ); } EndMD5( &md5 ); / XOR our buffer with the computed checksum / for( i = 0; i < i_size; i++ ) { p_buffer[ i ] ^= md5.p_digest[ i ]; }
And here's the annotated disassembly.
.text:10089F5F InitMD5 was inlined .text:10089F5F .text:10089F5F lea edi, [esp+0A8h+var_80] .text:10089F63 mov [esp+0A8h+var_90], 67452301h .text:10089F6B mov [esp+0A8h+var_8C], 0EFCDAB89h .text:10089F73 mov [esp+0A8h+var_88], 98BADCFEh .text:10089F7B mov [esp+0A8h+var_84], 10325476h .text:10089F83 rep stosd .text:10089F85 xor ecx, ecx .text:10089F87 lea edi, [esp+0A8h+var_3E] .text:10089F8B lea ebp, [esp+0A8h+var_3A] .text:10089F8F sub edi, esi .text:10089F91 mov [esp+0A8h+var_98], ecx .text:10089F95 mov [esp+0A8h+var_94], ecx .text:10089F99 lea eax, [esi+6] .text:10089F9C sub ebp, esi .text:10089F9E mov edi, edi .text:10089FA0 .text:10089FA0 for( i = 0; i < 16; i++ ) .text:10089FA0 .text:10089FA0 loc_10089FA0: ; CODE XREF: sub_10089E00+221j .text:10089FA0 xor edx, edx .text:10089FA2 .text:10089FA2 U32_AT is a preprocessor macro that's expanded to a function .text:10089FA2 which is then inlined. I've cut out a lot of bit-operations to make .text:10089FA2 the post smaller .text:10089FA2 .text:10089FA2 mov dh, [eax-6] ... snip ... .text:1008A027 lea eax, [esp+68h] .text:1008A02B push 64 .text:1008A02D push eax .text:1008A02E lea ebx, [esp+0B0h+var_98] .text:1008A032 call sub_10088200 ; AddMD5( &md5, (uint8_t )p_big_bordel, 64 ); .text:1008A037 mov ecx, [esp+0B0h+p_shuffle] .text:1008A03E mov eax, [ecx] .text:1008A040 add esp, 8 .text:1008A043 cmp eax, 1000300h .text:1008A048 jz short loc_1008A051 ; if( p_shuffle->i_version == 0x01000300 ) .text:1008A04A cmp eax, 1000400h .text:1008A04F jnz short loc_1008A078 ; Mystery line! .text:1008A051 .text:1008A051 loc_1008A051: ; CODE XREF: sub_10089E00+248j .text:1008A051 push 80h .text:1008A056 push offset unk_10108018 .text:1008A05B lea ebx, [esp+0B0h+var_98] .text:1008A05F call sub_10088200 ; AddMD5( &md5, (uint8_t )p_secret1, sizeof(p_secret1) ); .text:1008A064 mov edx, i_secret .text:1008A06A push edx .text:1008A06B push offset aPbclevtugPNccyrPbzchgreVa ; "pbclevtug (p) Nccyr Pbzchgre, Vap. Nyy"... .text:1008A070 call sub_10088200 ; AddMD5( &md5, (uint8_t )p_secret2, i_secret ); .text:1008A075 add esp, 10h .text:1008A078 .text:1008A078 loc_1008A078: ; CODE XREF: sub_10089E00+24Fj .text:1008A078 lea esi, [esp+0A8h+var_98] .text:1008A07C call sub_10088320 ; EndMD5( &md5 ); .text:1008A081 mov edx, [esp+0A8h+i_size] .text:1008A088 test edx, edx .text:1008A08A jbe short loc_1008A0AF ; This is the initial for-loop condition check .text:1008A08C mov eax, [esp+0A8h+p_buffer] .text:1008A093 lea ecx, [esp+0A8h+var_90] .text:1008A097 sub ecx, eax .text:1008A099 lea esp, [esp+0] .text:1008A0A0 .text:1008A0A0 for( i = 0; i < i_size; i++ ) .text:1008A0A0 .text:1008A0A0 loc_1008A0A0: ; CODE XREF: sub_10089E00+2ADj .text:1008A0A0 mov ebx, [eax] .text:1008A0A2 mov esi, [ecx+eax] .text:1008A0A5 xor ebx, esi .text:1008A0A7 mov [eax], ebx ; p_buffer[ i ] ^= md5.p_digest[ i ]; .text:1008A0A9 add eax, 4 .text:1008A0AC dec edx .text:1008A0AD jnz short loc_1008A0A0 .text:1008A0AF .text:1008A0AF loc_1008A0AF: ; CODE XREF: sub_10089E00+28Aj .text:1008A0AF pop edi .text:1008A0B0 pop esi .text:1008A0B1 pop ebp .text:1008A0B2 pop ebx .text:1008A0B3 add esp, 98h .text:1008A0B9 retn .text:1008A0B9 sub_10089E00 endpCheck the line with the comment "Mystery line!". There's an additional version check of some sort in the assembler code that's not part of the C code. There are at least four possible explanations: 1) The C code is older than the assembler code; 2) The C code is newer than the assembler code; 3) The assembler code wasn't generated from VideoLAN code but from a project that uses VideoLAN code and changed that line; 4) The F4I developers added the line. Maybe one of our readers knows the answer.
Besides that line (and another part with exactly the same additional version check somewhere in the middle of the function) the functions are absolutely identical I think.
If you have trouble finding the two assignments right behind the continue in the loop with the switch-statement don't think those lines are missing in the disassembled code. There's some very tricky compiler optimization going on but the code is there.
At this point I think we're basically done. It can't get any better than finding GPL code. Sure, there are maybe other libraries illegaly used in the code too besides LAME, mpglib and VideoLAN but it's not my job to check for that. If you're affiliated with any decent open-source multi-media project you should maybe check yourself if you were ripped off too.
I'm going to write emails to 6 - 8 people who are in some way affiliated with these projects or open-source licenses in general. But before that I'm going to relax for a while. I think it's about time for taking a break.
By the way, it's possible that an old public domain id3lib library was used by F4I. That's why I didn't list it among the other libraries a few lines above. The potential "loss" of id3lib is neglectable though as there's still LAME and mpglib from the LGPL department.
Trackbacks
Ansel's Brain on : Ironic turn of events in Sony DRM scandal
Show preview
In the latest turn of events, it appears that First4Internet's [http://en.wikipedia.org/wiki/Digital_rights_management DRM] software may have violate the license terms of DRM-busting software authored by [http://en.wikipedia.org/wiki/Jon_Lech_Johansen ...
kirjoittaessani.twoday.net on : XCP
Show preview
Heise berichtet unter Berufung auf The Interweb, da
Comments
Display comments as Linear | Threaded
Dana Cline on :
D3z on :
It seems that the 'mystery line' was added to handle a version check and probably some remote auto update of the code... am i wrong?
Ben Fowler on :
Too difficult for commerical programmers?
See
history-of-a-gpl-violation about two fifths of the way down.
gpierce on :
Albert on :
Even with a greedy lawyer and a bad settlement, this makes the LGPL and GPL authors quite rich. They can easily retire in comfort, at Sony's expense.
Go for it. The lawyer can accept payment as a portion of the winnings.
Everybody: please help find and contact the authors so they know.
I dearly wish Sony would steal my LGPL code too. I feel so left out.
Paolo Bonzini on :