Skip to content

It's the little surprises ...

It's the little surprises that make programming so interesting. In chapter 14 ( Pragmatic Paranoia ) of the Pragmatic Programmer the authors Andrew Hunt and David Thomas teach the reader that assuming things about code is a bad idea (one example they give is about minutes which have more or less than 60 seconds). You should double and triple check every assumption you make and then let the program check your assumption again with an assert statement.

Yesterday one of these assumptions cost me about 90 minutes. What do you think is the result of the expression 0xFFFFFFFF12345678 & 0xFFFFFFFF where & is the operator of the bit-wise AND operation?

Well, let's ask Java.

OK, Java says the result is 0x12345678. That was my guess too. What does C# say about this issue?

Another vote for 0x12345678. Let's ask Ruby now.

Wow, the answer 0x12345678 seems to pretty popular. OK, one more. Python.

Python agrees with the other languages. 0xFFFFFFFF12345678 & 0xFFFFFFFF is 0x12345678. But what about Jython, the language I used yesterday?

Yeah, awesome. That's just awesome. Jython says that the bit-wise AND of 0xFFFFFFFF12345678 and 0xFFFFFFFF is 0xFFFFFFFF12345678. After I played around with this for a while I've come to the following conclusion. 0xFFFFFFFF is a signed integer which is sign-extended to the signed long value 0xFFFFFFFFFFFFFFFF before the bit-wise AND operation.

The lesson I learned yesterday is to never assume that the most basic operations of a language actually work as expected. If you actually think about it, it probably makes sense to sign-extend the signed int to a signed long before performing the operation. After all for most other operations it's probably the right thing to do. If you want to clear the upper 32 bits of a 64 bits value though you're going to run into problems.

By the way, the proper way to do this in Jython is to explicitely mark the 0xFFFFFFFF as long (0xFFFFFFFFl).

Edit: And as I found out 5 seconds after posting, the behaviour is probably inherited from Java. Using longs instead of BigInteger causes the same behaviour.

Trackbacks

No Trackbacks

Comments

Display comments as Linear | Threaded

Massimo on :

A little error.

they are both unsigned as they are expressed in hex.
However, C always perform conversion to the biggest operand of the two, and return a result in the same fashion.
So, int64&int32==int64.
The 'bug' is in other languages, as C casting is ok.

Massimo on :

I forgot to add:

conversion is performed with standard operators, see
http://faydoc.tripod.com/cpu/cdq.htm
for example.

Regards,

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
BBCode format allowed
Form options

Submitted comments will be subject to moderation before being displayed.