Sunday, December 14, 2008

The dread -lcrt0.o error on Mac OS X

I periodically run into this problem compiling software on the mac:

ld_classic: can't locate file for: -lcrt0.o
collect2: ld returned 1 exit status
make: *** [muscle] Error 1

In this case, I was compiling MUSCLE, a multiple alignment program.

The solution is to track the -static flag down in your Makefile, and remove it. GCC/OSX does NOT like -static.

Saturday, December 13, 2008

On the color of bits

Programmers and mathematicians tend to have a hard time with Intellectual Property law. and Copyright in particular. Fundamentally, when speaking of information, lawyers and math types aren't speaking the same language. This essay manages to capture the difference better than anything I've read: What Colour are your bits?

Tuesday, November 25, 2008

A Minor Quibble with Javascript, or: Why Automatic Type Coercion is Evil

Earlier today, I posted a question to mozilla.dev.tech.js-engine, to whit:

isNaN( [1] ) = false. Why?


And got the answer I feared: it's in the spec!

As best I can follow, the spec lays out this sequence for determining whether [1] is, in fact, a number:

isNaN( [1] ) -> ToNumber( [1] ) -> ToPrimitive( [1] ) -> [1].toString() -> "1"

And "1" is coerced into a number.

Automatic coercion, of which this is a rather contorted example, produces counter-intuitive results every time. Language designers should never, ever make it core to their language. That is all.

Tuesday, October 28, 2008

Berserk, ep 25

i finished the anime treatment of the Berserk manga last night. overall, it's good: well-drawn and voiced, and full of those morally ambiguities that make for interesting drama.

but the last episode felt more than anything like a bored
father who's gotten sick of telling his kids a bedtime story and
decides to finish it, and finish it fast:

"so then the broken Prince takes was reunited with his loyal soldiers
and fled to a field. and then, uhh, they were all eaten by monsters.
yeah. the end. now GO TO SLEEP!"

the Berserk Manga wikipedia entry suggests a much lengthier, more baroque plot arc centering on an ongoing battle between the survivors of the last anime episode (ie, Guts) and demons, but that doesn't change the fact that supernatural occurrences were pretty subtle until the last couple of episodes, making the series finale pretty jarring.

so i'll echo my brother's advice: if you decide to rent Berserk, you'll sleep better if you stop at episode 19.

Wednesday, October 15, 2008

The limits of Open Source

One of the arguments made against Open Source in the bad old days, back before IBM started making Real Money on Linux and Microsoft's attitude became schizophrenic, was that Open Source developers don't come up w/ truly new ideas, but simply re-implement old ones.

Hogwash, but I can think of two technologies we should have thought up first:

Dtrace
ZFS

Just saying.

Wednesday, September 24, 2008

Reading the iPhone

The iPhone has a wonderfully clear, hi-rez screen, and seems a perfectly usable platform for ebooks -- not as good as the Kindle, but better than the Palm V on which I once read "Don Quixote."

Sadly, Plucker hasn't been ported to the iPhone yet, putting it somewhere behind the 1999 Palm V in functionality.

There are a plethora of eBook readers available for download from the App Store. Most of them cost money; Stanza is one of the exceptions. It's got some decent books available for download, and the library interface is only moderately wonky, but... but. The book reader is TERRIBLE. Rendering takes minutes--not downloading, but rendering--and tilting the screen triggers a complete re-render, meaning that any accidental tilt interrupts your reading for 2-3min. Completely unusable.

The best book reader I've found is the iPhone's built-in PDF viewer, accessed via Air Sharing, an app that was free a week ago but now costs $6.99. Air Sharing allows you to connect to a network share on your iPhone via WebDAV, which initially sounds only mildly interesting, until you realize that files within that share can be opened for viewing on the iPhone.

PDFs rendered this way open fast, rotate even faster, and the viewer state--ie, which page you're on--is saved between Air Share sessions. It would be perfect were it not for the sometimes screen-unfriendly nature of PDFs.

Text and HTML files can also be viewed, and the Text viewer retains state, but render times are prohibitive.

Wednesday, September 17, 2008

Sync'ing Google Calendar w/ the iPhone

Do we have a word for the relationship milestone marked by my sync'ing my girlfriend's Google Calendar with my iPhone?

It's pretty easy to do, at least on a Mac.

Tuesday, September 9, 2008

A loathsome and indefensible lie

From Livejournal, by way of Pharyngula:

The theory of childhood, also known as child origin, is a damnable, loathsome and indefensible lie. How can any thinking person suppose all humans used to be babies once? ...
The development of children has been well-researched in our six-month study following a sample of one thousand children and adults of various ages. We have conclusively proven that while there are minor changes in features like height and body fat ... incontravertibly still every creature in the study that started out as a child had only slightly more adult features at the end of the observation period than at its beginning.

Monday, September 8, 2008

Two patched protein subtypes and a conserved domain of group I proteins that regulates turnover.

I'd almost forgotten about this:

Two patched protein subtypes and a conserved domain of group I proteins that regulates turnover.

Kawamura S, Hervold K, Ramirez-Weber FA, Kornberg TB.

Biochemistry & Biophysics, University of California, San Francisco, CA 94158.

Patched (Ptc) is a twelve-cross membrane protein that binds the secreted Hedgehog protein. Its regulation of the Hedgehog signaling pathway is critical to normal development and to a number of human diseases. This report analyzes features of sequence similarity and divergence in the Ptc protein family and identifies two subtypes distinguished by novel conserved domains. We used these results to propose a rational basis for classification. We show that one of the conserved sequence regions in the C-terminal domain of Ptc 1 is responsible, at least in part, for rapid turnover. This sequence is absent from the stable Ptc 2 protein.

Wednesday, September 3, 2008

Levenshtein distance

A Levenshtein distance, or "edit distance," is a measure of the similarity of two given strings as embodied in the number of changes--insertions, deletions, or substitutions--required to transform one string into the other.

This is an incredibly useful tool for grouping and ordering strings, and in particular, non-language strings -- we already have a convention for ordering words in English, but if you're staring at 200 protein sequences, alphabetic order doesn't do anything for you.

There's a fast C implementation of the Levenshtein algorithm and though it doesn't seem to have a proper project homepage, it can be found in the Pootle & Translate Toolkit.

I needed to add a wildcard parameter to the Levenshtein algorithm so that the distance between, eg, ABC and AXC is 0, given that 'X' is the specified wildcard character. (Normally, the Levenshtein distance between "ABC" and "AXC" is 1, of course: it takes one substitution to turn one string into the other.)

So I did.

Before I attempted to modify this C version, though, I put together a simple, unoptimized Python version, and the difference in performance was shocking. On my test data set, the C code took 0.3s to run, and the Python version took ... over 3 minutes. It was over 1000x slower.

After adding the couple of if statements and loops necessary for the wildcard code, the C code slowed to 1.2s.

Tuesday, August 19, 2008

MochiKit, IE6, and DOM manipulation

I'd recommend MochiKit to anyone who has to write Javascript code -- it lives up to its motto and "makes Javascript suck less." I'd recommend it twice as forcefully to Python coders, as features like the iteration tools are explicitly inspired by Python; and thrice as forcefully to any functional programming -minded Python coders, as it provides handy functions like map and zip, plus short-cuts for partial application and the like.

Bob Ippolito is a smart man.

MochiKit also provides some convenient DOM manipulation tools, such that swapping in a whole table takes only 3 or 4 lines of code:

swapDOM( $('mytable'),
TABLE({'id':'mytable'},
TBODY(null, [TR(null, TD(null, "a cell"))] ) );

(assuming you already had a DOM element w/ id='mytable')

A quick caveat, though: Internet Explorer 6 is picky about tables, and requires that a table created via the DOM have a TBODY or nothing will be displayed.

Thursday, July 17, 2008

Python class inconsistencies

Update:

Masklinn's correction is obviously correct -- this isn't primarily a class v. instance variable problem, it's the difference between the operators I'm using on the instance variables.

So, an assignment (k.t = ...) points the attribute at a wholly new object, while accessing one of the attributes' methods actually alters the attributes' state.

(Which is where the confusion over mutability comes into play, but it was still confusion on my part.)

My original posting:

This makes a modicum of sense if you have a basic grasp of the distinction between mutables and immutables in Python, but it still seems like a mess.


>>> class K(object):
>>> t = (1,2,3)
>>> l = [1,2,3]
>>>
>>> k = K()
>>> j = K()
>>>
>>> k.t = ('a','b','c')
>>> j.l.append( 10 )
>>>
>>> j.t
(1, 2, 3)
>>> k.l
[1, 2, 3, 10]

Do you see it?

Class K defines two attributes, a tuple t and a list l.

If you instantiate K twice (k and j), then change the t attribute of one and the l attribute of the other, the change to l will be shared between instances, while the change to t will only effect the instance.

Whether an attribute belongs to the class or the instance depends on whether it's type is mutable or not!

There's a fix in one case: you can use __init__ to make mutable data instance-specific:

>>> class K(object):
>>> t = (1,2,3)
>>> l = [1,2,3]
>>> def __init__(self):
>>> self.l = [1,2,3]
Now changes to k.l won't effect j.l

However, I can't find a way to accomplish the opposite--create a class attribute for an immutable data type--w/out resorting to __get_attribute__ magic.

Saturday, June 14, 2008

Pyrex for performance and obfuscation

I've recently been asked to obfuscate a bunch of Python code. Encryption is one possibility, but the user needs the key along with the encrypted code in order to run the code, so this is really just a round-about form of obfuscation. And if multi-billion dollar (and rather unsavory) industries can't get this right, I'd rather not even try.

One novel form of obfuscation is compilation to C-code, a task made relatively simple by Pyrex and, more recently, Cython. Both projects are mainly intended to ease the integration of C libraries with Python; both accomplish this by compiling native Python code into a .so shared object. This .so file should, in turn, be slightly harder to decypher than Python bytecode.

Pyrex isn't as actively maintained as Cython, but it is available via Macports, so I'm using Pyrex for now.

Pyrex appears to work by first translating your Python code into C, then compiling this C against the Python libraries. Unannotated Python objects remain PyObject * pointers -- it's quite possible that the Python interpreter, or VM, or whatever lies underneath, is still doing most of the heavy lifting with Pyrex-translated code; I can't make any sense of it.

But Pyrex also allows you to write Python-like code that gets translated to native C, with all the implied performance gains. As a simple example, I've done a naive implementation of the Fibonacci sequence in plain Python and in Pyrex' C/Python intermediary. Here's the file, called "pyrex_fib.pyx":

cdef _cfib( int i ):
if i < 3:
return 1
else:
return _cfib(i-1) + _cfib(i-2)

def cfib( i ):
return _cfib( i )

def pyfib( i ):
if i < 3:
return 1
else:
return pyfib(i-1) + pyfib(i-2)


_cfib and pyfib are the same function, w/ _cfib implemented in Pyrex C notation; cfib is a wrapper around _cfib. (Native C functions can't be called directly from Python and must be wrapped.)

"pyrex_fib.pyx" is compiled to a Python-friendly .so file via distutils; here's the contents of "setup.py" -- lifted from Michael's Guide to Pyrex:

from distutils.core import setup
from distutils.extension import Extension
from Pyrex.Distutils import build_ext
setup(
name = "PyrexGuide",
ext_modules=[
Extension("pyrex_fib", ["pyrex_fib.pyx"])
],
cmdclass = {'build_ext': build_ext}
)

The compilation is accomplished via python setup.py build_ext --inplace, but note that bugs can result in the rather cryptic error message error: Pyrex does not appear to be installed on platform 'posix'


I also wrote a plain Python version, "py_fib.py":

def pyfib( i ):
if i < 3:
return 1
else:
return pyfib(i-1) + pyfib(i-2)


This enables me to compare Pyrex-translated Python code stored in a .so to the same code stored in a regular Python module, and compare them both to the "native" version.

I do this comparison via a simple module that imports both forms and runs them, timing each invocation. Here's the output:

kieran@host:~/tmp/pyrex$ ./time_fib.py
Sat Jun 14 12:15:07 2008
pyrex_fib.cfib(40) = 102334155 in 9.6s
pyrex_fib.pyfib(40) = 102334155 in 121.5s
py_fib.pyfib(40) = 102334155 in 98.0s

Interestingly, the Pyrex-translated Python code is about 20% slower than the regular Python; presumably, it's not benefiting from various interpreter optimizations. The "C" implementation blows them both out of the water.

Pyrex looks great for wrapping C libraries for Python, and might serve for code obfuscation, but the major limitation is the difficulty of moving non-scalar data types between C and Python: it wouldn't have been easy to return a Python dictionary from my cfib routine, for example.

Friday, May 30, 2008

Ouroboros

Circular dependencies will break your Python code!


Ouroboros

Given some module A that depends upon module B (ie, import B), and given a module B which depends upon module A, you'll get this rather cryptic error message:

ImportError: cannot import name A

This works at any remove, of course -- the circle could stretch through 1,000 modules, but once you forge that loop, you're toast.

The solution is to refactor your code to put routine from A needed by B into a separate module C that depends upon neither.

Wednesday, May 28, 2008

Leopard emacs is broken...

Which means that GNUplot is broken under Leopard, and with it, scipy.

Fortunately, there's a solution:


sudo mv /usr/bin/emacs-i386 /usr/bin/emacs-i386.backup
sudo /usr/libexec/dumpemacs -d
emacs --version
emacs


It's quite beyond me why the emacs shipping w/ Leopard is broken out of the box, but can be repaired via the dumpemacs command, but there it is.

Thursday, February 14, 2008

Redirecting from mod_python

The sort of thing I'm inclined to forget:


def handler(req):
req.headers_out['location'] = 'http://www.modpython.org/'
req.status = apache.HTTP_MOVED_TEMPORARILY
req.send_http_header()
return apache.OK


Care of the mod_python FAQ.

Monday, January 21, 2008

mod_python, Leopard, and the trouble w/ 64 bits

mod_python is broken under Leopard, at least on 64-bit capable systems.

Actually, this is only half true. Leopard ships w/ Apache 2, but more importantly, it ships w/ a 64-bit capable Apache 2. Because it's the 64-bit version that runs on capable systems, any modules linked by Apache need to be 64-bit capable.

Apache is 64-bit capable:

kieran@bali:~$ file /usr/sbin/httpd
/usr/sbin/httpd: Mach-O universal binary with 4 architectures
/usr/sbin/httpd (for architecture ppc7400): Mach-O executable ppc
/usr/sbin/httpd (for architecture ppc64): Mach-O 64-bit executable ppc64
/usr/sbin/httpd (for architecture i386): Mach-O executable i386
/usr/sbin/httpd (for architecture x86_64): Mach-O 64-bit executable x86_64

And mod_python, built by simply running ./configure and make, is 64-bit:

kieran@bali:~$ file /usr/libexec/apache2/mod_python.so
/usr/libexec/apache2/mod_python.so: Mach-O 64-bit bundle x86_64

So that much works. And basic mod_python functionality is intact. The problem comes when you try to compile python libraries that depend upon native C/C++ code. Neither command line options nor the CFLAGS environmental variable can force python distutils to build 64-bit friendly libraries.

I haven't figured out how to force distutils to cooperate yet, but in the meantime, rebuilding the entire software stack in fink creates a usable environment.

The first step is to simply ask for mod_python:

  1. stop apache: sudo service org.apache.httpd stop

  2. install the necessary packages: fink install libapache2-mod-python-py24

  3. accept the lengthy list of dependencies


Installing the PHP module wasn't so easy; I wound up building it by hand. First, make sure the Postgres include files are installed, assuming you need Postgres:

fink install postgresql82-dev

./configure --with-apxs2=/sw/bin/apxs2 --without-iconv --with-mysql=/sw/ --with-pgsql=/sw/

Monday, January 14, 2008

Magsafe Connector Disassembly

The chord nearest the Magsafe plug on a modern Apple laptop is under a good deal of mechanical stress, and mine finally gave out; the outer layer of wire looked like loose steel wool. So I decided to take the plug apart:






Notice that black bit on the PCB in the left-hand image? That's a chip, identified by the following text:

A3
2100
613A1

It most likely controls the switch between the orange "charging" LED indicator and the green "powered" LED. However, I wonder if this chip doesn't hold the answer to another mystery: why won't the Macbook battery charge off the Magsafe airline adapter?

For further information on the subject, I'd suggest Stuart Schmitt's guide to hacking the MagSafe cable for use w/ a standard DC transformer.

Tuesday, January 8, 2008

mod_python under Leopard

OS upgrades break my workaround for compiling mod_python under OSX.

This is a good guide for fixing mod_python under Leopard, but with a caveat: if you've upgraded your system from Tiger, the "missing symbol" problem might persist. Run otool -L /usr/libexec/apache2/mod_python.so -- if you see a reference to /Library/Frameworks/Python.framework/Versions/2.4/, that's your problem. I simply renamed /Library/Frameworks/Python.framework to /Library/Frameworks/_Python.framework and rebuilt.

Update: I take it all back; mod_python essentially doesn't work under Leopard, at least on a 64-bit capable system. See my Jan 21 post for more information.