Thursday, September 6, 2007

On convergent evolution, and the invocation of holy wars.

Python and Ruby are the same thing.

There, with that out of the way, let me explain myself.

I was looking at a post about scene carving image retargeting which thoughtfully includes a link to his python implementation of scene carving all nicely packaged into git.

So I grabbed it, used emacs to import it into a new file, and started converting it to Ruby, since that's the language which is hitting my happy button right now (Haskell is for when I want to hurt myself). (NB said conversion I plan to post here, or at least, link to SVN for it.)

Take a look at an excerpt:


class CostMatrix(ndarray):
def calculate(self, energy_map):
if not energy_map.shape == self.shape:
raise Exception, "Wrong shape"
(h, w) = self.shape
self[0] = energy_map[0].copy()
self[0] = self[0]
for y in range(1, h):
for x in range(0, w):
bestcost = inf
bestx = x
for dx in range(x - 1, x + 2):
if dx >= 0 and dx < w:
if self[y - 1, dx] < bestcost:
bestcost = self[y - 1, dx]
bestx = dx
self[y, x] = self[y - 1, bestx] + energy_map[y, x]
self._calculated = True

def _get_max_index(self, row, startcol = 0):
maxx = startcol
maxval = self[row, maxx]

for x in range(0, len(self[row])):
if self[row, x] > maxval:
maxx = x
maxval = self[row, x]

return maxx

def find_shortest_path(self):
(h, w) = self.shape

x = self._get_max_index(-1)
path = [x]
for y in range(h - 2, -1, -1):
bestcost = inf
for dx in range(x - 1, x + 2):
if dx >= 0 and dx < w:
if self[y, dx] < bestcost:
bestcost = self[y, dx]
x = dx
path.append(x)

path.reverse()
return path


def get_image(self):
scaling = 0.03
(h, w) = self.shape
im = Image.new("L", (w, h))
im.putdata(self.flatten() * scaling)
return im


Now, if you're a python person, that should be fine. But what if you're a ruby person? That looks like ruby, where someone added in a lot of colons, and didn't remember their end tags. Oh, and someone's using paretheses oddly.

So, having converted those things, I'm confronted with a syntactically valid chunk of ruby code. It no longer throws parse errors.

This is pretty mindblowing, to me. Maybe it's something that's long since been obvious to the old hands...

Anyway, this makes me wonder why Python and Ruby aren't implemented on the same core compiler/interpreter. I know Microsoft is doing something akin to this with their Dynamic Language Runtime, but why aren't the Ruby people stealing like mad from the Python people, and vice versa?

That said, this, to me, is only the midway step between Python and Haskell. Ruby will probably take a week to do anything fun with it, so, much as I might like to throw up a free image resizing service, I'm thinking I'd rather do it in HAppS, where at least it will be fast.

Any thoughts? Am I an idiot for not seeing this already?

PS there's also another implementation of scene carving based resizing that I've been looking at.

8 comments:

Allen Short said...

Different semantics. Ruby has continuations. Python has generators. Different C APIs. Ruby uses setjmp()/longjmp() for exception handling. Python uses return codes to indicate presence or absence of an exception.

All the similarities (of which there are many) between Python and Ruby are conceptual; everything's just slightly different implementationally.

Justin George said...

Right, that's what I'm saying, given all the conceptual similarities, why aren't they moving towards a shared core implementation of the minimal set needed to do either, and then building libraries from there.

Are they too proud to work together? Or is there a larger hurdle I'm not seeing?

Given that Microsoft is doing it, I doubt the hurdle can be that high, unless it's merely a question of throwing cash at it.

Brandon Corfman said...

What would this rewrite accomplish? It would be a lot of hard work going over all the same ground that's been plowed before. And then when the Python folks were done, we could have, what ... Rails?

I think the onus is on the Ruby folks if they want access to Python libraries since they're more established and, frankly, more interesting. (NumPy & Pygame come to mind as ones I use that have no good Ruby equivalent).

J.V. Toups said...

I think its just the huge barrier of no infrastructure or impetus for merger. It's not as if just because there is a syntactic similarity between the two languages they can just up and merge. As a previous poster indicated, Ruby is continuation based. I am not an Ruby expert, but I suspect that implies a deep divide between the implementation and internals of Ruby vs Python. Lets not forget that the power of Python at least is in its library. A change of engine or union with Ruby would require rewriting a LOT of code. I wouldn't expect something like that to happen without some really good reason.

Konrad said...

I can think of two reasons:
1. Finding similarities in already implemented languages, abstracting them away, and reimplementing these languages using the abstraction is huge amount of work, that doesn't bring (almost) anything new.
2. I have never heard of a project with two BDFLs...

Mark Lee Smith said...

Ruby and Python may look similar; both are object-oriented and include functional aspects. These similarities are only skin deep, when you look closer you notice that Python and Ruby have very different object models, and use functions in astoundingly different ways.

If for instance, a programmer wanted to use a Ruby library from Python, they'd find themselves defining a lot of named functions. This is because Ruby uses Blocks (anonymous function arguments, or higher-order functions) everywhere. Python is not equipped to deal with this style of programming because lambda in Python is sadly crippled, and will be so for the foreseeable future.

A programmer who wanted to use a Python library from Ruby may similarly find it difficult. Python uses functions/procedures in a lot of places where Ruby would use objects. This implies a forced style of programming (in either languages), due to differences in programming styles.

This is just scratching the surface, there are far more damaging differences between the two languages. So much so that I don't see this happening. Not only is it unpractical, but it would be bad for both languages.

A major problem with this kind of common VM, is that you will end up with a code base written in N languages, meaning that you need more (~N) programmers on average to maintain it effectively. This doesn't sound good, at least to me. It also requires the continued existence, development and use of each language.


You can of do this theory right now using Python and Ruby on the CLR or JVM, but I don't consider that much of an option. Maybe I'll change my mind when/if Parrot is finished?

Stephen said...

If the Parrot project succeeds, you won't even have to rewrite any code: http://www.parrotcode.org

Ian Bicking said...

No one is writing the shared VM. That's all there is to it, I think. The idea of intimately sharing code between the two is very hard. Notably the design decisions for the CLR (with IronPython, IronRuby, etc) keep the objects very separated. Closer to how you interact with C code in these languages currently, than really treating them like a single runtime. I think this is reasonable.

Outside of the CLR, there is of course Parrot. But it's hard to get excited about Parrot after all this time. On the Python side, PyPy offers the potential of targeting different backends for Python. In terms of open source, the LLVM and maybe eventually the HLVM (built on LLVM) seem like interesting candidates, and maybe more interesting than Parrot because they seem more focused on practical goals. The LLVM is a working backend for PyPy now, for instance.

All that said, the similarities between Python and Ruby don't mean a shared VM will *necessarily* help a lot. It could; mostly if it helps with things like making C libraries accessible to both languages easily at the same time, or if there's interesting infrastructure like hosting environments, VM sandboxing, etc.

But my conservative side at the moment is more likely to do communication between environments using something like HTTP, or simply the command-line (e.g., calling out to a resizing script).