sayap's blog

while 1: yield None

Fun with Google Trend - the Linux ecosystem

written by sayap, on Sep 24, 2008 10:00:00 PM.

One of the hottest topic recently is the controversial keynote delivered by GregKH regarding how little Canonical contributes to the Linux ecosystem. Dustin Kirkland, a Canonical employee, wrote a rebuttal by playing the "our penis company is so small" card and painting the criticsm from GregKH as an attack from a much bigger and established competitor.

As much as I love to see an underdog succeed, I don't buy Dustin's argument at all. Look, GregKH is a smart guy. Coming from a competing company, he would have expected this kind of attack. Thus, regardless of whether he did so conciously or not, the slides centered around Canonical versus Gentoo, not Canonical vesus Red Hat/Novell/IBM. Gentoo as a smaller player win or tie in all comparisons, rendering Dustin's argument worthless.

Anyway, in his futile attempt to further prove his point, Dustin showed us this masterpiece:

Linux versus Ubuntu

The fun part about Google Trend is that you can interpret it however you want. As a biased Gentoo user, I see that

  • Ubuntu is doing well with noticeable spikes after released dates

  • Linux is getting worse all the time

  • If you combine their search volumes together, the whole pie is actually shrinking

Which is essentially what GregKH told us: Ubuntu is not helping the Linux ecosystem (actually, it's hurting, but let's try not to read into the graph too much).

Sigh. * shake head in disbelief *

I zip, I slice, and I zip again

written by sayap, on Sep 21, 2008 3:24:00 AM.

Yesterday, James Reeves posted some Haskell code on slashdot and sort of challenged others to come up with equivalent solution in other languages:

listToForest :: Eq a => [[a]] -> Forest a
listToForest = map toBranch . groupBy ((==) `on` head) . filter (/= [])
           where toBranch = Node . (head . head) <*> (listToForest . map tail)

According to James, "assuming you know Haskell pretty well, [the code]'s fairly clear as well". He may be right, but for anyone who doesn't know Haskell, it looks downright scary. Anyway, what the code does is to convert some grid form of data (a list of lists in Python, e.g. rows of query result):

A I A
A I G
B D B
B W H

into some hierarchical form:

A -> I -> A
       -> G
B -> D -> B
  -> W -> H

Sounds easy? I thought so. After several failed attempts with Python, I finally realized the key for the Haskell code to be so concise was groupBy. And sure enough, Python got the equivalent in itertools.groupby. Nice. With that, here's a version in Python that is (hopefully) understandable by a mere mortal:

from itertools import groupby

def make_tree(data):
    return [[node, make_tree([x[1:] for x in iterator if x[1:]])]
            for node, iterator in groupby(data, lambda x: x[0]) if node]

>>> data = [['A', 'I', 'A'], ['A', 'I', 'G'], ['B', 'D', 'B'], ['B', 'W', 'H']]
>>> print make_tree(data)
[['A', [['I', [['A', []], ['G', []]]]]], ['B', [['D', [['B', []]]], ['W', [['H'
, []]]]]]]

With groupby, slicing the data is a piece of cake. We iterate through the iterator returned by groupby, chops everyone's head off, and passes the bodies as a list to the next recursive call. Simple, and get the job done.

Looking further into the documentation for itertools, I found izip and islice. Despite the naming, they are not made by Apple, though they are still cool. They basically allow you to zip and slice an iterator as if it's a list:

from itertools import groupby, islice, izip

def make_tree(data):
    return [[node, make_tree(izip(*islice(izip(*iterator), 1, None)))]
            for node, iterator in groupby(data, lambda x: x[0]) if node]

I am not sure if this iterator version performs better than the initial version, but I am pretty sure it is more dangerous, especially if the dataset is large, if you catch my drift.

Anyway, what James wanted to achieve is to then transform the tree into xml. Here's how to do so with my pseudo-tree:

indent = 4
def write_tag(tree, level=0):
    for node, children in tree:
        if children:
            print '%s<%s>' % (indent * level * ' ', node)
            write_tag(children, level+1)
            print '%s</%s>' % (indent * level * ' ', node)
        else:
            print '%s<%s/>' % (indent * level * ' ', node)

>>> tree = make_tree(data)
>>> write_tag(tree)
<A>
    <I>
        <A/>
        <G/>
    </I>
</A>
<B>
    <D>
        <B/>
    </D>
    <W>
        <H/>
    </W>
</B>

It's late and I need to sleep. To recap, the Haskell solution took 3 lines, and the Python solution took 3 lines. It's a draw. Peace. Let's all point to Java and laugh.

How to start a blog

written by sayap, on Sep 19, 2008 9:59:00 PM.

Here's how a normal person starts a blog:

  1. Make a coin toss. Heads for wordpress, tails for blogger, middle for others.

  2. Sign up.

  3. Blog like a normal person.

Of course, blogging like a normal person is pretty boring. For one, if you blog like a normal person, you won't even get arrested under ISA. Boring. * yawn *

So here's how a real man starts a blog:

  1. Spend a few days looking for a server that is cheap, fast, and has a reasonably not-slow connection to whatever 3rd world country that you happens to live in.

  2. Call your credit card company to authorize the payment for the server, all the while resisting the urge to explain that the transaction is not for porn.

  3. Spend a few days looking for a blog software that doesn't have any freaking release yet, let alone a stable one. Spend a few nights setting it up -- in a 3rd world country, you get slightly less sucky connection in the off-peak hours.

  4. Realize that unstable really does mean unstable, and the software is just unuseable. Nonsense. A magical touch of hg revert -r 486 . makes it production ready, and you are good to go.

  5. Blog about how to start a blog (to mask the fact that you have totally forgotten what you wanted to blog about in the first place).