Saturday 30 June 2007

Python, os.fork() and GSoC

This one was a bit tricker to find, so I'm going to post it here: if you use os.fork() on Python, you will want to use os._exit() on the child process, or it will clean stuff it shouldn't and you will have segfaults with doubled memory free. That said, python's webbrowser module is quite useful; check it if you ever has to open a browser from a Python script/program. The only downside I see on it is exactly that it doesn't takes care of the forking stuff for you. I'm using it on bug-triage to show a bug report on the user browser. Talking about GSoC, during the last week the Debian BTS was updated, breaking, between other things, btsutils' 0.1 version. The current development version of btsutils, which uses BeautifulSoup to parse the HTML stuff more realiable, wasn't affected by the update. That's all for now, time to code.

Sunday 24 June 2007

bug-triage 0.1

Yet another GSoC release; it's the first release of the real tool this time. From the release announcement sent to bug-triage-devel: ---- I'm pleased to announce the release of bug-triage 0.1. bug-triage is a tool to help triaging Debian bugs. Current features: * Show all bugs which match a given bug number, source package, package, maintainer or submitter * Show details about one of the returned bugs on the user's web browser The source code for bug-triage is available at http://bug-triage.alioth.debian.org/ Debian packages are being worked on and should be available soon.

Monday 18 June 2007

The world is full of (different) bugzillas....

... but the KDE's wins the strangeness award so far...
>>> import urllib2
>>> opener = urllib2.build_opener()
>>> f = opener.open("http://bugs.kde.org")
>>> print f.read()

[...]

<h1>Page not found</h1>

<p>KDE has switched to bugzilla. Please go to the <a href="/">main page</a>
to search for your bug.</p>

[...]

>>> opener.addheaders = [("User-agent", "bzutils")]
>>> f = opener.open("http://bugs.kde.org")
>>> print f.read()

[...]

<h1>KDE Bug Tracking System</h1>
<p>This is KDE's bug tracking system which files details of wishes, bugs and crashes
reported by users and developers.  Each report is given a number, and is kept on file until it is
marked as having been dealt with. For participating you need a personal account which will gain
you the ability to post reports and comments as well as voting for specific reports and observe
development. You'll need to enable cookies for this site for staying logged in.</p>
Questions: 1) Why does kde bugzilla require user-agent to work? 2) Why it doesn't return something more descriptive?

Saturday 16 June 2007

bzutils 0.1

Following with the Summer of Code, I've just released bzutils 0.1. The release announcement: --- I'm pleased to announce the release of bzutils 0.1. bzutils is a python module to interact with bugzilla servers. Current features: * Query bug reports through boogle (gnome's bugzilla search improvements) or boolean charts. * For each bug report, gets the following metadata: id, product, component, status, resolution, reporter, assignee, summary, priority and severity The source code for bzutils 0.1 is available at http://bug-triage.alioth.debian.org/ Debian packages are being worked on and should be available soon.

Thursday 14 June 2007

BeautifulSoup: Parsing html in Python

Parsing HTML to get the information you need can be a very hard task if you take complex pages like the ones generated by the Debian Bug Track System, which I need to do on my GSoC project while the debbugs people doesn't finish the SOAP interface. I was doing it through regular expressions, heavily based on the reportbug-ng code, when my mentor (thanks, Loïc) mentioned BeautifulSoup, a python module (with a strange name :P) to parse html. If you ever need to parse html code in python, I strongly suggest you take a look on it. As usual with python stuff, it's very well documented, and it has a very good set of features which allows one to easily find anything inside a html document. It also has a xml module, which I haven't tried (yet). BTW, did I already say I think GSoC is a great learning experience? Even I'm surprised by how fast I'm being able to apply GSoC-acquired knowledge in other activities, as I'm already using BeautifulSoup in another project.

Sao Paulo's Metro Strike

I live in São Paulo, The most important (IMO) city of Brazil. It's also the fifth most populous metropolitan region of the world. One of the main public transportation systems in São Paulo is its metro, which was once regarded as a transportation city of major quality. Lately, however, it has been sinking. Fast. Very fast. It just can't keep up with the demand; the trains are getting more and more full, and the timings are getting more irregular as time passes. If this isn't enough; the syndicate of metro workers seems to be formed by a bunch of selfish clowns. So, today, 3,3 million of people are without transport, because these clowns want 13% of income increase. Now, where are the laws which state that this kind of public service can't be paralyzed? The government should just send these clowns back to the circus they fled from. This brings us to another topic: the pathetic laws that regulates public/government workers . They can just work (or not work) however they want, and can't be fired. The ultimate job security here is to get into a government job. Finally, the solution for the (metro) problems: just privatize the damn thing already. Well, rant done, so let's go on with our daily schedules (or what is possible of it without metro, for the paulistans)

Sunday 10 June 2007

btsutils 0.1.1

I've recently released the first version of btsutils, a python module to interact with debbugs servers (such as the Debian Bug Tracking System). The btsutils is part of my Google Summer of Code project, the bug triage and forward tool. Currnetly, the btsutils can query the bts based on bug number, source package, package, maintainer or submitter. A Debian package of btsutils 0.1.1 is already waiting to be processed on the NEW queue. Some useful links:

Saturday 2 June 2007

Python Soul

I find it very interesting how different programming languages have different styles. My Google Summer of Code project, the Bug Triage and Forward Tool, is my first Python software; and working on it on the last few days, I've got the feeling that the way I've been using to structure the code doesn't fit very well with the way python packages/namespace works. I already wished to separate the project in three independent codebases, so I'll go ahead and do that. These codebases will be:
  • python-btsutils: python module to interact with the Debian BTS / Debbugs servers
  • python-bugzilla: python module to interact with Bugzilla
  • bug-triage: The tool itself.
BTW, I'm just loving python :)