Browsed by
Tag: development

A MusicBrainz report on Pseudo-Releases

A MusicBrainz report on Pseudo-Releases

As an implementation of support for translating or transliterating the tracklists on foreign releases, MusicBrainz has something called “Pseudo-Releases.” They are meant to be used alongside the database entry for the original, real release to contain alternate tracklists.

Unfortunately, many MusicBrainz users weren’t quite sure on when the “Pseudo-Release” status should be used, and incorrectly set it instead of the correct status like “Official”, “Bootleg”, or “Other” on some releases. And a few translated tracklists have the status set correctly—but aren’t linked back to the original version of the release.

In order to help find and fix these issues in the database, I’ve created a new report: “pseudo-releases without transl*tion relationships”, in code review now. It looks pretty sweet:

But 2261 releases to fix up means there’s a ways to go.

MusicBrainz contributions – Search result and stats improvements.

MusicBrainz contributions – Search result and stats improvements.

I’m pleased to see that two of my MusicBrainz contributions have been merged in time for the 2012-01-26 server update, and are now live on the site!

The first change isn’t exactly visible to the naked eye, but it should hopefully provide improvements in the Google Search result listings for MusicBrainz artist pages by providing a nicely formatted <meta description> tag to help Google (and other search engines, of course) list a more relevant “snippit” of the page in the search results.

The second change is in a little-known area of the site, the timeline statistics graph. I talk about this a bit more on my Google+ post, but it’s nice to see the feature live.

Vala Bindings for libmusicbrainz4

Vala Bindings for libmusicbrainz4

When developing Riker, I had a bit of a choice – I could either write (from scratch) a new library to interface with the MusicBrainz XML webservice, or I could create bindings to access the existing libmusicbrainz library from within Vala. Up to today, I’ve gone a little ways down both paths, and both have problems.

If I write new bindings from scratch, they’ll have some nice features like integrating into the Glib main loop, and automatically determining proxy settings from the environment. But it will be a lot of coding; and even more debugging.

The existing libmusicbrainz code is better tested, and writing bindings is less overall code to write. Unfortunately, I’m writing Vala bindings for the C bindings to a C++ library. The extra steps cause some weirdness, which means that the bindings are more complicated than I would like.

And then there are a few things with the C bindings to libmusicbrainz that it simply gets wrong. For example, it has no working typechecking! As a result, even some of the internal test code gets types mixed up, causing hard to debug issues. I’m working on a patch to correct this, which will change the C bindings API slightly. (But curiously enough, not the ABI).

But in the end, simply to get started faster, I decided that the bindings are the way to go. The hypothetical GObject-based MusicBrainz webservice library will have to wait for another day.

Take a look at my progress so far on the vala bindings at libmusicbrainz4.vapi.

Mercurial Frustrates Me

Mercurial Frustrates Me

Maybe I’m just used to having too much power at my fingertips. Git was designed, from the ground up, to provide operations to do absolutely anything to a repository, right down to the most basic level of manually creating individual objects in the repository. Using user-visible command-line tools that can be operated from scripts. It is literally possibly to reimplement most of the user visible Git commands (such as “commit”!) using shell scripts and some of the more-basic Git “plumbing” commands.

So, there’s no surprise that Git commands support special script-friendly output modes. A good example: git log --numstat will print out a summary of the changes in a commit in a parser-friendly tab-delimited format. Mercurial doesn’t have this option. (Someone wrote a patch to add it, but it was rejected!)

And then there’s the issues with speed. I’m writing a script that generates a summary of the differences between two branches based on which commits have been merged into each branch.

$ time git log --pretty=oneline --numstat brancha..branchb
real 0m1.081s

So, about one second. Not bad for a command that’s summarizing around 400 commits from a 250 MiB repository. Lets see if Mercurial can keep up:

$ time hg log -r "ancestors('branchb') - ancestors('brancha')" --template "{node} {desc|firstline}\n" --stat
real 3m51.994s

Ok… That command took over 230× as long in Mercurial as Git. Amusingly, the same repository in Mercurial is around 500 MiB – twice the size! (For the record, it’s mainly the --stat that slows it down. If I remove the stat, Git takes 0.167s, and Mercurial 0.421s). This is completely unusable.

So, why is Mercurial around 200× slower to calculate the differences contained in a commit? I don’t know. But they really should fix it.