Wednesday, July 11, 2012

Comparing objects in C++

The standard C++ pattern for comparing and ordering objects is to overload operator<() for your class. This function is what all the standard containers (std::set, std::priority_queue, etc.) and algorithms (std::sort, std::binary_search, etc.) use for comparing objects.

So if you have a class that you would like to put in a standard container, you have to overload operator<(). Your class usually has more than one member, and you usually want to order your objects first by the value of one member, then by the value of another member, and so on. For example, if you have a class person with string members firstname and lastname, and you would like a "phone book" ordering, you would order by lastname first, then by firstname, e.g.:

Anderson, Clive
Anderson, Lucy
Anderson, Thomas
Armstrong, Lance
Armstrong, Neil
...

To achieve this using operator<(), implemented only in terms of std::string::operator<(), you could do something like this:

struct person
{
    std::string firstname;
    std::string lastname;

    bool operator<(const person &other) const
    {
        /* Order first by lastname */
        if (lastname < other.lastname)
            return true;
        if (other.lastname < lastname)
            return false;

        /* The lastnames were equal, so try to order by firstname instead */
        return firstname < other.firstname;
    }
};

This code is not optimal. To see why, consider a case where you have two persons with the same lastname. The first check will fall through (lastname < other.lastname is false); so will the second (other.lastname < lastname is also false). Each check had to loop over the lastname when a single loop should have sufficed.

Fortunately, std::string happens to implement a member function compare() which returns an int. The semantics are similar to C's strcmp(a, b), which also returns an int. The return value is less than 0 if a < b, 0 if they are equal, or greater than 0 if a > b. Using this function, we can implement a much more efficient operator<():

struct person
{
    std::string firstname;
    std::string lastname;

    bool operator<(const person &other) const
    {
        int d = lastname.compare(other.lastname);
        if (d < 0)
            return true;
        if (d > 0)
            return false;

        /* The lastnames were equal, so try to order by firstname instead */
        return firstname < other.firstname;
    }
};

To see the difference in running time, we can implement a simple loop which inserts the same element many times to a multiset:

int main(int argc, char *argv[])
{
    std::multiset<person> persons;

    for (unsigned int i = 0; i < 1000000; ++i)
        persons.insert(person("Thomas", "Anderson with a rather long last name for the sake of the argument"));
}

We exaggerate the number of elements and the length of the lastname in order to amplify the difference. Running this on my laptop, it takes approximately 3.9 seconds to run the original code and 2.9 seconds to run the new version.

This is all well and good, but the actual problem runs a bit deeper than this. What if our code uses other compound data types than strings? Take e.g. std::pair. It is hopelessly broken, precisely because it doesn't implement a compare() function! std::pair only implements operator<() and only expects its elements to implement operator<() too. Have a look at the libstdc++ implementation (simplified):

template<class _T1, class _T2>
inline bool operator<(const pair<_T1, _T2>& __x, const pair<_T1, _T2>& __y)
{
    return __x.first < __y.first
        || (!(__y.first < __x.first) && __x.second < __y.second);
}

Can you spot the performance bug? Yep, it calls checks both __x.first < __y.first and __y.first < __x.first in the worst case. This also means that std::pair<std::string, std::string> has exactly the same problem as our first implementation of struct person.

In order to show just how bad this really is, consider the case where we have nested pairs:

typedef std::pair<int, int> p1; // 2 elements
typedef std::pair<p1, p1> p2;   // 4 elements
typedef std::pair<p2, p2> p3;   // 8 elements
// ...
typedef std::pair<p5, p5> p6;   // 64 elements

This is not a realistic example per se, but it illustrates very well how big the problem really is. Let's assume we have an object x6 constructed in the following way:

p1 x1(0, 0);
p2 x2(x1, x1);
// ...
p6 x6(x5, x5);

What happens when you try to compare x6 to itself like this?

if (x6 < x6)
    /* ... */;

Let's take a step back. x6 has 64 int elements, which can be reached in the following ways:

x6.first.first.first.first.first.first
x6.first.first.first.first.first.second
x6.first.first.first.first.second.first
x6.first.first.first.first.second.second
...

Intuitively, we should only need to perform 128 comparisons in the worst case (two comparisons for each pair of corresponding elements from the two objects). If we had two arrays of 64 ints, we would do it like this:

int compare(const int a[64], const int b[64])
{
    for (unsigned int i = 0; i < 64; ++i) {
        if (a[i] < b[i])
            return -1;
        if (b[i] < a[i])
            return 1;
    }

    return 0;
}

In order to answer the question posed ("How many comparisons between two ints are we actually making when we compare x6 against itself?"), we can simply program it using a wrapper class for int that counts the number of comparisons:

static unsigned int nr_comparisons = 0;

struct myint
{
    int value;

    myint()
    {
    }

    myint(int v):
        value(v)
    {
    }

    bool operator<(const myint &other) const
    {
        ++nr_comparisons;
        return value < other.value;
    }
};

Now we just have to change the definition of p1 from

typedef std::pair<int, int> p1;

to

typedef std::pair<myint, myint> p1;

and finally print nr_comparisons. The answer? 729.

Because std::pair has three comparisons in the worst case, instead of just two (which would have been sufficient), the complexity of comparing nested pairs has just climbed from 2^n (where n is the number of levels of pairs) to 3^n. That's pretty bad!

Part of the solution, as employed by std::string and our improved struct person, is to define the most fundamental comparison between objects as a function that can tell you whether a < b, a == b, or a > b in a single call rather than determine this using two calls. It is clear that the two are semantically equivalent (you can both define operator<() in terms of compare() and the other way around), but one is computationally superior for compound objects.

This is still not the full story, however. We wouldn't get any performance improvement from std::set if we changed it to call compare() instead of operator<() — because it doesn't need to compare both ways! In fact, if we changed it, we would now find that std::set<int> operations take more time, since calling compare() on an int would have to check both a < b and b < a even when std::set only needs to know the result of one of them! (This is of course tighly coupled to the std::set implementation — implemented as a tree, lookups would typically traverse the tree by comparing the element that is searched for with the current node, and update the current node depending on the result of the comparison.)

The real solution, therefore, seems to be to implement both operator<() and compare() for all types. std::pair's would look something like this:

bool operator<(const pair &other) const
{
    int d = first.compare(other.first);
    if (d < 0)
        return true;
    if (d > 0)
        return false;

    return second < other.second;
}

int compare(const pair &other) const
{
    int d = first.compare(other.first);
    if (d != 0)
        return d;

    return second.compare(other.second);
}

Now containers and algorithms can use the most efficient one — std::set would still use operator<() on its elements, but std::pair::operator<() would now use compare() (for its first element), and so on.

Maybe we could adopt Perl's <=> operator for the next version of C++?

Tuesday, January 26, 2010

Space invaders?

After my exams in December, I relaxed by writing a sort of space game using C++/SDL/OpenGL:



I ripped the sprites from Space Invaders and Galaga/Galaxian, please forgive me. If whoever wants to play/hack, the source code can be found here:

http://github.com/vegard/spaceinv

The video is a bit poor, mostly because the frame rate is supposed to be 60 FPS, but the Ogg/Theora capture was 25 FPS while the actual gameplay was probably closer to ~10 while doing the capture. Need to fix that. It looks better when playing.

Thursday, September 10, 2009

First Jato release

So I somewhat promised to say something more on my project for the Google Summer of Code.

I participated for the first time, and it was really a great experience. This is the first time I've been paid to do what I would probably have done in either case (because I immensely enjoy it), namely to work on an interesting free software project. So this time it was Jato, a new implementation of the Java Virtual Machine.

Jato is written in C and aims to be a JIT-only compiler/VM. So before the summer started, it wouldn't run a lot of Java programs. It couldn't even run the trivial "Hello World", but more about that later. It could do a few things like (java.lang.)String operations, but that was mostly because it was riding piggyback on another VM, Jam VM, which would load classes and interpret a lot of the library code (i.e. most of the standard Java API implemented by GNU Classpath).

My own project was to replace the parts that were borrowed from Jam VM with completely new, shiny code. Well, in fact, I had already worked on a standalone library called cafebabe, which would parse the Java .class file and present it to C programs using a very thin layer of structs -- the idea was to do as little as possible with the actual data and just make it easily accessible, e.g. so cafebabe doesn't do any verification of the bytecode or make sure that a class is its own superclass.

So we linked this library with Jato, and that worked pretty well. (I believe strongly in modularisation; with the class loader in a separate library, there's no way to do stupid things like making the class loader itself depend on the rest of the VM for functioning properly.) There were problems, of course, the biggest being that after developing in my own private branch for a couple of weeks, the rest of the project with its three other GSoC student participants was moving ahead too quickly for me to keep up; I had to "fix" just about every new mainline commit, since most of the VM data structures (classes, methods, fields) now had different names, different fields, etc.

At around this point, we started to notice that Jam VM wasn't being used for nearly as little as just class loading. In fact, static initializers (the "<clinit>" methods) and their calls were all being interpreted and executed by Jam VM, not the JIT compiler of Jato. And of course, once we started using Jato for those methods which had previously been executed by Jam VM, we found tons of bugs everywhere: The core VM (mostly my code replacing that of Jam VM), including improperly implemented Java semantics, missing bytecode instruction handlers, and the compiler (instruction selection, liveness analysis, register allocation). But it was fun, and extremely educative.

Did you know that a simple System.out.println() will make a whopping 650550 Java-method calls? That's using GNU Classpath for the Java API.

So we had to implement a lot of stuff just to make that work. In fact, we have so much infrastructure that we can even run a few other programs. And thus it was decided that it was time to make a release. The version 0.0.1 announcement:

http://thread.gmane.org/gmane.comp.java.vm.jato.devel/1725

The "next big" missing feature is garbage collection. I expect that'll be version 0.0.2. In other words, if you're interested in Java, garbage collection, or just compilers in general, you're welcome to join us...

Our team consisted of:
  • Pekka Enberg; Original creator, project manager and mentor.
  • Tomek Grabiec; His original project proposal included implementing exception handling, but he ended up somwhat like Ingo Molnar of the Linux kernel; Tomek also implemented a generic radix tree, implemented the basic threading support (java.lang.Thread), rewrote large parts of the register allocator, wrote the VM side of the JNI interface from scratch, and also did a lot of other core-VM work. Tons of other things too, including a lot of debugging.
  • Arthur Huillet; Floating-point support, register allocator fixes, other missing bytecodes, also a lot of debugging. Various other things.
  • Eduard-Gabriel Munteanu; Ported Jato mostly to x86-64. A tough task, since the i386 parts were moving at close to light speed.

The Jato IRC logger

Here's the recipe for the Jato IRC logger. Nothing fancy, but works surprisingly well.

/home/vegard/jato-irc-logger/irssi-config:
settings = {
core = {
real_name = "#jato IRC logger";
user_name = "vegard";
nick = "jato-irc-logger";
};

"fe-text" = {
actlist_sort = "refnum";
};

"fe-common/core" = {
autolog = "Yes";
autolog_path = "logs/$0/%Y-%m-%d.txt";
};
};

servers = (
{
address = "irc.freenode.net";
chatnet = "Freenode";
port = "6667";
autoconnect = "Yes";
},
);

chatnets = {
Freenode = {
type = "IRC";
autosendcmd = "/^msg nickserv identify vegard password";
};
};

channels = (
{
name = "#jato";
chatnet = "Freenode";
autojoin = "Yes";
},
);


/home/vegard/jato-irc-logger/screenrc:
screen irssi --config=irssi-config


/home/vegard/jato-irc-logger/start-logger.sh:
#! /bin/bash -e

screen -c screenrc -dmS jato-irc-logger


crontab:
# m h  dom mon dow   command
0 * * * * rsync -r -t --chmod=a+r jato-irc-logger/logs/#jato/ vegardno@hostname:www_docs/jato-irc-logs


/etc/rc.local:
#!/bin/sh -e

cd /home/vegard/jato-irc-logger
sudo -u vegard ./start-logger.sh &

exit 0

Wednesday, June 17, 2009

kmemcheck in mainline

First of all, I should say that on April 22, I went to Denmark to give a talk on kmemcheck. I was invited to DIKU (Datalogisk Institut på Københavns Universitet) by Julia Lawall, who held a workshop on Coccinelle (or, more generally, "finding bugs in operating system software"). It was really nice to be there, not (just) because I got a chance to talk about kmemcheck, but because I learned so much from all the other talks! Feel free to check out my slides.

I had asked my university (the University of Oslo) beforehand to sponsor the trip in exchange for a trip report, but I got no reply to my e-mail. I guess they don't see any value in collaboration with universities abroad.

And now, for the news: Yesterday, kmemcheck was merged in mainline Linux! It is quite an incredible feeling after having spent so much time to make it work properly... (Not to mention the rebase marathon that ensued after Linus initially rejected it)

I had also asked Redpill Linpro (a Norwegian/Swedish open source developer) beforehand to sponsor my work on kmemcheck for the summer, but they apparently weren't interested. At least I got a polite reply. (Instead, I got a stipend from Google for working on Jato for the summer, but more on that in a later post!)

Thursday, May 21, 2009

Hiragana tutor

An evening spent with Unicode charts, HTML, CSS, and JavaScript resulted in this:

The online Hiragana tutor

If you find any errors, please tell me.

Tuesday, May 5, 2009

LADSPA

Hurrah for LADSPA.

I started (yet another) project, this time it's a sort of audio synthesisizer and sequencer library. I've always been fascinated with digital audio, since there are so many cool things we can do with it! Like e.g. create and apply filters, warp the sound waves any way we want, add cool effects, and generally do things that I can't otherwise do. Not to mention that I like music... but who doesn't? :-)

My program isn't revolutionary in any way; in fact, the short story is that the program tries to "simulate" the way real, analogue audio components can be put together, as e.g. an effect rack (the guitar connects into a distortion module, which connects into a reverb module, which connects to the amplifier, which connects to a mixer board, which connects to the speakers, etc.). I stole most of the concepts from an existing non-free (but very nice) program called Reason [http://www.propellerheads.se/]. I'm sure there are also free programs that do what I'm trying to do (but part of the fun is doing it myself).

One really nice thing is that there exist a lot of free plugins that do most of the work when it comes to synthesizing instruments and applying effects. These plugins are implemented using a free library, called LADSPA (Linux Audio Developer's Simple Plugin API) [http://www.ladspa.org/], which is the interface between host programs (such as the one I'm making, but also others, like Audacity) and the plugins.

So the plugins form a directed, acyclic graph, where the nodes with no outgoing edges are the speakers (or, well, sound card), and the leaf nodes are typically synthesizers (instruments). Arranging the plugins in this way is a really nice thing, because it ensures that there are no vicious feedback cycles (if we really want some kind of feedback, like reverb, we can implement it as a plugin!) It also means that we can do a topological sort to find the order in which the plugins are to be "run".

Anyway, I've implemented these plugins for my program: A LADSPA plugin wrapper, so we can load any LADSPA plugin as a plugin for my program. An ALSA playback plugin for playing audio in real time. A WAV playback plugin for streaming to disk using libsndfile, which supports a few more formats beside WAV, by the way. A MIDI input sequencer, which reads MIDI files and presents the notes as "control output" to the synthesizer plugins. A mixer plugin, which simply combines several inputs into one output.

It would probably be possible to create an ALSA capture plugin for capturing audio in real time as well. That would be pretty cool, and we'd have a software guitar amp in no time. There seems to be a problem with using blocking ALSA streams, though, because the application sleeps while the sound card is playing the data, and we're left with almost no CPU-time to actually synthesize the sound. This can probably be solved using threads, though.

Anyway, I made something cool (I think, anyway), by hooking the CMT organ [http://www.ladspa.org/cmt/] to a plate reverb [http://plugin.org.uk/], and feeding some classical pieces into the MIDI sequencer (which now supports polyphonic songs!):


Will probably work some more on this, try to write some plugins (I have a few ideas), actually implement error handling (oops), and try to add the missing fundamental features (like reading the correct MIDI tempo, stop output at the end of the MIDI file, etc.).

One of the goals of the library is to be small and efficient, and not accept inferior solutions because they're easier to implement (some trade-offs are allowed). The next major feature will probably be to add multi-core support using pthreads, so that all CPU cores can be utilized.

The code itself can be found at github (as usual): http://github.com/vegard/trick2