Archive for the ‘Computing’ Category

openSUSE 12.3 installer formatting encrypted devices

Wednesday, April 17th, 2013

Hi, this morning I wanted to replace my openSUSE 12.2 by openSUSE 12.3 doing a fresh installation from a USB-stick (using a recent image of a KDE-live system). I am using an encrypted partition to store the volumes of a LVM-volume group (root, home and swap). root should get reformatted (from btrfs back to ext4), but of course I chose to keep my home-partition, it should get reused without being formatted. I entered this setup in the graphical installation program (YaST installer). I had done that before, quite often, what should go wrong? I went on not noticing the entry “Set up encrypted DM device on /dev/sda2”. What does that mean? Will it create some entries in configuration files for handling the encrypted device? No, it means that it will try to reformat the entire encrypted partition using “cryptsetup luksFormat” thus wiping out all volumes stored there although it was only told to format a single volume. Of course my last backup was quite old… but I was lucky: When reinstalling I always mount my home-partition–for accessing my files but also for noting when the installation program tries to touch the volume. I could not imagine that this would happen and thought this would be a quite useless precautionary measure–but actually this saved my day and not only this day: The installer (or cryptsetup or some layer in-between) noticed that there was a mounted partition using this device and refused to format /dev/sda2. First I did not understand the error messages but then–when realising the situation–I was shocked. Nervously unmounting the partition would have caused a total loss.

Afterwards there were even more surprises: This is a known bug! It had been reported before the release of 12.3. According to the report it can even wipe out your data although you did not press the button to start the installation! And it also causes the formatting of luks-encrypted devices which are completely unrelated to the installation—they just have to be open (later I retried the installer, I had plugged in an external hard drive containing an open encrypted LVM containing ext4/btrfs-volumes, I did not select it in the installation, but the installer wanted to format the partition (and it still wanted to format the partition containing my root-volume, although there was no btrfs-volume left iirc)). The bug had been marked as critical and as a potential “ship stopper”–but then somebody decided that encountering this bug would be too unlikely! Well, they might not have known all situations where this bug can occur and the particular steps to reproduce described in the report may appear unlikely and the bug may have been there before without anybody complaining. But, seriously, they wanted to risk total data loss just because a particular situation seems to be unlikely–although it actually happened? And of course you never know for sure whether such a critical bug may affect more situations than you thought.

The bug has been fixed in the libstorage-codebase before the release. However, it was not contained in the release and the disk images provided at still contain the bug and might erroneously format some partitions. About two weeks after the release there have been updates to libstorage and yast2-storage. After having updated the packages in the live system the installation worked as expected and I am currently using it. But my confidence in openSUSE-releases has been shaken.

MeeGo? Where are you?

Monday, July 9th, 2012

There is a rumour that a small Finish company called Jolla lead by ex-Nokia-employees is trying to revive MeeGo. (here, here and hier) Is there a new chance for a real GNU/Linux mobile operating system with great hardware support being free like in freedom, not free like in Android? And they want to use Qt instead of this hip, kinky HTML5+JavaScript (cf. Tizen). Sounds like really good news, but unfortunately I am not that confident. There have been so many setbacks. Maybe it will stay a dream in the near future. :( What are your opinions about it?

Yes, I know about Mer. However, I cannot judge its current state, of course I hope there will not be too much fragmentation of Mer/MeeGo/Jolla/Tizen.

KDevelop-PG-Qt: Generalised Expression Parsing

Sunday, February 26th, 2012

Hi guys and non-guys,

apparently I had no motivation for blogging in the last few months, however…

Currently (once in a while) I am porting the KDevelop-PG-Qt grammar from Flex+Bison to KDevelop-PG—Qt, a parser generator should trust itself and I hope it will increase the quality of the code (and of course it will allow using Unicode). Long Some time ago I implemented specialised parsing for (arithmetic) expressions: the recursive descent approach is not very good in those cases, you have to care about not using left-recursion and you will get very, very deep syntax trees, taking a lot of memory for simple expressions like “0” etc. thus increasing memory usage significantly (e. g. in KDevelop’s PHP-plugin). What was the approach? You can specify postfix, prefix, binary and ternary operators and parentheses. For each operator you have to specify the type of the tokens which should be interpreted as operator and a priority (higher≙stronger binding), additionally, binary and ternary operators have an associativity. I implemented it using a shunting yard-like approach (actually without knowing that termat that time).

However, when porting the grammar of KDevelop-PG-Qt I noticed that this approach did not suffice, but of course I wanted to use the specialised expression parsing for regular expressions etc.: I needed an “empty binary operator” for the usual concatenations in regular expressions (you can just write ab and do not have to use an operator to join them, e.g. a.b). It had been on my TODO-list for long time, and now I had to implement it. Well, when you have to change something, of course you want to choose a more general approach, thus instead of adding a special syntax for “concat-operators” or something like that—I tinkered with the idea—I allowed using arbitrary expressions as operator (already tinkered with that idea long time ago): The parser will parse whatever syntactic unit specified as operator instead of eating a single token. The interesting observation I want to tell about–let us start with an example: With this new feature we could implement Knuth’s hyperoperator: 3↑3, 3↑↑3, 3↑↑↑3 etc., the operator would be specified by the expression “UP_ARROW+” or something like that. Interestingly this is kinda a ternary operator: we can specify an arbitrary integer for the number of arrows, and the number is encoded in unary representation. Let us choose a more comfortable notation: 3[12]3 for twelve arrows, or why not 3[1+3*2]3? In terms of the classical ternary operator: “cond ? opt1 : opt2“ is a binary expression, where “? opt1 :” is the operator. Now it is obviously a normal ternary operator taking the arithmetic expressions as parameters, but we can interprete it as binary operator where the operator symbol itself is an expression surrounded by brackets. In fact we can handle any n-ary infix operators that way as binary infix operators, n-ary prefix operators as unary prefix operators and n-ary postfix operators as unary postfix operators, since for precedence the inner arguments do not matter, and the expression parsing problem gets reduced to these three basic cases and parentheses—of course excluding certain kinds of clashes between the symbols used (colon for both a ternary (cond? opt1 : opt2) and binary (x:y) operators etc.) where probably all usual linear time parsing algorithms will fail and which are unsuitable for programming languages. This allows a great simplification of the parsing algorithm, I had a lot of code dedicated to handle ternary operators, when composing the syntax tree the code had to look for them explicitly (of course now it is obvious that this was unnecessary). For now I have removed the support for ternary operators, later I might add the syntactic sugar to implement such operators easily without caring about these technique, that for I have to allow custom fields in the AST for specific operators, but the shunting yard routine will not have to be aware of this—I will stop with those details, I hope you got the interesting point how shunting yard can be used in a very general way and how expression parsing gets simplified to the few basic cases found in most programming languages by allowing everything to be an “operator”.

Back from NWERC 2011

Monday, November 28th, 2011

Hi folks!

This weekend I participated at the North Western European Regional Contest, a regional preliminary round for the ACM ICPC. It is an international students contest about solving as many as possible out of 10 algorithmic problems in 5 hours in teams of three students and one PC. It was my first time there and it was quite fun.

At saturday we visited the historic harbour of Bremen Vegesack, which was partly boring and partly quite nice. At sunday there was the actual contest at the quite decadent and walled Campus of Jacobs University in Bremen (you even have to pay tuition of ~€25000, ~$33000 per year, at public universities you pay about €400-1400, I could live from it for long time, but even at elitist economical universities like EBS and WHU you have to pay much less). We used KDevelop 4.2 for solving the tasks and enjoyed the nice completion, source code reformatting (nice when you are not sure any longer which braces belong together and do not have much time :)), and inline documentation of STL functions. Thank you, KDevelop guys. :)

Our team from RWTH solved only three out of 10 tasks (30st place out of 70). B and E were quite trivial (I did not even had to read B since my team members solved it). I want to tell a bit about the other problems, I hope it is interesting for some of you. Of course you may try to solve the problems your self, it is quite a lot of fun. ;)


Given a natural number x<10^15 we should compute all n,k such that binomial(n,k)=x. It was quite simple, but we had a bug in our implementation and I do not have any clue what the problem was. We chose to check the cases binomial(n,1)=n and binomial(n,2)=n·(n-1)/2 manually and brute force the solution for all other cases by computing lines of Pascal’s triangle for k values such that binomial(n,k)≤x and terminating when binomial(n,2)>x. Alternatively you could check more terms explicitly (the team from LMU did it) and compute the whole triangle for the remaining values or use binary search to find the parameters.


Given a sequence of films watched by a person we should answer for each time position how many different films have been watched since the last time he watched the current one. By maintaining a binary tree containing the films ordered by the last time they were watched and a map containing this time for each movie this queries can be easily answered to if you have a way to get the number of nodes after a specific one in the tree in O(log n). That for you would have to store the size of each subtree and STL set/map unfortunately do not support this. Thus we decided to write our own tree, we went for a Treap with random priorities since it is the easiest way to avoid degeneration (easier to implement than AVL and Red-Black trees). Unfortunately we did not have enough time for debugging the tree and we even forgot to update the counters when balancing the tree after insertion. There were simplier solutions using segment trees.

For this task there were particularly many wrong submissions (the man presenting the solutions assumed that many teams tried to solve this one because the first team solved it yet after 20 minutes and thought it would be the easiest one, he even presented a very funny solution using java.lang.String. a linear time Java indexOf() call, which was of course way too slow).


This task was about deciding whether it is possible to partition a given pattern of black and white square into disjoint, L-shaped “puzzle pieces” of one black (in the corner) and two white squares. We had no idea how to solve that, we thought a lot about backtracking, but there semt to be no clever way to do that in polynomial time. The trick was to encode this problem in Boolean formulas and to use a standard algorithm (e.g. resolution) for the 2-SAT problem using variables encoding “white square x,y belongs to the same puzzle piece as the white square x+1,y+1“ etc.


Nobody solved this problem at the contest (though there were some people at the semi-live contest at Spoj solving it, a coach of another team said he knew a very similiar task from a different contest). Look at it, it looks like really complicated spam. Only one team tried. Cool geometric structures, quad-trees etc. will not help you to solve this problem: 500×500 fields, each one is either water or grass. You want to build a pool area. The border has to be grass and every pool of water has to be surrounded by a wall, which will cost $a per wall. You can transform water fields into grass, which will cost $b per field, and grass into water for $c. You want to build the cheapest pool area possible.

Actually it is a simple max-flow problem, since a minimum cut in a certain flow encoding the landscape is the optimal solution. You have a source, a sink and a node for every field. A cut should divide the nodes into water-nodes and grass-nodes. Source-to-x edges should have a capacity of the price needed to transform x into a grass-field and x-to-sink edges should have a capacity refering of the price needed to transform x into a water-field, unless the node represents a field at the border, then the capacity should be infinity. Additionally you add x-to-y nodes with a capacity of the price needed to build a wall for adjacent nodes x and y. That is it (if I am not wrong, I hope so :D). Really awesome.


I will post a full description later. For each observation you can yield an inueqility and you can complete the information in a Floyd-Warshall-like manner.


Spam. You have 13 out of 52 regular playing cards and want to partition it into as few as possible bunches of single cards, pairs, triples, quadruples, full-houses and streets of at least five consecutive cards. You can simply use brute force: Test all combinations of up to two full-houses and up to two streets (cannot be more for 13 cards) and greedily take the remaining cards as quadruples, triples, pairs and single cards. We tried that with a lot of copy-and-paste code and our program printed wrong answers, but we have no idea in which cases it failed. Alternatively you can choose a probably more compact solution by building a directed acyclic graph of all 2^13 combinations of your cards where edges represents adding one of the allowed bunches. Then you use breadth-first-search to get the solution. This was one of those typical tasks with useless information in the input (cards were encoded as 4h for ♥4 and Jd for a ♦Jack, in fact the suits did not even matter).


Given integer coordinates of up to 10000 RFID chips, 250000 RFID sensors and 25 walls compute which sensors can detect which chips. Each sensor can detect any chip within a radius of r, while walls reduce the radius by 1. No two sensors are closer to each other than r (“due to interference”^^). We used a sweep-line algorithm. Whenever the sweep line reaches the value x where x+r is the abscissa of a product we add the product to an ordered binary tree (std::set in C++) ordered by the ordinate. When reaching the value x where x-r is the abscissa of a chip we erase it from the tree. If x-r is the abscissa of a sensor we query all products in the tree with ordinates between y-r and y+r using lower_bound and upper_bound where y is the ordinate of the sensor. We count the number of walls intersecting the line between sensor and chip (we check that by adding up angle(p1, q1, p2)+angle(q1, p2, q2)+angle(p1, q2, p2)+angle(q1, p1, q2), which should be 2π (aka 360 degrees) for intersecting line segments) and check whether the chip is within the resulting reduced radius. This solution has a complexity of O((n+m)log(n+m)+n·w) where n is the number of chips, m the number of sensors and w the number of walls. This is guaranteed since the distance between any two sensors is at least r, thus at most 7 sensors can detect one chip (imagine a regular hexagon of sensors with one sensor in the center). Thus for many walls you would probably require a more sophisticated algorithm taking walls into account when sweeping.

Surprisingly only 7 teams solved it, I thought it would be kinda a standard-geometric-sweep-line-problem, it was even easier than E for us (there we used int instead of long long :D). Alternatively you can use binning (since the coordinate space is limited) or quad-trees to solve this problem.


You have a couple of train-stations and ≤1000 connections between these stations. Each connection will be once an hour at a certain minute (e.g. 13:47, 14:47 and so on). However, there is a certain probability that the train will be too late. If it is too late the delay is uniformly distributed in a range from 1 to some number ≤120. You have to compute the minimum expected duration of a trip from the start to the target. We solved that problem by a fixed-point iteration: For each station and each minute (0≤m<60) we store the best known expected duration to the target. We propagate that information till it does not change any longer. Unfortunately it timed out, probably because of floating point issues: we recognised changes which where insignificant or caused by rounding. Only one team solved this problem, three teams tried.


Yesterday I returned from Bremen, entered the first train at about 17:00, arrived at 0:45 and slept till 12:00. :) It was a really nice experience. If you are a student, like algorithms and want to participate: go ahead, look for two comrades, ask your professor to register you and join your regional contest! :) (at some universities there is a local preliminary contest)

Some visual impressions:

Coloured wave


The camera of my N97 mini mobile phone had a bug or something like that, however, when trying to photograph an ugly super market or something like that in the early evening and accidentally moving the camera too early I got this nice photo. Could not reproduce. :(
Some of my fingers

Could not reproduce :(

The contest was sponsored by Facebook and we had to attend a boring presentation of a Facebook employee. It was quite funny how he tried to make advertisement to work there. He assumed everybody in the room would use Facebook, what? We could get free t-shirts when answering his silly questions, did he think we would like to work there because of a t-shirt? It happened multiple times that he said that he is not allowed to tell about xy, really a great opportunity to work for a company where you cannot even superficially talk openly about the things you are doing there. And he told us that they would finally delete all messages six days after you have pressed the delete-button, haha. :D

I have pushed some solutions here.

Switched to LuaLaTeX

Monday, November 28th, 2011


Have you ever tried to write a little bit complex command in LaTeX? I did at some occasions, and finally it somehow worked, but it has always been ugly. However, there is LuaTeX/LuaLaTeX, it provides real scripting within your documents:

  for i=0, 15 do
    tex.print("Math: $x_{" .. i .. "}$")

That is just awesome, in plain LaTeX this would be ugly, but it gets even more ugly if you have to deal with floating points, external files etc. Well, for now I do not need any complex macro, so I cannot talk about actual experiences with Lua stuff, but I encountered some problems when translating my LaTeX document to LuaLaTeX two weeks ago.

unicode-math does not work properly

When enabling the unicode-math package for using Unicode characters in formulas (⊂∀∃∂ etc.) I have to select a font using the \setmathfont command, I tried “Latin Modern Math”, “STIXGeneral”, “XITS Math”, “Neo Euler” and “Asana Math”, otherwise Unicode symbols will not get displayed. However, with all of these fonts formulas do not look as good as with standard LaTeX lmodern-package, which is usable from LuaLaTeX, too, \setmathfont will override it. Some of them are too bold, some have ugly ℝ, ℚ, ℂ (\mathbb) and \mathcal symbols etc. Thus I decided to port the uniinput package provided by the Neo project (they create the keyboard layout I am using) to LuaLaTeX. I thought it would be nice to check the Lua capabilities that for, however, I faced the next problem.

Lua is not Unicode aware

That is really annoying, LuaLaTeX’s claim is to support a) sophisticated scripting and b) native Unicode support. However, they choosed Lua as scripting language, which does not support Unicode natively. I could not find the functions I needed to write a succinct macro for declaring a Unicode character in math mode (for examble ℝ should be replaced with \mathbb{R}), simply something to split a 8-bit-UTF-8-string into its Unicode characters and to do conversions between character codes and strings. I did not want to write it myself. Thus I choosed a quick and dirty way: using some regexp-magic and a Ruby script to convert uniinput.sty into uniinput-lualatex.sty. It works now, you can use it if you want to…

Making it working with KileIP

KileIP currently has the latex-command hard coded to display previews of formulas. I was too lazy to fix that and I wanted to be able to fall back to LaTeX if there were unexpected problems, thus I made my document working with both LaTeX and LuaLaTeX:


Well, next time I need a complex macro I will certainly use Lua and it will hopefully work with my setup. :)

KDevelop-PG-Qt 1.0 Beta

Tuesday, October 4th, 2011

Today KDevelop-PG-Qt 1.0 Beta (aka 0.9.82) got released, the parser generator used by several KDevelop (and Quanta) language plugins. There are some new features, and various bugs have been reported in the last few months since 0.9.5 and are now fixed.

New features

Most effort has been spent for implementing the lexer generation. You can now write a simple specification of the lexical structure by using regular expressions, the generated token stream class, which generates tokens from the input-data, can be used directly by the generated parser, but you can also just create a lexer without a parser. Thus, despite the name, KDevelop-PG-Qt is no longer just a parser generator. My motivation for writing the lexer generator was the lack of decent Unicode support in most lexers. Quex is quite good, but it is not free software (despite the LGPL based license) because of excluding military usage. The lexer can not only read different encodings, but also use different encodings for internal processing, e.g. it can either convert a UTF-8 stream to UTF-32 first or directly operate on UTF-8 bytes. Simple example:

%lexer ->
    for FOR ;
    "some keyword" SOME_KEYWORD ;
    {alphabetic}* WORD ;

It does also include limited support for look-ahead (full support like in Flex make generated code more complicated and inefficient), and a simlar – but in my experience more useful for modern languages – feature called “barriers” which will be explained in the documentation soon.

The API used by the generated files has been cleaned up, some method names were very counterintuitive, some methods were not used at all. This break requires updating KDevelop-PG-Qt to build the latest versions of PHP, QMake, SGML and CSS plugins (you may have to do a clean build).

There are also some minor new features: AST structs now get forward declared reducing dependencies, code which relied on definitions at some places has been fixed some time ago. The token types do not get declared by the parser any longer. Additionally you no longer need “;;” in the grammer files, though it is still supported for compatibility.

Bug fixes

The bug fixes include proper line numbering. In GCC you can now see correct line numbers refering to the location of wrong code in the grammer file, for other compilers you may have to activate the –use-line-directive, then it will use the #line directive instead of the GCC specific syntax, but you will not see the location in the generated code. The CMake-macro will do that automatically. Some compatibility errors have been reported reported by Fedora and Debian packagers and fixed, special thanks to them. KDevelop-PG-Qt builds with QT_STRICT_ITERATOS now and also builds under Windows with MSVC (MSVC is not perfect for C++11 and Flex is not perfect for Windows). Annoying wrongly formated error messages in the lexer and parser of KDevelop-PG-Qt have been fixed, too.

Some bug fixes might be necessary, thus there is this beta-version first. There is some work to do now first: The documentation and the kate grammar file for proper syntax highlighting have to be updated.


This is the second release using C++11 features, especially the auto keyword and STL hashes (unordered_map and unordered_set). I had used variadic templates before to construct UTF-8 automata, but for compatibility with MSVC it has been replaced with a Ruby script generating some code, the variadic template had been quite ugly anyway.

Future development of KDevelop-PG-Qt

  • Parser and lexer should be rewritten using KDevelop-PG-Qt
  • Long requested: there should be a way to mark conflicts as resolved, that for the conflict reporting has to be refactored a bit
  • The next release will hopefully support LL(k) parsing, making some stupid look-aheads obsolete
  • Cleaning up further code ;)

Special thanks to Milian Wolff for all his patience, bug reports and support with the release.

I will post the link to the tar-ball as soon as it is uploaded, there is only the tag for now.Tar-ball can be found here.

MeeGo finally dead, absolutely dead

Thursday, September 29th, 2011

Some people could not believe it that Nokia’s deal with Microsoft killed MeeGo, a promising Linux distribution for mobile devices, though it had been arguable to drop Maemo and not using Plasma Mobile. I hope they will now believe it: Even Intel is dropping MeeGo, Tizen is coming slowly, probably not providing a full featured GNU/Linux (why should it be like that, if everything is HTML5/JavaScript-focused), providing no more big benefit compared to Android, Necessitas is there, Tizle is not. There is finally obviously no more prospect for MeeGo, forget it. I hope the are chances for Free Software and KDE on Android, although that is not optimal, no full featured GNU/Linux and has an uncertain future because of Chrome OS, I currently do not see any alternative.

EcmaScript is no assembly language

Wednesday, September 28th, 2011


Windows 8, Tizen – there seems to be a HTML 5 + JavaScript hype. The myriads of web developers are expected to write all their stupid apps to make a platform popular. Some may know that I really do not like EcmaScript, however, it is good to allow using it, there may be domains where it is appropriate, there may be cases where it is the best choice to reuse some EcmaScript code. But seriously, that way technical possibilities get wasted. Reusing all the high-quality desktop application code – nearly impossible. Using certain paradigms of great programming languages – nearly impossible. I know, you can translate LLVM IL to EcmaScript, but EcmaScript is really not an assembly language. That it is not its purpose and it is not suitable, do not tell me V8 would be good, it cannot transform EcmaScript to an assembly language, even Google is now propagating Nativ Client (NaCl), really cool technology, to allow any language (great developments of the past!) being ed, by reusing existing languages, suitable as universal low-level representations: x86 machine code and LLVM IL. It had taken long time until there was finally a usable version of Qt for Android (Necessitas), reimplementation of everything and Java got in the way of a quick port, although there has always been the possibility to use native object-files. How long would something usable take for Tizen? And do not forget: Qt is huge an popular, it may be unaffordable for other projects somehow relying on certain OS capabilities. Thus, why not just supplying existing API? Why not just keeping great ecosystem, being open for new innovation and new technology which is compatible with it? Why limiting everything with EcmaScript’s poor capabilities? Web developers can still be attracted, EcmaScript and even HTML 5 can perfectly be combined with Qt. Even Apple failed with its original web-apps-only strategy, why should anybody else succeed with that nonsense now, years later? Microsoft is despaired, Google wants to control everything with Chrome OS, but why the hell should MeeGo be dropped?

New HTML/JavaScript-focused architectures: you suck.

Tizen wants to allow native access, but they “anticipate” that most stuff should be written with EcmaScript. Thus it will certainly not be easier than with Android to reuse existing technology.

GPLv3, LGPLv3, AGPLv3 Discussion

Wednesday, August 24th, 2011


Just a short notice: I have started a discussion about the exclusion of GPLv3, LGPLv3 and AGPLv3 by the current licensing policy at the kde-licensing mailing list—as promised in my previous blog post discussing some arguments. Will be offline for few days, thus do not wonder if I am not answering.


Unexpected Bottle Neck: vector<bool>

Tuesday, August 23rd, 2011

Hi folks!

The last paragraph is actually the interesting one for every C++-programmer, the leading-in might not be understandable.

The new lexer component of KDevelop-PG-Qt is not very fast, it can take several minutes with complex grammars, especially when using UTF-8 as internal encoding for the automata. Of course, first I want to finish the most important features and that for I am writing an example lexer for PHP to check its capabilities to spot bugs and missing features, but now I was curious and asked Valgrind/Callgrind for analysis. I expected a lot of useless copying/assignments because I have not done any optimisation yet and tried to be “defensive” by using too much copying. Indeed: For my first example about 78% of the runtime were spent on copying. But then I tried a full (as far as it is finished) PHP example with UTF-8 and it was a real surprise:

KDevelop-PG-Qt: Callgrind results: a lot of comparisons on vector<bool>

KCacheGrind Overview

It spends most of the time with comparing instances of std::vector<bool>! For the powerset construction I have to maintain following states for subsets of states in the NFA and after that I map those subsets to new unique IDs for the states of the DFA. For both cases I wanted to use a map with vector<bool> (a subset of states) as key_type, I could not decide if I should use map or unordered_map – I used map, but there is also some code to make it possible to use QBitArray with unordered_map, just for the case. However: I expected the minimisation to be a bottle neck (disclaimer: I stopped valgrind after some time, maybe it becomes a bottle neck later), but now the comparisons used in the map-implementation are crucial. I was not very economical with those map-operations – I could have done the mapping while computing the necessary subsets, it has to be optimised and I may use unordered_map and unordered_set (hashes) instead of map and set, but that is not the reason why I am writing this blog post. ;)

The interesting thing about it is the usage of std::_Bit_reference. std::vector<bool> has a specialised implementation, the booleans get packed, thus it will only need size()/8+const Bytes when it is full. There is a special hashing-implementation directly using the integers used to store the bits, but there is only a generic comparison implementation. Thus it will iterate over all the bits, instead of using comparisons of chunks (like 32 or 64 bit, depending on the CPU). Of course it will be much slower, keep in mind that each iteration requires some bit-manipulation. Especially in my case of a red-black-tree the vector will get compared to very similar ones next to it in the tree, and in that case it takes especially long time. Be aware of vector<bool>, it is often awesome (safe a lot of memory), but something like this slow comparison may happen, and always keep in mind that the index-operator might be slower than for vector<char&gt. Unfortunately there seems to be no API for accessing the integer-chunks. Btw. QBitArray does not support comparison at all. I have filed a bug-report to GCC/libstdc++.