Performance of C++11 regular expressions

Since I want to talk about C++11 program performance, I guess I should start saying that I am running g++ (GCC) 4.9.1 on Linux (Manjaro).
In What’s in a regular expression?, I presented some preliminary testing of the regular expression functionality in C++ that I could use in the scope of the assembler parser for Nand2Tetris.
In Linux application profiling can be spectacular, I explained how I divided the running time of my assembler by 100, by constructing regular expressions only once instead of once per row to parse.
After the regular expression improvement, I could not help rewriting the parser (only 130 lines of code) code with some std::string functions instead of regular expressions, to see what running time I would get.
The std::string functions I used were the following:

  • std::remove_if()
  • string::erase()
  • std::isdigit()
  • std::isspace()
  • string::substr()
  • string::find()

I guess you get the picture (I won’t show the code because it is contrary to Nand2Tetris’ policy).
The numbers are as follows to assemble a 20000 line-assembly file,
with regular expressions:

and with std::string:

So it is better by 30% with std::string.
My Parser.cpp is about the same size in both cases, but I guess that is because the pattern rules are simple. If they got more complicated, the regular expression version would be cheaper to maintain and not grow so much in size.

Leave a Reply

Your email address will not be published. Required fields are marked *