Profiling only parts of your code with perf.
20 Jan 2014Perf is an excellent tool for profiling linux software. It uses hardware performance counters to give you information on your code’s performance. This makes it a very low overhead alternative to – for instance – valgrind which uses instrumentation. Both ways of profiling software have their merits but perf is a great tool to just keep running all the time to find severe performance bottlenecks with ease.
That said, I frequently have to profile only a part of my code. This is due to the fact that I write lots of database benchmarks which frequently load or even generate an initial dataset which they than run queries on. I usually care for query instead of loading performance, therefore I’d like to profile only the query part of my application. This can be done using API calls to the excellent performance counters C api but in my case, an easier way of achieving what I need it to use a “perf wrapper”. In you code, that wrapper looks like this:
For each System::profile
part you will receive a performance counter dump, one named loading.data and one named queries.data. Each contains the samples for the appropriate subset of the code and you can find bottlenecks wihin query execution without also seeing all data which is actually produced by initially loading the database.
Essentially, this code starts perf record on the current process right before it’s body is executed and stops perf right after the body giving you a performance counter sampling of exactly the part of code you are interested in. It is implemented like this:
You can take a look at your results by executing either perf report -i loading.data
or perf report -i queries.data
depending on which part you are interested in.
Of course, this only works and looks this good if you use C++11 but who would want to use anything less.