I want to profile a few minutes of cycles rendering a benchmark file (I already did so, but I am not sure if I did so correctly). I am particularly interested in two things: Finding out CPU time hotspots and finding out cache hits and misses (general cache friendliness and boundness) (is boundness a word? I am not a native english speaker, but I hope you get what I mean). Now, should I profile in debug or in release configuration? Because somebody suggested to build the release configuration with debug symbols and then profile that, should I do that? How ‘realistic’ is the debug build? Is the memory layout somehow different? Different padding or something? Can I expect cache friendliness optimizations (and maybe also the removal of unnecessary processing) to transfer over to the release build?
I know, that’s a lot of questions, and I guess the answer to some of them is “it depends”, but still, I would like to hear your opinions on them.