Didn’t really get the point of the post as it just presents something without a conclusion.
9X% of users do not care about a <1% drop in performance. I suspect we get the same variability just by going from one kernel version to another. The impact from all the Intel mitigations that are now enabled by default is much worse.
However I do care about nice profiles and stack traces without having to jump through hoops.
Asking people to recompile an _entire_ distribution just to get sane defaults is wrong. Those who care about the last drop should build their custom systems as they see fit, and they probably already do.
it does present a conclusion. once the kernel supports .sframe it will be all-around superior to -fomit-frame-pointer, and a better default for distros to use.
This comparison is pretty misleading. An accessibility issue prevents someone from being able to use software effectively. Not having localized text would have a similar impact. A ~1% performance impact on the other hand is the minuscule downside of improving debugging, profiling and error reporting for an entire OS. And that's not just a minority of users, as tons of software will automatically gather stack traces for bug reports.
There's basically no downside to fixing accessibility issues or adding new language translations other than the work involved in doing so. (And yes, maintaining translations over time is hard, but most projects let them lag during development, so they don't directly hold anything back.) There is a rather glaring downside to this performance optimization, whose upside is sometimes entirely within run-to-run variance and can be blown away by almost any other performance tweak. It's clear the optimization has some upsides, but an extra register and saving some trivial loads/stores just isn't as big of a deal on modern processors that are loaded to the gills with huge caches and deep pipelines.
I guess I don't care that much about fomit-frame-pointer in the grand scheme of things, but I think enabling it in distributions was ultimately a mistake. If some software packages benefited enough from it, it could've just been done only for those packages. Doing it across the system is questionable at best...
But does what you care about matter enough to be the default?
Are you the majority?
Evaluate "majority" this way: For every/any random binary in a distro, out of all the currently running instances of that binary in the world at any given moment, how many of those need to be profiled?
There is no way the answer is "most of them".
You have a job where you profile things, and maybe even you profile almost everything you touch. Your whole world has a high quotient of profiling in it. So you want the whole system built for profiling by default. How convenient for you. But your whole world is not the whole world.
But it's not just you, there are, zomg thousands, tens of thousands, maybe even hundreds of thousands of developers and ops admins the same as you.
Yes and? Is even that most installed instances of any given executable?
No way.
Or maybe yes. It's possible. Can you show that somehow? But I will guess no way and not even close.
> Evaluate "majority" this way: For every/any random binary in a distro, out of all the currently running instances of that binary in the world at any given moment, how many of those need to be profiled?
> There is no way the answer is "most of them".
This is an absurd way to evaluate it. All it takes is one savvy user to report a performance problem that developers are able to root-cause using stack traces from the user's system. Suppose they're able to make a 5% performance improvement to the program. Now all user's programs are 5% faster because of the frame pointers on this one user's system.
At this point people usually ask: but couldn't developers have done that on their own systems with debug code? But the performance of debug code is not the same as the performance of shipping code. And not all problems manifest the same on all systems. This is why you need shipping code to be debuggable (or instrumentable or profileable or whatever you want to call it).
I regularly have users run Sysprof and upload it to issues. It's immensely powerful to be able to see what is going on systems which are having issues. I'd argue it's one of the major reasons GNOME performance has gotten so much better in the recent-past.
You can't do that when step one is reinstall another distro and reproduce your problem.
Additionally, the overhead for performance related things that could fall into the 1% range (hint, it's not much) rarely are using the system libraries in such a way anyway that would cause this. They can compile that app with frame-pointers disabled. And for stuff where they do use system libraries (qsort, bsearch, strlen, etc) the frame pointer is negligible to the work being performed. You're margin of error is way larger than the theoretical overhead.
Better analogy: you're paying 30% to apple, and over 50% in bad payday loans, and you're worried about the 3% visa/stripe overhead ... that's kinda crazy. But that's where we are in computer performance, there's 10x, 100x, and even greater inefficiencies everywhere, 1% for better backtraces is nothing.
Absolutely. We've gotten numerous double digit performance improvements across applications, libraries, and system daemons because of frame-pointers in Fedora (and that's just from me).
Performance problems matter to the people who have them, who often are in an inconvenient place. Having the ability for profiling to just work means that it's easy to help these people.
I think you are trying to make this out something that it isn’t.
Visibility at the “cost” of negligible impact is more important than raw performance. That’s it.
I’m a regular user of Linux with some performance sensitivity that does not go as far as “I _need_ that extra register!”. That’s what the majority of developers working on Linux are like. I think it’s up to _you_ to prove the contrary.
> Evaluate "majority" this way: For every/any random binary in a distro, out of all the currently running instances of that binary in the world at any given moment, how many of those need to be profiled?
Most systems need to generate useful crash reports. Even end user systems. What kind of system doesn't need them? How else are developers supposed to reliably address user complaints?
Theoretically, there are alternative ways to generate stacktraces without using frame pointers. The problem is, they're not nearly as ubiquitous and require more work to integrate them in existing applications and workflows. That makes them useless in practice for a large number of cases.
I think it's ridiculous to question that since obviously, yes, many people have decided exactly that. I see no point myself and I'm even in the field. And I am not in charge of all the distributions which disabled it by default.
So, "yes". In fact "yes, duh?" Talk about head in sand...
That strikes me as an insane take (not to mention blatantly inaccurate), but I take your point that this is a common one for distribution-maintainers to have.
> 9X% of users do not care about a <1% drop in performance.
Except Python got opted out of the frame pointer change due to benchmarks showing slowdowns of up to 10%. The discussion around that had the great idea of just adding a pragma to flat out override the build setting. So in the end that "%1" reduction claim only holds if everything even remotely affected silently ignores the flag.
Any link to the fix or documentation about it? I could find added perf support but did not see anything about improved performance related to frame pointer use.
https://pagure.io/fesco/issue/2817#comment-826636 will probably get you started into the relevant paths. Python 3.12 was going to include frame-pointers anyway for perf to boot. So they needed to fix this regardless.
No, that's not how it works. You don't stack 1% losses of the total application performance on top of each other just because your application uses 40 libraries.
9X% of users do not care about a <1% drop in performance. I suspect we get the same variability just by going from one kernel version to another. The impact from all the Intel mitigations that are now enabled by default is much worse.
However I do care about nice profiles and stack traces without having to jump through hoops.
Asking people to recompile an _entire_ distribution just to get sane defaults is wrong. Those who care about the last drop should build their custom systems as they see fit, and they probably already do.