-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce risk of running fgprof in production #12
Comments
In reply to @sumerc's comment here: #11 (comment)
No worries, me too 🙈.
Yes, I think that's a valid concern. But it's probably an issue for later. There are many low hanging fruits that can be picked while completely ignoring this issue.
Yeah, that'd be cool! But I suspect we can get away with a simpler approach for the first iteration.
IMO this is a strong argument for stopping the profiling with an error if we detect that it's not going well. I'd rather produce no data than to risk having a very negative impact on the application and/or producing useless data. One potential approach: We accumlate the time fgpro spends in If we're worried that even a single
What I meant is that instead of trying to stay below our "profiling time budget", we could try to use as much of it as possible, i.e. increase our sampling interval dynamically if we notice we could do more samples per second. PID controllers are a good way to adjust an input variable (in our case sampling frequency) in response to a measured process variable (in our case "STW duration per second") to keep it near a given setpoint (in our case our STW duration per second budget). Does that make sense? No worries if not. PIDs might be complete overkill for this, we'd probably only use the proportional term which is trivial and not deserving a fancy description like PID controller : p. |
Agreed. My concern was to detect situations like these. I also think that stopping the profiler might as well be the best option for these situations.
Yes. This came to my mind as well. Agreed that it is still too early like you suggested. BTW, as operations done in
Yes. When you say PID I thought we will be reading some external process related counters or something like that :) :) Anyway, I will try working on this whenever I find any time, I don't expect it to be too much complicated AFAIU. However, the tricky part is finding any time :). Is there somehow any related work/issue on your end waiting on this item or such? Knowing beforehand might help me better utilize my time. |
Sounds good! About related work: I'm starting a new job on Monday (joining Datadog's profiler team), and there is a decent chance I'll work on related stuff a lot going forward. The details are still fuzzy, but one thing I was considering was to propose a patch to the Go core guys that would allow Anyway, not sure if I'll get a chance to work on this yet or not, I'll keep you posted! |
Congrats! That is awesome :)
Yes. It is a good idea. In any case, |
Yes. The API change will probably be the most controversial aspect of proposing this to the Go core. But let's hope they see the value in wallclock profiling. |
The goal for this issue is to figure out how to reduce the risk of running fgprof in production. People should feel safe turning it on, knowing that in the worst case it will just bail out or reduce the amount of data being gathered rather than having a negative impact on the application.
There has been some initial discussion here: #11 (comment)
The text was updated successfully, but these errors were encountered: