-
-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lowering the overhead of Cyberbrain #58
Comments
Benchmark results from (all tests are executed with Python 3.8) hyperfine -m 5 "python3.8 -m examples.password.password examples/password/sonnets/* -s 1 --l33t" Before optimization: Benchmark #1: python -m examples.password.password examples/password/sonnets/* -s 1 --l33t
Time (mean ± σ): 13.538 s ± 1.564 s [User: 11.830 s, System: 0.504 s]
Range (min … max): 11.626 s … 15.674 s 5 runs |
After 6ee0e94:
|
After af9b0d4
|
In my humble opinion. There is no need to save every invocation in the frame per time.
It maybe can save lots of memory and time spent. |
Hi @linw1995, thanks for the suggestions. Collecting less information is definitely one option we should consider. There are probably a few misundertandings here so I'd like to clarify and see what you think. Currently Cyberbrain does not step in each function call. So if you have a function @trace
def f():
g()
h() and Cyberbrain will only know that g() and h() get called, but will not trace what happened inside g() or h(). I think what you have in mind is multi-frame tracing, which is the major feature planned for v2. Second, let's say we want to apply this optimization for multi-frame tracing. How do we know if a function is pure? Note that even if we can analyze Python code, we don't know what happens at the C-level. BTW the analyzing process might take a lot of time too. Third
This is not possible. Cyberbrain does not store the original value at each step, because you can't deepcopy unpickleble objects. Instead, we convert the value to JSON immediately and just store the JSON (which is the "JSON pickle" step mentioned in the main thread). Ultimately a JSON representation is what users see in the devtools console, so there's no need to store the original object either. |
@laike9m thanks for the clarification. Yes, I am thinking about multi-frame tracing. In my vision, we can do this, or close enough. One way to lower the overhead of Cyberbrain in multi-frame tracing is that cut some tracing branches. There are two types of branches that can be cut.
The pure function callsEvery invocation will produce a snapshot of the current variables which are in use. If collecting snapshots before and after the invocation in the current frame and without non-pure calls, can be re-calculated the detailed events of the invocation in the deeper frame.
About how to determine non-pure invocation in frame, yes, we cannot do that. Maybe let the user decides what frame is the pure calculation, to cut some tracing branches. def add(a, b):
return a + b
def print_time():
print(time.time())
@trace(pure=(add,), depth=2)
def multiply(a, b):
answer = 0
print_time()
for _ in range(a):
answer = add(answer, b)
return answer In the above example, we only need to save the arguments and the return value of the There is no need for deep copying everything. The snapshots in need further discussion
|
After 9789ab0 (Replaced protobuf with msgpack)
py-spy result: Message encoding is not a bottleneck anymore. |
Cyberbrain adds a huge overhead to program execution, both in time spent and memory usage. This issue is for discussing possible improvements.
Time
Profiled one run with py-spy
here's the result: https://laike9m.github.io/images/cyberbrain_profile.svg
I only did a brief check. In summary, the overhead of
sys.settrace
is smaller than expected. It took up ~1/6 of the extra time.Major consuming time operations:
[ ] Propose to use orjson as the preferred backend jsonpickle/jsonpickle#326parameters = inspect.signature(handler).parameters
invalue_stack.py
. Kinda unexpected.log
function inlogger.py
. This is also unexpected.Apparently there are some low-hanging fruits, and we should fix them first.
Ultimately, we need to rewrite part of Cyberbrain in C/C++. There are many options, but I'd like to automate it as much as I can, so I will first look into Nuitka and mypyc. If they don't work well, Cython is also a good option.
Probably the only good news is that the overhead of
sys.settrace
only contributes a small portion to the overhead. So in the short-term I won't bother replacing it. Once we optimized the other stuff to the extent thatsys.settrace
becomes the majority overhead, we'll come back to it.Optimize JSON pickle
Cybebrain uses the jsonpickle library to convert Python objects to JSON, so that they can be displayed in the devtools console. jsonpickle is pure Python and really slow, it took ~23% of the total time, which is the biggest performance bottleneck.
The to JSON process can't be parallelized, since we have to do it before the original object gets modified. Thus the only way left is to speed up the library. Some options I've considered or tried
Memory
TBD
The text was updated successfully, but these errors were encountered: