-
Notifications
You must be signed in to change notification settings - Fork 28
Benchmark results (kind of) for the Raspberry Pi4 #8
Comments
Please use the below patch and it may enable multithreading diff --git a/tflite_minimal/minimal.cc b/tflite_minimal/minimal.cc interpreter->SetNumThreads(-1); //For more details refer this link Sets the number of threads used by the interpreter and available to CPU kernels. If not set, the interpreter will use an implementation-dependent default number of threads. Currently, only a subset of kernels, such as conv, support multi-threading. num_threads should be >= -1. Setting num_threads to 0 has the effect to disable multithreading, which is equivalent to setting num_threads to 1. If set to the value -1, the number of threads used will be implementation-defined and platform-dependent. |
Ok, will give -1 a go and if it does not work use 4 as that works with the python wrapper if tflite. |
Btw, as I installed the tflite as system lib I compiled it by addibg a quick and dirty cmake file instead of pulling in tflite sources again. |
You may want to try stream on Raspberrry Pi4 |
Will look into that next and report back to you. |
with -1 as multithread, it apears to still use 1 thread. So recompiled with 4 below the results using 4 threads.
Perhaps for the future it would be nice to add a "-t " flag just as whispercpp does. Will check out the streaming binary and report back. |
As an example let me know how you go as I have found tflite doesn't scale that well or didn't with the models I tried. |
Took a better look at the -1 option for multithread and indeed it looks like it uses 2 threads however only one if them reaches 100% cpu usage. The other thread doesn't. So guess setting it to -1 and let the ststemd figure it out itself appears to be the better way forward. haven't build the streaming binary yet. |
Re-ran the test.wav with both -1 and 4 together by timing the command. With -1 it some times use two threads, sometimes one and then switch to another and sometimes one at 100% and another at around 40% Anyhow, results below;
Did the same with it being set to 4 and then it just uses all four cpu's. Not always all at 100% but definetely doing this at four threads at the same time. It then take around 9 seconds to transcribe. So yeah, using -1 and let tflite figure it out does not always scale right, however hardcoding it to a number does work better. And yes hardcoding it to two takes ~10 seconds to encode, so using more threads does bring something but yeah, anything above 2 does not bring more then it cost. |
Just for completeness-sake, below the same results for a Raspberry Pi 3b+
Strange thing is. The other two WAV files get's cutted off at ~12 seconds? |
Thanks @j1nx |
I have compiled the latest version and running it with Tensorflow-Lite 3.11 on a Raspberry Pi4.
Below are the results of the different samples wav files.
Considering the first wav file is 11 seconds and the other two exactly 30 seconds, it looks like it is possible for realtime encoding on a RPI4. It also runs on 1 CPU, so not bad. Not bad at all!
Great work.
The text was updated successfully, but these errors were encountered: