-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLE APIs are not thread safe #42
Comments
This is a pretty serious issue. Since it's doesn't seem to be fixed in the firmware, what would be a "good enough" workaround on the application side? Wrap all |
Unfortunately due to the way the callback registration works there's no reliable workaround from application code. You cannot register callbacks before I managed to fry one of my boards by accident so I don't have a way to test this anymore. |
Appreciate the quick response. So you're saying there's not really a way to prevent race conditions? That's pretty unfortunate... |
I recompiled my firmware to solve this, but obviously that's not a solution for everyone. There were other problems which I submitted pull requests for (#40, #41) and the latest version of the firmware solves at least that bits. Then RedBear was acquired by Particle and this whole repository was scheduled to get merged into the official Particle firmware (particle-iot#1505). So far that didn't happen. My hope was that the BLE API gets more attention as Particle is going to release their own Bluetooth products (https://www.particle.io/mesh/). |
Are your modifications by any chance open source? I’m already recompiling the firmware to get around other limitations (#45) so it wouldn’t be a huge leap for me. Edit: Also, I mirror your wishes for the future of particle Bluetooth, hopefully it gets more attention! |
I am afraid I never pushed that branch to my GitHub, so it's probably lost for good :-/ I tried all the three approaches above, but in the end I settled on implementing |
The BTstack APIs used to implement the BLE wiring/HAL API are supposed to be single-threaded. The HAL code, however, spawns a thread that handles the BTstack execution loop. This can be done, but only when certain rules are followed, which is not the case.
Currently the APIs such as "ble.startAdvertising()" are documented to be safe to be executed in main thread in a pattern like
This, in turn, triggers the calls to
hal_btstack_init
(forble.init
) for BTstack to be initialized, which spawns a thread at the end. By the timehal_btstack_startAdvertising
(forble.startAdvertising
) is called the thread is already running and continuing the Bluetooth initialization (often the key exchange as part of BTstack'ssm_run
function).Since the APIs are not thread safe one of the following things can occur:
hal_btstack_startAdvertising
callsgap_advertisements_enable(1)
. It sets the necessary flags and proceeds to callhci_run
.hci_run
immediately quits on theif (!hci_can_send_command_packet_now()) return;
condition because the worker thread is busy sending key exchange packets. In turn, the advertisements are never enabled and there's no error reported.hal_btstack_startAdvertising
callsgap_advertisements_enable(1)
. It sets the necessary flags and proceeds to callhci_run
.hci_run
goes past theif (!hci_can_send_command_packet_now()) return;
check. Now depending on the thread interleaving one of these things can occur:sm_run
key exchange on the worker fails because it can't send a packet now.hci_can_send_command_packet_now
checks and end up messing up the command buffers and sending gibberish to the HCI USART communication port.In some cases, the thread interleaving happens to be lucky and everything succeeds in the right order.
There are some ways to solve the problem, but each of them has its own drawbacks. These are the ones that I have come up with:
ble.waitForEvents()
API that will be run in the application thread.ble.startWorker()
API. Allow all BLE API calls to be made on the application thread until thestartWorker
API is called. After that the APIs could only be called in the BLE timers and callbacks, with the exception ofble.deInit()
(or possiblyble.stopWorker()
). It can be combined with theble.waitForEvents()
API above to offer an alternative and introduce more flexibility for the application writers.Note that this is very much not a theoretical problem. It keeps happening on my initialization sequence in various different ways. I spend days analyzing what is going wrong and I will happily provide more information and relevant logs. The race conditions often manifest as "invalid packet type" or "packet timeout" errors in the HCI code in hci_transport_h4_wiced.c.
The text was updated successfully, but these errors were encountered: