Releases: theneolanders/resonite-voice-bridge
Bundle Blockly dependency
I was previously using an unversioned release of Blockly from unpkg. This was recently updated which broke the UI, so I've bundled a static version of the dependency, which has the added benefit of eliminating any asset loading from third party servers.
If you were having issues with the previous release due to Blockly not loading, this one will fix it for you.
Thanks @Sl4vP0weR!
Custom Command syntax bugfixes
This release introduces a few small changes to the new custom commands based on user feedback:
- Wake word can now be multiple words
- Punctuation in text inputs doesn't break matching
- Parameters will always end with a delimiter character for simpler parsing
- Loading with an uninitialized word replacement list no longer crashes
v2.0 Custom commands with visual editor
New Features
- This release implements a new visual editor (using Google Blockly, the same as Scratch) to enable you to create custom commands. This greatly simplifies the process of extracting meaning from your speech transcript by turning it into actionable commands and parameters to be sent to Resonite. This feature significantly lowers the barrier for creating voice assistants and voice-powered objects.
- The word replacement feature now includes an import/export button so you can easily backup/restore your dictionary.
- The UI has been reworked to use collapsible accordions, so you can quickly and easily get to the section you need.
Bug Fixes
- Added missing events in response to websocket for word replacement enable/disable.
Screenshot of the new command editor:
v1.85 Add streaming output toggle, fix speechEnded event
This release adds a new toggle option for streaming output. Disabling this will result in a single transcription being sent, rather than the partial transcriptions that Google produces for longer inputs.
This also fixes the speechEnded event firing every few seconds when no new speech was detected.
v1.8 Punctuation removal, speechEnded event
This update implements a new punctuation removal feature, including associated events and commands.
It also adds the speechEnded
event, for manually determining when the user has stopped speaking.
v1.7 Word replacement/uncensoring, persistent settings
This release includes the ability to replace words from the speech recognition engine. This allows you to uncensor profanity as well as correct incorrect detections due to accent or audio quality.
This update also includes settings persistence, using the browsers localStorage mechanism. Any changes to settings will persist across reloads as long as browser cache is not cleared.
v1.6 Custom confidence thresholds + improved documentation
This release implements custom confidence thresholds. This allows you to set the lower limit for how confident the Speech Recognition API must be in order to append text to the transcript. This can be configured in the UI or via commands. There are also accompanying commands to monitor when changes are made.
The in-UI documentation has been overhauled to be clearer and more thorough.
v1.5 General improvements and fixes
This release includes the following new functionality:
clear
command that lets you manually clear the transcript before detection ends. This lets you more easily parse out wake words in protoflux. Includes an event to notify you when the transcript was cleared. This can be finicky since Google's STT sometimes changes words as it gains more confidence in its prediction. A future update will include custom timeouts and confidence thresholds to make this easier to work with.- Debug mode - This shows you the confidence value for the current prediction. Includes both a UI element as well as a new event to report the confidence to flux. A future update will include requiring a custom confidence threshold to be met before appending to the transcript.
The syntax for commands and events has changed. Commands no longer require special characters on either side, and events are now wrapped in square brackets (ie: [enabled])
Additionally transcriptions should no longer send duplicate messages that contain no new text. A new transcription event will only be sent if the previous transcription was detected as complete (determined by Google automatically, for now), or if new text is present.
v1.2 Multilingual Support
This release adds support for multiple languages with the Chrome Speech To Text API, supporting 16 languages total.
Languages can be switched via the UI or via a command to the websocket. A new event is sent over the websocket when the language is changed.
v1.1 Mic control
This update adds the ability to toggle, enable, and disable the mic using a button on the interface and also by sending messages to the websocket connection. The websocket server now also sends event messages when the microphone status changes.
Additionally it includes better visual feedback on the page for the microphone status and permission request.