Capacitor plugin to transcribe speech into text.
- 🖥️ Cross-platform: Supports Android, iOS and Web.
- 🌐 Multiple Languages: Supports many different languages.
- 🛠 Permissions: Check and request permissions for recording audio.
- 🎙 Events: Listen for events like
start
,end
,speechStart
,speechEnd
,error
,partialResults
, andresults
. - 🔇 Silence Detection: Automatically detects silence to stop the recording.
- 📊 Silence Threshold: Define what's considered "silence" for your recordings.
- 🔁 Up-to-date: Always supports the latest Capacitor version.
- ⭐️ Support: First-class support from the Capawesome Team.
This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:
npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>
Attention: Replace <YOUR_LICENSE_KEY>
with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.
Next, install the package:
npm install @capawesome-team/capacitor-speech-recognition
npx cap sync
Add the NSSpeechRecognitionUsageDescription
and NSMicrophoneUsageDescription
keys to the ios/App/App/Info.plist
file, which tells the user why your app is requesting location information:
<key>NSSpeechRecognitionUsageDescription</key>
<string>Speech recognition is used to transcribe speech into text.</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone is used to record audio for speech recognition.</string>
No configuration required for this plugin.
import { SpeechRecognition } from '@capawesome-team/capacitor-speech-recognition';
const startListening = async () => {
await SpeechRecognition.startListening({
language: 'en-US',
silenceThreshold: 2000,
});
};
const stopListening = async () => {
await SpeechRecognition.stopListening();
};
const checkPermissions = async () => {
const { recordAudio } = await SpeechRecognition.checkPermissions();
return recordAudio;
};
const requestPermissions = async () => {
const { recordAudio } = await SpeechRecognition.requestPermissions();
return recordAudio;
};
const isAvailable = async () => {
const { available } = await SpeechRecognition.isAvailable();
return available;
};
const isListening = async () => {
const { listening } = await SpeechRecognition.isListening();
return listening;
};
const getSupportedLanguages = async () => {
const { languages } = await SpeechRecognition.getSupportedLanguages();
return languages;
};
const addListeners = () => {
SpeechRecognition.addListener('start', () => {
console.log('Speech recognition started');
});
SpeechRecognition.addListener('end', () => {
console.log('Speech recognition ended');
});
SpeechRecognition.addListener('error', (event) => {
console.error('Speech recognition error:', event.message);
});
SpeechRecognition.addListener('partialResult', (event) => {
console.log('Partial result:', event.result);
});
SpeechRecognition.addListener('result', (event) => {
console.log('Final result:', event.result);
});
SpeechRecognition.addListener('speechStart', () => {
console.log('User started speaking');
});
SpeechRecognition.addListener('speechEnd', () => {
console.log('User stopped speaking');
});
};
const removeAllListeners = async () => {
await SpeechRecognition.removeAllListeners();
};
getLanguages()
isAvailable()
isListening()
startListening(...)
stopListening()
checkPermissions()
requestPermissions()
addListener('end', ...)
addListener('error', ...)
addListener('partialResult', ...)
addListener('result', ...)
addListener('speechEnd', ...)
addListener('speechStart', ...)
addListener('start', ...)
removeAllListeners()
- Interfaces
- Type Aliases
getLanguages() => Promise<GetLanguagesResult>
Get the available languages for speech recognition.
Attention: On Android, this method is unfortunately not supported by all devices. If the method is not supported, the promise will never resolve. It's recommended to set a timeout for the promise.
Only available on Android and iOS.
Returns: Promise<GetLanguagesResult>
Since: 6.0.0
isAvailable() => Promise<IsAvailableResult>
Check if the speech recognizer is available on the device.
Returns: Promise<IsAvailableResult>
Since: 6.0.0
isListening() => Promise<IsListeningResult>
Check if the speech recognizer is currently listening.
Returns: Promise<IsListeningResult>
Since: 6.0.0
startListening(options?: StartListeningOptions | undefined) => Promise<void>
Start listening for speech.
Param | Type |
---|---|
options |
StartListeningOptions |
Since: 6.0.0
stopListening() => Promise<void>
Stop listening for speech.
Since: 6.0.0
checkPermissions() => Promise<PermissionStatus>
Check permissions for the plugin.
Returns: Promise<PermissionStatus>
Since: 6.0.0
requestPermissions() => Promise<PermissionStatus>
Request permissions for the plugin.
Returns: Promise<PermissionStatus>
Since: 6.0.0
addListener(eventName: 'end', listenerFunc: () => void) => Promise<PluginListenerHandle>
Called when the speech recognizer has stopped listening.
Param | Type |
---|---|
eventName |
'end' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>
Called when an error occurs.
Param | Type |
---|---|
eventName |
'error' |
listenerFunc |
(event: ErrorEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener(eventName: 'partialResult', listenerFunc: (event: PartialResultEvent) => void) => Promise<PluginListenerHandle>
Called when a partial result is available.
Param | Type |
---|---|
eventName |
'partialResult' |
listenerFunc |
(event: PartialResultEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener(eventName: 'result', listenerFunc: (event: ResultEvent) => void) => Promise<PluginListenerHandle>
Called when the final results are available.
Param | Type |
---|---|
eventName |
'result' |
listenerFunc |
(event: ResultEvent) => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener(eventName: 'speechEnd', listenerFunc: () => void) => Promise<PluginListenerHandle>
Called when the user has stopped speaking.
Only available on Android and Web.
Param | Type |
---|---|
eventName |
'speechEnd' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener(eventName: 'speechStart', listenerFunc: () => void) => Promise<PluginListenerHandle>
Called when the user has started to speak.
Only available on Android and Web.
Param | Type |
---|---|
eventName |
'speechStart' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
addListener(eventName: 'start', listenerFunc: () => void) => Promise<PluginListenerHandle>
Called when the speech recognizer has started listening.
Param | Type |
---|---|
eventName |
'start' |
listenerFunc |
() => void |
Returns: Promise<PluginListenerHandle>
Since: 6.0.0
removeAllListeners() => Promise<void>
Remove all listeners for this plugin.
Since: 6.0.0
Prop | Type | Description | Since |
---|---|---|---|
languages |
string[] |
The supported languages for speech recognition as BCP-47 language tags. | 6.0.0 |
Prop | Type | Description | Since |
---|---|---|---|
isAvailable |
boolean |
Whether or not the speech recognizer is available on the device. | 6.0.0 |
Prop | Type | Description | Since |
---|---|---|---|
isListening |
boolean |
Whether or not the speech recognizer is currently listening. | 6.0.0 |
Prop | Type | Description | Default | Since |
---|---|---|---|---|
language |
string |
The BC-47 language tag for the language to use for speech recognition. | 6.0.0 | |
silenceThreshold |
number |
The number of milliseconds of silence before the speech recognition ends. Only available on Android (SDK 33+) and iOS. | 2000 |
6.0.0 |
Prop | Type | Description | Since |
---|---|---|---|
recordAudio |
PermissionState |
Permission state for recording audio. | 6.0.0 |
Prop | Type |
---|---|
remove |
() => Promise<void> |
Prop | Type | Description | Since |
---|---|---|---|
message |
string |
The error message. | 6.0.0 |
Prop | Type | Description | Since |
---|---|---|---|
result |
string |
The partial result of the speech recognition. | 6.0.0 |
Prop | Type | Description | Since |
---|---|---|---|
result |
string |
The final result of the speech recognition. | 6.0.0 |
'prompt' | 'prompt-with-rationale' | 'granted' | 'denied'
See CHANGELOG.md.
See LICENSE.