Skip to content

Latest commit

 

History

History

speech-recognition

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

@capawesome-team/capacitor-speech-recognition

Capacitor plugin to transcribe speech into text.

Features

  • 🖥️ Cross-platform: Supports Android, iOS and Web.
  • 🌐 Multiple Languages: Supports many different languages.
  • 🛠 Permissions: Check and request permissions for recording audio.
  • 🎙 Events: Listen for events like start, end, speechStart, speechEnd, error, partialResults, and results.
  • 🔇 Silence Detection: Automatically detects silence to stop the recording.
  • 📊 Silence Threshold: Define what's considered "silence" for your recordings.
  • 🔁 Up-to-date: Always supports the latest Capacitor version.
  • ⭐️ Support: First-class support from the Capawesome Team.

Installation

This plugin is only available to Capawesome Insiders. First, make sure you have the Capawesome npm registry set up. You can do this by running the following commands:

npm config set @capawesome-team:registry https://npm.registry.capawesome.io
npm config set //npm.registry.capawesome.io/:_authToken <YOUR_LICENSE_KEY>

Attention: Replace <YOUR_LICENSE_KEY> with the license key you received from Polar. If you don't have a license key yet, you can get one by becoming a Capawesome Insider.

Next, install the package:

npm install @capawesome-team/capacitor-speech-recognition
npx cap sync

iOS

Privacy Descriptions

Add the NSSpeechRecognitionUsageDescription and NSMicrophoneUsageDescription keys to the ios/App/App/Info.plist file, which tells the user why your app is requesting location information:

<key>NSSpeechRecognitionUsageDescription</key>
<string>Speech recognition is used to transcribe speech into text.</string>
<key>NSMicrophoneUsageDescription</key>
<string>Microphone is used to record audio for speech recognition.</string>

Configuration

No configuration required for this plugin.

Usage

import { SpeechRecognition } from '@capawesome-team/capacitor-speech-recognition';

const startListening = async () => {
  await SpeechRecognition.startListening({
    language: 'en-US',
    silenceThreshold: 2000,
  });
};

const stopListening = async () => {
  await SpeechRecognition.stopListening();
};

const checkPermissions = async () => {
  const { recordAudio } = await SpeechRecognition.checkPermissions();
  return recordAudio;
};

const requestPermissions = async () => {
  const { recordAudio } = await SpeechRecognition.requestPermissions();
  return recordAudio;
};

const isAvailable = async () => {
  const { available } = await SpeechRecognition.isAvailable();
  return available;
};

const isListening = async () => {
  const { listening } = await SpeechRecognition.isListening();
  return listening;
};

const getSupportedLanguages = async () => {
  const { languages } = await SpeechRecognition.getSupportedLanguages();
  return languages;
};

const addListeners = () => {
  SpeechRecognition.addListener('start', () => {
    console.log('Speech recognition started');
  });
  SpeechRecognition.addListener('end', () => {
    console.log('Speech recognition ended');
  });
  SpeechRecognition.addListener('error', (event) => {
    console.error('Speech recognition error:', event.message);
  });
  SpeechRecognition.addListener('partialResult', (event) => {
    console.log('Partial result:', event.result);
  });
  SpeechRecognition.addListener('result', (event) => {
    console.log('Final result:', event.result);
  });
  SpeechRecognition.addListener('speechStart', () => {
    console.log('User started speaking');
  });
  SpeechRecognition.addListener('speechEnd', () => {
    console.log('User stopped speaking');
  });
};

const removeAllListeners = async () => {
  await SpeechRecognition.removeAllListeners();
};

API

getLanguages()

getLanguages() => Promise<GetLanguagesResult>

Get the available languages for speech recognition.

Attention: On Android, this method is unfortunately not supported by all devices. If the method is not supported, the promise will never resolve. It's recommended to set a timeout for the promise.

Only available on Android and iOS.

Returns: Promise<GetLanguagesResult>

Since: 6.0.0


isAvailable()

isAvailable() => Promise<IsAvailableResult>

Check if the speech recognizer is available on the device.

Returns: Promise<IsAvailableResult>

Since: 6.0.0


isListening()

isListening() => Promise<IsListeningResult>

Check if the speech recognizer is currently listening.

Returns: Promise<IsListeningResult>

Since: 6.0.0


startListening(...)

startListening(options?: StartListeningOptions | undefined) => Promise<void>

Start listening for speech.

Param Type
options StartListeningOptions

Since: 6.0.0


stopListening()

stopListening() => Promise<void>

Stop listening for speech.

Since: 6.0.0


checkPermissions()

checkPermissions() => Promise<PermissionStatus>

Check permissions for the plugin.

Returns: Promise<PermissionStatus>

Since: 6.0.0


requestPermissions()

requestPermissions() => Promise<PermissionStatus>

Request permissions for the plugin.

Returns: Promise<PermissionStatus>

Since: 6.0.0


addListener('end', ...)

addListener(eventName: 'end', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the speech recognizer has stopped listening.

Param Type
eventName 'end'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('error', ...)

addListener(eventName: 'error', listenerFunc: (event: ErrorEvent) => void) => Promise<PluginListenerHandle>

Called when an error occurs.

Param Type
eventName 'error'
listenerFunc (event: ErrorEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('partialResult', ...)

addListener(eventName: 'partialResult', listenerFunc: (event: PartialResultEvent) => void) => Promise<PluginListenerHandle>

Called when a partial result is available.

Param Type
eventName 'partialResult'
listenerFunc (event: PartialResultEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('result', ...)

addListener(eventName: 'result', listenerFunc: (event: ResultEvent) => void) => Promise<PluginListenerHandle>

Called when the final results are available.

Param Type
eventName 'result'
listenerFunc (event: ResultEvent) => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('speechEnd', ...)

addListener(eventName: 'speechEnd', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the user has stopped speaking.

Only available on Android and Web.

Param Type
eventName 'speechEnd'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('speechStart', ...)

addListener(eventName: 'speechStart', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the user has started to speak.

Only available on Android and Web.

Param Type
eventName 'speechStart'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


addListener('start', ...)

addListener(eventName: 'start', listenerFunc: () => void) => Promise<PluginListenerHandle>

Called when the speech recognizer has started listening.

Param Type
eventName 'start'
listenerFunc () => void

Returns: Promise<PluginListenerHandle>

Since: 6.0.0


removeAllListeners()

removeAllListeners() => Promise<void>

Remove all listeners for this plugin.

Since: 6.0.0


Interfaces

GetLanguagesResult

Prop Type Description Since
languages string[] The supported languages for speech recognition as BCP-47 language tags. 6.0.0

IsAvailableResult

Prop Type Description Since
isAvailable boolean Whether or not the speech recognizer is available on the device. 6.0.0

IsListeningResult

Prop Type Description Since
isListening boolean Whether or not the speech recognizer is currently listening. 6.0.0

StartListeningOptions

Prop Type Description Default Since
language string The BC-47 language tag for the language to use for speech recognition. 6.0.0
silenceThreshold number The number of milliseconds of silence before the speech recognition ends. Only available on Android (SDK 33+) and iOS. 2000 6.0.0

PermissionStatus

Prop Type Description Since
recordAudio PermissionState Permission state for recording audio. 6.0.0

PluginListenerHandle

Prop Type
remove () => Promise<void>

ErrorEvent

Prop Type Description Since
message string The error message. 6.0.0

PartialResultEvent

Prop Type Description Since
result string The partial result of the speech recognition. 6.0.0

ResultEvent

Prop Type Description Since
result string The final result of the speech recognition. 6.0.0

Type Aliases

PermissionState

'prompt' | 'prompt-with-rationale' | 'granted' | 'denied'

Changelog

See CHANGELOG.md.

License

See LICENSE.