Package net.idrnd.voicesdk.media
Class SpeechEndpointDetector
java.lang.Object
net.idrnd.voicesdk.common.VoiceSdkNativePeer
net.idrnd.voicesdk.media.SpeechEndpointDetector
- All Implemented Interfaces:
AutoCloseable
Provides the functionality of speech end detection in audio stream.
Enables streaming scenario for end detection when audio data is processed by continuous buffers. Intended usage scenario is following:
- initialize
SpeechEndpointDetector - call 1 or more
addSamples(byte[])methods with incoming audio data - call
isSpeechEnded() - if false, repeat
- if true, stop processing and
VoiceSdkNativePeer.close()
This class serves as gateway to native Voice SDK implementation and allocates resources on native heap.
To release the allocated memory, AutoCloseable.close() method should be invoked when the instance is no longer needed.
Any method that delegates to native call may throw VoiceSdkEngineException
-
Constructor Summary
ConstructorsConstructorDescriptionSpeechEndpointDetector(int minSpeechLengthMs, int maxSilenceLengthMs, int sampleRate) Initializes speech endpoint detector. -
Method Summary
Modifier and TypeMethodDescriptionvoidaddSamples(byte[] bytes) Adds audio samples for processing in PCM16 formatvoidaddSamples(float[] floatSamples) Audio samples for processing encoded in normalized float formatvoidaddSamples(short[] pcm16Samples) Adds audio samples for processing in PCM16 formatbooleanChecks if speech end is detected after the previousaddSamples(byte[])callvoidreset()Resets detector, clearing all the accumulated statisticsMethods inherited from class net.idrnd.voicesdk.common.VoiceSdkNativePeer
close, equals, hashCode
-
Constructor Details
-
SpeechEndpointDetector
public SpeechEndpointDetector(int minSpeechLengthMs, int maxSilenceLengthMs, int sampleRate) Initializes speech endpoint detector.- Parameters:
minSpeechLengthMs- the threshold for required accumulated speech durationmaxSilenceLengthMs- the threshold for the duration of continuous silence that triggers the end detection if the required amount of speech is accumulatedsampleRate- sample rate of incoming audio data stream- Throws:
VoiceSdkEngineException- wraps native exceptions
-
-
Method Details
-
reset
public void reset()Resets detector, clearing all the accumulated statistics- Throws:
VoiceSdkEngineException- wraps native exceptions
-
addSamples
public void addSamples(byte[] bytes) Adds audio samples for processing in PCM16 format- Parameters:
bytes- Array of little-endian PCM16 audio bytes- Throws:
VoiceSdkEngineException- wraps native exceptions
-
addSamples
public void addSamples(short[] pcm16Samples) Adds audio samples for processing in PCM16 format- Parameters:
pcm16Samples- Array of PCM16 audio samples- Throws:
VoiceSdkEngineException- wraps native exceptions
-
addSamples
public void addSamples(float[] floatSamples) Audio samples for processing encoded in normalized float format- Parameters:
floatSamples- Array of float audio samples (in [-1, 1] range)- Throws:
VoiceSdkEngineException- wraps native exceptions
-
isSpeechEnded
public boolean isSpeechEnded()Checks if speech end is detected after the previousaddSamples(byte[])call- Returns:
- true if speech end is detected
-