Call Center SDK  1.11.3
Public Types | Public Member Functions | Static Public Member Functions | List of all members
voicesdk::diar::DiarizationEngine Class Referenceabstract

Diarization engine class (interface), intended to perform speaker diarization. More...

#include <voicesdk/diarization/diarization.h>

Public Types

using Ptr = std::shared_ptr< DiarizationEngine >
 

Public Member Functions

virtual TimeStamps GetSegmentation (const float *float_samples, const size_t samples_num, const size_t sample_rate, const size_t num_speakers=0)=0
 Performs speaker diarization from the given float (in [-1; 1] range) audio samples. More...
 
virtual TimeStamps GetSegmentation (const int16_t *pcm16_samples, const size_t samples_num, const size_t sample_rate, const size_t num_speakers=0)=0
 Performs speaker diarization from the given PCM16 audio samples. More...
 
virtual TimeStamps GetSegmentation (const uint8_t *pcm16_bytes, const size_t bytes_num, const size_t sample_rate, const size_t num_speakers=0)=0
 Performs speaker diarization using given PCM16 samples bytes representation. More...
 
virtual TimeStamps GetSegmentation (const std::string &audio_path, const size_t num_speakers=0)=0
 Performs speaker diarization from the given audio file. More...
 

Static Public Member Functions

static DiarizationEngine::Ptr Create (const std::string &init_dir)
 Creates DiarizationEngine instance. More...
 

Detailed Description

Diarization engine class (interface), intended to perform speaker diarization.

Member Function Documentation

◆ Create()

static DiarizationEngine::Ptr voicesdk::diar::DiarizationEngine::Create ( const std::string &  init_dir)
static

Creates DiarizationEngine instance.

Parameters
init_dirpath to initialization data directory
Returns
Smart pointer to created DiarizationEngine instance
Exceptions
std::runtime_errorif runtime error occurred
voicesdk::LicenseExceptionif license error occurred

◆ GetSegmentation() [1/4]

virtual TimeStamps voicesdk::diar::DiarizationEngine::GetSegmentation ( const float *  float_samples,
const size_t  samples_num,
const size_t  sample_rate,
const size_t  num_speakers = 0 
)
pure virtual

Performs speaker diarization from the given float (in [-1; 1] range) audio samples.

Parameters
float_samplespointer to array with samples
samples_numsize of array with samples
sample_ratesample rate
num_speakersoptional parameter, can be passed if number of speakers in record is known (e.g. 2 for phone conversation), increases diarization accuracy; if 0, then engine tries to determine number of speakers by itself, thus accuracy can be lower
Exceptions
std::runtime_errorif runtime error occurred
voicesdk::LicenseExceptionif license error occurred
Returns
timestamps of speakers utterances

◆ GetSegmentation() [2/4]

virtual TimeStamps voicesdk::diar::DiarizationEngine::GetSegmentation ( const int16_t *  pcm16_samples,
const size_t  samples_num,
const size_t  sample_rate,
const size_t  num_speakers = 0 
)
pure virtual

Performs speaker diarization from the given PCM16 audio samples.

Parameters
pcm16_samplespointer to array with samples
samples_numsize of array with samples
sample_ratesample rate
num_speakersoptional parameter, can be passed if number of speakers in record is known (e.g. 2 for phone conversation), increases diarization accuracy; if 0, then engine tries to determine number of speakers by itself, thus accuracy can be lower
Exceptions
std::runtime_errorif runtime error occurred
voicesdk::LicenseExceptionif license error occurred
Returns
timestamps of speakers utterances

◆ GetSegmentation() [3/4]

virtual TimeStamps voicesdk::diar::DiarizationEngine::GetSegmentation ( const std::string &  audio_path,
const size_t  num_speakers = 0 
)
pure virtual

Performs speaker diarization from the given audio file.

Parameters
audio_pathpath to audio file
num_speakersoptional parameter, can be passed if number of speakers in record is known (e.g. 2 for phone conversation), increases diarization accuracy; if 0, then engine tries to determine number of speakers by itself, thus accuracy can be lower
Exceptions
std::runtime_errorif runtime error occurred
voicesdk::LicenseExceptionif license error occurred
Returns
timestamps of speakers utterances

◆ GetSegmentation() [4/4]

virtual TimeStamps voicesdk::diar::DiarizationEngine::GetSegmentation ( const uint8_t *  pcm16_bytes,
const size_t  bytes_num,
const size_t  sample_rate,
const size_t  num_speakers = 0 
)
pure virtual

Performs speaker diarization using given PCM16 samples bytes representation.

Parameters
pcm16_bytespointer to array with bytes
bytes_numsize of array with bytes
sample_ratesample rate
num_speakersoptional parameter, can be passed if number of speakers in record is known (e.g. 2 for phone conversation), increases diarization accuracy; if 0, then engine tries to determine number of speakers by itself, thus accuracy can be lower
Exceptions
std::runtime_errorif runtime error occurred
voicesdk::LicenseExceptionif license error occurred
Returns
timestamps of speakers utterances