Skip to content

Algorithm configuration

IDVoice Server component set

IDVoice Server contains various components, which solve different tasks. Some of them are rarely used along with each other. You can configure the set of server components to use. This helps to decrease memory usage and remove unnecessary API endpoints. Enabling/disabling a specific component is possible via a boolean environment variable IDRND_VOICESDK_COMPONENTS_<COMPONENT>. Please see the example at the right panel.

The table below shows the list of variables to configure component set for the basic IDVoice Server distribution.

Variable name Description
IDRND_VOICESDK_COMPONENTS_LIVENESS Voice liveness
IDRND_VOICESDK_COMPONENTS_VERIFY Voice verification
IDRND_VOICESDK_COMPONENTS_MEDIA Speech and signal processing utilities

Launching server with a specific IDVoice SDK component set:

$ docker run -d --name vrss --publish 8080:8080 \
             -e IDRND_VOICESDK_COMPONENTS_<COMPONENT>=true|false \
             voicesdk-server:3.0

Voice verification component

Run multiple verification algorithms

IDVoice voice verification component is delivered with a set of configurations covering the most common use-cases and scenarios. Configurations to use can be configured using IDRND_VOICESDK_VERIFY_DEFAULT_INIT_DATA (default configuration to use if the configuration query parameter is not supplied) and IDRND_VOICESDK_VERIFY_INIT_DATAS (available configuration to select with the configuration parameter) environment variables, and then be selected for use at the runtime using the configuration request's query parameter, e.g. /voice_template_factory/create_voice_template_from_file?configuration=mic-v2.

IDVoice speaker verification component configuration example:

$ docker run -d --name vrss --publish 8080:8080 \
             -e IDRND_VOICESDK_VERIFY_INIT_DATAS="mic-v2,other-configuration" \
             -e IDRND_VOICESDK_VERIFY_DEFAULT_INIT_DATA="mic-v2" \
             voicesdk-server:3.0

Available configurations:

  • mic-v2 - init data for microphone channel text-independent/text-dependent speaker verification

The default configuration is text-independent speaker verification in microphone channel (mic-v2).

Enable voice template compression

It is possible to employ voice template compression to get smaller binary representations of voice template by setting the VOICESDK_USE_VOICE_TEMPLATE_COMPRESSION environment variable:

$ docker run -d --name vrss --publish 8080:8080 \
             -e VOICESDK_USE_VOICE_TEMPLATE_COMPRESSION=1 \
             voicesdk-server:3.0

Liveness component

Run multiple liveness algorithms

The voice liveness component is delivered with a set of configurations covering different attack vectors. Configurations to use can be configured using IDRND_VOICESDK_LIVENESS_DEFAULT_INIT_DATA (default configuration to use if the configuration query parameter is not supplied) and IDRND_VOICESDK_LIVENESS_INIT_DATAS (available configuration to select with the configuration parameter) environment variables, and then be selected for use at the runtime using the configuration request's query parameter, e.g. /liveness_engine/check_liveness_file?configuration=voice-clones.

IDVoice liveness component configuration example:

$ docker run -d --name vrss --publish 8080:8080 \
             -e IDRND_VOICESDK_LIVENESS_INIT_DATAS="replay,voice-clones" \
             -e IDRND_VOICESDK_LIVENESS_DEFAULT_INIT_DATA="replay" \
             voicesdk-server:3.13.0

Available configurations:

  • replay - an algorithm for replay attack detection
  • voice-clones - an algorithm for voice cloning attack detection

Both algorithms are enabled by default, and the default configuration is replay.