IDLive Voice (voice liveness)

Voice liveness scenarios¶

Typical scenario looks like this:

Voice liveness is meant to be used as a helper to the existing automatic speaker verification system. It is not suited to verify that person is the enrolled one, it only checks that the actual voice presented is genuine.

Voice liveness itself does not require any sort of enrollment, it operates on any voice presented to it, making it ideal to work in parallel with the existing ASV system. The final decision is made based on scores from both voice liveness and ASV. If either of them is low, the system should deny the access.

Ideal use-case for this scenario is controlling physical access to the building or facility. Attacker can't replace microphone in such circumstances and has less time to tinker with the system and prepare reliable voice conversion/replay attack.

Another good use-case is TV parental controls for restricting usage of paid services and purchases.

Less ideal use-case is a voice gateway in front of a telephone banking service to which remote customers may connect using either their fixed/landline or mobile/cellular phone. In this case the microphone is not controlled by the authentication system and attacker can try adding noise and use lower frequencies to watch changes in decision. Having multiple phone numbers and obtaining sample of genuine voice of real account can help the attacker to turn the system into a test oracle.

False rejection and false acceptance¶

In voice biometrics there are two main errors, namely "false acceptance" (FA) and "false rejection" (FR).

False acceptance is a situation in which a replayed or synthesized voice successfully passes liveness check mechanism. False rejection is when a valid speaker with genuine voice fails to pass liveness check. Depending on the thresholds selected by the implementors a system may operate with very low FA but will suffer with excessive FR, in other cases very low FR will result in quite big amount of FA.

In many business scenarios the cost of FA error is much higher of FR error. For example, authentication used to control access to important resource has quite high FA cost, while FR could simply trigger the user to try another authentication attempt or use other means of gaining access.

How to perform liveness check¶

Again, please note that the best approach is to perform liveness check in parallel with presenting voice to speaker verification system. If this is done in the same process, use multithreading to launch both authentication and liveness checks and make a decision after both of them finish.

Performing liveness check and speaker verification sequentially can introduce unnecessary delays in the system.

Please see the liveness check examples for various programming languages below:

C++JavaPython

#include <voicesdk/liveness/liveness.h>

// ...

// Create liveness engine
voicesdk::LivenessEngine::Ptr engine = voicesdk::LivenessEngine::Create("/path/to/init_data/liveness/replay");

// Check liveness from the given wav file
std::cout << engine->CheckLiveness("/path/to/file.wav") << std::endl;

import net.idrnd.voicesdk.liveness.LivenessEngine;

// ...

// Create liveness engine
LivenessEngine livenessEngine = new LivenessEngine("/path/to/init_data/liveness/replay");

// Check liveness from the given wav file
LivenessResult result = livenessEngine.checkLiveness("/path/to/file.wav");
System.out.println(String.format("* Liveness check result: %s", result));

from voicesdk.liveness import LivenessEngine

# ...

# Create AS engine
liveness_engine = LivenessEngine("/path/to/init_data/liveness/replay")

# Check liveness from the given wav file
res = liveness_engine.check_liveness_file("/path/to/file.wav")
print(res)