Skip to content

Using IDLive Face Match Server

Start container

CPU Version

To run the application using CPU, start the Docker container:

docker run --name idfacematch-server -d -p 8080:8080 idfacematch-server-eval:#.##.#

GPU Version

To run the application using GPU acceleration, start the Docker container with GPU support:

docker run --name idfacematch-server-gpu -d -p 8080:8080 --gpus=all --volume idfm-cache:/app/cache idfacematch-server-eval:#.##.#-gpu

The --gpus=all flag enables GPU access for the container, and the volume mount is needed for storing compiled models.

The application is ready once it prints Ready into the log stream. You can also call the /info endpoint. Once it returns 200 OK, the application has started.

The OpenAPI specification is available at http://localhost:8080/openapi.json. You can use Postman or Swagger UI to explore it. Online Swagger UI is available at https://petstore.swagger.io.

Model Cache

When you first run the GPU version of the server, it will compile models with TensorRT compiler. This process takes some time (approximately 1-2 minutes per model on Tesla T4). Once compiled, the models will be saved to a cache, and subsequent startups will be almost instant.

By default, the cache is stored in the /app/cache directory inside the container. This is why we recommend mounting a volume to this location using --volume idfm-cache:/app/cache when starting the container with GPU support.

A compiled model can only be used on the same platform it was compiled on. The cache keeps track of how each model was compiled and can contain several compiled versions for the same model. The same cache can be used by several IDFace Match Server instances, but make sure they all only read from it.

If you use IDFace Match Server with protection enabled, the cache will be encrypted. This cache can only be accessed with IDFace Match Server instances that use the same protection method.

Environment variables

The IDLive Face Match Server's configuration is managed via environment variables. For Docker use the --env parameter when executing docker run, one for every environment variable:

docker run --env IDFM_SERVER_PORT=8081 ...

All environment variables are optional.

Variable Default Description
IDFM_SERVER_PORT 8080 HTTP port for the server to listen on
IDFM_SERVER_QDRANT_HOST Qdrant host. If not set the 1:N functionality won't be available
IDFM_SERVER_QDRANT_PORT 6334 Qdrant GRPC port
IDFM_SERVER_QDRANT_TLS false Connect to Qdrant using TLS
IDFM_SERVER_QDRANT_API_KEY Qdrant API key

Running Qdrant

Qdrant is a vector search engine that allows efficient similarity search. It is required for 1:N face search in IDLive Face Match Server. To start Qdrant using Docker, run the following command:

docker run --name qdrant -d -p 6334:6334 \
  -e QDRANT__STORAGE__HNSW_INDEX__M=64 \
  -e QDRANT__STORAGE__HNSW_INDEX__EF_CONSTRUCT=600 \
  -e QDRANT__SEARCH__HNSW_INDEX__EF=600 \
  qdrant/qdrant:latest

This command pulls and runs the latest Qdrant image, exposing it on port 6334.

The parameters QDRANT__STORAGE__HNSW_INDEX__M=64, QDRANT__STORAGE__HNSW_INDEX__EF_CONSTRUCT=600 and QDRANT__SEARCH__HNSW_INDEX__EF=600 provide an optimal balance between search quality and speed.

Warning

It is crucial to start Qdrant with the recommended parameters above to ensure the best search quality and performance.

Persistent Storage in Qdrant

To ensure that Qdrant retains data between restarts, you can use a persistent volume:

docker run --name qdrant -d -p 6334:6334 \
  -e QDRANT__STORAGE__HNSW_INDEX__M=64 \
  -e QDRANT__STORAGE__HNSW_INDEX__EF_CONSTRUCT=600 \
  -e QDRANT__SEARCH__HNSW_INDEX__EF=600 \
  -v qdrant_storage:/qdrant/storage \
  qdrant/qdrant:latest

For more information on running and configuring Qdrant, refer to the official documentation: Qdrant Documentation.

Start server with 1:N search support

To enable 1:N search functionality using Qdrant, start the server with the IDFM_SERVER_QDRANT_HOST environment variable set to the Qdrant host:

docker run --name idfacematch-server -d -p 8080:8080 \
  -e IDFM_SERVER_QDRANT_HOST=QDRANT_HOST \
  idfacematch-server-eval:#.##.#

By default, the server expects Qdrant to be running on port 6334. If you use a different port, specify it explicitly:

docker run --name idfacematch-server -d -p 8080:8080 \
  -e IDFM_SERVER_QDRANT_HOST=QDRANT_HOST \
  -e IDFM_SERVER_QDRANT_PORT=QDRANT_PORT \
  idfacematch-server-eval:#.##.#

Info

If you are running both idfacematch-server and qdrant on your local machine, ensure proper connectivity between the Docker containers. It can be achieved by creating a docker network and assigning both containers to it:

docker network create idfm-qdrant

docker run --name qdrant -d \
  --network idfm-qdrant \
  -e QDRANT__STORAGE__HNSW_INDEX__M=64 \
  -e QDRANT__STORAGE__HNSW_INDEX__EF_CONSTRUCT=600 \
  -e QDRANT__SEARCH__HNSW_INDEX__EF=600 \
  -v qdrant_storage:/qdrant/storage \
  qdrant/qdrant:latest

docker run --name idfacematch-server -d -p 8080:8080 \
  --network idfm-qdrant \
  -e IDFM_SERVER_QDRANT_HOST=qdrant \
  idfacematch-server-eval:#.##.#

The OpenAPI specification contains a list of endpoints related to 1:N search.

Generate a face template

To generate a face template, send an image to the /detect endpoint. The content type does not matter, but make sure it is not a multipart request:

curl -s --data-binary @image.jpg  http://localhost:8080/detect

Image can be in JPG or PNG format. The response will contain information about all faces on the image, including a face box and a template:

{
  "faces": [
    {
      "box": {
        "x": 967,
        "y": 124,
        "width": 183,
        "height": 254
      },
      "template": "wRkUy8zdCs.."
    }
  ]
}

The coordinates start from the top-left corner of the image. The faces list will be empty if no faces were detected.

Info

If the input image is a JPG file that contains EXIF orientation metadata, the image will be automatically rotated according to this information before face detection is performed. As a result, face boxes in the response will correspond to the rotated image.

If you plan to overlay face boxes on the original image (e.g., for cropping faces), make sure to rotate the image according to its EXIF orientation to ensure proper alignment.

The /detect endpoint supports these optional query parameters:

Name Description
max-faces Maximum number of faces that can be on the image. If there are more, an error will be returned.

Match templates

To match two templates, prepare this JSON file:

{
  "source": "wRkUy8zdCs...",
  "target": "32k3vAHk6T..."
}

And send it to the /match endpoint:

curl -s --header "Content-Type: application/json" \
        --data-binary @templates.json http://localhost:8080/match

Example of a response:

{ "probability": 0.83 }

The matching result contains the probability field which is the main factor in determining whether two biometric templates belong to the same identity or different ones. It ranges from 0 (different identities) to 1 (same identity) and the threshold for the decision is 0.5.

Additionaly you can select a calibration:

{
  "source": "wRkUy8zdCs...",
  "target": "32k3vAHk6T...",
  "calibration": "SOFT"
}

Calibration allows you to set the desired FNMR/FMR balance.

Available values for calibration:

  • REGULAR (the default) targets low FMR (0.00001).
  • SOFT achieves lower FNMR while still having acceptable FMR (0.0001).
  • HARD targets extra low FMR (0.000001) with possibly higher FNMR.

Template Backend Consistency (CPU vs. GPU)

As of version 1.6.0, GPU-based template generation is supported. CPU- and GPU-generated templates can be used interchangeably for verification (1:1); differences in inference backend are not expected to drastically affect verification accuracy.

However, for identification (1:N) use cases, mixing templates from different inference backends - either by combining CPU and GPU templates in a single gallery or by querying across backends - is not recommended and may degrade identification accuracy.

Recommendations

  • Maintain separate 1:N galleries for CPU and GPU templates; do not mix backends within a single identification database.
  • When migrating from one backend to another, perform a full re-enrollment so that the target gallery is homogeneous with respect to the inference backend.
  • Record the source backend in each template's metadata (e.g., inference_backend: CPU or inference_backend: GPU) to enable detection or prevention of unintended mixing.
  • (Optional) Enforce runtime validation that flags or blocks identification queries when the probe and gallery backends differ.

When Qdrant is enabled, you can perform 1:N searches. The relevant API endpoints are:

Create a collection

To create a face template collection:

curl -s -XPOST http://localhost:8080/database/collection-name

Add face templates to a collection

To add face templates to a collection:

curl -s --header "Content-Type: application/json" \
        --data-binary '{"templates": ["wRkUy8zdCs..."]}' \
        http://localhost:8080/database/collection-name/records

Search for a face in a collection

To search for similar faces in the collection, you can provide either an image or a list of face templates. In the first case, all faces detected in the image (JPEG or PNG) will be used for the search.

curl -s --data-binary @image.jpg \
        http://localhost:8080/search/collection-name?limit=5

The response will contain a list of matched records, including their IDs and similarity probabilities.

Query parameters (all optional):

Name Description
max-faces Maximum allowed number of faces in the image. If more faces are detected, an error will be returned.
limit Number of matched records to return for each face detected. Default is 5.
calibration Calibration setting for adjusting the probability based on FNMR and FMR. Available values: REGULAR (default), SOFT, HARD.