Skip to content

Serving with docker and docker-compose

This tutorial shows how to serve the NOS server directly with docker or docker-compose.

Serving with docker

To run the NOS gRPC server with docker simply run:

For CPU:

docker run --rm \
    -e NOS_HOME=/app/.nos \
    -v $(HOME)/.nos:/app/.nos \
    -v /dev/shm:/dev/shm \
    -p 50051:50051 \
    autonomi/nos:latest-cpu

For running the GPU server, you need to install nvidia-docker and run the following command:

docker run --rm \
    --name nos-grpc-server \
    --gpus all \
    -e NOS_HOME=/app/.nos \
    -v $(HOME)/.nos:/app/.nos \
    -v /dev/shm:/dev/shm \
    -p 50051:50051 \
    autonomi/nos:latest-gpu

Serving with docker-compose

To run the NOS gRPC server and the HTTP gateway with docker-compose simply run:

docker-compose up -f docker-compose.yml

You should now see the logs both from the server and the gateway.

(nos-py38) tutorials/05-serving-with-docker desktop [ docker compose -f docker-compose.yml up
[+] Running 2/2
  Container 05-serving-with-docker-nos-grpc-server-1   Created                                                                                                                                                0.0s
  Container 05-serving-with-docker-nos-http-gateway-1  Recreated                                                                                                                                              0.0s
Attaching to 05-serving-with-docker-nos-grpc-server-1, 05-serving-with-docker-nos-http-gateway-1
05-serving-with-docker-nos-grpc-server-1   | Starting server with OMP_NUM_THREADS=64...
05-serving-with-docker-nos-http-gateway-1  | WARNING:  Current configuration will not reload as not all conditions are met, please refer to documentation.
05-serving-with-docker-nos-grpc-server-1   |   InferenceExecutor :: Connected to backend.
05-serving-with-docker-nos-grpc-server-1   |   Starting gRPC server on [::]:50051
05-serving-with-docker-nos-grpc-server-1   |   InferenceService :: Deployment complete (elapsed=0.0s)
05-serving-with-docker-nos-http-gateway-1  | INFO:     Started server process [1]
05-serving-with-docker-nos-http-gateway-1  | INFO:     Waiting for application startup.
05-serving-with-docker-nos-http-gateway-1  | INFO:     Application startup complete.
05-serving-with-docker-nos-http-gateway-1  | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

The HTTP gateway service (nos-http-gateway) in the docker-compose.yml file simply forwards the HTTP requests to the gRPC server (nos-grpc-server). This is especially useful when exposing the server via a REST API.

Here's the full docker-compose.yml file:

version: "3.8"

services:
  nos-http-gateway:
    image: autonomi/nos:latest-gpu
    command: nos-http-server --host 0.0.0.0 --port 8000 --workers 1
    environment:
      - NOS_HOME=/app/.nos
      - NOS_LOGGING_LEVEL=INFO
      - NOS_GRPC_HOST=nos-grpc-server
      - NOS_HTTP_ENV=prod
    volumes:
      - ~/.nosd:/app/.nos
      - /dev/shm:/dev/shm
    ports:
      - 8000:8000
    ipc: host
    depends_on:
      - nos-grpc-server

  nos-grpc-server:
    image: autonomi/nos:latest-gpu
    environment:
      - NOS_HOME=/app/.nos
      - NOS_GRPC_HOST=[::]
      - NOS_LOGGING_LEVEL=INFO
    volumes:
      - ~/.nosd:/app/.nos
      - /dev/shm:/dev/shm
    ports:
      - 50051:50051
    ipc: host
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
        limits:
          memory: 12G

Testing the server

To test the server's health, you can simply use curl:

curl -X "GET" "http://localhost:8000/v1/health" -H "accept: application/json"

You should see the following response:

{"status":"ok"}

You can now try one of the many requests showcased in the main README.

Debugging the server

  • Running on CPUs: You can remove the deploy section from the docker-compose.yml file to run the server without GPU capabilities.
  • Running on GPUs: Make sure you have nvidia-docker installed and that you have the latest NVIDIA drivers installed on your machine. You can check the NVIDIA drivers version by running nvidia-smi on your terminal. If you don't have nvidia-docker installed, you can follow the instructions here.
  • Running on MacOS: You can run the server on MacOS by removing the deploy section from the docker-compose.yml file.
  • Enabling debug logs: You can enable debug logs on both the docker services by setting the NOS_LOGGING_LEVEL environment variable to DEBUG in the docker-compose.yml file. This should provide you with more information on what's happening under the hood.