Serving with docker and docker-compose¶
This tutorial shows how to serve the NOS server directly with docker or docker-compose.
Serving with docker¶
To run the NOS gRPC server with docker simply run:
For CPU:
docker run --rm \
    -e NOS_HOME=/app/.nos \
    -v $(HOME)/.nos:/app/.nos \
    -v /dev/shm:/dev/shm \
    -p 50051:50051 \
    autonomi/nos:latest-cpu
For running the GPU server, you need to install nvidia-docker and run the following command:
docker run --rm \
    --name nos-grpc-server \
    --gpus all \
    -e NOS_HOME=/app/.nos \
    -v $(HOME)/.nos:/app/.nos \
    -v /dev/shm:/dev/shm \
    -p 50051:50051 \
    autonomi/nos:latest-gpu
Serving with docker-compose¶
To run the NOS gRPC server and the HTTP gateway with docker-compose simply run:
You should now see the logs both from the server and the gateway.
(nos-py38) tutorials/05-serving-with-docker desktop [ docker compose -f docker-compose.yml up
[+] Running 2/2
 ✔ Container 05-serving-with-docker-nos-grpc-server-1   Created                                                                                                                                                0.0s
 ✔ Container 05-serving-with-docker-nos-http-gateway-1  Recreated                                                                                                                                              0.0s
Attaching to 05-serving-with-docker-nos-grpc-server-1, 05-serving-with-docker-nos-http-gateway-1
05-serving-with-docker-nos-grpc-server-1   | Starting server with OMP_NUM_THREADS=64...
05-serving-with-docker-nos-http-gateway-1  | WARNING:  Current configuration will not reload as not all conditions are met, please refer to documentation.
05-serving-with-docker-nos-grpc-server-1   |  ✓ InferenceExecutor :: Connected to backend.
05-serving-with-docker-nos-grpc-server-1   |  ✓ Starting gRPC server on [::]:50051
05-serving-with-docker-nos-grpc-server-1   |  ✓ InferenceService :: Deployment complete (elapsed=0.0s)
05-serving-with-docker-nos-http-gateway-1  | INFO:     Started server process [1]
05-serving-with-docker-nos-http-gateway-1  | INFO:     Waiting for application startup.
05-serving-with-docker-nos-http-gateway-1  | INFO:     Application startup complete.
05-serving-with-docker-nos-http-gateway-1  | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
The HTTP gateway service (nos-http-gateway) in the docker-compose.yml file simply forwards the HTTP requests to the gRPC server (nos-grpc-server). This is especially useful when exposing the server via a REST API.
Here's the full docker-compose.yml file:
version: "3.8"
services:
  nos-http-gateway:
    image: autonomi/nos:latest-gpu
    command: nos-http-server --host 0.0.0.0 --port 8000 --workers 1
    environment:
      - NOS_HOME=/app/.nos
      - NOS_LOGGING_LEVEL=INFO
      - NOS_GRPC_HOST=nos-grpc-server
      - NOS_HTTP_ENV=prod
    volumes:
      - ~/.nosd:/app/.nos
      - /dev/shm:/dev/shm
    ports:
      - 8000:8000
    ipc: host
    depends_on:
      - nos-grpc-server
  nos-grpc-server:
    image: autonomi/nos:latest-gpu
    environment:
      - NOS_HOME=/app/.nos
      - NOS_GRPC_HOST=[::]
      - NOS_LOGGING_LEVEL=INFO
    volumes:
      - ~/.nosd:/app/.nos
      - /dev/shm:/dev/shm
    ports:
      - 50051:50051
    ipc: host
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
        limits:
          memory: 12G
Testing the server¶
To test the server's health, you can simply use curl:
You should see the following response:
You can now try one of the many requests showcased in the main README.
Debugging the server¶
- Running on CPUs: You can remove the deploysection from thedocker-compose.ymlfile to run the server without GPU capabilities.
- Running on GPUs: Make sure you have nvidia-dockerinstalled and that you have the latest NVIDIA drivers installed on your machine. You can check the NVIDIA drivers version by runningnvidia-smion your terminal. If you don't havenvidia-dockerinstalled, you can follow the instructions here.
- Running on MacOS: You can run the server on MacOS by removing the deploysection from thedocker-compose.ymlfile.
- Enabling debug logs: You can enable debug logs on both the docker services by setting the NOS_LOGGING_LEVELenvironment variable toDEBUGin thedocker-compose.ymlfile. This should provide you with more information on what's happening under the hood.