Starting the server
The NOS gRPC server can be started in two ways:
- Via the NOS SDK using
nos.init(...)
(preferred for development) - Via the NOS
serve
CLI. - Via Docker Compose (recommended for production deployments)
You can also start the server with the REST API proxy enabled as shown in the 2nd and 4th examples below.
You can start the nos server via the NOS serve
CLI:
Optionally, to use the REST API, you can start an HTTP gateway proxy alongside the gRPC server:
Navigate to examples/docker
to see an example of the YAML specification. You can start the server with the following command:
Navigate to examples/docker
to see an example of the YAML specification. You can start the server with the following command:
services:
server:
image: autonomi/nos:latest-gpu
command: /app/entrypoint.sh --http
environment:
- NOS_HOME=/app/.nos
- NOS_LOGGING_LEVEL=INFO
volumes:
- ~/.nosd:/app/.nos
- /dev/shm:/dev/shm
ports:
- 8000:8000
ipc: host
deploy:
resources:
reservations:
devices:
- capabilities: [gpu]
API Reference¶
nos.init ¶
init(runtime: str = 'auto', port: int = DEFAULT_GRPC_PORT, utilization: float = 1.0, pull: bool = True, logging_level: Union[int, str] = logging.INFO, tag: Optional[str] = None) -> Container
Initialize the NOS inference server (as a docker daemon).
The method first checks to see if your system requirements are met, before pulling the NOS docker image from Docker Hub (if necessary) and starting the inference server (as a docker daemon). You can also specify the runtime to use (i.e. "cpu", "gpu"), and the port to use for the inference server.
Parameters:
-
runtime
(str
, default:'auto'
) –The runtime to use (i.e. "auto", "local", "cpu", "gpu"). Defaults to "auto". In "auto" mode, the runtime will be automatically detected.
-
port
(int
, default:DEFAULT_GRPC_PORT
) –The port to use for the inference server. Defaults to DEFAULT_GRPC_PORT.
-
utilization
(float
, default:1.0
) –The target cpu/memory utilization of inference server. Defaults to 1.
-
pull
(bool
, default:True
) –Pull the docker image before starting the inference server. Defaults to True.
-
logging_level
(Union[int, str]
, default:INFO
) –The logging level to use. Defaults to logging.INFO. Optionally, a string can be passed (i.e. "DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL").
-
tag
(str
, default:None
) –The tag of the docker image to use ("latest"). Defaults to None, where the appropriate version is used.
Source code in nos/server/__init__.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 |
|
nos.shutdown ¶
Shutdown the inference server.