Skip to content

Commit 14d9c08

Browse files
committed
docs: add CLI reference documentation
Add comprehensive CLI reference covering all 8 commands: - start (head/worker) - submit - get-instance - get-result - list-workers - get-endpoint - cancel - logs Includes options, arguments, and usage examples.
1 parent 6565860 commit 14d9c08

1 file changed

Lines changed: 300 additions & 0 deletions

File tree

docs/cli_reference.md

Lines changed: 300 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,300 @@
1+
# PyLet CLI Reference
2+
3+
PyLet provides a command-line interface for managing a distributed instance execution cluster.
4+
5+
## Installation
6+
7+
```bash
8+
pip install pylet
9+
```
10+
11+
After installation, the `pylet` command is available.
12+
13+
---
14+
15+
## Commands
16+
17+
### `pylet start`
18+
19+
Start the head node (server) or a worker node.
20+
21+
```bash
22+
# Start head node (server) on port 8000
23+
pylet start
24+
25+
# Start worker node connected to head
26+
pylet start --head 192.168.1.10:8000
27+
28+
# Start worker with custom resources
29+
pylet start --head 192.168.1.10:8000 --gpu-units 4 --cpu-cores 8 --memory-mb 16384
30+
```
31+
32+
**Options:**
33+
34+
| Option | Default | Description |
35+
|--------|---------|-------------|
36+
| `--head <ip:port>` | None | Head node address. If omitted, starts as head node. |
37+
| `--cpu-cores <int>` | 4 | CPU cores to offer (worker only) |
38+
| `--gpu-units <int>` | 0 | GPU units to offer (worker only) |
39+
| `--memory-mb <int>` | 4096 | Memory in MB to offer (worker only) |
40+
41+
---
42+
43+
### `pylet submit`
44+
45+
Submit a new instance to the cluster.
46+
47+
```bash
48+
# Simple command
49+
pylet submit echo hello
50+
51+
# With resource requirements
52+
pylet submit python train.py --cpu-cores 4 --gpu-units 1 --memory-mb 8192
53+
54+
# Named instance (for service discovery)
55+
pylet submit "vllm serve model --port \$PORT" --name my-vllm --gpu-units 1
56+
57+
# Multi-word commands with quotes
58+
pylet submit "python -c 'print(\"hello\")'"
59+
```
60+
61+
**Arguments:**
62+
63+
| Argument | Required | Description |
64+
|----------|----------|-------------|
65+
| `COMMAND` | Yes | Shell command to execute (can be multiple words) |
66+
67+
**Options:**
68+
69+
| Option | Default | Description |
70+
|--------|---------|-------------|
71+
| `--cpu-cores <int>` | 1 | CPU cores required |
72+
| `--gpu-units <int>` | 0 | GPU units required |
73+
| `--memory-mb <int>` | 512 | Memory in MB required |
74+
| `--name <string>` | None | Instance name for service discovery |
75+
76+
**Output:**
77+
78+
```
79+
Instance submitted with ID: abc-123-def
80+
```
81+
82+
---
83+
84+
### `pylet get-instance`
85+
86+
Get instance details by ID or name.
87+
88+
```bash
89+
# By ID
90+
pylet get-instance --instance-id abc-123-def
91+
92+
# By name
93+
pylet get-instance --name my-vllm
94+
```
95+
96+
**Options:**
97+
98+
| Option | Description |
99+
|--------|-------------|
100+
| `--instance-id <string>` | Instance UUID |
101+
| `--name <string>` | Instance name |
102+
103+
One of `--instance-id` or `--name` is required.
104+
105+
---
106+
107+
### `pylet get-result`
108+
109+
Get the result of a completed instance.
110+
111+
```bash
112+
pylet get-result abc-123-def
113+
```
114+
115+
**Arguments:**
116+
117+
| Argument | Required | Description |
118+
|----------|----------|-------------|
119+
| `INSTANCE_ID` | Yes | Instance UUID |
120+
121+
---
122+
123+
### `pylet list-workers`
124+
125+
List all registered workers in the cluster.
126+
127+
```bash
128+
pylet list-workers
129+
```
130+
131+
**Output:**
132+
133+
```
134+
Worker abc-123 (192.168.1.5) - ONLINE - GPUs: 4
135+
Worker def-456 (192.168.1.6) - ONLINE - GPUs: 2
136+
Worker ghi-789 (192.168.1.7) - SUSPECT - GPUs: 1
137+
```
138+
139+
---
140+
141+
### `pylet get-endpoint`
142+
143+
Get the endpoint (host:port) of a running instance. Useful for service discovery.
144+
145+
```bash
146+
# By ID
147+
pylet get-endpoint --instance-id abc-123-def
148+
149+
# By name
150+
pylet get-endpoint --name my-vllm
151+
```
152+
153+
**Options:**
154+
155+
| Option | Description |
156+
|--------|-------------|
157+
| `--instance-id <string>` | Instance UUID |
158+
| `--name <string>` | Instance name |
159+
160+
**Output:**
161+
162+
```
163+
192.168.1.5:15600
164+
```
165+
166+
---
167+
168+
### `pylet cancel`
169+
170+
Cancel a running instance. Sends SIGTERM, waits grace period, then SIGKILL.
171+
172+
```bash
173+
pylet cancel abc-123-def
174+
```
175+
176+
**Arguments:**
177+
178+
| Argument | Required | Description |
179+
|----------|----------|-------------|
180+
| `INSTANCE_ID` | Yes | Instance UUID |
181+
182+
**Output:**
183+
184+
```
185+
Cancellation requested for instance abc-123-def
186+
```
187+
188+
---
189+
190+
### `pylet logs`
191+
192+
Get logs from an instance.
193+
194+
```bash
195+
# Get all logs
196+
pylet logs abc-123-def
197+
198+
# Get last 1000 bytes
199+
pylet logs abc-123-def --tail 1000
200+
201+
# Follow logs (like tail -f)
202+
pylet logs abc-123-def --follow
203+
pylet logs abc-123-def -f
204+
```
205+
206+
**Arguments:**
207+
208+
| Argument | Required | Description |
209+
|----------|----------|-------------|
210+
| `INSTANCE_ID` | Yes | Instance UUID |
211+
212+
**Options:**
213+
214+
| Option | Default | Description |
215+
|--------|---------|-------------|
216+
| `--tail <int>` | None | Get only last N bytes |
217+
| `--follow`, `-f` | False | Follow log output (poll for new content) |
218+
219+
---
220+
221+
## Environment
222+
223+
The CLI connects to the head node at `http://localhost:8000` by default. This is currently hardcoded in the client.
224+
225+
## Exit Codes
226+
227+
| Code | Meaning |
228+
|------|---------|
229+
| 0 | Success |
230+
| 1 | Error (connection failed, instance not found, etc.) |
231+
232+
---
233+
234+
## Examples
235+
236+
### Start a Local Cluster
237+
238+
```bash
239+
# Terminal 1: Start head node
240+
pylet start
241+
242+
# Terminal 2: Start worker with 2 GPUs
243+
pylet start --head localhost:8000 --gpu-units 2 --cpu-cores 8
244+
245+
# Terminal 3: Start another worker
246+
pylet start --head localhost:8000 --gpu-units 1 --cpu-cores 4
247+
```
248+
249+
### Submit and Monitor an Instance
250+
251+
```bash
252+
# Submit a long-running job
253+
pylet submit "python train.py --epochs 100" --name training --gpu-units 1
254+
255+
# Check status
256+
pylet get-instance --name training
257+
258+
# Follow logs
259+
pylet logs $(pylet get-instance --name training | grep -o 'instance_id.*' | cut -d"'" -f2) -f
260+
261+
# Cancel if needed
262+
pylet cancel <instance-id>
263+
```
264+
265+
### Run a vLLM Service
266+
267+
```bash
268+
# Submit vLLM server
269+
pylet submit "vllm serve Qwen/Qwen2.5-1.5B-Instruct --port \$PORT" \
270+
--name vllm-server \
271+
--gpu-units 1 \
272+
--memory-mb 8192
273+
274+
# Wait and get endpoint
275+
sleep 30
276+
ENDPOINT=$(pylet get-endpoint --name vllm-server)
277+
278+
# Use the service
279+
curl http://$ENDPOINT/v1/completions \
280+
-H "Content-Type: application/json" \
281+
-d '{"model": "Qwen/Qwen2.5-1.5B-Instruct", "prompt": "Hello", "max_tokens": 10}'
282+
283+
# Cleanup
284+
pylet cancel <instance-id>
285+
```
286+
287+
---
288+
289+
## Command Summary
290+
291+
| Command | Purpose |
292+
|---------|---------|
293+
| `pylet start` | Start head or worker node |
294+
| `pylet submit <cmd>` | Submit instance |
295+
| `pylet get-instance` | Get instance details |
296+
| `pylet get-result <id>` | Get instance result |
297+
| `pylet list-workers` | List workers |
298+
| `pylet get-endpoint` | Get instance endpoint |
299+
| `pylet cancel <id>` | Cancel instance |
300+
| `pylet logs <id>` | Get instance logs |

0 commit comments

Comments
 (0)