# grid-orch

**Repository Path**: dev-linhu/grid-orch

## Basic Information

- **Project Name**: grid-orch
- **Description**: 一个支持 OpenAI 兼容接口的实时分布式算力调度系统，基于 Ray + 持久化 WebSocket 节点通道，执行端仅支持 Ollama。
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-04-11
- **Last Updated**: 2026-04-11

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Grid Compute Orchestrator (MVP)

A production-oriented MVP for distributed model serving with:

- `GridControl`: centralized scheduler + control plane (FastAPI + Ray + SQLite)
- `GridNode`: seller-side execution agent (persistent WebSocket + Ollama runtime)

The system exposes OpenAI-compatible APIs while dispatching inference to registered nodes in real time.

## Highlights

- OpenAI-compatible endpoints:
  - `GET /v1/models`
  - `POST /v1/chat/completions`
- Persistent WebSocket channel (`/ws/node`) from `GridNode` to `GridControl`
- Real-time task push (no polling dispatch)
- Node liveness and runtime-aware routing:
  - immediate offline on disconnect
  - immediate offline on Ollama unavailability (`runtime_up=false`)
- Data sensitivity routing (`P0/P1`) with trust-level constraints
- Request/job/usage/audit persistence in SQLite

## Architecture

```text
Client -> GridControl API -> Scheduler -> Ray Dispatch Broker -> GridNode -> Ollama
                      ^                                         |
                      |-------------- result over WebSocket -----
```

## Repository Layout

```text
.
├── docker-compose.yml
├── mvp_schema_sqlite.sql
├── gridcontrol/
│   ├── app.py
│   ├── scheduler.py
│   ├── ray_runtime.py
│   ├── db.py
│   ├── config.py
│   ├── requirements.txt
│   ├── Dockerfile
│   └── .env.example
├── gridnode/
│   ├── agent.py
│   ├── ws_client.py
│   ├── runtime.py
│   ├── config.py
│   ├── requirements.txt
│   ├── Dockerfile
│   └── .env.example
└── docs/
    └── quickstart.md
```

## Quick Start (Docker Compose)

1. Prepare env files

```bash
cp gridcontrol/.env.example gridcontrol/.env
cp gridnode/.env.example gridnode/.env
```

2. Build and run

```bash
docker compose up -d --build
```

3. Verify

```bash
curl http://127.0.0.1:8080/healthz
curl http://127.0.0.1:8080/v1/models
```

4. Test completion

```bash
curl -X POST http://127.0.0.1:8080/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{
    "model":"qwen2.5:7b",
    "messages":[{"role":"user","content":"hello"}],
    "stream":false,
    "metadata":{"data_class":"P0"}
  }'
```

For detailed run commands, see [docs/quickstart.md](./docs/quickstart.md).

## Runtime Behavior

- `GridNode` discovers installed models from `OLLAMA_BASE_URL/api/tags`.
- If Ollama is unreachable, node sends `runtime_up=false` on heartbeat.
- `GridControl` immediately marks node offline and fails in-flight jobs for that node.

## Security Notes

- Node authentication via shared secret (`X-Node-Secret` / WS hello secret)
- P0/P1 routing guardrails with trust-level filtering
- Sensitive payloads are not logged in plain text

## Current MVP Scope

Included:
- single control plane
- SQLite persistence
- Ollama-only node runtime
- non-streaming chat completion

Not included yet:
- billing/settlement
- multi-tenant auth hardening
- HA control plane
- streaming response path

## License

MIT License. See [LICENSE](./LICENSE).