# codex-relay

**Repository Path**: sunshinewithmoonlight/codex-relay

## Basic Information

- **Project Name**: codex-relay
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-04-12
- **Last Updated**: 2026-05-15

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# codex-relay

本项目现在是一个纯本地 Go 中转服务，不再依赖 Node、Cloudflare Worker 或 Cloudflare Tunnel。

## 公开接口

- `POST /v1/responses`
  - 业务接口
  - 需要 `Authorization: Bearer <token_profiles[].token>`
- `GET /v1/models`
  - 业务接口
  - 需要 `Authorization: Bearer <token_profiles[].token>`
- `GET /v1/healthz`
  - 运维接口
  - 需要 `Authorization: Bearer <server.auth_healthz>`
  - 请求时会同步拉取 `codex-oauth` provider 的 usage，再返回 healthz JSON

`/v1/healthz` 是 `/v1/models` 的 sibling 路由，但鉴权 token 独立。

## 配置

唯一配置源为 YAML。仓库内提供模板：

- [config.example.yaml](/Users/shine/codex-relay/config/config.example.yaml)

运行时默认配置文件路径为 `~/.config/codex-relay.yaml`。

核心结构：

```yaml
server:
  host: 127.0.0.1
  port: 8788
  auth_healthz: replace-with-health-token
  outbound_proxy_url: http://127.0.0.1:7897
  provider_cooldown_s: 180
  refresh_cooldown_s: 120

token_profiles:
  - token: replace-with-client-token-a
    remark: MacBook Codex CLI 主用 token
    provider_order: [1, 2]

providers:
  - index: 1
    type: codex-oauth
    auth_access_token: Bearer replace-with-access-token
    auth_refresh_token: replace-with-refresh-token
    force_model: gpt-5.4
    force_reasoning: xhigh
    timeout_s: 60

  - index: 2
    type: openai-compatible
    base_url: https://example.invalid/v1
    api_key: sk-replace-me
    force_model: gpt-5.4
    force_reasoning: medium
    timeout_s: 60
```

说明：

- `server.host: 127.0.0.1` 表示只监听本机，不暴露到 `0.0.0.0`
- `token_profiles[].remark` 只用于运维备注，不参与鉴权
- `provider_order` 是显式 provider index 优先级队列
- `provider_cooldown_s` 是 provider 失败后的冷却时间
- `refresh_cooldown_s` 是 cooldown provider 的自动恢复检查周期
- 只要 provider 没有返回可解析的正常 Response，就会被视为失败并进入 cooldown
- 进程会自动检测 YAML 文件变更并热重载；如果修改了 `server.host` 或 `server.port`，仍需要手动重启进程
- 对 `codex-oauth` provider，如果 relay 在运行中成功刷新了 OAuth access token / refresh token，会把最新 token 原子写回当前 YAML 配置文件，避免进程重启后退回到旧 refresh token

## 构建与运行

```bash
REPO=/Users/shine/codex-relay
mkdir -p /Users/shine/.runtime ~/.config
cp "$REPO/config/config.example.yaml" ~/.config/codex-relay.yaml
go test "$REPO"/...
go build -o /Users/shine/.runtime/codex-relay "$REPO"/cmd/codex-relay
/Users/shine/.runtime/codex-relay --check-config
/Users/shine/.runtime/codex-relay
```

如需后台运行并将日志统一写到 `/tmp/codex-relay`：

```bash
make -C /Users/shine/codex-relay run-bg
tail -f /tmp/codex-relay/codex-relay.stderr.log
```

## 运行时行为

- 多个业务 token 可以命中不同的 provider 优先级队列
- provider cooldown 按 provider index 全局共享，不按 token profile 隔离
- `busy` / `occupied` 表示 provider 当前正在处理请求，它是本地运行时状态，独立于 cooldown，不会写回 YAML
- cooldown provider 会被优先跳过
- 对同一个业务 token，relay 会按 `provider_order` 先选择第一个非 cooldown 且 `active_requests == 0` 的 provider
- 如果该 token 的全部非 cooldown provider 都 busy，relay 会回退到该 token 顺序里的第一个非 cooldown provider，即使它已经 busy
- 如果某个 token profile 的全部 provider 都在 cooldown，服务仍会按原始顺序继续尝试 last-resort provider
- `refresh_cooldown_s > 0` 时，后台会周期性检查 cooldown provider 是否恢复
- 对 `codex-oauth` provider，这个恢复检查会先查询 `/backend-api/wham/usage`
- 如果官方 5 小时或 7 天窗口任一额度已耗尽，就跳过真实 probe，并让 provider 继续留在 cooldown
- 如果 `/wham` 查询失败，则回退到真实 probe，不会把 usage 故障误当成额度耗尽
- `codex-oauth` 的非流式 `POST /v1/responses` 仍然是向上游发送 SSE 请求，再由 relay 在本地聚合为 JSON 响应
- `codex-oauth` access token 接近过期时，relay 会先尝试 OAuth refresh；refresh 成功后会同步更新内存态和 YAML 配置文件
- 对这一路径，不应假设最终文本一定出现在 `response.completed.response.output`；当上游把最终消息放在 `response.output_item.done` 中时，relay 会用完成态 item 回填 `response.output`
- 因此排查“`status=completed` 但 `output=[]`”时，优先检查非流式 SSE 聚合逻辑，而不是先调大 `timeout_s`
- `timeout_s` 只用于保护建连、首个有效响应以及非流式聚合；一旦流式响应已经建立，relay 不会再因为 `timeout_s` 主动截断 SSE
- `/v1/healthz` 会为 `codex-oauth` provider 追加 `usage` 或 `usage_error`
- `/v1/healthz` 的每个 provider 节点还会返回 `busy` 与 `active_requests`，用于区分“上游故障 cooldown”和“当前有并发占用”
- `usage` 目前包含 `quota_5h`、`quota_7d`、`quota_5h_reset_at_ms`、`quota_7d_reset_at_ms`
- `quota_5h` / `quota_7d` 默认表示对应窗口的余量百分比，不是已使用百分比
- `quota_5h_reset_at_ms` / `quota_7d_reset_at_ms` 是上游 usage 返回的官方窗口刷新时间，单位为毫秒时间戳
- `openai-compatible` provider 不返回 usage 字段

## 验证

```bash
curl -i http://127.0.0.1:8788/v1/healthz
curl -i http://127.0.0.1:8788/v1/healthz -H "authorization: Bearer replace-with-health-token"
curl -i http://127.0.0.1:8788/v1/models -H "authorization: Bearer replace-with-client-token-a"
curl -i http://127.0.0.1:8788/v1/responses \
  -H "authorization: Bearer replace-with-client-token-a" \
  -H "content-type: application/json" \
  -d '{"input":"hello","stream":false}'
```

并发占用 smoke：

```bash
BUSINESS_TOKEN='replace-with-client-token-a'
HEALTH_TOKEN='replace-with-health-token'

curl -N http://127.0.0.1:8788/v1/responses \
  -H "authorization: Bearer ${BUSINESS_TOKEN}" \
  -H "content-type: application/json" \
  -d '{"input":"hold provider 0 open","stream":true}'

curl -N http://127.0.0.1:8788/v1/responses \
  -H "authorization: Bearer ${BUSINESS_TOKEN}" \
  -H "content-type: application/json" \
  -d '{"input":"hold provider 1 open","stream":true}'

curl -sS http://127.0.0.1:8788/v1/responses \
  -H "authorization: Bearer ${BUSINESS_TOKEN}" \
  -H "content-type: application/json" \
  -d '{"input":"all busy should still fallback to first non-cooldown provider","stream":false}'

curl -sS http://127.0.0.1:8788/v1/healthz \
  -H "authorization: Bearer ${HEALTH_TOKEN}"
```

预期：

- 第二个请求会按同一 token 的 `provider_order` 跳到下一个空闲 provider
- 当前两个 provider 都 busy 时，第三个请求仍会回退到该 token 的第一个非 cooldown provider，而不是直接返回 `503`
- `/v1/healthz` 中可以看到对应 provider 的 `busy: true` 与 `active_requests`
- 关闭前两个请求后，再次查看 `/v1/healthz`，`active_requests` 会回落到 `0`

一个典型的 healthz provider 片段：

```json
{
  "index": 1,
  "type": "codex-oauth",
  "model": "gpt-5.4",
  "busy": false,
  "active_requests": 0,
  "usage": {
    "quota_5h": "75%",
    "quota_7d": "58%",
    "quota_5h_reset_at_ms": 1776091507000,
    "quota_7d_reset_at_ms": 1776385421000
  }
}
```

如果 usage 拉取失败，同一个 provider 节点会返回 `usage_error`，但整个 `/v1/healthz` 仍返回 `200 OK`。