API · v0.0.3

UnifiedEngine API Reference

UnifiedEngine exposes an OpenAI-compatible HTTP API served entirely from your Mac. Any OpenAI SDK that supports a custom base URL can talk to it unchanged.

Base URL

http://127.0.0.1:38180

To expose the daemon to a trusted network, bind a non-loopback host and set an API key:

UE_API_KEY="replace-with-a-secret" \
cargo run -p ue-daemon -- \
  --host 0.0.0.0 \
  --port 38180 \
  --model /path/to/model.gguf \
  --cors-origin "https://example.com" \
  --max-request-bytes 1048576 \
  --max-concurrent-requests 4 \
  --requests-per-minute 60

Authentication

When the daemon binds to a non-loopback host, an API key is required. Send it as a bearer token on every /v1/* request:

Authorization: Bearer <api-key>

/health stays unauthenticated for local readiness checks. In the macOS app, keys are stored in Keychain and passed to the helper daemon through UE_API_KEY.

GET /health

Unauthenticated readiness probe.

{
  "status": "ok",
  "version": "0.0.3"
}
GET /v1/status

Returns runtime, backend, and active model details. Local model paths are redacted unless the daemon is started with --expose-model-path.

{
  "status": "running",
  "version": "0.0.3",
  "endpoint": "http://127.0.0.1:38180",
  "started_at": 1730000000,
  "runtime": {
    "runtime": "llama.cpp",
    "model_id": "local-gguf",
    "backend": "metal",
    "ready": true
  },
  "active_model": {
    "id": "local-gguf",
    "object": "model",
    "owned_by": "unifiedengine",
    "format": "gguf",
    "status": "valid"
  }
}
GET /v1/models

Returns active model metadata.

POST /v1/chat/completions

Supported fields:

  • model
  • messages
  • temperature
  • max_tokens
  • stream

Unsupported OpenAI fields are ignored in v0.0.3.

Non-streaming

curl http://127.0.0.1:38180/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key-if-configured>" \
  -d '{"model":"local-gguf","messages":[{"role":"user","content":"Hello"}],"stream":false}'

Streaming (SSE)

curl -N http://127.0.0.1:38180/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <api-key-if-configured>" \
  -d '{"model":"local-gguf","messages":[{"role":"user","content":"Hello"}],"stream":true}'

Limits & validation

The control plane validates every request and enforces the limits you configure at startup:

--max-request-bytesMaximum request body size
--max-concurrent-requestsIn-flight request ceiling
--requests-per-minutePer-client rate limit
--cors-originAllowed cross-origin caller
x-request-idReturned on every response for tracing