LLM provider bridge for Python

Relay SDK

Relay is a lightweight Python layer that smooths out the differences between OpenAI, Anthropic, and every provider you plug in next.

View on GitHub

Key Capabilities

Single Predictable Client

Instantiate `relay.LLM` once and swap providers by changing the identifier—no rewrites, no bespoke client code.

Smart Defaults & Normalized Responses

Provider defaults, models, and usage metrics are resolved for you so downstream code always receives the same structure.

Extensible Provider Registry

Drop in new providers by subclassing `BaseProvider` and registering it, without touching the rest of the SDK.

Why Relay?

Relay sits in front of provider SDKs and exposes a compact façade that always accepts standard chat messages and returns a unified payload. It maximises portability so teams can experiment with different models without rewriting integrations.

Design Principles

The internal design prioritises clarity: configuration errors surface early, HTTP failures are wrapped in typed exceptions, and every provider shares the same lifecycle. New contributors can trace requests end-to-end in just a few modules.

Provider Support

OpenAI

Handles the Chat Completions API with curated parameter support, defaulting to `gpt-4o-mini` while letting teams tweak temperature, token limits, and output formats.

Gracefully surfaces API errors with status codes and raw payloads.
Normalises chat messages and usage metrics, including prompt and completion tokens.
Supports organisational scoping, configurable base URLs, and request timeouts.

Anthropic

Bridges Anthropic’s Messages API, translating Relay messages into Claude-friendly payloads and extracting text blocks from streaming-style responses.

Combines multiple system prompts and enforces supported roles up front.
Exposes knobs for temperature, top-p/top-k, and stop sequences.
Maps Anthropic usage fields into Relay’s `Usage` dataclass.

Quick Start

The `LLM` façade unifies both chat and completion workflows. Provide a provider slug (optionally with a model) and an API key—Relay handles the rest.

from relay import LLM

client = LLM(name="openai:gpt-4o-mini", api_key="sk-...")
response = client.chat(
    [
        {"role": "system", "content": "You are Relay, a helpful SDK assistant."},
        {"role": "user", "content": "Give me a short fun fact about Python."},
    ],
    temperature=0.3,
)

print(f"[{response.provider}:{response.model}] {response.content}")

Install with pip install relay

Architecture at a Glance

Flow

Calls go through the lightweight `LLM` façade, which parses the provider identifier, merges global defaults, and delegates to a provider implementation retrieved from the registry.

Extension Points

Every provider subclasses `BaseProvider`, implements `chat`, and inherits a shared `complete` fallback. The registry (`relay.providers`) keeps the surface area stable while exposing `register_provider` for custom integrations.

Roadmap & Contribution

Next Up

Add more providers such as Mistral, Gemini, and local runtimes.
Deliver streaming responses and tool/function calling helpers.
Offer both sync and async flavours for production workloads.

Contributing

Follow the Developer Guidelines: set up a virtualenv, install in editable mode, keep changes typed and tested with `pytest`, and document new workflows in the README and examples.