# HiveForge (Full Documentation)

> HiveForge is a batteries-included SaaS platform that lets you deploy production-ready applications with built-in AI, billing, email, vector search, webhooks, and MCP integration.

This is the full version of llms.txt. For a summary, see https://docs.hiveforge.dev/llms.txt

---

# Authentication Overview

HiveForge supports four authentication mechanisms, each designed for a different use case. Choose the right one based on who or what is making the request.

## Authentication methods

| Method | Header | Use case |
|---|---|---|
| Supabase JWT | `Authorization: Bearer <token>` | End-users accessing the platform via browser or mobile app |
| API Keys | `Authorization: Bearer hf_live_...` | Programmatic access from your backend services |
| Deployment Credentials | `X-Deployment-ID` + `X-Deployment-Secret` | SDK and proxy calls from deployed customer apps |
| MCP Service Key | `X-MCP-Service-Key` | Inter-service calls for MCP tool execution |

## How it works

```
                        HiveForge API
                             |
         +-------------------+-------------------+
         |                   |                   |
    JWT Tokens          API Keys          Deployment Creds
    (end users)       (programmatic)      (SDK / proxy)
         |                   |                   |
   Supabase Auth      Key validation      Secret matching
   HS256 / RS256      Scope checking      Tier entitlements
```

### 1. Supabase JWT

The default authentication method for end-users. When a user signs in through the HiveForge web app (email/password or OAuth), Supabase issues a JWT that is sent as a Bearer token. The API verifies the token using either the JWT secret (HS256) or JWKS endpoint (RS256).

**Best for:** Browser-based applications, mobile apps, any user-facing client.

### 2. API Keys

Prefixed keys (`hf_live_...` for production, `hf_test_...` for sandbox) with granular scope-based permissions. Keys are tied to an organization and created by admin or owner users. The API key middleware validates the key and attaches scopes to the request for downstream enforcement.

**Best for:** Server-to-server integrations, CI/CD pipelines, automation scripts.

### 3. Deployment Credentials

A pair of headers (`X-Deployment-ID` and `X-Deployment-Secret`) used by the HiveForge SDK when a deployed customer application communicates with the platform. These credentials identify the deployment and determine tier-based entitlements.

**Best for:** Customer SaaS apps deployed through HiveForge, SDK initialization.

### 4. MCP Service Key

A shared secret sent via `X-MCP-Service-Key` header for Model Context Protocol inter-service calls. This authenticates tool invocations between the MCP server and the HiveForge API.

**Best for:** MCP tool servers, internal service-to-service communication.

All API requests must use HTTPS in production. The base URL for the HiveForge API is `https://api.hiveforge.dev`.

## Choosing the right method

- **Building a web app?** Use JWT tokens via Supabase Auth.
- **Calling the API from a backend?** Use API keys with appropriate scopes.
- **Using the HiveForge SDK in a deployed app?** Use deployment credentials.
- **Connecting an MCP tool server?** Use MCP service key.

---

# HiveForge SDK

The HiveForge TypeScript SDK provides a complete interface for hosted deployments to interact with HiveForge platform services, including entitlement management, AI proxy, billing, email, vector search, webhooks, and more.

## Installation

```bash
npm install @producthacker/hiveforge-sdk
```

## Environment Variables

Set these in your `.env` file or hosting environment:

```env
HIVEFORGE_DEPLOYMENT_ID=d9f2a1b4-7c3e-4f8a-b5d6-1e2f3a4b5c6d
HIVEFORGE_DEPLOYMENT_SECRET=sk_live_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6
HIVEFORGE_API_URL=https://api.hiveforge.dev/api/v1   # Optional, this is the default
```

| Variable | Required | Description |
|---|---|---|
| `HIVEFORGE_DEPLOYMENT_ID` | Yes | Your deployment's unique identifier from the HiveForge dashboard |
| `HIVEFORGE_DEPLOYMENT_SECRET` | Yes | Secret key for authenticating API requests |
| `HIVEFORGE_API_URL` | No | API base URL. Defaults to `https://api.hiveforge.dev/api/v1` |

## Client Initialization

### Manual Configuration

```typescript
import { HiveForgeClient } from '@producthacker/hiveforge-sdk';

const hiveforge = new HiveForgeClient({
  deploymentId: process.env.HIVEFORGE_DEPLOYMENT_ID!,
  deploymentSecret: process.env.HIVEFORGE_DEPLOYMENT_SECRET!,
  apiUrl: process.env.HIVEFORGE_API_URL,  // optional
  debug: false,                            // optional, enables console logging
});

// Must be called before using any features
await hiveforge.initialize();
```

### From Environment Variables

```typescript
import { createHiveForgeClient } from '@producthacker/hiveforge-sdk';

// Reads HIVEFORGE_DEPLOYMENT_ID and HIVEFORGE_DEPLOYMENT_SECRET from env
const hiveforge = createHiveForgeClient({
  debug: process.env.NODE_ENV === 'development',
});

await hiveforge.initialize();
```

You must call `initialize()` before using any SDK features. This fetches your deployment's entitlements from the server and starts background refresh.

## Configuration Options

The `HiveForgeConfig` interface accepts the following options:

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `deploymentId` | `string` | Yes | -- | Deployment ID from HiveForge dashboard |
| `deploymentSecret` | `string` | Yes | -- | Deployment secret key |
| `apiUrl` | `string` | No | `https://api.hiveforge.dev/api/v1` | API base URL |
| `debug` | `boolean` | No | `false` | Enable debug logging to console |
| `fetch` | `typeof fetch` | No | `globalThis.fetch` | Custom fetch implementation (useful for testing) |

## API Modules

After initialization, the client exposes these proxy modules:

| Module | Access | Description |
|---|---|---|
| `ai` | `hiveforge.ai` | Chat completions, streaming, embeddings |
| `billing` | `hiveforge.billing` | Stripe checkout sessions, customer portal, prices |
| `credits` | `hiveforge.credits` | Credit balance, usage, packs, costs |
| `email` | `hiveforge.email` | Transactional email delivery |
| `vectors` | `hiveforge.vectors` | Semantic vector search and storage |
| `webhooks` | `hiveforge.webhooks` | Reliable outbound webhook delivery |

## Entitlement Methods

These methods are available directly on the `HiveForgeClient` instance after initialization.

### `isEnabled(feature)`

Check if a feature flag is enabled for the current deployment.

```typescript
if (hiveforge.isEnabled('ai_enabled')) {
  // AI features are available
}

if (hiveforge.isEnabled('billing_enabled')) {
  // Billing features are available
}
```

**Parameters:**

| Name | Type | Description |
|---|---|---|
| `feature` | `keyof FeatureFlags` | Feature flag name to check |

**Returns:** `boolean`

### `getStatus()`

Get the current subscription status.

```typescript
const status = hiveforge.getStatus();
// 'active' | 'trialing' | 'past_due' | 'grace_period' | 'suspended' | 'canceled'
```

**Returns:** `SubscriptionStatus`

### `getTier()`

Get the current subscription tier.

```typescript
const tier = hiveforge.getTier();
// 'sandbox' | 'trial' | 'launch' | 'growth' | 'enterprise'
```

**Returns:** `string`

### `getMessage()`

Get the user-facing message from entitlements (e.g., trial expiration warnings).

```typescript
const message = hiveforge.getMessage();
// "Trial expires in 3 days" or null
```

**Returns:** `string | null`

### `getEntitlements()`

Get the full entitlements object.

```typescript
const entitlements = hiveforge.getEntitlements();
if (entitlements) {
  console.log(entitlements.tier);
  console.log(entitlements.features.ai_monthly_limit);
  console.log(entitlements.quotas);
}
```

**Returns:** `Entitlements | null`

### `getQuota(resource)`

Get quota information for a specific resource.

```typescript
const aiQuota = hiveforge.getQuota('ai_tokens');
if (aiQuota) {
  console.log(`Used: ${aiQuota.used}, Limit: ${aiQuota.limit}, Remaining: ${aiQuota.remaining}`);
}
```

**Returns:** `QuotaInfo | null`

### `isSuspended()`

Check if the deployment is in suspended state. Returns `boolean`.

### `isInGracePeriod()`

Check if the deployment is in grace period. Returns `boolean`.

### `refreshEntitlements()`

Manually refresh entitlements from the server. Returns `Promise<Entitlements>`.

### `shutdown()`

Stop background entitlement refresh and clean up.

## Event System

Subscribe to SDK events for real-time updates.

```typescript
// Entitlements refreshed
const unsub1 = hiveforge.on('entitlements:updated', (entitlements) => {
  console.log('Tier:', entitlements.tier);
});

// Subscription status changed
const unsub2 = hiveforge.on('status:changed', ({ previous, current }) => {
  console.log(`Status changed: ${previous} -> ${current}`);
});

// Quota approaching limit (80% threshold)
const unsub3 = hiveforge.on('quota:warning', ({ resource, used, limit }) => {
  console.warn(`${resource} at ${Math.round((used / limit) * 100)}% usage`);
});

// Quota exceeded
const unsub4 = hiveforge.on('quota:exceeded', ({ resource, used, limit }) => {
  console.error(`${resource} quota exceeded: ${used}/${limit}`);
});

// Unsubscribe when done
unsub1();
```

### Event Reference

| Event | Payload | Description |
|---|---|---|
| `entitlements:updated` | `Entitlements` | Entitlements successfully refreshed |
| `entitlements:error` | `Error` | Entitlement refresh failed |
| `quota:warning` | `{ resource, used, limit }` | Resource usage above 80% |
| `quota:exceeded` | `{ resource, used, limit }` | Resource usage at or above limit |
| `status:changed` | `{ previous, current }` | Subscription status changed |

## Feature Flags Reference

| Flag | Type | Description |
|---|---|---|
| `ai_enabled` | `boolean` | Whether AI proxy is available |
| `ai_monthly_limit` | `number | null` | Monthly token limit (`null` = unlimited) |
| `billing_enabled` | `boolean` | Whether billing proxy is available |
| `custom_domain` | `boolean` | Custom domain support |
| `white_label` | `boolean` | White labeling enabled |
| `api_rate_limit` | `number | null` | API rate limit (`null` = unlimited) |
| `support_level` | `string` | Support tier: `community`, `email`, `priority`, or `dedicated` |

## Subscription Statuses

| Status | Description |
|---|---|
| `active` | Subscription is active and in good standing |
| `trialing` | Currently in trial period |
| `past_due` | Payment failed, system is retrying |
| `grace_period` | Subscription lapsed, features limited temporarily |
| `suspended` | Account fully suspended, features disabled |
| `canceled` | Subscription canceled |

## TypeScript Types

All types are exported for use in your application:

```typescript
import type {
  HiveForgeConfig,
  Entitlements,
  SubscriptionStatus,
  FeatureFlags,
  QuotaInfo,
  ChatMessage,
  AICompletionOptions,
  AICompletionResponse,
  AIStreamOptions,
  AIStreamChunk,
  AIEmbeddingOptions,
  AIEmbeddingResponse,
  AIQuotaResponse,
  CheckoutOptions,
  CheckoutResponse,
  PortalOptions,
  PortalResponse,
  HiveForgeEvent,
  HiveForgeEventData,
  HiveForgeEventHandler,
} from '@producthacker/hiveforge-sdk';
```

## Server-Side Usage (Next.js)

```typescript
// lib/hiveforge.ts
import { HiveForgeClient } from '@producthacker/hiveforge-sdk';

export const hiveforge = new HiveForgeClient({
  deploymentId: process.env.HIVEFORGE_DEPLOYMENT_ID!,
  deploymentSecret: process.env.HIVEFORGE_DEPLOYMENT_SECRET!,
});

// Initialize once on server start
if (typeof window === 'undefined') {
  hiveforge.initialize().catch(console.error);
}
```

```typescript
// pages/api/ai.ts (or app/api/ai/route.ts)
import { hiveforge } from '@/lib/hiveforge';

export default async function handler(req, res) {
  const { message } = req.body;

  const response = await hiveforge.ai.complete({
    messages: [{ role: 'user', content: message }],
  });

  res.json({ content: response.content });
}
```

---

# AI Proxy

The AI proxy lets your deployed application make OpenAI-compatible requests through HiveForge without embedding API keys in your app. Quota enforcement, model access, and usage tracking are handled automatically based on your deployment's tier.

Access the AI proxy via `hiveforge.ai`.

```typescript
import { HiveForgeClient } from '@producthacker/hiveforge-sdk';

const hiveforge = new HiveForgeClient({
  deploymentId: process.env.HIVEFORGE_DEPLOYMENT_ID!,
  deploymentSecret: process.env.HIVEFORGE_DEPLOYMENT_SECRET!,
});
await hiveforge.initialize();

// Check if AI is available for this tier
if (hiveforge.ai.isEnabled()) {
  const response = await hiveforge.ai.complete({
    messages: [{ role: 'user', content: 'Hello!' }],
  });
  console.log(response.content);
}
```

## Methods

### `complete(options)`

Create a chat completion. Sends messages to the AI model and returns the full response.

```typescript
const response = await hiveforge.ai.complete({
  messages: [
    { role: 'system', content: 'You are a helpful customer support agent.' },
    { role: 'user', content: 'How do I reset my password?' },
  ],
  model: 'gpt-4o-mini',
  temperature: 0.7,
  max_tokens: 500,
});

console.log(response.content);
console.log(`Tokens used: ${response.tokens_used}`);
console.log(`Model: ${response.model}`);
```

**Parameters (`AICompletionOptions`):**

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `messages` | `ChatMessage[]` | Yes | -- | Array of conversation messages |
| `model` | `string` | No | `'gpt-4o-mini'` | Model to use |
| `max_tokens` | `number` | No | -- | Maximum tokens in the response |
| `temperature` | `number` | No | `0.7` | Sampling temperature (0-2) |
| `top_p` | `number` | No | -- | Nucleus sampling parameter |
| `stream` | `boolean` | No | `false` | Whether to stream (use `stream()` method instead) |
| `stop` | `string | string[]` | No | -- | Stop sequences |
| `presence_penalty` | `number` | No | `0` | Presence penalty (-2 to 2) |
| `frequency_penalty` | `number` | No | `0` | Frequency penalty (-2 to 2) |
| `user` | `string` | No | -- | End-user identifier for abuse tracking |
| `metadata` | `Record<string, unknown>` | No | -- | Custom metadata for logging |

**`ChatMessage` type:**

| Field | Type | Required | Description |
|---|---|---|---|
| `role` | `'system' | 'user' | 'assistant' | 'function' | 'tool'` | Yes | Message role |
| `content` | `string | null` | Yes | Message content |
| `name` | `string` | No | Name of the function/tool |

**Returns (`AICompletionResponse`):**

| Field | Type | Description |
|---|---|---|
| `content` | `string` | The generated text response |
| `model` | `string` | Model that was used |
| `tokens_used` | `number` | Total tokens consumed |
| `tokens_input` | `number` | Input/prompt tokens |
| `tokens_output` | `number` | Output/completion tokens |
| `finish_reason` | `string | null` | Why generation stopped (`'stop'`, `'length'`, etc.) |
| `metadata` | `Record<string, unknown>` | Optional metadata |

**Throws:** `AIProxyException` if quota is exceeded or AI is not enabled.

Equivalent curl request:

```bash
curl -X POST https://api.hiveforge.dev/api/v1/proxy/ai/completions \
  -H "Content-Type: application/json" \
  -H "X-Deployment-ID: d9f2a1b4-7c3e-4f8a-b5d6-1e2f3a4b5c6d" \
  -H "X-Deployment-Secret: sk_live_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6" \
  -d '{
    "messages": [{"role": "user", "content": "Hello!"}],
    "model": "gpt-4o-mini",
    "temperature": 0.7
  }'
```

### `stream(options)`

Create a streaming chat completion. Returns an `AsyncGenerator` that yields content chunks as they arrive.

```typescript
let fullResponse = '';

for await (const chunk of hiveforge.ai.stream({
  messages: [
    { role: 'user', content: 'Write a short poem about TypeScript.' },
  ],
  model: 'gpt-4o-mini',
})) {
  process.stdout.write(chunk.content);
  fullResponse += chunk.content;

  if (chunk.done) {
    console.log('\n--- Stream complete ---');
  }
}
```

**With callbacks:**

```typescript
for await (const chunk of hiveforge.ai.stream({
  messages: [{ role: 'user', content: 'Tell me a story' }],
  onChunk: (chunk) => {
    process.stdout.write(chunk.content);
  },
  onComplete: (fullContent) => {
    console.log(`\nTotal length: ${fullContent.length} chars`);
  },
  onError: (error) => {
    console.error('Stream error:', error.message);
  },
})) {
  // You can also process chunks here
}
```

**Parameters (`AIStreamOptions`):**

Same as `AICompletionOptions` (minus `stream`), plus:

| Parameter | Type | Required | Description |
|---|---|---|---|
| `onChunk` | `(chunk: AIStreamChunk) => void` | No | Callback for each chunk |
| `onComplete` | `(fullContent: string) => void` | No | Callback when streaming completes |
| `onError` | `(error: Error) => void` | No | Callback for stream errors |

**Yields (`AIStreamChunk`):**

| Field | Type | Description |
|---|---|---|
| `content` | `string` | Text content of this chunk |
| `done` | `boolean` | Whether this is the final chunk |

**Returns:** `AsyncGenerator<AIStreamChunk, string, unknown>` -- the return value is the full concatenated content.

### `streamToString(options)`

Convenience method that consumes the entire stream and returns the full response as a string.

```typescript
const fullResponse = await hiveforge.ai.streamToString({
  messages: [{ role: 'user', content: 'Summarize this document...' }],
  onChunk: (chunk) => process.stdout.write(chunk.content),
});
```

**Parameters:** Same as `stream()`.
**Returns:** `Promise<string>`

### `embed(options)`

Generate vector embeddings for text. Useful for semantic search, clustering, and similarity comparisons.

```typescript
const result = await hiveforge.ai.embed({
  text: 'How do I reset my password?',
  model: 'text-embedding-3-small',
});

console.log(`Dimensions: ${result.dimensions}`);
console.log(`Tokens used: ${result.tokens_used}`);
```

**Multiple texts:**

```typescript
const result = await hiveforge.ai.embed({
  text: [
    'How do I reset my password?',
    'Where can I update my billing info?',
    'How to enable two-factor authentication',
  ],
});
```

**Parameters (`AIEmbeddingOptions`):**

| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| `text` | `string | string[]` | Yes | -- | Text(s) to embed |
| `model` | `string` | No | `'text-embedding-3-small'` | Embedding model |
| `metadata` | `Record<string, unknown>` | No | -- | Custom metadata |

**Returns (`AIEmbeddingResponse`):**

| Field | Type | Description |
|---|---|---|
| `embeddings` | `number[][]` | Array of embedding vectors |
| `model` | `string` | Model that was used |
| `tokens_used` | `number` | Tokens consumed |
| `dimensions` | `number` | Dimensionality of each embedding |

### `getQuota()`

Get the current AI usage quota for your deployment.

```typescript
const quota = await hiveforge.ai.getQuota();

console.log(`Used: ${quota.used} tokens`);
console.log(`Limit: ${quota.limit ?? 'unlimited'}`);
console.log(`Remaining: ${quota.remaining ?? 'unlimited'}`);
console.log(`Resets at: ${quota.resets_at}`);
console.log(`Tier: ${quota.tier}`);
```

**Returns (`AIQuotaResponse`):**

| Field | Type | Description |
|---|---|---|
| `used` | `number` | Tokens used in current period |
| `limit` | `number | null` | Token limit (`null` = unlimited) |
| `remaining` | `number | null` | Tokens remaining (`null` = unlimited) |
| `resets_at` | `string | null` | ISO timestamp when quota resets |
| `tier` | `string` | Current deployment tier |
| `model_limits` | `Record<string, boolean> | null` | Which models are accessible |

### `getModels()`

List available models for your deployment's tier.

```typescript
const models = await hiveforge.ai.getModels();
console.log('Available:', models.available_models);
```

### `isEnabled()`

Check if AI is enabled for the current tier without making an API call (reads from cached entitlements). Returns `boolean`.

### `getRemainingQuota()`

Get remaining AI token quota from cached entitlements (no API call). Returns `number | null` (`null` if unlimited or entitlements not loaded).

## Full Example: Chat Interface

```typescript
import { HiveForgeClient, AIProxyException } from '@producthacker/hiveforge-sdk';

const hiveforge = new HiveForgeClient({
  deploymentId: process.env.HIVEFORGE_DEPLOYMENT_ID!,
  deploymentSecret: process.env.HIVEFORGE_DEPLOYMENT_SECRET!,
});
await hiveforge.initialize();

async function chat(userMessage: string, history: Array<{ role: string; content: string }>) {
  if (!hiveforge.ai.isEnabled()) {
    throw new Error('AI features are not available on your current plan.');
  }

  const remaining = hiveforge.ai.getRemainingQuota();
  if (remaining !== null && remaining < 100) {
    throw new Error('AI token quota nearly exhausted. Please upgrade your plan.');
  }

  try {
    const messages = [
      { role: 'system' as const, content: 'You are a helpful assistant.' },
      ...history.map(m => ({ role: m.role as 'user' | 'assistant', content: m.content })),
      { role: 'user' as const, content: userMessage },
    ];

    let response = '';
    for await (const chunk of hiveforge.ai.stream({ messages })) {
      response += chunk.content;
    }
    return response;
  } catch (error) {
    if (error instanceof AIProxyException) {
      if (error.isQuotaExceeded) {
        window.location.href = error.upgradeUrl ?? '/pricing';
      }
    }
    throw error;
  }
}
```

---

# API Overview

The HiveForge API provides a unified interface for deployed applications to access AI, billing, email, vector search, and webhook services through managed proxy endpoints.

## Base URL

```
https://api.hiveforge.dev/api/v1
```

All API endpoints are prefixed with `/api/v1`. The current version is **v1**.

## Authentication

HiveForge uses two authentication methods depending on the endpoint:

| Method | Headers | Used By |
|--------|---------|---------|
| Deployment credentials | `X-Deployment-ID` + `X-Deployment-Secret` | Proxy endpoints, entitlements |
| Body credentials | `deployment_id` + `deployment_secret` in JSON body | Entitlement check, heartbeat |

## Request Format

All request bodies must be JSON with `Content-Type: application/json`.

## Response Format

All responses return JSON.

**Successful response:**

```json
{
  "content": "...",
  "model": "gpt-4o-mini",
  "tokens_used": 150
}
```

**Error response:**

```json
{
  "detail": "Invalid deployment credentials"
}
```

Proxy endpoints may return structured error objects:

```json
{
  "error": {
    "code": "QUOTA_EXCEEDED",
    "message": "Monthly AI quota exceeded for this tier",
    "quota_exceeded": true,
    "upgrade_url": "https://hiveforge.dev/upgrade"
  }
}
```

## Error Codes

| HTTP Status | Meaning |
|-------------|---------|
| `400` | Bad Request -- Invalid parameters or request body |
| `401` | Unauthorized -- Missing or invalid deployment credentials |
| `402` | Payment Required -- Insufficient credits |
| `403` | Forbidden -- Feature not enabled for your tier or model unavailable |
| `404` | Not Found -- Resource does not exist |
| `429` | Too Many Requests -- Rate limit or quota exceeded |
| `500` | Internal Server Error -- Something went wrong on our end |
| `502` | Bad Gateway -- Upstream provider error (OpenAI, Stripe, etc.) |

## Rate Limiting

Rate limits vary by subscription tier:

| Tier | Requests per Minute |
|------|-------------------|
| Sandbox | 100 |
| Trial | 500 |
| Launch | 2,000 |
| Growth | 10,000 |
| Enterprise | Unlimited |

When rate limited, the API returns a `429` status code. Check the `Retry-After` header for how long to wait before retrying.

## Pagination

List endpoints support pagination:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `page` | integer | 1 | Page number (1-indexed) |
| `per_page` | integer | 50 | Items per page (max 100) |

---

# AI Completions API

`POST /api/v1/proxy/ai/completions`

Proxy a chat completion request to OpenAI. Enforces tier-based model access and quota limits.

## Authentication

| Header | Required | Description |
|--------|----------|-------------|
| `X-Deployment-ID` | Yes | Your deployment UUID |
| `X-Deployment-Secret` | Yes | Your deployment secret |

## Request Body

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `messages` | ChatMessage[] | Yes | -- | Array of chat messages |
| `model` | string | No | `gpt-4o-mini` | Model to use |
| `max_tokens` | integer | No | `null` | Maximum tokens to generate (1--128,000) |
| `temperature` | float | No | `0.7` | Sampling temperature (0--2) |
| `top_p` | float | No | `null` | Nucleus sampling threshold (0--1) |
| `stream` | boolean | No | `false` | Enable streaming |
| `stop` | string or string[] | No | `null` | Stop sequences |
| `presence_penalty` | float | No | `0` | Presence penalty (-2 to 2) |
| `frequency_penalty` | float | No | `0` | Frequency penalty (-2 to 2) |
| `user` | string | No | `null` | End-user identifier for OpenAI abuse tracking |
| `metadata` | object | No | `null` | Custom tracking metadata |

### ChatMessage

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `role` | string | Yes | One of `"system"`, `"user"`, `"assistant"`, `"function"`, `"tool"` |
| `content` | string | No | Message content |
| `name` | string | No | Name of the function/tool |
| `function_call` | object | No | Function call data |
| `tool_calls` | object[] | No | Tool call data |

## Response

| Field | Type | Description |
|-------|------|-------------|
| `content` | string | The generated completion text |
| `model` | string | Model that was used |
| `tokens_used` | integer | Total tokens consumed |
| `tokens_input` | integer | Input/prompt tokens |
| `tokens_output` | integer | Output/completion tokens |
| `finish_reason` | string | Why generation stopped (`stop`, `length`, etc.) |
| `metadata` | object | Echo of your custom metadata |

## Examples

### curl

```bash
curl -X POST https://api.hiveforge.dev/api/v1/proxy/ai/completions \
  -H "Content-Type: application/json" \
  -H "X-Deployment-ID: d7a8f3e1-2b4c-5d6e-8f9a-0b1c2d3e4f5a" \
  -H "X-Deployment-Secret: sk_deploy_a1b2c3d4e5f6..." \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is HiveForge?"}
    ],
    "model": "gpt-4o-mini",
    "temperature": 0.7,
    "max_tokens": 500
  }'
```

### Python

```python
import httpx

response = httpx.post(
    "https://api.hiveforge.dev/api/v1/proxy/ai/completions",
    headers={
        "Content-Type": "application/json",
        "X-Deployment-ID": "d7a8f3e1-2b4c-5d6e-8f9a-0b1c2d3e4f5a",
        "X-Deployment-Secret": "sk_deploy_a1b2c3d4e5f6...",
    },
    json={
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is HiveForge?"},
        ],
        "model": "gpt-4o-mini",
        "temperature": 0.7,
        "max_tokens": 500,
    },
)

data = response.json()
print(data["content"])
```

## Example Response

```json
{
  "content": "HiveForge is a platform that deploys and manages SaaS applications...",
  "model": "gpt-4o-mini",
  "tokens_used": 150,
  "tokens_input": 35,
  "tokens_output": 115,
  "finish_reason": "stop",
  "metadata": null
}
```

## Error Codes

| Status | Code | Description |
|--------|------|-------------|
| `401` | `INVALID_CREDENTIALS` | Missing or invalid `X-Deployment-ID` / `X-Deployment-Secret` |
| `403` | `AI_NOT_ENABLED` | AI is not enabled for your subscription tier |
| `403` | `MODEL_NOT_AVAILABLE` | The requested model is not available for your tier |
| `429` | `QUOTA_EXCEEDED` | Monthly AI quota has been exceeded |
| `502` | `OPENAI_ERROR` | OpenAI returned an error |

---

# MCP Integration

The **Model Context Protocol (MCP)** is an open standard for connecting AI agents to external tools and data sources. HiveForge provides first-class MCP integration, allowing your deployed applications to meter, gate, and monetize MCP tool usage through the platform's credit system.

## Why MCP Matters

MCP enables AI agents (such as Claude Code, custom LLM pipelines, or hosted assistants) to call external tools in a standardized way. Without a metering layer, these tool calls run untracked and ungated. HiveForge sits between the agent and the tools to provide:

- **Entitlement checking** -- gate tool access by subscription tier
- **Credit-based metering** -- charge per tool invocation based on action type
- **Usage recording** -- full audit trail of every tool call
- **Graceful degradation** -- agents receive structured denial reasons they can act on

## Two Integration Modes

| Mode | Use Case | Auth Method |
|------|----------|-------------|
| **Stdio** | Local agents (Claude Code, CLI tools) | API key (`Authorization: Bearer hf_live_...`) |
| **Service-to-Service** | Hosted MCP servers | Shared secret (`X-MCP-Service-Key` header) |

### Stdio Mode

For MCP servers running locally via stdio transport (e.g., Claude Code connecting to your tools on the user's machine). The MCP server authenticates with the customer's HiveForge API key and calls the stdio metering endpoints directly.

### Service-to-Service Mode

For hosted MCP servers that run alongside your application. The MCP server authenticates with a shared service key and has access to the full resolution, entitlement, and usage API.

## Architecture

```
                         +----------------------+
                         |     AI Agent          |
                         |  (Claude Code, etc.)  |
                         +----------+-----------+
                                    |
                                    v
                         +----------------------+
                         |     MCP Server        |
                         |  (stdio or hosted)    |
                         +----------+-----------+
                                    |
                    +---------------+---------------+
                    v               v               v
            +--------------+ +--------------+ +--------------+
            |  1. Check    | |  2. Execute  | |  3. Record   |
            |  Entitlement | |  Tool        | |  Usage       |
            +------+-------+ +--------------+ +------+-------+
                   |                                  |
                   v                                  v
            +---------------------------------------------+
            |              HiveForge API                   |
            |         https://api.hiveforge.dev            |
            +---------------------------------------------+
            |  Entitlements  |  Credits  |  Usage Tracking |
            +---------------------------------------------+
```

## Metered Tool Families

| Family | Tools | Example |
|--------|-------|---------|
| **TaskCrush** | 6 tools | `taskcrush_create_task`, `taskcrush_chat` |
| **HornetHive** | 4 tools | `hornethive_execute_crew`, `hornethive_rag_search` |
| **Tao-Data** | 4 tools | `taodata_log_trace`, `taodata_evaluate` |
| **HiveForge Platform** | 9 tools | `hiveforge_list_templates`, `hiveforge_create_instance` |

MCP integration requires the **Trial** tier or above. Sandbox-tier deployments receive a structured `sandbox_tier` denial when attempting MCP tool calls.

---

# Stdio Mode

Stdio mode is for MCP servers running locally via stdio transport -- the standard integration pattern for Claude Code, local CLI agents, and other tools that run on the user's machine.

## Authentication

Stdio mode uses the customer's HiveForge API key for authentication. Pass it as a Bearer token in the `Authorization` header.

```
Authorization: Bearer hf_live_a1b2c3d4e5f6g7h8i9j0...
```

The API key is tied to an organization. The platform automatically resolves the organization to its active deployment and entitlement tier.

Stdio endpoints do not require the `X-MCP-Service-Key` header. They are designed for direct use by customer-facing MCP servers.

## Endpoints

All endpoints use the base URL: `https://api.hiveforge.dev/api/v1/mcp`

### Check Entitlement

Verify whether the caller is entitled to use a specific MCP tool before executing it.

```
POST /api/v1/mcp/stdio/check-entitlement
```

**Request Body**

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `tool_name` | `string` | Yes | MCP tool name (e.g., `hornethive_execute_crew`) |

**Response**

| Field | Type | Description |
|-------|------|-------------|
| `allowed` | `boolean` | Whether the tool call is permitted |
| `credit_cost` | `integer` | Credits that will be deducted |
| `tier` | `string|null` | Deployment's entitlement tier |
| `reason` | `string|null` | Denial reason if `allowed` is `false` |

**Reason codes:** `sandbox_tier`, `insufficient_credits`, `no_deployment_found`, `deployment_not_found`, `mcp_disabled`

**Example curl:**

```bash
curl -X POST https://api.hiveforge.dev/api/v1/mcp/stdio/check-entitlement \
  -H "Authorization: Bearer hf_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "tool_name": "hornethive_execute_crew"
  }'
```

**Example response (allowed):**

```json
{
  "allowed": true,
  "credit_cost": 5,
  "tier": "launch",
  "reason": null
}
```

**Example response (denied):**

```json
{
  "allowed": false,
  "credit_cost": 0,
  "tier": "sandbox",
  "reason": "sandbox_tier"
}
```

### Record Usage

Record a completed MCP tool invocation and deduct credits. Call this after the tool executes successfully.

```
POST /api/v1/mcp/stdio/record-usage
```

**Request Body**

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `tool_name` | `string` | Yes | MCP tool name that was executed |
| `metadata` | `object|null` | No | Arbitrary metadata for the audit trail |

**Response**

| Field | Type | Description |
|-------|------|-------------|
| `success` | `boolean` | Whether credits were deducted |
| `credits_used` | `integer` | Number of credits deducted |
| `credits_remaining` | `integer|null` | Remaining credit balance |

**Example curl:**

```bash
curl -X POST https://api.hiveforge.dev/api/v1/mcp/stdio/record-usage \
  -H "Authorization: Bearer hf_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "tool_name": "hornethive_execute_crew",
    "metadata": {
      "crew_id": "content-pipeline",
      "duration_ms": 4520
    }
  }'
```

**Example response:**

```json
{
  "success": true,
  "credits_used": 5,
  "credits_remaining": 9450
}
```

## Full Integration Example

```python
import httpx

API_URL = "https://api.hiveforge.dev/api/v1/mcp"
API_KEY = "hf_live_your_key_here"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}


async def metered_tool_call(tool_name: str, execute_fn, **kwargs):
    """Execute an MCP tool with entitlement checking and usage recording."""
    async with httpx.AsyncClient() as client:
        # Step 1: Check entitlement
        check = await client.post(
            f"{API_URL}/stdio/check-entitlement",
            headers=HEADERS,
            json={"tool_name": tool_name},
        )
        check_data = check.json()

        if not check_data["allowed"]:
            return {
                "error": f"Tool denied: {check_data['reason']}",
                "tier": check_data.get("tier"),
            }

        # Step 2: Execute the tool
        result = await execute_fn(**kwargs)

        # Step 3: Record usage
        await client.post(
            f"{API_URL}/stdio/record-usage",
            headers=HEADERS,
            json={"tool_name": tool_name, "metadata": {"args": kwargs}},
        )

        return result
```

Always check entitlement before executing the tool. If you record usage without checking first, the credit deduction may fail for depleted accounts, but the tool will have already run.

---

# Tier Comparison

HiveForge offers five subscription tiers. Each tier unlocks additional features, higher quotas, and more capabilities for your deployed application.

## Feature Matrix

| Feature | Sandbox | Trial | Launch | Growth | Enterprise |
|---------|---------|-------|--------|--------|------------|
| **AI Enabled** | No | Yes | Yes | Yes | Yes |
| **AI Monthly Limit** | 0 | 1,000 | 10,000 | 50,000 | Unlimited |
| **Billing Enabled** | No | Yes | Yes | Yes | Yes |
| **Custom Domain** | No | No | Yes | Yes | Yes |
| **White Label** | No | No | No | Yes | Yes |
| **API Rate Limit** | 100/min | 500/min | 2,000/min | 10,000/min | Unlimited |
| **Support Level** | Community | Email | Email | Priority | Dedicated |
| **MCP Enabled** | No | Yes | Yes | Yes | Yes |

## Tier Details

### Sandbox

The free tier for exploration and development. No AI, billing, or MCP access.

- No time limit -- stays on Sandbox indefinitely
- Suitable for prototyping UI and basic app structure
- API rate limit of 100 requests/minute
- Community support only (forums, documentation)

Sandbox is the default tier for new deployments. Upgrade to Trial to unlock AI and MCP features.

### Trial

A time-limited evaluation tier with access to all platform features at reduced limits.

- Full AI proxy access (1,000 tokens/month)
- Billing proxy for Stripe integration testing
- MCP tool metering enabled
- API rate limit of 500 requests/minute
- Email support

Trial deployments revert to Sandbox when the trial period expires unless upgraded to a paid tier.

### Launch

The first paid tier, designed for early-stage products going to production.

- AI proxy with 10,000 tokens/month
- Custom domain support
- MCP tool metering enabled
- API rate limit of 2,000 requests/minute
- Email support

Best for: Solo founders and small teams launching their first SaaS product.

### Growth

For scaling products that need higher limits and brand customization.

- AI proxy with 50,000 tokens/month
- Custom domain support
- White-label branding (remove HiveForge branding)
- MCP tool metering enabled
- API rate limit of 10,000 requests/minute
- Priority support with faster response times

Best for: Growing startups with paying customers and increasing usage.

### Enterprise

Unlimited usage with dedicated support and custom terms.

- Unlimited AI tokens
- Custom domain and white-label support
- MCP tool metering enabled
- Unlimited API rate
- Dedicated support with SLA
- Custom feature flags and quota overrides available

Best for: Established companies with high-volume usage or compliance requirements.

Enterprise tier includes custom onboarding. Contact sales at sales@hiveforge.dev for pricing and setup.

## Upgrading and Downgrading

Tier changes take effect within 5 minutes due to entitlement caching. When upgrading:

- New feature flags are immediately available after cache expiry
- Quotas are adjusted to the new tier's limits
- Monthly credit allocation changes at the next billing period

When downgrading:

- Features above the new tier are disabled after cache expiry
- Quota limits are reduced (usage may already exceed the new limit)
- Purchased credits are retained regardless of tier

## Programmatic Tier Check

Use the SDK or API to check the current tier:

```typescript
import { HiveForgeClient } from "@producthacker/hiveforge-sdk";

const client = new HiveForgeClient();
const entitlements = await client.entitlements.check();

console.log(entitlements.tier);        // "launch"
console.log(entitlements.features);    // { ai_enabled: true, ... }
```