TypeScript SDK
AI Proxy

AI Proxy

The AI proxy lets your deployed application make OpenAI-compatible requests through HiveForge without embedding API keys in your app. Quota enforcement, model access, and usage tracking are handled automatically based on your deployment's tier.

Access the AI proxy via hiveforge.ai.

import { HiveForgeClient } from '@producthacker/hiveforge-sdk';
 
const hiveforge = new HiveForgeClient({
  deploymentId: process.env.HIVEFORGE_DEPLOYMENT_ID!,
  deploymentSecret: process.env.HIVEFORGE_DEPLOYMENT_SECRET!,
});
await hiveforge.initialize();
 
// Check if AI is available for this tier
if (hiveforge.ai.isEnabled()) {
  const response = await hiveforge.ai.complete({
    messages: [{ role: 'user', content: 'Hello!' }],
  });
  console.log(response.content);
}

Methods


complete(options)

Create a chat completion. Sends messages to the AI model and returns the full response.

const response = await hiveforge.ai.complete({
  messages: [
    { role: 'system', content: 'You are a helpful customer support agent.' },
    { role: 'user', content: 'How do I reset my password?' },
  ],
  model: 'gpt-4o-mini',
  temperature: 0.7,
  max_tokens: 500,
});
 
console.log(response.content);
console.log(`Tokens used: ${response.tokens_used}`);
console.log(`Model: ${response.model}`);

Parameters (AICompletionOptions):

ParameterTypeRequiredDefaultDescription
messagesChatMessage[]Yes--Array of conversation messages
modelstringNo'gpt-4o-mini'Model to use
max_tokensnumberNo--Maximum tokens in the response
temperaturenumberNo0.7Sampling temperature (0-2)
top_pnumberNo--Nucleus sampling parameter
streambooleanNofalseWhether to stream (use stream() method instead)
stopstring | string[]No--Stop sequences
presence_penaltynumberNo0Presence penalty (-2 to 2)
frequency_penaltynumberNo0Frequency penalty (-2 to 2)
userstringNo--End-user identifier for abuse tracking
metadataRecord<string, unknown>No--Custom metadata for logging

ChatMessage type:

FieldTypeRequiredDescription
role'system' | 'user' | 'assistant' | 'function' | 'tool'YesMessage role
contentstring | nullYesMessage content
namestringNoName of the function/tool

Returns (AICompletionResponse):

FieldTypeDescription
contentstringThe generated text response
modelstringModel that was used
tokens_usednumberTotal tokens consumed
tokens_inputnumberInput/prompt tokens
tokens_outputnumberOutput/completion tokens
finish_reasonstring | nullWhy generation stopped ('stop', 'length', etc.)
metadataRecord<string, unknown>Optional metadata

Throws: AIProxyException if quota is exceeded or AI is not enabled.

Equivalent curl request
curl -X POST https://api.hiveforge.dev/api/v1/proxy/ai/completions \
  -H "Content-Type: application/json" \
  -H "X-Deployment-ID: d9f2a1b4-7c3e-4f8a-b5d6-1e2f3a4b5c6d" \
  -H "X-Deployment-Secret: sk_live_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6" \
  -d '{
    "messages": [{"role": "user", "content": "Hello!"}],
    "model": "gpt-4o-mini",
    "temperature": 0.7
  }'

stream(options)

Create a streaming chat completion. Returns an AsyncGenerator that yields content chunks as they arrive.

let fullResponse = '';
 
for await (const chunk of hiveforge.ai.stream({
  messages: [
    { role: 'user', content: 'Write a short poem about TypeScript.' },
  ],
  model: 'gpt-4o-mini',
})) {
  process.stdout.write(chunk.content);
  fullResponse += chunk.content;
 
  if (chunk.done) {
    console.log('\n--- Stream complete ---');
  }
}

With callbacks:

for await (const chunk of hiveforge.ai.stream({
  messages: [{ role: 'user', content: 'Tell me a story' }],
  onChunk: (chunk) => {
    // Called for each chunk
    process.stdout.write(chunk.content);
  },
  onComplete: (fullContent) => {
    // Called when streaming finishes
    console.log(`\nTotal length: ${fullContent.length} chars`);
  },
  onError: (error) => {
    console.error('Stream error:', error.message);
  },
})) {
  // You can also process chunks here
}

Parameters (AIStreamOptions):

Same as AICompletionOptions (minus stream), plus:

ParameterTypeRequiredDescription
onChunk(chunk: AIStreamChunk) => voidNoCallback for each chunk
onComplete(fullContent: string) => voidNoCallback when streaming completes
onError(error: Error) => voidNoCallback for stream errors

Yields (AIStreamChunk):

FieldTypeDescription
contentstringText content of this chunk
donebooleanWhether this is the final chunk

Returns: AsyncGenerator<AIStreamChunk, string, unknown> -- the return value is the full concatenated content.


streamToString(options)

Convenience method that consumes the entire stream and returns the full response as a string.

const fullResponse = await hiveforge.ai.streamToString({
  messages: [{ role: 'user', content: 'Summarize this document...' }],
  onChunk: (chunk) => process.stdout.write(chunk.content),
});
 
console.log('Final response:', fullResponse);

Parameters: Same as stream().

Returns: Promise<string>


embed(options)

Generate vector embeddings for text. Useful for semantic search, clustering, and similarity comparisons.

const result = await hiveforge.ai.embed({
  text: 'How do I reset my password?',
  model: 'text-embedding-3-small',
});
 
console.log(`Dimensions: ${result.dimensions}`);
console.log(`Tokens used: ${result.tokens_used}`);
console.log(`Embedding length: ${result.embeddings[0].length}`);

Multiple texts:

const result = await hiveforge.ai.embed({
  text: [
    'How do I reset my password?',
    'Where can I update my billing info?',
    'How to enable two-factor authentication',
  ],
});
 
// result.embeddings is an array of embedding arrays
for (const embedding of result.embeddings) {
  console.log(`Vector with ${embedding.length} dimensions`);
}

Parameters (AIEmbeddingOptions):

ParameterTypeRequiredDefaultDescription
textstring | string[]Yes--Text(s) to embed
modelstringNo'text-embedding-3-small'Embedding model
metadataRecord<string, unknown>No--Custom metadata

Returns (AIEmbeddingResponse):

FieldTypeDescription
embeddingsnumber[][]Array of embedding vectors
modelstringModel that was used
tokens_usednumberTokens consumed
dimensionsnumberDimensionality of each embedding
Equivalent curl request
curl -X POST https://api.hiveforge.dev/api/v1/proxy/ai/embeddings \
  -H "Content-Type: application/json" \
  -H "X-Deployment-ID: d9f2a1b4-7c3e-4f8a-b5d6-1e2f3a4b5c6d" \
  -H "X-Deployment-Secret: sk_live_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6" \
  -d '{
    "text": "How do I reset my password?",
    "model": "text-embedding-3-small"
  }'

getQuota()

Get the current AI usage quota for your deployment.

const quota = await hiveforge.ai.getQuota();
 
console.log(`Used: ${quota.used} tokens`);
console.log(`Limit: ${quota.limit ?? 'unlimited'}`);
console.log(`Remaining: ${quota.remaining ?? 'unlimited'}`);
console.log(`Resets at: ${quota.resets_at}`);
console.log(`Tier: ${quota.tier}`);

Returns (AIQuotaResponse):

FieldTypeDescription
usednumberTokens used in current period
limitnumber | nullToken limit (null = unlimited)
remainingnumber | nullTokens remaining (null = unlimited)
resets_atstring | nullISO timestamp when quota resets
tierstringCurrent deployment tier
model_limitsRecord<string, boolean> | nullWhich models are accessible

getModels()

List available models for your deployment's tier.

const models = await hiveforge.ai.getModels();
 
console.log(`Tier: ${models.tier}`);
console.log('Available:', models.available_models);
console.log('All models:', models.all_models);

Returns:

FieldTypeDescription
tierstringCurrent tier
available_modelsstring[]Models accessible at your tier
all_modelsstring[]All models across all tiers

isEnabled()

Check if AI is enabled for the current tier without making an API call (reads from cached entitlements).

if (hiveforge.ai.isEnabled()) {
  // Safe to make AI calls
}

Returns: boolean


getRemainingQuota()

Get remaining AI token quota from cached entitlements (no API call).

const remaining = hiveforge.ai.getRemainingQuota();
if (remaining !== null && remaining < 1000) {
  console.warn('Running low on AI tokens');
}

Returns: number | null -- null if unlimited or entitlements not loaded.

Full Example: Chat Interface

import { HiveForgeClient, AIProxyException } from '@producthacker/hiveforge-sdk';
 
const hiveforge = new HiveForgeClient({
  deploymentId: process.env.HIVEFORGE_DEPLOYMENT_ID!,
  deploymentSecret: process.env.HIVEFORGE_DEPLOYMENT_SECRET!,
});
await hiveforge.initialize();
 
async function chat(userMessage: string, history: Array<{ role: string; content: string }>) {
  if (!hiveforge.ai.isEnabled()) {
    throw new Error('AI features are not available on your current plan.');
  }
 
  const remaining = hiveforge.ai.getRemainingQuota();
  if (remaining !== null && remaining < 100) {
    throw new Error('AI token quota nearly exhausted. Please upgrade your plan.');
  }
 
  try {
    const messages = [
      { role: 'system' as const, content: 'You are a helpful assistant.' },
      ...history.map(m => ({ role: m.role as 'user' | 'assistant', content: m.content })),
      { role: 'user' as const, content: userMessage },
    ];
 
    let response = '';
    for await (const chunk of hiveforge.ai.stream({ messages })) {
      response += chunk.content;
      // Update UI with chunk.content
    }
    return response;
  } catch (error) {
    if (error instanceof AIProxyException) {
      if (error.isQuotaExceeded) {
        // Redirect to upgrade page
        window.location.href = error.upgradeUrl ?? '/pricing';
      }
    }
    throw error;
  }
}