Log In Get Started
Developer Documentation

Agentics API v1

A unified, OpenAI compatible interface to 18 language models, 27 image models, 71 voices, embeddings, agentic tool execution, the Neural Resonance Network, and a real time voice Orb. One API key, every modality.

Quickstart #quickstart

Get started with the Agentics API in minutes. The API is OpenAI-compatible, so you can use existing OpenAI SDKs with minimal changes. Get a working response from the flagship ALM model in under two minutes.

Base URL: https://api.agentics.co.za/v1 · all responses are JSON unless explicitly streamed via SSE or WebSocket.

Installation

bashnpm install agentics-sdk

Basic Usage

jsimport Agentics from 'agentics-sdk';

const client = new Agentics({
  apiKey: process.env.AGENTICS_API_KEY
});

const response = await client.inference.chat({
  model: 'Sigma',
  messages: [
    { role: 'user', content: 'Hello, how are you?' }
  ]
});

console.log(response.choices[0].message.content);

Quick Examples

js// Text-to-Speech
const audio = await client.audio.textToSpeech({
  text: 'Hello world!',
  voice: 'female2',
  outputBase64: true
});

// Speech-to-Text
const transcript = await client.audio.speechToText({
  audioData: base64Audio
});

// Image Generation
const image = await client.images.generate({
  prompt: 'A sunset over mountains',
  model: 'flux'
});

// Web Search
const results = await client.webgrep.search('latest AI news');

// Embeddings
const embedding = await client.embeddings.generateFromText('Hello world');

// Agent Task
const task = await client.agents.runTask('Search for AI news');

SDK Modules

ModuleDescription
client.inferenceChat completions and text generation
client.audioText-to-speech and speech-to-text
client.liveLive WebSocket transcription
client.imagesImage generation
client.embeddingsVector embeddings
client.agentsAgent tasks with tools
client.conversationsConversation management
client.memoryKey-value memory storage
client.webgrepWeb search and scraping
client.mcpMCP server management

Using with OpenAI SDK

jsimport OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.AGENTICS_API_KEY,
  baseURL: 'https://api.agentics.co.za/v1'
});

const response = await client.chat.completions.create({
  model: 'Sigma',
  messages: [{ role: 'user', content: 'Hello!' }]
});

Authentication #authentication

All API requests require authentication using an API key passed in the Authorization header. Tokens come in three flavours.

Token typePrefixPurpose
Live keyak_live_Server side, full access to your account scopes
Test keyak_test_Sandboxed, billed against the test ledger
Page tokenpt_Browser scoped, ephemeral, signed by Shield

Never ship a live key to the browser. Use Shield for any client-side execution - it provisions short lived page tokens automatically.

API Key Authentication

bashcurl https://api.agentics.co.za/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "Sigma", "messages": [{"role": "user", "content": "Hello!"}]}'

JWT Authentication

For user sessions, JWT tokens are issued on login with a 7-day expiry. Refresh tokens last 30 days.

Hardware Authentication

Device-bound authentication is available for enhanced security. Contact support to enable hardware ID binding.

Rate Limits #rate-limits

Soft limits scale with subscription tier. Hard limits protect upstream infrastructure. Both reset every 60 seconds. Inspect headers X-RateLimit-Remaining, X-RateLimit-Reset, X-RateLimit-Limit.

TierRequests/minTokens/day
Free2010,000
Pro100500,000
EnterpriseUnlimitedUnlimited

Base URL #base-url

All API requests should be made to:

urlhttps://api.agentics.co.za

The API follows REST conventions and returns JSON responses.

Chat Completions #chat

POST/v1/chat/completions

The primary inference endpoint. Generate text responses using our AI models. Compatible with OpenAI's chat completions API, with one extra field: tools_async for fire-and-forget background tool calls.

Request Body

json{
  "model": "Sigma",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false,
  "maxTokens": 1024,
  "temperature": 0.7
}

Parameters

FieldTypeNotes
modelstringALM, Sigma, Syntactic, Polaris, Nitro, Magma, Orion, AgentX, Neo, Omega, Delta, Amiga.
messagesarrayStandard OpenAI shape with optional name and tool_call_id fields.
streambooleanDefaults to false. Set true for SSE deltas.
toolsarrayOpenAI function calling schema. Blocking, returns results in stream.
tools_asyncarrayFire and forget tools. Model continues responding while the action runs.
temperaturenumber0 to 2. Default 0.7.
maxTokensintegerMaximum tokens to generate
top_pfloatNucleus sampling parameter

Streaming Response

Streaming is enabled on every text and audio endpoint, including the Realtime Orb WebSocket. Image generation streams progress frames every 200ms when stream: true is set.

jsconst response = await fetch('https://api.agentics.co.za/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'Sigma',
    messages: [{ role: 'user', content: 'Write a poem' }],
    stream: true
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  process.stdout.write(decoder.decode(value));
}

Models #models

List available AI models and their capabilities.

List Models

httpGET /v1/models

Get Model Info

httpGET /v1/models/:model

Available Models

ModelDescriptionMultimodal
ALMFlagship reasoning with vision, tool calling, multimodal (default)Yes
SigmaFrontier multimodal modelYes
OrionFrontier reasoning and analysisNo
SyntacticPrecision code generationNo
NitroDeep reasoning and analysisNo
PolarisVersatile general-purpose modelNo
MagmaPremium-tier creative generationYes
AgentXSpecialist agentic execution modelYes
NeoLong-context premium modelYes
OmegaCompact general purposeNo
DeltaConversational and fastNo
AmigaFriendly companion modelNo

Image Generation #images

POST/v1/images/generations

27 image models behind one endpoint. Flux, SDXL, AgenPics, Arta, Juggernaut, Dreamshaper and more. Set model in the request body to pick one.

Request Body

json{
  "prompt": "A futuristic cityscape at sunset",
  "model": "flux",
  "size": "1024x1024",
  "n": 1
}

Parameters

ParameterTypeDescription
promptstringText description of the image
modelstringflux, sdxl, arta, agenpics, juggernaut, dreamshaper, etc.
sizestringImage dimensions (1024x1024, 512x512, etc.)
nintegerNumber of images to generate

Audio Processing #audio

Text-to-speech and speech-to-text capabilities powered by local ONNX models. 71 voices, streaming PCM at 44.1 kHz, sub-200ms first audio.

Text-to-Speech

POST/v1/audio/tts
json{
  "text": "Hello, welcome to Agentics!",
  "voice": "female2",
  "speed": 1.2,
  "quality": 20,
  "outputBase64": true
}

Parameters

ParameterTypeDescription
textstringText to convert to speech (required)
voicestringVoice ID (default: female2 / justine)
speedfloatSpeech speed multiplier (default: 1.2)
qualityintegerAudio quality (default: 20)
outputBase64booleanReturn base64 audio in JSON response

Available Voices

Four signature voices: Justine (default, warm), Emma (clinical, fast), Lucas (deep, calm), Felix (playful, conversational). Plus 67 community voices behind voice=community/<handle>.

httpGET /v1/audio/voices
Voice IDAliasesGender
male1m1, lucasMale
male2m2, felixMale
female1f1, emmaFemale
female2f2, justineFemale (default)

Speech-to-Text

POST/v1/audio/stt
POST/v1/audio/transcriptions

JSON Request

json{
  "audioData": "base64_encoded_audio",
  "language": "en"
}

Multipart Form Request

bashcurl https://api.agentics.co.za/v1/audio/stt \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F file="@audio.wav" \
  -F language="en"

Live Transcription (WebSocket)

WS/v1/audio/live/ws

Stream audio in real-time for continuous transcription with VAD (Voice Activity Detection) and EOT (End-of-Turn) detection.

Query Parameters

ParameterDefaultDescription
languageautoLanguage code (en, es, fr, etc.)
vad_threshold0.5VAD sensitivity (0-1)
min_silence500Min silence timeout in ms
max_silence6000Max silence timeout in ms
eottrueEnable End-of-Turn detection

Embeddings #embeddings

POST/v1/embeddings

384-dimensional GTE-Small embeddings. Up to 1,024 inputs per request. The same backend powers NRN. Returns the standard OpenAI shape, with optional cosine similarity scoring against a stored corpus via compare_to.

Generate Embedding

httpPOST /v1/embeddings/generate
json{
  "text": "The quick brown fox"
}

Add to Group

httpPOST /v1/embeddings/add
json{
  "group": "documentation",
  "content": "How to use the API",
  "metadata": "optional metadata string"
}

Search Embeddings

httpPOST /v1/embeddings/search
json{
  "query": "How do I use the API?",
  "group": "documentation",
  "limit": 5
}

List Groups

httpGET /v1/embeddings/groups

Delete Group

httpDELETE /v1/embeddings/groups/:name

Agent Tools #agent

Execute agentic tools for advanced AI capabilities including code execution, web search, and more. Agentics is the first inference API with first-class async tools.

Synchronous tools

The model pauses, calls your function, waits for the JSON response, and continues. Best for lookups and database queries the model must reason with.

Asynchronous tools

The model fires the call into a queue and continues responding. Perfect for sending emails, queuing jobs, writing audit logs.

Available Tools

ToolDescription
bashExecute shell commands (sandboxed)
pythonRun Python code (sandboxed)
web_searchSearch the web via WebGrep
fetch_urlFetch and parse web content
generate_imageGenerate images from prompts
str_replace_editorEdit files with search/replace
memoryPersistent key-value storage
todoTask management

Endpoint

httpPOST /v1/agent/execute

Realtime Voice (Orb) #realtime

Live voice interface with bidirectional audio streaming. Send 16 kHz PCM frames, receive PCM frames at 44.1 kHz. Includes voice activity detection, interruption handling, and a structured event channel for tool calls.

WS/v1/realtime
WS/v1/orb
WS/orb/realtime?mode=voice&model=AgentX&voice=female2

The Orb provides a voice-powered AI interaction with:

  • Real-time speech recognition
  • Natural language understanding
  • Text-to-speech responses
  • Interrupt handling
  • Tool calling over the structured event channel

Conversations #conversations

Manage persistent conversation history.

Endpoints

httpGET    /v1/conversations           # List conversations
POST   /v1/conversations           # Create conversation
GET    /v1/conversations/:id       # Get conversation
DELETE /v1/conversations/:id       # Delete conversation
POST   /v1/conversations/:id/messages  # Add message

Memory Storage #memory

Key-value storage for persistent agent memory.

Endpoints

httpGET    /v1/memory/:key    # Get value
POST   /v1/memory         # Set value
DELETE /v1/memory/:key    # Delete value

NRN - Neural Resonance Network #nrn

A biologically-inspired memory system written in C that learns, reinforces, decays, and consolidates knowledge over time. NRN is the core long-term memory engine powering Agentics AI agents, designed to solve the fundamental problem of information retention across unbounded time spans.

How It Works

NRN uses a 384-dimensional embedding space (GTE-Small ONNX, compiled directly into the binary) combined with Hebbian learning, temporal decay, and a learned neural controller. Knowledge that is accessed frequently becomes permanent; unused knowledge decays naturally. Semantic clusters form automatically, and cross-cluster associations strengthen through co-retrieval.

Endpoints

POST/v1/nrn/store

Push a memory into the Neural Resonance Network. NRN handles deduplication, association building, and decay automatically.

POST/v1/nrn/recall

Query NRN with a natural language prompt. Returns ranked, weighted memories with adaptive retrieval that improves with use.

Architecture

ComponentDetail
Embedding ModelGTE-Small (384-dim) ONNX, compiled into the binary via objcopy
StorageSQLite per-user databases with embeddings, clusters, and association tables
Neural Controller2-layer network (392 to 64 to 4) with ReLU + Softmax, trained via REINFORCE
Clustering32 semantic clusters with centroid tracking and auto-assignment
Hebbian LearningCo-retrieval strengthens association weights; global decay prevents saturation
LifecycleInsert, Access tracking, Remembrance (3+ accesses), Decay (0.95/cycle), Pruning
ConsolidationDuplicate merging (>0.95 similarity), LLM bridge summaries, decay cycle

Retrieval Strategies

The neural controller selects from 4 strategies per query:

IDStrategyDescription
0Dense K=3Top 3 by raw cosine similarity - precise, focused queries
1Dense K=7Top 7 by cosine similarity - broader topic recall
2Decay-Rerank K=5Top 5 re-ranked by (similarity * decay_weight) - recency-biased
3Hebbian-Boost K=5Top 5 with Hebbian association boosting - cross-topic relational

Configuration

ParameterTypeDefaultDescription
remembrance_thresholdint3Access count before a memory becomes permanent
decay_ratefloat0.95Multiplicative decay factor per cycle
reinforce_ratefloat0.02Rate at which retrieved embeddings shift toward queries
hebbian_learn_ratefloat0.1How quickly associations strengthen on co-retrieval
controller_learn_ratefloat0.01REINFORCE learning rate for strategy controller
merge_similarity_thresholdfloat0.95Cosine similarity above which embeddings merge during consolidation
hebbian_boost_factorfloat0.15Score boost from Hebbian associations in strategy 3

CLI Usage

bash# Prepare documents for ingestion
nrn prep ./docs/ prepared.txt

# Add embeddings to a user's group
nrn add main docs prepared.txt

# Search with neuroplastic retrieval
nrn search main docs "how does authentication work" 5

# Run consolidation (merge duplicates, decay, bridge)
nrn consolidate main docs

# Get memory stats
nrn stats main docs

# Give reward feedback (improves controller)
nrn reward main docs 412 0.9

# Prune decayed memories below threshold
nrn prune main docs

Why NRN

Traditional RAG systems treat every retrieval identically - no learning, no adaptation, no memory lifecycle. NRN introduces neuroplastic principles:

  • Memories that matter are reinforced and become permanent
  • Unused knowledge gracefully decays, keeping the index clean
  • Semantic clustering enables cross-topic associative recall
  • The neural controller learns which retrieval strategy works best per query type
  • Consolidation merges duplicates and generates bridging knowledge between clusters
  • Zero runtime dependencies - the ONNX model is embedded in the compiled C binary

MCP Protocol #mcp

Model Context Protocol integration for connecting AI agents to external tools and data sources.

Endpoints

httpGET    /v1/mcp/servers    # List MCP servers
POST   /v1/mcp/connect    # Connect to server
POST   /v1/mcp/execute    # Execute MCP tool

WebGrep Search #webgrep

Lightning-fast web search and content extraction.

Endpoints

httpPOST /v1/webgrep/search    # Web search
POST /v1/webgrep/fetch     # Fetch URL content
POST /v1/webgrep/rag       # RAG-mode extraction

Search Engines

  • DuckDuckGo (default)
  • Google
  • Brave
  • Kagi
  • Startpage
  • Ecosia

Workflow Engine #workflow

Agentic Flow - orchestrate complex multi-step workflows.

Endpoints

httpPOST   /v1/workflow/create    # Create workflow
POST   /v1/workflow/execute   # Execute workflow
GET    /v1/workflow/:id       # Get workflow status

Agentics CLI #cli

Full-featured AI agent CLI with live voice chat.

Installation

bashnpm install -g agentics

Usage

bash# Start chat
agentics

# With specific model
agentics --model Sigma

# Voice mode
agentics --voice

# With tools enabled
agentics --tools

Features

  • Live voice chat with WebSocket streaming
  • Silero VAD with smart end-of-turn detection
  • Local ONNX-powered text-to-speech
  • Image generation with Flux and Arta
  • 10+ integrated tools
  • 18 text models + 27 image models

Taskman #taskman

AI-powered task management with MCP server integration.

Installation

bashnpm install -g @agentics/taskman

Modes

bash# Terminal UI
taskman tui

# Background daemon
taskman daemon

# MCP server
taskman mcp

# AI reminders
taskman reminder

MCP Tools

18 integrated tools for task operations accessible via Agentics CLI and other MCP clients.

Agentics Studio #studio

Web-based development platform at ur1s.xyz.

Features

  • Custom subdomain on ur1s.xyz
  • Built-in HTML/CSS/JS editor
  • Real-time visit analytics
  • Instant deployment and updates

VPN Access #vpn

On-demand VPN credentials for secure browsing and access.

Endpoints

httpGET    /v1/vpn/servers       # List available servers
POST   /v1/vpn/credentials   # Get connection credentials
GET    /v1/vpn/config        # Download configuration file

Supported Protocols

  • OpenVPN
  • WireGuard
  • IKEv2

Error Handling #errors

The API uses standard HTTP status codes and returns errors in a consistent JSON format.

Error Response Format

json{
  "error": {
    "message": "Invalid API key provided",
    "type": "authentication_error",
    "code": "invalid_api_key"
  }
}

Common Error Codes

StatusTypeDescription
400invalid_requestMalformed request or missing parameters
401authentication_errorInvalid or missing API key
403permission_deniedInsufficient permissions
429rate_limit_exceededToo many requests
500server_errorInternal server error