Agentics API v1
A unified, OpenAI compatible interface to 18 language models, 27 image models, 71 voices, embeddings, agentic tool execution, the Neural Resonance Network, and a real time voice Orb. One API key, every modality.
Quickstart #quickstart
Get started with the Agentics API in minutes. The API is OpenAI-compatible, so you can use existing OpenAI SDKs with minimal changes. Get a working response from the flagship ALM model in under two minutes.
Base URL: https://api.agentics.co.za/v1 · all responses are JSON unless explicitly streamed via SSE or WebSocket.
Installation
bashnpm install agentics-sdk
Basic Usage
jsimport Agentics from 'agentics-sdk';
const client = new Agentics({
apiKey: process.env.AGENTICS_API_KEY
});
const response = await client.inference.chat({
model: 'Sigma',
messages: [
{ role: 'user', content: 'Hello, how are you?' }
]
});
console.log(response.choices[0].message.content);
Quick Examples
js// Text-to-Speech
const audio = await client.audio.textToSpeech({
text: 'Hello world!',
voice: 'female2',
outputBase64: true
});
// Speech-to-Text
const transcript = await client.audio.speechToText({
audioData: base64Audio
});
// Image Generation
const image = await client.images.generate({
prompt: 'A sunset over mountains',
model: 'flux'
});
// Web Search
const results = await client.webgrep.search('latest AI news');
// Embeddings
const embedding = await client.embeddings.generateFromText('Hello world');
// Agent Task
const task = await client.agents.runTask('Search for AI news');
SDK Modules
| Module | Description |
|---|---|
client.inference | Chat completions and text generation |
client.audio | Text-to-speech and speech-to-text |
client.live | Live WebSocket transcription |
client.images | Image generation |
client.embeddings | Vector embeddings |
client.agents | Agent tasks with tools |
client.conversations | Conversation management |
client.memory | Key-value memory storage |
client.webgrep | Web search and scraping |
client.mcp | MCP server management |
Using with OpenAI SDK
jsimport OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.AGENTICS_API_KEY,
baseURL: 'https://api.agentics.co.za/v1'
});
const response = await client.chat.completions.create({
model: 'Sigma',
messages: [{ role: 'user', content: 'Hello!' }]
});
Authentication #authentication
All API requests require authentication using an API key passed in the Authorization header. Tokens come in three flavours.
| Token type | Prefix | Purpose |
|---|---|---|
| Live key | ak_live_ | Server side, full access to your account scopes |
| Test key | ak_test_ | Sandboxed, billed against the test ledger |
| Page token | pt_ | Browser scoped, ephemeral, signed by Shield |
Never ship a live key to the browser. Use Shield for any client-side execution - it provisions short lived page tokens automatically.
API Key Authentication
bashcurl https://api.agentics.co.za/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "Sigma", "messages": [{"role": "user", "content": "Hello!"}]}'
JWT Authentication
For user sessions, JWT tokens are issued on login with a 7-day expiry. Refresh tokens last 30 days.
Hardware Authentication
Device-bound authentication is available for enhanced security. Contact support to enable hardware ID binding.
Rate Limits #rate-limits
Soft limits scale with subscription tier. Hard limits protect upstream infrastructure. Both reset every 60 seconds. Inspect headers X-RateLimit-Remaining, X-RateLimit-Reset, X-RateLimit-Limit.
| Tier | Requests/min | Tokens/day |
|---|---|---|
| Free | 20 | 10,000 |
| Pro | 100 | 500,000 |
| Enterprise | Unlimited | Unlimited |
Base URL #base-url
All API requests should be made to:
urlhttps://api.agentics.co.za
The API follows REST conventions and returns JSON responses.
Chat Completions #chat
The primary inference endpoint. Generate text responses using our AI models. Compatible with OpenAI's chat completions API, with one extra field: tools_async for fire-and-forget background tool calls.
Request Body
json{
"model": "Sigma",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"stream": false,
"maxTokens": 1024,
"temperature": 0.7
}
Parameters
| Field | Type | Notes |
|---|---|---|
| model | string | ALM, Sigma, Syntactic, Polaris, Nitro, Magma, Orion, AgentX, Neo, Omega, Delta, Amiga. |
| messages | array | Standard OpenAI shape with optional name and tool_call_id fields. |
| stream | boolean | Defaults to false. Set true for SSE deltas. |
| tools | array | OpenAI function calling schema. Blocking, returns results in stream. |
| tools_async | array | Fire and forget tools. Model continues responding while the action runs. |
| temperature | number | 0 to 2. Default 0.7. |
| maxTokens | integer | Maximum tokens to generate |
| top_p | float | Nucleus sampling parameter |
Streaming Response
Streaming is enabled on every text and audio endpoint, including the Realtime Orb WebSocket. Image generation streams progress frames every 200ms when stream: true is set.
jsconst response = await fetch('https://api.agentics.co.za/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'Sigma',
messages: [{ role: 'user', content: 'Write a poem' }],
stream: true
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
process.stdout.write(decoder.decode(value));
}
Models #models
List available AI models and their capabilities.
List Models
httpGET /v1/models
Get Model Info
httpGET /v1/models/:model
Available Models
| Model | Description | Multimodal |
|---|---|---|
ALM | Flagship reasoning with vision, tool calling, multimodal (default) | Yes |
Sigma | Frontier multimodal model | Yes |
Orion | Frontier reasoning and analysis | No |
Syntactic | Precision code generation | No |
Nitro | Deep reasoning and analysis | No |
Polaris | Versatile general-purpose model | No |
Magma | Premium-tier creative generation | Yes |
AgentX | Specialist agentic execution model | Yes |
Neo | Long-context premium model | Yes |
Omega | Compact general purpose | No |
Delta | Conversational and fast | No |
Amiga | Friendly companion model | No |
Image Generation #images
27 image models behind one endpoint. Flux, SDXL, AgenPics, Arta, Juggernaut, Dreamshaper and more. Set model in the request body to pick one.
Request Body
json{
"prompt": "A futuristic cityscape at sunset",
"model": "flux",
"size": "1024x1024",
"n": 1
}
Parameters
| Parameter | Type | Description |
|---|---|---|
prompt | string | Text description of the image |
model | string | flux, sdxl, arta, agenpics, juggernaut, dreamshaper, etc. |
size | string | Image dimensions (1024x1024, 512x512, etc.) |
n | integer | Number of images to generate |
Audio Processing #audio
Text-to-speech and speech-to-text capabilities powered by local ONNX models. 71 voices, streaming PCM at 44.1 kHz, sub-200ms first audio.
Text-to-Speech
json{
"text": "Hello, welcome to Agentics!",
"voice": "female2",
"speed": 1.2,
"quality": 20,
"outputBase64": true
}
Parameters
| Parameter | Type | Description |
|---|---|---|
text | string | Text to convert to speech (required) |
voice | string | Voice ID (default: female2 / justine) |
speed | float | Speech speed multiplier (default: 1.2) |
quality | integer | Audio quality (default: 20) |
outputBase64 | boolean | Return base64 audio in JSON response |
Available Voices
Four signature voices: Justine (default, warm), Emma (clinical, fast), Lucas (deep, calm), Felix (playful, conversational). Plus 67 community voices behind voice=community/<handle>.
httpGET /v1/audio/voices
| Voice ID | Aliases | Gender |
|---|---|---|
male1 | m1, lucas | Male |
male2 | m2, felix | Male |
female1 | f1, emma | Female |
female2 | f2, justine | Female (default) |
Speech-to-Text
JSON Request
json{
"audioData": "base64_encoded_audio",
"language": "en"
}
Multipart Form Request
bashcurl https://api.agentics.co.za/v1/audio/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F file="@audio.wav" \
-F language="en"
Live Transcription (WebSocket)
Stream audio in real-time for continuous transcription with VAD (Voice Activity Detection) and EOT (End-of-Turn) detection.
Query Parameters
| Parameter | Default | Description |
|---|---|---|
language | auto | Language code (en, es, fr, etc.) |
vad_threshold | 0.5 | VAD sensitivity (0-1) |
min_silence | 500 | Min silence timeout in ms |
max_silence | 6000 | Max silence timeout in ms |
eot | true | Enable End-of-Turn detection |
Embeddings #embeddings
384-dimensional GTE-Small embeddings. Up to 1,024 inputs per request. The same backend powers NRN. Returns the standard OpenAI shape, with optional cosine similarity scoring against a stored corpus via compare_to.
Generate Embedding
httpPOST /v1/embeddings/generate
json{
"text": "The quick brown fox"
}
Add to Group
httpPOST /v1/embeddings/add
json{
"group": "documentation",
"content": "How to use the API",
"metadata": "optional metadata string"
}
Search Embeddings
httpPOST /v1/embeddings/search
json{
"query": "How do I use the API?",
"group": "documentation",
"limit": 5
}
List Groups
httpGET /v1/embeddings/groups
Delete Group
httpDELETE /v1/embeddings/groups/:name
Agent Tools #agent
Execute agentic tools for advanced AI capabilities including code execution, web search, and more. Agentics is the first inference API with first-class async tools.
Synchronous tools
The model pauses, calls your function, waits for the JSON response, and continues. Best for lookups and database queries the model must reason with.
Asynchronous tools
The model fires the call into a queue and continues responding. Perfect for sending emails, queuing jobs, writing audit logs.
Available Tools
| Tool | Description |
|---|---|
bash | Execute shell commands (sandboxed) |
python | Run Python code (sandboxed) |
web_search | Search the web via WebGrep |
fetch_url | Fetch and parse web content |
generate_image | Generate images from prompts |
str_replace_editor | Edit files with search/replace |
memory | Persistent key-value storage |
todo | Task management |
Endpoint
httpPOST /v1/agent/execute
Realtime Voice (Orb) #realtime
Live voice interface with bidirectional audio streaming. Send 16 kHz PCM frames, receive PCM frames at 44.1 kHz. Includes voice activity detection, interruption handling, and a structured event channel for tool calls.
The Orb provides a voice-powered AI interaction with:
- Real-time speech recognition
- Natural language understanding
- Text-to-speech responses
- Interrupt handling
- Tool calling over the structured event channel
Conversations #conversations
Manage persistent conversation history.
Endpoints
httpGET /v1/conversations # List conversations
POST /v1/conversations # Create conversation
GET /v1/conversations/:id # Get conversation
DELETE /v1/conversations/:id # Delete conversation
POST /v1/conversations/:id/messages # Add message
Memory Storage #memory
Key-value storage for persistent agent memory.
Endpoints
httpGET /v1/memory/:key # Get value
POST /v1/memory # Set value
DELETE /v1/memory/:key # Delete value
NRN - Neural Resonance Network #nrn
A biologically-inspired memory system written in C that learns, reinforces, decays, and consolidates knowledge over time. NRN is the core long-term memory engine powering Agentics AI agents, designed to solve the fundamental problem of information retention across unbounded time spans.
How It Works
NRN uses a 384-dimensional embedding space (GTE-Small ONNX, compiled directly into the binary) combined with Hebbian learning, temporal decay, and a learned neural controller. Knowledge that is accessed frequently becomes permanent; unused knowledge decays naturally. Semantic clusters form automatically, and cross-cluster associations strengthen through co-retrieval.
Endpoints
Push a memory into the Neural Resonance Network. NRN handles deduplication, association building, and decay automatically.
Query NRN with a natural language prompt. Returns ranked, weighted memories with adaptive retrieval that improves with use.
Architecture
| Component | Detail |
|---|---|
Embedding Model | GTE-Small (384-dim) ONNX, compiled into the binary via objcopy |
Storage | SQLite per-user databases with embeddings, clusters, and association tables |
Neural Controller | 2-layer network (392 to 64 to 4) with ReLU + Softmax, trained via REINFORCE |
Clustering | 32 semantic clusters with centroid tracking and auto-assignment |
Hebbian Learning | Co-retrieval strengthens association weights; global decay prevents saturation |
Lifecycle | Insert, Access tracking, Remembrance (3+ accesses), Decay (0.95/cycle), Pruning |
Consolidation | Duplicate merging (>0.95 similarity), LLM bridge summaries, decay cycle |
Retrieval Strategies
The neural controller selects from 4 strategies per query:
| ID | Strategy | Description |
|---|---|---|
| 0 | Dense K=3 | Top 3 by raw cosine similarity - precise, focused queries |
| 1 | Dense K=7 | Top 7 by cosine similarity - broader topic recall |
| 2 | Decay-Rerank K=5 | Top 5 re-ranked by (similarity * decay_weight) - recency-biased |
| 3 | Hebbian-Boost K=5 | Top 5 with Hebbian association boosting - cross-topic relational |
Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
remembrance_threshold | int | 3 | Access count before a memory becomes permanent |
decay_rate | float | 0.95 | Multiplicative decay factor per cycle |
reinforce_rate | float | 0.02 | Rate at which retrieved embeddings shift toward queries |
hebbian_learn_rate | float | 0.1 | How quickly associations strengthen on co-retrieval |
controller_learn_rate | float | 0.01 | REINFORCE learning rate for strategy controller |
merge_similarity_threshold | float | 0.95 | Cosine similarity above which embeddings merge during consolidation |
hebbian_boost_factor | float | 0.15 | Score boost from Hebbian associations in strategy 3 |
CLI Usage
bash# Prepare documents for ingestion
nrn prep ./docs/ prepared.txt
# Add embeddings to a user's group
nrn add main docs prepared.txt
# Search with neuroplastic retrieval
nrn search main docs "how does authentication work" 5
# Run consolidation (merge duplicates, decay, bridge)
nrn consolidate main docs
# Get memory stats
nrn stats main docs
# Give reward feedback (improves controller)
nrn reward main docs 412 0.9
# Prune decayed memories below threshold
nrn prune main docs
Why NRN
Traditional RAG systems treat every retrieval identically - no learning, no adaptation, no memory lifecycle. NRN introduces neuroplastic principles:
- Memories that matter are reinforced and become permanent
- Unused knowledge gracefully decays, keeping the index clean
- Semantic clustering enables cross-topic associative recall
- The neural controller learns which retrieval strategy works best per query type
- Consolidation merges duplicates and generates bridging knowledge between clusters
- Zero runtime dependencies - the ONNX model is embedded in the compiled C binary
MCP Protocol #mcp
Model Context Protocol integration for connecting AI agents to external tools and data sources.
Endpoints
httpGET /v1/mcp/servers # List MCP servers
POST /v1/mcp/connect # Connect to server
POST /v1/mcp/execute # Execute MCP tool
WebGrep Search #webgrep
Lightning-fast web search and content extraction.
Endpoints
httpPOST /v1/webgrep/search # Web search
POST /v1/webgrep/fetch # Fetch URL content
POST /v1/webgrep/rag # RAG-mode extraction
Search Engines
- DuckDuckGo (default)
- Brave
- Kagi
- Startpage
- Ecosia
Workflow Engine #workflow
Agentic Flow - orchestrate complex multi-step workflows.
Endpoints
httpPOST /v1/workflow/create # Create workflow
POST /v1/workflow/execute # Execute workflow
GET /v1/workflow/:id # Get workflow status
Agentics CLI #cli
Full-featured AI agent CLI with live voice chat.
Installation
bashnpm install -g agentics
Usage
bash# Start chat
agentics
# With specific model
agentics --model Sigma
# Voice mode
agentics --voice
# With tools enabled
agentics --tools
Features
- Live voice chat with WebSocket streaming
- Silero VAD with smart end-of-turn detection
- Local ONNX-powered text-to-speech
- Image generation with Flux and Arta
- 10+ integrated tools
- 18 text models + 27 image models
Taskman #taskman
AI-powered task management with MCP server integration.
Installation
bashnpm install -g @agentics/taskman
Modes
bash# Terminal UI
taskman tui
# Background daemon
taskman daemon
# MCP server
taskman mcp
# AI reminders
taskman reminder
MCP Tools
18 integrated tools for task operations accessible via Agentics CLI and other MCP clients.
Agentics Studio #studio
Web-based development platform at ur1s.xyz.
Features
- Custom subdomain on ur1s.xyz
- Built-in HTML/CSS/JS editor
- Real-time visit analytics
- Instant deployment and updates
VPN Access #vpn
On-demand VPN credentials for secure browsing and access.
Endpoints
httpGET /v1/vpn/servers # List available servers
POST /v1/vpn/credentials # Get connection credentials
GET /v1/vpn/config # Download configuration file
Supported Protocols
- OpenVPN
- WireGuard
- IKEv2
Error Handling #errors
The API uses standard HTTP status codes and returns errors in a consistent JSON format.
Error Response Format
json{
"error": {
"message": "Invalid API key provided",
"type": "authentication_error",
"code": "invalid_api_key"
}
}
Common Error Codes
| Status | Type | Description |
|---|---|---|
| 400 | invalid_request | Malformed request or missing parameters |
| 401 | authentication_error | Invalid or missing API key |
| 403 | permission_denied | Insufficient permissions |
| 429 | rate_limit_exceeded | Too many requests |
| 500 | server_error | Internal server error |