Developer Documentation
Everything you need to integrate Tagmatic's AI annotation API into your application. All endpoints are available at https://tagmatic.app/api.
Authentication #
Tagmatic uses two authentication methods depending on the context: API keys for server-to-server integration, and JWT tokens for the browser dashboard.
API Keys #
API keys are the standard way to authenticate programmatic access to the annotation API. They're prefixed with tmk_ and are created from your dashboard.
Go to Dashboard → API Keys and click Create New Key. Keys are shown once — copy it immediately.
Send your API key in the X-API-Key header on every request:
POST https://tagmatic.app/api/annotate X-API-Key: tmk_your_api_key_here Content-Type: application/json
You can also use the Authorization header as a Bearer token:
Authorization: Bearer tmk_your_api_key_here
Never expose API keys in client-side code or commit them to version control. Use environment variables.
JWT Sessions #
JWT tokens are used by the dashboard and when building browser-based tools. A token is returned after login and expires after 30 days.
POST /api/auth/login Content-Type: application/json { "email": "you@example.com", "password": "your-password" } // Response { "token": "eyJhbGciOiJIUzI1NiIs...", "user": { "id": 42, "email": "you@example.com", "plan": "starter" } }
For server-to-server integrations, always use an API key — not a JWT token.
Schema Reference #
The schema field defines what annotations to extract. It's an array of label definitions — each with a name, type, and description. Tagmatic supports four annotation types.
All annotation requests follow this structure: POST /api/annotate with text, schema, and optional guidelines.
Classification #
Classify text into one of a fixed set of options. Use for sentiment, topic, priority, intent, or any categorical label.
{
"text": "This product is absolutely fantastic, exceeded all my expectations!",
"schema": [
{
"name": "sentiment",
"type": "classification",
"description": "Overall emotional tone of the review",
"options": ["positive", "neutral", "negative"]
},
{
"name": "intent",
"type": "classification",
"description": "Primary intent of the message",
"options": ["purchase", "support", "complaint", "general"]
}
],
"guidelines": "Rate the tone based on explicit language only. Ignore punctuation."
}
{
"job_id": "ann_01jq8bx9f3",
"annotations": {
"sentiment": "positive",
"intent": "purchase"
},
"confidence": { "sentiment": 0.97, "intent": 0.88 },
"processing_ms": 412
}
Span / NER #
Extract named entities or specific text spans with character offsets. Returns all matching occurrences in the text. Use for names, dates, locations, product mentions, or any substring-level annotation.
{
"text": "Book a flight from New York to Paris for Alice Johnson on March 5th.",
"schema": [
{
"name": "location",
"type": "span",
"description": "Geographic places mentioned in the text",
"multi": true
},
{
"name": "person",
"type": "span",
"description": "Full names of people",
"multi": true
}
]
}
{
"job_id": "ann_01jq8c0d2m",
"annotations": {
"location": [
{ "text": "New York", "start": 21, "end": 29 },
{ "text": "Paris", "start": 33, "end": 38 }
],
"person": [
{ "text": "Alice Johnson", "start": 43, "end": 56 }
]
},
"processing_ms": 508
}
The start and end fields are zero-indexed character offsets into the original text string, where end is exclusive (text.slice(start, end)).
Extraction #
Extract structured key-value data from unstructured text. The value can be any string. Use for prices, dates, model numbers, addresses, or any free-form field you need to pull out.
{
"text": "Invoice #INV-2024-089 dated 2024-03-14 for $1,250.00 due in 30 days.",
"schema": [
{
"name": "invoice_number",
"type": "extraction",
"description": "The invoice identifier"
},
{
"name": "amount",
"type": "extraction",
"description": "Total invoice amount including currency symbol"
},
{
"name": "due_date",
"type": "extraction",
"description": "Payment due date derived from the invoice date and terms"
}
]
}
{
"job_id": "ann_01jq8c4p7k",
"annotations": {
"invoice_number": "INV-2024-089",
"amount": "$1,250.00",
"due_date": "2024-04-13"
},
"processing_ms": 391
}
If a field can't be extracted (not present in the text), the annotation value will be null.
Boolean #
Make yes/no decisions about text. Use for spam detection, content moderation, eligibility checks, or any binary condition.
{
"text": "BUY NOW!!! Earn $5000 weekly from home. Click here for FREE money!!!",
"schema": [
{
"name": "is_spam",
"type": "boolean",
"description": "True if the text appears to be spam or unsolicited marketing"
},
{
"name": "contains_pii",
"type": "boolean",
"description": "True if the text contains personally identifiable information"
}
]
}
{
"job_id": "ann_01jq8c7r9n",
"annotations": {
"is_spam": true,
"contains_pii": false
},
"confidence": { "is_spam": 0.99, "contains_pii": 0.95 },
"processing_ms": 302
}
Mixing label types
A single schema can mix all four types. Tagmatic resolves them in one API call.
{
"text": "Call me at 555-0100. I'm very unhappy with my order #7821.",
"schema": [
{ "name": "sentiment", "type": "classification", "options": ["positive", "neutral", "negative"] },
{ "name": "order_number", "type": "extraction", "description": "Order number mentioned" },
{ "name": "phone", "type": "span", "description": "Phone numbers in the text" },
{ "name": "needs_callback", "type": "boolean", "description": "Customer explicitly asks to be called back" }
]
}
Guidelines Best Practices #
The optional guidelines field is the single highest-leverage input for annotation accuracy. Well-written guidelines reduce ambiguous cases by giving the model a consistent decision framework.
What makes guidelines effective
- State the decision rule, not the definition. "Mark as positive if the customer expresses satisfaction with the product, outcome, or service" is better than "positive = good feeling".
- Cover edge cases explicitly. If sarcasm should count as negative, say so. If neutral includes mixed sentiment, say so.
- Use concrete examples. "For example, 'not bad' should be classified as neutral, not positive."
- Be specific about scope. "Classify based on the product only — ignore tone about shipping or customer service."
- Define what to do when unsure. "When in doubt between positive and neutral, prefer neutral."
Classify sentiment based on explicit language about the product. Mark positive if the customer expresses clear satisfaction or recommends the product. Mark negative for complaints, disappointment, or requests for refunds. Mark neutral for factual descriptions, mixed opinions, or sarcasm. Ignore delivery comments.
Classify if the review is positive, negative, or neutral based on how the customer feels.
Extract the invoice number exactly as it appears in the text (e.g., "INV-001", "Invoice #42"). If multiple invoice numbers appear, extract only the first one. If none is present, return null.
Extract the invoice number from the text.
Guidelines for span / NER labels
- Specify whether to include articles and prepositions ("the UK" vs "UK").
- Clarify ambiguous cases: "Include the country name but not the city."
- For multi-occurrence spans, state whether overlapping spans are allowed.
Guidelines for boolean labels
- Define the exact condition that makes it
true. Don't rely on the model's intuition. - For moderation, list specific categories: "True if the text contains profanity, threats, or sexually explicit content."
- State the default: "Default to false if uncertain."
Adding 3–5 labeled examples via the Projects API can improve accuracy more than any guideline tweak. Examples show the model exactly what you want — in your own data.
Error Code Reference #
All errors return a JSON body with an error string and an optional code for programmatic handling.
{
"error": "Human-readable description",
"code": "machine_readable_code"
}
| Status | Code | Message | How to fix |
|---|---|---|---|
| 400 | invalid_schema |
Schema must be a non-empty array | Ensure schema is an array with at least one label definition. |
| 400 | invalid_label_type |
Unsupported label type: "xyz" | Use one of: classification, span, extraction, boolean. |
| 400 | missing_options |
Classification labels require an options array | Add an options array to each classification label. |
| 400 | text_too_long |
Text exceeds maximum length (32,000 characters) | Split long documents into smaller chunks before annotating. |
| 400 | batch_too_large |
Batch requests are limited to 25 texts | Split your batch into groups of ≤25 and send multiple requests. |
| 401 | missing_api_key |
API key required | Add the X-API-Key header with your tmk_ key. |
| 401 | invalid_api_key |
Invalid or revoked API key | Check the key in your dashboard. Generate a new one if revoked. |
| 429 | rate_limited |
Monthly request limit reached | Upgrade your plan, or wait until the next billing period. Check X-RateLimit-* headers. |
| 500 | annotation_timeout |
Annotation timed out after 60s | Reduce schema complexity, shorten the text, or retry. Very large schemas with many labels may time out on long texts. |
| 500 | internal_error |
An unexpected error occurred | Retry with exponential backoff. If it persists, contact support. |
Rate limit headers
Every response includes these headers so you can monitor usage:
| Header | Description |
|---|---|
X-RateLimit-Limit |
Total requests allowed in the current billing period |
X-RateLimit-Remaining |
Requests remaining before you hit the limit |
X-RateLimit-Reset |
Unix timestamp when the limit resets (start of next billing period) |
Retrying on errors
import time, requests def annotate_with_retry(payload, max_retries=3): for attempt in range(max_retries): r = requests.post("https://tagmatic.app/api/annotate", headers={"X-API-Key": API_KEY}, json=payload) if r.status_code == 429: time.sleep(2 ** attempt) # exponential backoff continue r.raise_for_status() return r.json() raise Exception("Max retries exceeded")
Rate Limits by Plan #
Limits are tracked per API key and reset at the start of each billing period. Batch requests count as one request per text item in the batch.
| Plan | Monthly Requests | Batch Size | File Upload | Team Members |
|---|---|---|---|---|
| Free | 500 | Up to 10 | — | 1 |
| Starter | 10,000 | Up to 25 | Up to 50 MB | 3 |
| Pro | 100,000 | Up to 25 | Up to 50 MB | 10 |
| Enterprise | Custom | Custom | Custom | Unlimited |
Logged-in users can use the annotation panel at tagmatic.app without an API key. Playground requests are counted separately from API key usage.
Handling 429 responses
When you receive a 429, check the X-RateLimit-Reset header for when the limit resets. Use exponential backoff for transient failures:
# Python example response = requests.post(url, headers=headers, json=body) if response.status_code == 429: reset_at = response.headers.get("X-RateLimit-Reset") remaining = response.headers.get("X-RateLimit-Remaining") print(f"Rate limited. {remaining} requests left. Resets at {reset_at}")
Webhook Setup Guide #
Webhooks let you receive real-time notifications when annotation events occur. Configure them in the Webhooks dashboard or via the API.
Event types
| Event | Trigger |
|---|---|
annotation.completed |
A single annotation request finished successfully |
batch.completed |
A batch annotation job finished (all items processed) |
batch.failed |
A batch job failed — some or all items could not be annotated |
file.completed |
A file upload job (CSV/JSONL) finished processing |
project.drift_alert |
Label distribution in a project shifted significantly |
review.submitted |
A human reviewer submitted a correction |
Registering a webhook
POST /api/webhooks Authorization: Bearer <jwt_token> { "url": "https://yourapp.com/webhooks/tagmatic", "events": ["annotation.completed", "batch.completed"], "secret": "your-webhook-signing-secret" }
Payload structure
{
"event": "annotation.completed",
"timestamp": "2024-03-14T10:23:45.000Z",
"data": {
"job_id": "ann_01jq8bx9f3",
"text": "This product is fantastic...",
"annotations": { "sentiment": "positive" },
"processing_ms": 412
}
}
HMAC verification
Every webhook request includes an X-Tagmatic-Signature header. Verify it to confirm the request came from Tagmatic:
import hmac, hashlib def verify_webhook(payload_bytes, signature_header, secret): expected = hmac.new( secret.encode(), payload_bytes, hashlib.sha256 ).hexdigest() received = signature_header.replace("sha256=", "") return hmac.compare_digest(expected, received) # In your Flask/FastAPI handler: sig = request.headers.get("X-Tagmatic-Signature", "") if not verify_webhook(request.data, sig, WEBHOOK_SECRET): return "Forbidden", 403
Retry behavior
- Tagmatic retries failed deliveries up to 5 times with exponential backoff (1 min, 5 min, 30 min, 2 hrs, 8 hrs).
- A delivery is considered successful if your endpoint returns a
2xxstatus within 10 seconds. - After 5 failed attempts, the webhook is marked as failed and you'll see it in the Deliveries log.
- You can trigger a manual retry from the Webhooks dashboard.
Your webhook handler should be idempotent — retries can deliver the same event multiple times. Use job_id as a deduplication key.
Python Quickstart #
Call the REST API directly using requests — no SDK required.
import requests API_KEY = "tmk_your_key_here" BASE_URL = "https://tagmatic.app/api" response = requests.post( f"{BASE_URL}/annotate", headers={"X-API-Key": API_KEY, "Content-Type": "application/json"}, json={ "text": "Your text here", "schema": [{ "name": "sentiment", "type": "classification", "options": ["positive", "neutral", "negative"] }] } ) print(response.json()["annotations"])
The full API is documented below. Use any HTTP client in any language. Questions? Email tagmatic@polsia.app.