Developer Documentation

Everything you need to integrate Tagmatic's AI annotation API into your application. All endpoints are available at https://tagmatic.app/api.

Authentication #

Tagmatic uses two authentication methods depending on the context: API keys for server-to-server integration, and JWT tokens for the browser dashboard.

API Keys #

API keys are the standard way to authenticate programmatic access to the annotation API. They're prefixed with tmk_ and are created from your dashboard.

💡
Where to get your API key

Go to Dashboard → API Keys and click Create New Key. Keys are shown once — copy it immediately.

Send your API key in the X-API-Key header on every request:

HTTP
POST https://tagmatic.app/api/annotate
X-API-Key: tmk_your_api_key_here
Content-Type: application/json

You can also use the Authorization header as a Bearer token:

HTTP
Authorization: Bearer tmk_your_api_key_here
⚠️
Keep API keys secret

Never expose API keys in client-side code or commit them to version control. Use environment variables.

JWT Sessions #

JWT tokens are used by the dashboard and when building browser-based tools. A token is returned after login and expires after 30 days.

HTTP
POST /api/auth/login
Content-Type: application/json

{
  "email": "you@example.com",
  "password": "your-password"
}

// Response
{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "user": { "id": 42, "email": "you@example.com", "plan": "starter" }
}

For server-to-server integrations, always use an API key — not a JWT token.

Schema Reference #

The schema field defines what annotations to extract. It's an array of label definitions — each with a name, type, and description. Tagmatic supports four annotation types.

📌
Base request shape

All annotation requests follow this structure: POST /api/annotate with text, schema, and optional guidelines.

Classification #

Classify text into one of a fixed set of options. Use for sentiment, topic, priority, intent, or any categorical label.

Request
{
  "text": "This product is absolutely fantastic, exceeded all my expectations!",
  "schema": [
    {
      "name": "sentiment",
      "type": "classification",
      "description": "Overall emotional tone of the review",
      "options": ["positive", "neutral", "negative"]
    },
    {
      "name": "intent",
      "type": "classification",
      "description": "Primary intent of the message",
      "options": ["purchase", "support", "complaint", "general"]
    }
  ],
  "guidelines": "Rate the tone based on explicit language only. Ignore punctuation."
}
Response
{
  "job_id": "ann_01jq8bx9f3",
  "annotations": {
    "sentiment": "positive",
    "intent": "purchase"
  },
  "confidence": { "sentiment": 0.97, "intent": 0.88 },
  "processing_ms": 412
}

Span / NER #

Extract named entities or specific text spans with character offsets. Returns all matching occurrences in the text. Use for names, dates, locations, product mentions, or any substring-level annotation.

Request
{
  "text": "Book a flight from New York to Paris for Alice Johnson on March 5th.",
  "schema": [
    {
      "name": "location",
      "type": "span",
      "description": "Geographic places mentioned in the text",
      "multi": true
    },
    {
      "name": "person",
      "type": "span",
      "description": "Full names of people",
      "multi": true
    }
  ]
}
Response
{
  "job_id": "ann_01jq8c0d2m",
  "annotations": {
    "location": [
      { "text": "New York", "start": 21, "end": 29 },
      { "text": "Paris", "start": 33, "end": 38 }
    ],
    "person": [
      { "text": "Alice Johnson", "start": 43, "end": 56 }
    ]
  },
  "processing_ms": 508
}

The start and end fields are zero-indexed character offsets into the original text string, where end is exclusive (text.slice(start, end)).

Extraction #

Extract structured key-value data from unstructured text. The value can be any string. Use for prices, dates, model numbers, addresses, or any free-form field you need to pull out.

Request
{
  "text": "Invoice #INV-2024-089 dated 2024-03-14 for $1,250.00 due in 30 days.",
  "schema": [
    {
      "name": "invoice_number",
      "type": "extraction",
      "description": "The invoice identifier"
    },
    {
      "name": "amount",
      "type": "extraction",
      "description": "Total invoice amount including currency symbol"
    },
    {
      "name": "due_date",
      "type": "extraction",
      "description": "Payment due date derived from the invoice date and terms"
    }
  ]
}
Response
{
  "job_id": "ann_01jq8c4p7k",
  "annotations": {
    "invoice_number": "INV-2024-089",
    "amount": "$1,250.00",
    "due_date": "2024-04-13"
  },
  "processing_ms": 391
}

If a field can't be extracted (not present in the text), the annotation value will be null.

Boolean #

Make yes/no decisions about text. Use for spam detection, content moderation, eligibility checks, or any binary condition.

Request
{
  "text": "BUY NOW!!! Earn $5000 weekly from home. Click here for FREE money!!!",
  "schema": [
    {
      "name": "is_spam",
      "type": "boolean",
      "description": "True if the text appears to be spam or unsolicited marketing"
    },
    {
      "name": "contains_pii",
      "type": "boolean",
      "description": "True if the text contains personally identifiable information"
    }
  ]
}
Response
{
  "job_id": "ann_01jq8c7r9n",
  "annotations": {
    "is_spam": true,
    "contains_pii": false
  },
  "confidence": { "is_spam": 0.99, "contains_pii": 0.95 },
  "processing_ms": 302
}

Mixing label types

A single schema can mix all four types. Tagmatic resolves them in one API call.

Request — mixed schema
{
  "text": "Call me at 555-0100. I'm very unhappy with my order #7821.",
  "schema": [
    { "name": "sentiment", "type": "classification", "options": ["positive", "neutral", "negative"] },
    { "name": "order_number", "type": "extraction", "description": "Order number mentioned" },
    { "name": "phone", "type": "span", "description": "Phone numbers in the text" },
    { "name": "needs_callback", "type": "boolean", "description": "Customer explicitly asks to be called back" }
  ]
}

Guidelines Best Practices #

The optional guidelines field is the single highest-leverage input for annotation accuracy. Well-written guidelines reduce ambiguous cases by giving the model a consistent decision framework.

What makes guidelines effective

  • State the decision rule, not the definition. "Mark as positive if the customer expresses satisfaction with the product, outcome, or service" is better than "positive = good feeling".
  • Cover edge cases explicitly. If sarcasm should count as negative, say so. If neutral includes mixed sentiment, say so.
  • Use concrete examples. "For example, 'not bad' should be classified as neutral, not positive."
  • Be specific about scope. "Classify based on the product only — ignore tone about shipping or customer service."
  • Define what to do when unsure. "When in doubt between positive and neutral, prefer neutral."
✓ Good

Classify sentiment based on explicit language about the product. Mark positive if the customer expresses clear satisfaction or recommends the product. Mark negative for complaints, disappointment, or requests for refunds. Mark neutral for factual descriptions, mixed opinions, or sarcasm. Ignore delivery comments.

✗ Bad

Classify if the review is positive, negative, or neutral based on how the customer feels.

✓ Good

Extract the invoice number exactly as it appears in the text (e.g., "INV-001", "Invoice #42"). If multiple invoice numbers appear, extract only the first one. If none is present, return null.

✗ Bad

Extract the invoice number from the text.

Guidelines for span / NER labels

  • Specify whether to include articles and prepositions ("the UK" vs "UK").
  • Clarify ambiguous cases: "Include the country name but not the city."
  • For multi-occurrence spans, state whether overlapping spans are allowed.

Guidelines for boolean labels

  • Define the exact condition that makes it true. Don't rely on the model's intuition.
  • For moderation, list specific categories: "True if the text contains profanity, threats, or sexually explicit content."
  • State the default: "Default to false if uncertain."
🚀
Pro tip: use few-shot examples

Adding 3–5 labeled examples via the Projects API can improve accuracy more than any guideline tweak. Examples show the model exactly what you want — in your own data.

Error Code Reference #

All errors return a JSON body with an error string and an optional code for programmatic handling.

Error shape
{
  "error": "Human-readable description",
  "code": "machine_readable_code"
}
Status Code Message How to fix
400 invalid_schema Schema must be a non-empty array Ensure schema is an array with at least one label definition.
400 invalid_label_type Unsupported label type: "xyz" Use one of: classification, span, extraction, boolean.
400 missing_options Classification labels require an options array Add an options array to each classification label.
400 text_too_long Text exceeds maximum length (32,000 characters) Split long documents into smaller chunks before annotating.
400 batch_too_large Batch requests are limited to 25 texts Split your batch into groups of ≤25 and send multiple requests.
401 missing_api_key API key required Add the X-API-Key header with your tmk_ key.
401 invalid_api_key Invalid or revoked API key Check the key in your dashboard. Generate a new one if revoked.
429 rate_limited Monthly request limit reached Upgrade your plan, or wait until the next billing period. Check X-RateLimit-* headers.
500 annotation_timeout Annotation timed out after 60s Reduce schema complexity, shorten the text, or retry. Very large schemas with many labels may time out on long texts.
500 internal_error An unexpected error occurred Retry with exponential backoff. If it persists, contact support.

Rate limit headers

Every response includes these headers so you can monitor usage:

HeaderDescription
X-RateLimit-Limit Total requests allowed in the current billing period
X-RateLimit-Remaining Requests remaining before you hit the limit
X-RateLimit-Reset Unix timestamp when the limit resets (start of next billing period)

Retrying on errors

Python — retry with backoff
import time, requests

def annotate_with_retry(payload, max_retries=3):
    for attempt in range(max_retries):
        r = requests.post("https://tagmatic.app/api/annotate",
                           headers={"X-API-Key": API_KEY}, json=payload)
        if r.status_code == 429:
            time.sleep(2 ** attempt)   # exponential backoff
            continue
        r.raise_for_status()
        return r.json()
    raise Exception("Max retries exceeded")

Rate Limits by Plan #

Limits are tracked per API key and reset at the start of each billing period. Batch requests count as one request per text item in the batch.

Plan Monthly Requests Batch Size File Upload Team Members
Free 500 Up to 10 1
Starter 10,000 Up to 25 Up to 50 MB 3
Pro 100,000 Up to 25 Up to 50 MB 10
Enterprise Custom Custom Custom Unlimited
💡
Playground mode

Logged-in users can use the annotation panel at tagmatic.app without an API key. Playground requests are counted separately from API key usage.

Handling 429 responses

When you receive a 429, check the X-RateLimit-Reset header for when the limit resets. Use exponential backoff for transient failures:

Reading rate limit headers
# Python example
response = requests.post(url, headers=headers, json=body)

if response.status_code == 429:
    reset_at = response.headers.get("X-RateLimit-Reset")
    remaining = response.headers.get("X-RateLimit-Remaining")
    print(f"Rate limited. {remaining} requests left. Resets at {reset_at}")

Webhook Setup Guide #

Webhooks let you receive real-time notifications when annotation events occur. Configure them in the Webhooks dashboard or via the API.

Event types

EventTrigger
annotation.completed A single annotation request finished successfully
batch.completed A batch annotation job finished (all items processed)
batch.failed A batch job failed — some or all items could not be annotated
file.completed A file upload job (CSV/JSONL) finished processing
project.drift_alert Label distribution in a project shifted significantly
review.submitted A human reviewer submitted a correction

Registering a webhook

HTTP
POST /api/webhooks
Authorization: Bearer <jwt_token>

{
  "url": "https://yourapp.com/webhooks/tagmatic",
  "events": ["annotation.completed", "batch.completed"],
  "secret": "your-webhook-signing-secret"
}

Payload structure

annotation.completed payload
{
  "event": "annotation.completed",
  "timestamp": "2024-03-14T10:23:45.000Z",
  "data": {
    "job_id": "ann_01jq8bx9f3",
    "text": "This product is fantastic...",
    "annotations": { "sentiment": "positive" },
    "processing_ms": 412
  }
}

HMAC verification

Every webhook request includes an X-Tagmatic-Signature header. Verify it to confirm the request came from Tagmatic:

Python — verify signature
import hmac, hashlib

def verify_webhook(payload_bytes, signature_header, secret):
    expected = hmac.new(
        secret.encode(),
        payload_bytes,
        hashlib.sha256
    ).hexdigest()
    received = signature_header.replace("sha256=", "")
    return hmac.compare_digest(expected, received)

# In your Flask/FastAPI handler:
sig = request.headers.get("X-Tagmatic-Signature", "")
if not verify_webhook(request.data, sig, WEBHOOK_SECRET):
    return "Forbidden", 403

Retry behavior

  • Tagmatic retries failed deliveries up to 5 times with exponential backoff (1 min, 5 min, 30 min, 2 hrs, 8 hrs).
  • A delivery is considered successful if your endpoint returns a 2xx status within 10 seconds.
  • After 5 failed attempts, the webhook is marked as failed and you'll see it in the Deliveries log.
  • You can trigger a manual retry from the Webhooks dashboard.
⚠️
Idempotency

Your webhook handler should be idempotent — retries can deliver the same event multiple times. Use job_id as a deduplication key.

Python Quickstart #

Call the REST API directly using requests — no SDK required.

Python — requests
import requests

API_KEY = "tmk_your_key_here"
BASE_URL = "https://tagmatic.app/api"

response = requests.post(
    f"{BASE_URL}/annotate",
    headers={"X-API-Key": API_KEY, "Content-Type": "application/json"},
    json={
        "text": "Your text here",
        "schema": [{
            "name": "sentiment",
            "type": "classification",
            "options": ["positive", "neutral", "negative"]
        }]
    }
)
print(response.json()["annotations"])
📦
REST API Reference

The full API is documented below. Use any HTTP client in any language. Questions? Email tagmatic@polsia.app.