Developer Documentation

Everything you need to integrate Tagmatic's AI annotation API into your application. All endpoints are available at https://tagmatic.app/api.

Authentication #

Tagmatic uses two authentication methods depending on the context: API keys for server-to-server integration, and JWT tokens for the browser dashboard.

API Keys #

API keys are the standard way to authenticate programmatic access to the annotation API. They're prefixed with tmk_ and are created from your dashboard.

💡

Where to get your API key

Go to Dashboard → API Keys and click Create New Key. Keys are shown once — copy it immediately.

Send your API key in the X-API-Key header on every request:

HTTP

POST https://tagmatic.app/api/annotate
X-API-Key: tmk_your_api_key_here
Content-Type: application/json

You can also use the Authorization header as a Bearer token:

HTTP

Authorization: Bearer tmk_your_api_key_here

⚠️

Keep API keys secret

Never expose API keys in client-side code or commit them to version control. Use environment variables.

JWT Sessions #

JWT tokens are used by the dashboard and when building browser-based tools. A token is returned after login and expires after 30 days.

HTTP

POST /api/auth/login
Content-Type: application/json

{
  "email": "you@example.com",
  "password": "your-password"
}

// Response
{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "user": { "id": 42, "email": "you@example.com", "plan": "starter" }
}

For server-to-server integrations, always use an API key — not a JWT token.

Schema Reference #

The schema field defines what annotations to extract. It's an array of label definitions — each with a name, type, and description. Tagmatic supports four annotation types.

📌

Base request shape

All annotation requests follow this structure: POST /api/annotate with text, schema, and optional guidelines.

Classification #

Classify text into one of a fixed set of options. Use for sentiment, topic, priority, intent, or any categorical label.

Request

{
  "text": "This product is absolutely fantastic, exceeded all my expectations!",
  "schema": [
    {
      "name": "sentiment",
      "type": "classification",
      "description": "Overall emotional tone of the review",
      "options": ["positive", "neutral", "negative"]
    },
    {
      "name": "intent",
      "type": "classification",
      "description": "Primary intent of the message",
      "options": ["purchase", "support", "complaint", "general"]
    }
  ],
  "guidelines": "Rate the tone based on explicit language only. Ignore punctuation."
}

Response

{
  "job_id": "ann_01jq8bx9f3",
  "annotations": {
    "sentiment": "positive",
    "intent": "purchase"
  },
  "confidence": { "sentiment": 0.97, "intent": 0.88 },
  "processing_ms": 412
}

Span / NER #

Extract named entities or specific text spans with character offsets. Returns all matching occurrences in the text. Use for names, dates, locations, product mentions, or any substring-level annotation.

Request

{
  "text": "Book a flight from New York to Paris for Alice Johnson on March 5th.",
  "schema": [
    {
      "name": "location",
      "type": "span",
      "description": "Geographic places mentioned in the text",
      "multi": true
    },
    {
      "name": "person",
      "type": "span",
      "description": "Full names of people",
      "multi": true
    }
  ]
}

Response

{
  "job_id": "ann_01jq8c0d2m",
  "annotations": {
    "location": [
      { "text": "New York", "start": 21, "end": 29 },
      { "text": "Paris", "start": 33, "end": 38 }
    ],
    "person": [
      { "text": "Alice Johnson", "start": 43, "end": 56 }
    ]
  },
  "processing_ms": 508
}

The start and end fields are zero-indexed character offsets into the original text string, where end is exclusive (text.slice(start, end)).

Extraction #

Extract structured key-value data from unstructured text. The value can be any string. Use for prices, dates, model numbers, addresses, or any free-form field you need to pull out.

Request

{
  "text": "Invoice #INV-2024-089 dated 2024-03-14 for $1,250.00 due in 30 days.",
  "schema": [
    {
      "name": "invoice_number",
      "type": "extraction",
      "description": "The invoice identifier"
    },
    {
      "name": "amount",
      "type": "extraction",
      "description": "Total invoice amount including currency symbol"
    },
    {
      "name": "due_date",
      "type": "extraction",
      "description": "Payment due date derived from the invoice date and terms"
    }
  ]
}

Response

{
  "job_id": "ann_01jq8c4p7k",
  "annotations": {
    "invoice_number": "INV-2024-089",
    "amount": "$1,250.00",
    "due_date": "2024-04-13"
  },
  "processing_ms": 391
}

If a field can't be extracted (not present in the text), the annotation value will be null.

Boolean #

Make yes/no decisions about text. Use for spam detection, content moderation, eligibility checks, or any binary condition.

Request

{
  "text": "BUY NOW!!! Earn $5000 weekly from home. Click here for FREE money!!!",
  "schema": [
    {
      "name": "is_spam",
      "type": "boolean",
      "description": "True if the text appears to be spam or unsolicited marketing"
    },
    {
      "name": "contains_pii",
      "type": "boolean",
      "description": "True if the text contains personally identifiable information"
    }
  ]
}

Response

{
  "job_id": "ann_01jq8c7r9n",
  "annotations": {
    "is_spam": true,
    "contains_pii": false
  },
  "confidence": { "is_spam": 0.99, "contains_pii": 0.95 },
  "processing_ms": 302
}

Mixing label types

A single schema can mix all four types. Tagmatic resolves them in one API call.

Request — mixed schema

{
  "text": "Call me at 555-0100. I'm very unhappy with my order #7821.",
  "schema": [
    { "name": "sentiment", "type": "classification", "options": ["positive", "neutral", "negative"] },
    { "name": "order_number", "type": "extraction", "description": "Order number mentioned" },
    { "name": "phone", "type": "span", "description": "Phone numbers in the text" },
    { "name": "needs_callback", "type": "boolean", "description": "Customer explicitly asks to be called back" }
  ]
}

Guidelines Best Practices #

The optional guidelines field is the single highest-leverage input for annotation accuracy. Well-written guidelines reduce ambiguous cases by giving the model a consistent decision framework.

What makes guidelines effective

State the decision rule, not the definition. "Mark as positive if the customer expresses satisfaction with the product, outcome, or service" is better than "positive = good feeling".
Cover edge cases explicitly. If sarcasm should count as negative, say so. If neutral includes mixed sentiment, say so.
Use concrete examples. "For example, 'not bad' should be classified as neutral, not positive."
Be specific about scope. "Classify based on the product only — ignore tone about shipping or customer service."
Define what to do when unsure. "When in doubt between positive and neutral, prefer neutral."

✓ Good

Classify sentiment based on explicit language about the product. Mark positive if the customer expresses clear satisfaction or recommends the product. Mark negative for complaints, disappointment, or requests for refunds. Mark neutral for factual descriptions, mixed opinions, or sarcasm. Ignore delivery comments.

✗ Bad

Classify if the review is positive, negative, or neutral based on how the customer feels.

✓ Good

Extract the invoice number exactly as it appears in the text (e.g., "INV-001", "Invoice #42"). If multiple invoice numbers appear, extract only the first one. If none is present, return null.

✗ Bad

Extract the invoice number from the text.

Guidelines for span / NER labels

Specify whether to include articles and prepositions ("the UK" vs "UK").
Clarify ambiguous cases: "Include the country name but not the city."
For multi-occurrence spans, state whether overlapping spans are allowed.

Guidelines for boolean labels

Define the exact condition that makes it true. Don't rely on the model's intuition.
For moderation, list specific categories: "True if the text contains profanity, threats, or sexually explicit content."
State the default: "Default to false if uncertain."

🚀

Pro tip: use few-shot examples

Adding 3–5 labeled examples via the Projects API can improve accuracy more than any guideline tweak. Examples show the model exactly what you want — in your own data.

Error Code Reference #

All errors return a JSON body with an error string and an optional code for programmatic handling.

Error shape

{
  "error": "Human-readable description",
  "code": "machine_readable_code"
}

Status	Code	Message	How to fix
400	`invalid_schema`	Schema must be a non-empty array	Ensure `schema` is an array with at least one label definition.
400	`invalid_label_type`	Unsupported label type: "xyz"	Use one of: `classification`, `span`, `extraction`, `boolean`.
400	`missing_options`	Classification labels require an options array	Add an `options` array to each `classification` label.
400	`text_too_long`	Text exceeds maximum length (32,000 characters)	Split long documents into smaller chunks before annotating.
400	`batch_too_large`	Batch requests are limited to 25 texts	Split your batch into groups of ≤25 and send multiple requests.
401	`missing_api_key`	API key required	Add the `X-API-Key` header with your `tmk_` key.
401	`invalid_api_key`	Invalid or revoked API key	Check the key in your dashboard. Generate a new one if revoked.
429	`rate_limited`	Monthly request limit reached	Upgrade your plan, or wait until the next billing period. Check `X-RateLimit-*` headers.
500	`annotation_timeout`	Annotation timed out after 60s	Reduce schema complexity, shorten the text, or retry. Very large schemas with many labels may time out on long texts.
500	`internal_error`	An unexpected error occurred	Retry with exponential backoff. If it persists, contact support.

Rate limit headers

Every response includes these headers so you can monitor usage:

Header	Description
`X-RateLimit-Limit`	Total requests allowed in the current billing period
`X-RateLimit-Remaining`	Requests remaining before you hit the limit
`X-RateLimit-Reset`	Unix timestamp when the limit resets (start of next billing period)

Retrying on errors

Python — retry with backoff

import time, requests

def annotate_with_retry(payload, max_retries=3):
    for attempt in range(max_retries):
        r = requests.post("https://tagmatic.app/api/annotate",
                           headers={"X-API-Key": API_KEY}, json=payload)
        if r.status_code == 429:
            time.sleep(2 ** attempt)   # exponential backoff
            continue
        r.raise_for_status()
        return r.json()
    raise Exception("Max retries exceeded")

Rate Limits by Plan #

Limits are tracked per API key and reset at the start of each billing period. Batch requests count as one request per text item in the batch.

Plan	Monthly Requests	Batch Size	File Upload	Team Members
Free	500	Up to 10	—	1
Starter	10,000	Up to 25	Up to 50 MB	3
Pro	100,000	Up to 25	Up to 50 MB	10
Enterprise	Custom	Custom	Custom	Unlimited

💡

Playground mode

Logged-in users can use the annotation panel at tagmatic.app without an API key. Playground requests are counted separately from API key usage.

Handling 429 responses

When you receive a 429, check the X-RateLimit-Reset header for when the limit resets. Use exponential backoff for transient failures:

Reading rate limit headers

# Python example
response = requests.post(url, headers=headers, json=body)

if response.status_code == 429:
    reset_at = response.headers.get("X-RateLimit-Reset")
    remaining = response.headers.get("X-RateLimit-Remaining")
    print(f"Rate limited. {remaining} requests left. Resets at {reset_at}")

Webhook Setup Guide #

Webhooks let you receive real-time notifications when annotation events occur. Configure them in the Webhooks dashboard or via the API.

Event types

Event	Trigger
`annotation.completed`	A single annotation request finished successfully
`batch.completed`	A batch annotation job finished (all items processed)
`batch.failed`	A batch job failed — some or all items could not be annotated
`file.completed`	A file upload job (CSV/JSONL) finished processing
`project.drift_alert`	Label distribution in a project shifted significantly
`review.submitted`	A human reviewer submitted a correction

Registering a webhook

HTTP

POST /api/webhooks
Authorization: Bearer <jwt_token>

{
  "url": "https://yourapp.com/webhooks/tagmatic",
  "events": ["annotation.completed", "batch.completed"],
  "secret": "your-webhook-signing-secret"
}

Payload structure

annotation.completed payload

{
  "event": "annotation.completed",
  "timestamp": "2024-03-14T10:23:45.000Z",
  "data": {
    "job_id": "ann_01jq8bx9f3",
    "text": "This product is fantastic...",
    "annotations": { "sentiment": "positive" },
    "processing_ms": 412
  }
}

HMAC verification

Every webhook request includes an X-Tagmatic-Signature header. Verify it to confirm the request came from Tagmatic:

Python — verify signature

import hmac, hashlib

def verify_webhook(payload_bytes, signature_header, secret):
    expected = hmac.new(
        secret.encode(),
        payload_bytes,
        hashlib.sha256
    ).hexdigest()
    received = signature_header.replace("sha256=", "")
    return hmac.compare_digest(expected, received)

# In your Flask/FastAPI handler:
sig = request.headers.get("X-Tagmatic-Signature", "")
if not verify_webhook(request.data, sig, WEBHOOK_SECRET):
    return "Forbidden", 403

Retry behavior

Tagmatic retries failed deliveries up to 5 times with exponential backoff (1 min, 5 min, 30 min, 2 hrs, 8 hrs).
A delivery is considered successful if your endpoint returns a 2xx status within 10 seconds.
After 5 failed attempts, the webhook is marked as failed and you'll see it in the Deliveries log.
You can trigger a manual retry from the Webhooks dashboard.

⚠️

Idempotency

Your webhook handler should be idempotent — retries can deliver the same event multiple times. Use job_id as a deduplication key.

Python Quickstart #

Call the REST API directly using requests — no SDK required.

Python — requests

import requests

API_KEY = "tmk_your_key_here"
BASE_URL = "https://tagmatic.app/api"

response = requests.post(
    f"{BASE_URL}/annotate",
    headers={"X-API-Key": API_KEY, "Content-Type": "application/json"},
    json={
        "text": "Your text here",
        "schema": [{
            "name": "sentiment",
            "type": "classification",
            "options": ["positive", "neutral", "negative"]
        }]
    }
)
print(response.json()["annotations"])

📦

REST API Reference

The full API is documented below. Use any HTTP client in any language. Questions? Email tagmatic@polsia.app.