PDF Tools & API for AI Agents — PDF MCP Server for LLMs

Q: Do I authenticate the PDF API with an API key?

Yes. REST endpoints use a Bearer token (JWT) in the Authorization header and work on the free plan; the MCP server uses an MCP API key as a Bearer token and requires an active paid plan (see pdf4.net/pricing). Create an account, pick a plan if you want MCP, and point your agent or HTTP client at the documented endpoints.

Why AI agents need PDF tools

LLMs are great at reasoning over text — but most real-world knowledge is locked in PDFs: invoices, contracts, reports, manuals and scanned forms. Raw PDF bytes are not something a model can read, and dumping a whole document into a prompt wastes context and money.

PDF tools for AI agents solve this by turning messy documents into structured, LLM-ready text your agent can actually use. PDF4 handles three jobs that show up in almost every agentic app:

RAG ingestion

Convert PDFs to Markdown or JSON, chunk them, and embed for retrieval. Headings, lists and tables are preserved so your retrieval stays accurate.

Tool-use / function calling

Expose PDF operations as callable tools. The agent decides when to OCR, extract or summarize a document, then acts on the result.

Document automation

Build invoice, contract and knowledge-base agents that pull fields, route documents and generate outputs end to end.

Call PDF4 from your agent: the MCP server

The fastest way to wire PDFs into an LLM app is the PDF MCP server. The Model Context Protocol is the open standard that lets AI clients discover and call external tools, and PDF4 implements it — so Claude, Cursor, Windsurf and any MCP-compatible agent can list and invoke PDF4 tools directly, no custom integration code required.

Connect over the SSE transport and authenticate with an MCP API key as a Bearer token. The agent sees PDF operations as native tools it can call during a task.

Connect an MCP client (config snippet)

# Add the PDF4 MCP server to your agent / IDE
{
  "mcpServers": {
    "pdf4": {
      "url": "https://pdf4.net/mcp/sse",
      "headers": { "Authorization": "Bearer YOUR_MCP_API_KEY" }
    }
  }
}

Prefer plain HTTP? Every tool is also a REST endpoint — see the PDF API docs for the full list and the MCP section under /mcp/sse.

PDF to Markdown & JSON for LLMs and RAG

For retrieval-augmented generation you want LLM-ready text, not a wall of broken characters. PDF4 produces layout-aware output that keeps document structure intact:

PDF to Markdown — clean Markdown with headings, lists, tables and links, ideal for chunking and embedding.
PDF to JSON — content and structure as JSON your pipeline can iterate over programmatically.
PDF to Text — fast plain-text extraction when you just need the words.
PDF to CSV — pull tabular data straight into rows for analysis or function calls.

Convert a PDF to Markdown (curl)

# Returns LLM-ready Markdown, ready to chunk + embed for RAG curl -X POST "https://pdf4.net/api/pdf/to-markdown" \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "[email protected]"

OCR, extraction, summarization & translation for agents

Beyond conversion, PDF4 gives agents the AI document operations they reach for most:

OCR PDF — make scanned or image-only PDFs searchable so the text layer is usable by your model.
Extract Data from PDF — pull structured fields (invoice numbers, totals, dates, parties) as JSON, with an optional schema.
Summarize PDF — condense long documents server-side so your agent stays inside its context budget.
Translate PDF — translate documents for multilingual ingestion and global workflows.

Define a PDF tool for function calling (Python)

Here's a realistic pattern: register PDF4 as a tool your agent can call, then run it. The function-calling schema below is illustrative; the HTTP call uses PDF4's real endpoint shape (Bearer token + multipart file upload).

# 1) Tool/function definition you hand to the LLM
extract_pdf_data = {
    "name": "extract_pdf_data",
    "description": "Extract structured fields from a PDF (invoices, contracts, forms) as JSON.",
    "input_schema": {
        "type": "object",
        "properties": {
            "file_path": {"type": "string", "description": "Local path to the PDF"},
            "schema": {"type": "string", "description": "Optional JSON of fields to extract"}
        },
        "required": ["file_path"]
    }
}

# 2) When the agent calls the tool, hit the PDF4 API
import requests

def extract_pdf_data_impl(file_path, schema=None):
    with open(file_path, "rb") as f:
        resp = requests.post(
            "https://pdf4.net/api/pdf/extract-data",
            headers={"Authorization": "Bearer YOUR_API_KEY"},
            files={"file": f},
            data={"schema": schema} if schema else None,
        )
    resp.raise_for_status()
    return resp.json()   # structured data, ready to return to the model

Long-running operations follow the standard async flow: the API returns an operation_id; poll /api/pdf/status/{operation_id} and fetch the result from /api/pdf/download/{operation_id}. See the API docs for exact request/response shapes.

Use cases for LLM & agent builders

Invoice & document agents

Drop in a PDF invoice, extract line items and totals as JSON, and let the agent reconcile or post to your ledger.

Knowledge-base ingestion

Batch-convert PDFs to Markdown, chunk and embed them, and power a RAG assistant grounded in your real docs.

Contract & report analysis

Summarize long contracts, extract key clauses and dates, and surface answers with citations back to the source.

Scanned-document workflows

OCR image-only PDFs first, then extract or summarize — turning paper scans into structured agent input.

Frequently asked questions

What are PDF tools for AI agents?

PDF tools for AI agents are document operations — convert, OCR, extract, parse, summarize and translate — exposed as an API or MCP server so an LLM agent can call them as functions. Instead of stuffing raw PDF bytes into a prompt, your agent sends a PDF to PDF4 and gets back clean Markdown, JSON or plain text it can reason over, embed for RAG, or feed into the next tool.

What is a PDF MCP server?

A PDF MCP server implements the Model Context Protocol so AI clients like Claude, Cursor and Windsurf can discover and call PDF tools directly. PDF4 ships an MCP server at /mcp/sse: connect with your MCP API key as a Bearer token and the agent can list available PDF tools and invoke them — converting, extracting and summarizing PDFs without any custom glue code.

How do I parse a PDF for an LLM or RAG pipeline?

Send the PDF to PDF4's PDF to Markdown or PDF to JSON endpoint. You get layout-aware, LLM-ready text with headings, lists and tables preserved — ideal for chunking and embedding into a vector store for retrieval-augmented generation. For scanned documents, run OCR first so the text layer is searchable.

Can AI agents extract structured data from PDFs?

Yes. The Extract Data endpoint uses AI to pull structured fields — invoice numbers, totals, dates, parties, line items — from PDFs and return them as JSON. Pass an optional schema describing the fields you want, or let it auto-detect common documents like invoices, contracts and forms.

What is the best AI for summarizing a PDF?

PDF4's summarize endpoint gives agents a one-call way to condense a long PDF into a concise summary. Because it runs server-side and returns plain text, your agent stays inside its context budget instead of loading the whole document — useful for triage, knowledge-base ingestion and document automation.

Do I authenticate the PDF API with an API key?

Yes. REST endpoints use a Bearer token (JWT) in the Authorization header and work on the free plan; the MCP server uses an MCP API key as a Bearer token and requires an active paid plan. Create an account, pick a plan if you want MCP, and point your agent or HTTP client at the documented endpoints.

PDF tools agents reach for

Build your PDF-powered AI agent

Get a free REST API key — or pick a paid plan to connect the MCP server — and start parsing PDFs into Markdown and JSON your LLM can use.

Get a free API key Read the API docs

PNG to PDF

JPG to PDF

Images to PDF

HTML to PDF

Markdown to PDF

Email to PDF

PDF from Template

PDF to PNG

PDF to JPG

PDF to Word

PDF to Excel

PDF to PowerPoint

PDF to Text

PDF to CSV

PDF to Markdown

PDF to SVG

PDF to EPUB

PDF to JSON

PDF to PDF/A

OCR PDF

PDF to HTML

Merge PDF

Split PDF

Extract PDF Pages

Delete PDF Pages

Rotate PDF

Organize PDF

Crop PDF

Add Bookmarks to PDF

Compress PDF

Flatten PDF

Repair PDF

Grayscale PDF

Resize PDF

Add Page Numbers to PDF

Watermark PDF

Add Headers & Footers to PDF

Add Hyperlinks to PDF

Annotate PDF

Compare PDF

Edit PDF Metadata

Fill PDF Form

Redact PDF

Unlock PDF

Protect PDF

Sign PDF

PDF Accessibility Checker

PDF Inspector

Read Barcodes from PDF

Extract Data from PDF

Summarize PDF

Translate PDF