AI Agents Pricing API Log in
Convert to PDF
PNG to PDF JPG to PDF Images to PDF HTML to PDF Markdown to PDF Email to PDF PDF from Template
Convert from PDF
PDF to PNG PDF to JPG PDF to Word PDF to Excel PDF to PowerPoint PDF to Text PDF to CSV PDF to Markdown PDF to SVG PDF to EPUB PDF to JSON PDF to PDF/A OCR PDF PDF to HTML
Organize PDF
Merge PDF Split PDF Extract PDF Pages Delete PDF Pages Rotate PDF Organize PDF Crop PDF Add Bookmarks to PDF
Optimize PDF
Compress PDF Flatten PDF Repair PDF Grayscale PDF Resize PDF
Edit PDF
Add Page Numbers to PDF Watermark PDF Add Headers & Footers to PDF Add Hyperlinks to PDF Annotate PDF Compare PDF Edit PDF Metadata Fill PDF Form
PDF Security
Redact PDF Unlock PDF Protect PDF Sign PDF
Analyze PDF
PDF Accessibility Checker PDF Inspector Read Barcodes from PDF
AI PDF Tools
Extract Data from PDF Summarize PDF Translate PDF
Get Started
PDF API · MCP Server · LLM-ready output

PDF Tools & API for AI Agents

Give your AI agents a reliable way to read documents. PDF4 is the PDF API and MCP server that LLM apps call to convert, OCR, extract, parse and summarize PDFs — returning clean Markdown and JSON built for RAG and tool-use.

Markdown & JSON output MCP server for Claude, Cursor & Windsurf OCR for scanned docs Bearer-token auth

Why AI agents need PDF tools

LLMs are great at reasoning over text — but most real-world knowledge is locked in PDFs: invoices, contracts, reports, manuals and scanned forms. Raw PDF bytes are not something a model can read, and dumping a whole document into a prompt wastes context and money.

PDF tools for AI agents solve this by turning messy documents into structured, LLM-ready text your agent can actually use. PDF4 handles three jobs that show up in almost every agentic app:

RAG ingestion

Convert PDFs to Markdown or JSON, chunk them, and embed for retrieval. Headings, lists and tables are preserved so your retrieval stays accurate.

Tool-use / function calling

Expose PDF operations as callable tools. The agent decides when to OCR, extract or summarize a document, then acts on the result.

Document automation

Build invoice, contract and knowledge-base agents that pull fields, route documents and generate outputs end to end.

Call PDF4 from your agent: the MCP server

The fastest way to wire PDFs into an LLM app is the PDF MCP server. The Model Context Protocol is the open standard that lets AI clients discover and call external tools, and PDF4 implements it — so Claude, Cursor, Windsurf and any MCP-compatible agent can list and invoke PDF4 tools directly, no custom integration code required.

Connect over the SSE transport and authenticate with an MCP API key as a Bearer token. The agent sees PDF operations as native tools it can call during a task.

Connect an MCP client (config snippet)
# Add the PDF4 MCP server to your agent / IDE { "mcpServers": { "pdf4": { "url": "https://pdf4.net/mcp/sse", "headers": { "Authorization": "Bearer YOUR_MCP_API_KEY" } } } }

Prefer plain HTTP? Every tool is also a REST endpoint — see the PDF API docs for the full list and the MCP section under /mcp/sse.

PDF to Markdown & JSON for LLMs and RAG

For retrieval-augmented generation you want LLM-ready text, not a wall of broken characters. PDF4 produces layout-aware output that keeps document structure intact:

Convert a PDF to Markdown (curl)
# Returns LLM-ready Markdown, ready to chunk + embed for RAG curl -X POST "https://pdf4.net/api/pdf/to-markdown" \ -H "Authorization: Bearer YOUR_API_KEY" \ -F "[email protected]"

OCR, extraction, summarization & translation for agents

Beyond conversion, PDF4 gives agents the AI document operations they reach for most:

Define a PDF tool for function calling (Python)

Here's a realistic pattern: register PDF4 as a tool your agent can call, then run it. The function-calling schema below is illustrative; the HTTP call uses PDF4's real endpoint shape (Bearer token + multipart file upload).

# 1) Tool/function definition you hand to the LLM extract_pdf_data = { "name": "extract_pdf_data", "description": "Extract structured fields from a PDF (invoices, contracts, forms) as JSON.", "input_schema": { "type": "object", "properties": { "file_path": {"type": "string", "description": "Local path to the PDF"}, "schema": {"type": "string", "description": "Optional JSON of fields to extract"} }, "required": ["file_path"] } } # 2) When the agent calls the tool, hit the PDF4 API import requests def extract_pdf_data_impl(file_path, schema=None): with open(file_path, "rb") as f: resp = requests.post( "https://pdf4.net/api/pdf/extract-data", headers={"Authorization": "Bearer YOUR_API_KEY"}, files={"file": f}, data={"schema": schema} if schema else None, ) resp.raise_for_status() return resp.json() # structured data, ready to return to the model

Long-running operations follow the standard async flow: the API returns an operation_id; poll /api/pdf/status/{operation_id} and fetch the result from /api/pdf/download/{operation_id}. See the API docs for exact request/response shapes.

Use cases for LLM & agent builders

Invoice & document agents

Drop in a PDF invoice, extract line items and totals as JSON, and let the agent reconcile or post to your ledger.

Knowledge-base ingestion

Batch-convert PDFs to Markdown, chunk and embed them, and power a RAG assistant grounded in your real docs.

Contract & report analysis

Summarize long contracts, extract key clauses and dates, and surface answers with citations back to the source.

Scanned-document workflows

OCR image-only PDFs first, then extract or summarize — turning paper scans into structured agent input.

Frequently asked questions

What are PDF tools for AI agents?

PDF tools for AI agents are document operations — convert, OCR, extract, parse, summarize and translate — exposed as an API or MCP server so an LLM agent can call them as functions. Instead of stuffing raw PDF bytes into a prompt, your agent sends a PDF to PDF4 and gets back clean Markdown, JSON or plain text it can reason over, embed for RAG, or feed into the next tool.

What is a PDF MCP server?

A PDF MCP server implements the Model Context Protocol so AI clients like Claude, Cursor and Windsurf can discover and call PDF tools directly. PDF4 ships an MCP server at /mcp/sse: connect with your MCP API key as a Bearer token and the agent can list available PDF tools and invoke them — converting, extracting and summarizing PDFs without any custom glue code.

How do I parse a PDF for an LLM or RAG pipeline?

Send the PDF to PDF4's PDF to Markdown or PDF to JSON endpoint. You get layout-aware, LLM-ready text with headings, lists and tables preserved — ideal for chunking and embedding into a vector store for retrieval-augmented generation. For scanned documents, run OCR first so the text layer is searchable.

Can AI agents extract structured data from PDFs?

Yes. The Extract Data endpoint uses AI to pull structured fields — invoice numbers, totals, dates, parties, line items — from PDFs and return them as JSON. Pass an optional schema describing the fields you want, or let it auto-detect common documents like invoices, contracts and forms.

What is the best AI for summarizing a PDF?

PDF4's summarize endpoint gives agents a one-call way to condense a long PDF into a concise summary. Because it runs server-side and returns plain text, your agent stays inside its context budget instead of loading the whole document — useful for triage, knowledge-base ingestion and document automation.

Do I authenticate the PDF API with an API key?

Yes. REST endpoints use a Bearer token (JWT) in the Authorization header; the MCP server uses an MCP API key as a Bearer token. Create a free account to get credentials, then point your agent or HTTP client at the documented endpoints.

PDF tools agents reach for

Build your PDF-powered AI agent

Get a free API key, connect the MCP server, and start parsing PDFs into Markdown and JSON your LLM can use.