Document Operations MCP Server

Language / 语言: English | 中文

Document Operations MCP Server - A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.

Demo

Video

https://github.com/user-attachments/assets/43dfeeec-8097-413e-8519-a7de98e31136

In this demo, we showcase how to:

Configure doc-ops-mcp in MCP clients
Convert DOCX documents to PDF format
Add default watermarks to converted PDF files

Quick Start
System Architecture
Optional Integration
Features
Open Source Licenses
Future Roadmap
Docker Deployment
Development Guide
Troubleshooting
Contributing

1. Quick Start

First, add the Document Operations MCP server to your MCP client.

Standard config works in most MCP clients:

{
  "mcpServers": {
    "doc-ops-mcp": {
      "command": "npx",
      "args": ["-y", "doc-ops-mcp"],
      "env": {
        "OUTPUT_DIR": "/path/to/your/output/directory",
        "CACHE_DIR": "/path/to/your/cache/directory",
      }
    }
  }
}

Claude Desktop

Follow the MCP install guide, use the standard config above.

VS Code

Follow the MCP install guide, use the standard config above.

Cursor

Go to Cursor Settings -> MCP -> Add new MCP Server. Name to your liking, use command type with the command npx -y doc-ops-mcp.

Other MCP Clients

For other MCP clients, use the standard config above and refer to your client's documentation for MCP server installation.

Configuration

The Document Operations MCP server supports configuration through environment variables. These can be provided in the MCP client configuration as part of the "env" object:

{
  "mcpServers": {
    "doc-ops-mcp": {
      "command": "npx",
      "args": ["-y", "doc-ops-mcp"],
      "env": {
        "OUTPUT_DIR": "/path/to/your/output/directory",
        "CACHE_DIR": "/path/to/your/cache/directory",
        "WATERMARK_IMAGE": "/path/to/watermark.png",
        "QR_CODE_IMAGE": "/path/to/qrcode.png"
      }
    }
  }
}

Supported Document Operations

Format	Convert to PDF	Convert to DOCX	Convert to HTML	Convert to Markdown	Content Rewriting	Watermark/QR Code
PDF	✅	❌	❌	❌	❌	✅
DOCX	✅	✅	✅	✅	✅	❌
HTML	✅	❌	✅	✅	✅	❌
Markdown	✅	✅	✅	✅	✅	❌

Rewriting Features:

Content Replacement: Support batch text replacement and regular expression replacement
Format Adjustment: Modify document structure, heading levels, and style formatting
Smart Rewriting: Content optimization while preserving original document format

Usage Examples

Format Conversion:

Convert /Users/docs/report.docx to PDF
Convert /Users/docs/article.md to HTML
Convert /Users/docs/presentation.html to DOCX
Convert /Users/docs/readme.md to PDF (with theme styling)

Document Rewriting:

Rewrite company names in /Users/docs/contract.md
Batch replace terminology in /Users/docs/manual.docx
Adjust heading levels in /Users/docs/article.html
Update dates and version numbers in /Users/docs/policy.md

PDF Enhancement:

Add watermark to /Users/docs/document.pdf
Add QR code to /Users/docs/report.pdf
Add company logo watermark to /Users/docs/invoice.pdf

Environment Variables

The server supports environment variables for controlling output paths and PDF enhancement features:

Core Directories

OUTPUT_DIR: Controls where all generated files are saved (default: ~/Documents)
CACHE_DIR: Directory for temporary and cache files (default: ~/.cache/doc-ops-mcp)

PDF Enhancement Features

WATERMARK_IMAGE: Default watermark image path for PDF files
- Automatically added to all PDF conversions
- Supported formats: PNG, JPG
- If not set, default text watermark "doc-ops-mcp" will be used
QR_CODE_IMAGE: Default QR code image path for PDF files
- Added to PDFs only when explicitly requested (addQrCode=true)
- Supported formats: PNG, JPG
- If not set, QR code functionality will be unavailable

Output Path Rules:

If outputPath is not provided → files saved to OUTPUT_DIR with auto-generated names
If outputPath is relative → resolved relative to OUTPUT_DIR
If outputPath is absolute → used as-is, ignoring OUTPUT_DIR

See OUTPUT_PATH_CONTROL.md for detailed documentation.

2. System Architecture

Document Operations MCP Server adopts a pure JavaScript architecture design, providing complete document processing capabilities:

┌─────────────────────────────────────────────────────────────┐
│                    MCP Client Layer                         │
│           (Claude Desktop, Cursor, VS Code, etc.)           │
└─────────────────────┬───────────────────────────────────────┘
                      │ JSON-RPC 2.0
┌─────────────────────┴───────────────────────────────────────┐
│                 Doc-Ops-MCP Server                         │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │
│  │   Tool Router   │  │  Request        │  │  Response   │ │
│  │   & Handler     │  │  Validator      │  │  Formatter  │ │
│  └────────┬────────┘  └────────┬────────┘  └──────┬──────┘ │
│           │                    │                  │        │
│  ┌────────┴────────────────────┴──────────────────┴─────┐ │
│  │                Document Processing Engine             │ │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │ │
│  │  │  Document   │  │   Format    │  │   Style     │   │ │
│  │  │   Reader    │  │  Converter  │  │  Processor  │   │ │
│  │  └─────────────┘  └─────────────┘  └─────────────┘   │ │
│  │                                                        │ │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │ │
│  │  │    PDF      │  │  Watermark/ │  │ Conversion  │   │ │
│  │  │ Enhancement │  │   QR Code   │  │  Planner    │   │ │
│  │  └─────────────┘  └─────────────┘  └─────────────┘   │ │
└────┴───────────────────────────────────────────────────────┴─┘
                            │
┌───────────────────────────┴─────────────────────────────────┐
│                    Core Dependencies Layer                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   pdf-lib   │  │word-extractor│  │   marked    │          │
│  │ (PDF Tools) │  │(DOCX Reader)│  │ (Markdown)  │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   cheerio   │  │    jszip    │  │    docx     │          │
│  │(HTML Parser)│  │(ZIP Handler)│  │(DOCX Gen.)  │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│  ┌─────────────┐  ┌─────────────┐                           │
│  │   xml2js    │  │Custom OOXML │                           │
│  │(XML Parser) │  │   Parser    │                           │
│  └─────────────┘  └─────────────┘                           │
└─────────────────────────────────────────────────────────────┘

Architecture Overview

Core Features:

Pure JavaScript implementation with no external system dependencies
Complete document reading, conversion, and style processing capabilities
Built-in PDF watermark and QR code addition functionality
Intelligent conversion planning and path optimization

Conversion Flow:

Direct Conversion: Supports direct conversion between most formats
Multi-step Conversion: Complex conversions achieved through intermediate formats
Style Preservation: Uses OOXML parser to ensure complete style integrity

3. Optional Integration

This server can work with playwright-mcp for enhanced PDF conversion capabilities. Please refer to the official playwright-mcp documentation for detailed configuration.

🔧 PDF Conversion Workflow

This server supports complete PDF conversion functionality:

Document Parsing: Use OOXML parser to ensure complete style preservation
Format Conversion: Convert documents to high-quality HTML format
PDF Generation: Built-in converter or optionally work with playwright-mcp
Enhancement Processing: Automatically add watermarks and QR codes (if configured)

How It Works

This server uses intelligent conversion architecture:

Smart Planning: plan_conversion analyzes conversion requirements and selects optimal paths
Format Conversion: Use specialized converters to handle various document formats
Style Preservation: Ensure style integrity through OOXML parser
Enhancement Processing: Automatically add watermarks, QR codes and other enhancements
Optional Integration: Support working with playwright-mcp for enhanced capabilities

4. Features

MCP Tools

Core Document Tools

Tool Name	Description	Input Parameters	External Dependencies
`read_document`	Read document content	`filePath`: Document path `extractMetadata`: Extract metadata `preserveFormatting`: Preserve formatting	None
`write_document`	Write document content	`content`: Document content `outputPath`: Output file path `encoding`: File encoding	None
`convert_document`	Smart document conversion	`inputPath`: Input file path `outputPath`: Output file path `preserveFormatting`: Preserve formatting	None
`plan_conversion`	Conversion planner	`sourceFormat`: Source format `targetFormat`: Target format `preserveStyles`: Preserve styles `quality`: Conversion quality	None

read_document

Read various document formats including PDF, DOCX, DOC, HTML, MD, and more.

Parameters:

filePath (string, required) - Document path to read
extractMetadata (boolean, optional) - Extract document metadata, defaults to false
preserveFormatting (boolean, optional) - Preserve formatting (HTML output), defaults to false

write_document

Write content to document files in specified formats.

Parameters:

content (string, required) - Content to write
outputPath (string, optional) - Output file path (auto-generated if not provided)
encoding (string, optional) - File encoding, defaults to utf-8

convert_document

Convert documents between formats with enhanced style preservation.

Parameters:

inputPath (string, required) - Input file path
outputPath (string, optional) - Output file path (auto-generated if not provided)
preserveFormatting (boolean, optional) - Preserve formatting, defaults to true
useInternalPlaywright (boolean, optional) - Use built-in Playwright for PDF conversion, defaults to false

convert_docx_to_pdf

Convert DOCX to PDF with automatic watermark addition (if configured).

Parameters:

docxPath (string, required) - DOCX file path
outputPath (string, optional) - Output PDF path (auto-generated if not provided)
addQrCode (boolean, optional) - Whether to add QR code, defaults to false
preserveFormatting (boolean, optional) - Preserve original formatting, defaults to true
chineseFont (string, optional) - Chinese font, defaults to Microsoft YaHei

convert_markdown_to_pdf

Convert Markdown to PDF with automatic watermark addition (if configured).

Parameters:

markdownPath (string, required) - Markdown file path
outputPath (string, optional) - Output PDF path (auto-generated if not provided)
theme (string, optional) - Theme style, defaults to "github"
includeTableOfContents (boolean, optional) - Include table of contents, defaults to false
addQrCode (boolean, optional) - Whether to add QR code, defaults to false

convert_markdown_to_html

Convert Markdown to HTML.

Parameters:

markdownPath (string, required) - Markdown file path
outputPath (string, optional) - Output HTML path (auto-generated if not provided)
theme (string, optional) - Theme style, defaults to "github"
includeTableOfContents (boolean, optional) - Include table of contents, defaults to false

convert_markdown_to_docx

Convert Markdown to DOCX.

Parameters:

markdownPath (string, required) - Markdown file path
outputPath (string, optional) - Output DOCX path (auto-generated if not provided)

convert_html_to_markdown

Convert HTML to Markdown.

Parameters:

htmlPath (string, required) - HTML file path
outputPath (string, optional) - Output Markdown path (auto-generated if not provided)

plan_conversion

🎯 Smart Conversion Planner - Analyze conversion requirements and generate optimal conversion plans.

Parameters:

sourceFormat (string, required) - Source file format (pdf, docx, html, markdown, md, txt, doc)
targetFormat (string, required) - Target file format (pdf, docx, html, markdown, md, txt, doc)
sourceFile (string, optional) - Source file path (for generating specific conversion parameters)
preserveStyles (boolean, optional) - Whether to preserve style formatting, defaults to true
includeImages (boolean, optional) - Whether to include images, defaults to true
theme (string, optional) - Conversion theme, defaults to github
quality (string, optional) - Conversion quality requirement (fast, balanced, high), defaults to balanced

process_pdf_post_conversion

Parameters:

playwrightPdfPath (string, required) - Generated PDF file path
targetPath (string, optional) - Target PDF file path (auto-generated if not provided)
addWatermark (boolean, optional) - Whether to add watermark, defaults to false
addQrCode (boolean, optional) - Whether to add QR code, defaults to false
watermarkImage (string, optional) - Watermark image path
qrCodePath (string, optional) - QR code image path

PDF Enhancement Tools

add_watermark

🎨 PDF Watermark Addition Tool - Add image or text watermarks to PDF documents.

Parameters:

pdfPath (string, required) - PDF file path
watermarkImage (string, optional) - Watermark image path (PNG/JPG)
watermarkText (string, optional) - Watermark text content
watermarkImageScale (number, optional) - Image scale ratio, defaults to 0.25
watermarkImageOpacity (number, optional) - Image opacity, defaults to 0.6
watermarkImagePosition (string, optional) - Image position, defaults to fullscreen

add_qrcode

📱 PDF QR Code Addition Tool - Add QR codes to PDF documents.

Parameters:

pdfPath (string, required) - PDF file path
qrCodePath (string, optional) - QR code image path
qrScale (number, optional) - QR code scale ratio, defaults to 0.15
qrOpacity (number, optional) - QR code opacity, defaults to 1.0
qrPosition (string, optional) - QR code position, defaults to bottom-center
addText (boolean, optional) - Whether to add explanatory text, defaults to true

System Requirements

Node.js ≥ 18.0.0
Zero external system dependencies - All processing via npm packages
Optional Integration: playwright-mcp for enhanced PDF conversion

Core Technology Stack

pdf-lib - PDF operations and enhancement
word-extractor - DOCX document text extraction
marked - Markdown parsing and rendering
cheerio - HTML parsing and manipulation
docx - DOCX document generation
jszip - ZIP file processing
xml2js - XML parsing and conversion
Custom OOXML Parser - Advanced DOCX style preservation

Installation

# Global installation
npm install -g doc-ops-mcp

# Or using pnpm
pnpm add -g doc-ops-mcp

# Or using bun
bun add -g doc-ops-mcp

Architecture Components

MCP Server Core: Handles JSON-RPC 2.0 communication and tool registration
Smart Router: Routes requests to optimal processing modules
Conversion Engine: Contains specialized converters for different document types
Style Processor: Ensures style preservation during format conversion
Security Module: Provides path validation and content security handling

5. Open Source Licenses

Project License

This Project: MIT License
Compatibility: Available for commercial and non-commercial use

Third-Party Dependencies

Library	Version	License	Purpose
pdf-lib	^1.17.1	MIT	PDF document manipulation
word-extractor	^1.0.4	MIT	DOCX document text extraction
marked	^15.0.12	MIT	Markdown parsing and rendering
cheerio	^1.0.0-rc.12	MIT	HTML parsing and manipulation
docx	^9.5.1	Apache-2.0	DOCX document generation
jszip	^3.10.1	MIT	ZIP file processing
xml2js	^0.6.2	MIT	XML parsing and conversion

License Compatibility

✅ Commercial Use: All dependencies support commercial use
✅ Distribution: Free to distribute and modify
✅ Patent Protection: Apache-2.0 provides patent protection
⚠️ Notice: Original license notices must be retained

6. Future Roadmap

Core Features

🔄 Enhanced Conversion Quality: Improve style preservation for complex documents
📊 Excel Support: Complete Excel read/write and conversion functionality
🎨 Template System: Support for custom document templates
🔍 OCR Integration: Image text recognition capabilities

System Improvements

🌐 Multi-language Support: Internationalization and localization
🔐 Security Enhancements: Document encryption and access control
⚡ Performance Optimization: Large file handling and memory optimization
🔌 Plugin System: Extensible processor architecture

Version Roadmap

v2.0: Complete Excel support and template system
v3.0: OCR integration and multi-language support
v4.0: Advanced security features and plugin system

7. Docker Deployment

Quick Start

Using Pre-built Image

# Pull the latest image
docker pull docops/doc-ops-mcp:latest

# Run with default configuration
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  docops/doc-ops-mcp:latest

Building from Source

# Clone the repository
git clone https://github.com/JefferyMunoz/doc-ops-mcp.git
cd doc-ops-mcp

# Build the Docker image
docker build -t doc-ops-mcp .

# Run the container
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  -v $(pwd)/documents:/app/documents \
  doc-ops-mcp

Docker Compose Deployment

Create a docker-compose.yml file:

version: '3.8'

services:
  doc-ops-mcp:
    image: docops/doc-ops-mcp:latest
    container_name: doc-ops-mcp
    ports:
      - "3000:3000"
    volumes:
      - ./documents:/app/documents
      - ./config:/app/config
    environment:
      - NODE_ENV=production
      - PORT=3000
    restart: unless-stopped
    
  # Optional: Add Nginx for reverse proxy
  nginx:
    image: nginx:alpine
    container_name: doc-ops-nginx
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - doc-ops-mcp
    restart: unless-stopped

Environment Variables

Variable	Description	Default
`PORT`	Server port	`3000`
`NODE_ENV`	Environment mode	`production`
`LOG_LEVEL`	Logging level	`info`
`MAX_FILE_SIZE`	Maximum file size (MB)	`50`

Volume Mounts

Mount local directories for persistent storage:

# Documents directory for file processing
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  -v $(pwd)/documents:/app/documents \
  -v $(pwd)/output:/app/output \
  doc-ops-mcp

Docker Configuration Examples

Production Deployment

# Production setup with Docker Swarm
docker swarm init
docker stack deploy -c docker-compose.yml doc-ops

# Scale the service
docker service scale doc-ops_mcp=3

Health Checks

The container includes built-in health checks:

# Check container health
docker ps

# View health check logs
docker inspect --format='{{.State.Health.Status}}' doc-ops-mcp

# Manual health check
docker exec doc-ops-mcp curl -f http://localhost:3000/health || exit 1

8. Development Guide

Local Development

# Clone the repository
git clone https://github.com/your-org/doc-ops-mcp.git
cd doc-ops-mcp

# Install dependencies
npm install

# Run in development mode
npm run dev

# Build the project
npm run build

# Run tests
npm test

Project Structure

src/
├── index.ts          # MCP server entry point
├── tools/            # Tool implementations
│   ├── documentConverter.ts
│   ├── pdfTools.ts
│   └── ...
├── types/            # Type definitions
└── utils/            # Utility functions

Adding New Tools

Create a new tool file in src/tools/
Implement the tool logic
Register the tool in src/index.ts
Add test cases
Update documentation

9. Troubleshooting

Common Issues

Port conflicts: Change the host port in docker-compose.yml
Permission issues: Ensure volume mounts have correct permissions
Memory issues: Increase Docker memory allocation

Debug Mode

# Run with debug logging
docker run -d \
  --name doc-ops-mcp \
  -p 3000:3000 \
  -e LOG_LEVEL=debug \
  doc-ops-mcp

# View logs
docker logs -f doc-ops-mcp

10. Contributing

How to Contribute

Fork the Project
Create a Feature Branch (git checkout -b feature/AmazingFeature)
Commit Your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Intellectual Property License

By submitting a Pull Request, you agree that all contributions submitted through Pull Requests will be licensed under the MIT License. This means:

You grant the project maintainers and users the right to use, modify, and distribute your contributions under the MIT License
You confirm that you have the right to make these contributions
You understand that your contributions will become part of the open source project
You waive any claims to exclusive ownership of the contributed code

If you cannot agree to these terms, please do not submit a Pull Request.

Code Standards

Use TypeScript
Follow ESLint configuration
Add appropriate tests
Update relevant documentation

Reporting Issues

Use GitHub Issues
Provide detailed error information and reproduction steps
Include system environment information

License

This project is licensed under the MIT License - see the LICENSE file for details.

Document Operations

Document Operations MCP Server

Demo

Video

Table of Contents

1. Quick Start

Configuration

Supported Document Operations

Usage Examples

Environment Variables

Core Directories

PDF Enhancement Features

2. System Architecture

Architecture Overview

3. Optional Integration

🔧 PDF Conversion Workflow

How It Works

4. Features

MCP Tools

Core Document Tools

read_document

write_document

convert_document

convert_docx_to_pdf

convert_markdown_to_pdf

convert_markdown_to_html

convert_markdown_to_docx

convert_html_to_markdown

plan_conversion

process_pdf_post_conversion

PDF Enhancement Tools

add_watermark

add_qrcode

System Requirements

System Requirements

Core Technology Stack

Installation

Architecture Components

5. Open Source Licenses

Project License

Third-Party Dependencies

License Compatibility

6. Future Roadmap

Core Features

System Improvements

Version Roadmap

7. Docker Deployment

Quick Start

Using Pre-built Image

Building from Source

Docker Compose Deployment

Environment Variables

Volume Mounts

Docker Configuration Examples

Production Deployment

Health Checks

8. Development Guide

Local Development

Project Structure

Adding New Tools

9. Troubleshooting

Common Issues

Debug Mode

10. Contributing

How to Contribute

Intellectual Property License

Code Standards

Reporting Issues

License

Be the First to Experience MCP Now