icon for mcp server

Web Crawler

STDIO

MCP server implementation for web crawling with configurable depth and concurrent requests.

Web Crawler MCP Server Deployment Guide

Prerequisites

  • Node.js (v18+)
  • npm (v9+)

Installation

  1. Clone the repository:

    git clone https://github.com/jitsmaster/web-crawler-mcp.git cd web-crawler-mcp
  2. Install dependencies:

    npm install
  3. Build the project:

    npm run build

Configuration

Create a .env file with the following environment variables:

CRAWL_LINKS=false MAX_DEPTH=3 REQUEST_DELAY=1000 TIMEOUT=5000 MAX_CONCURRENT=5

Running the Server

Start the MCP server:

npm start

MCP Configuration

Add the following to your MCP settings file:

{ "mcpServers": { "web-crawler": { "command": "node", "args": ["/path/to/web-crawler/build/index.js"], "env": { "CRAWL_LINKS": "false", "MAX_DEPTH": "3", "REQUEST_DELAY": "1000", "TIMEOUT": "5000", "MAX_CONCURRENT": "5" } } } }

Usage

The server provides a crawl tool that can be accessed through MCP. Example usage:

{ "url": "https://example.com", "depth": 1 }

Configuration Options

Environment VariableDefaultDescription
CRAWL_LINKSfalseWhether to follow links
MAX_DEPTH3Maximum crawl depth
REQUEST_DELAY1000Delay between requests (ms)
TIMEOUT5000Request timeout (ms)
MAX_CONCURRENT5Maximum concurrent requests

Be the First to Experience MCP Now