Website Downloader
STDIOSimple MCP server for downloading documentation websites and preparing them for RAG indexing.
Simple MCP server for downloading documentation websites and preparing them for RAG indexing.
Simple MCP server for downloading documentation websites and preparing them for RAG indexing.
Fork and download, cd to the repository.
uv venv ./venv/Scripts/activate pip install -e .
Put this in your claude_desktop_config.json with your own paths:
"mcp-windows-website-downloader": { "command": "uv", "args": [ "--directory", "F:/GithubRepos/mcp-windows-website-downloader", "run", "mcp-windows-website-downloader", "--library", "F:/GithubRepos/mcp-windows-website-downloader/website_library" ] },
python -m mcp_windows_website_downloader.server --library docs_library
result = await server.call_tool("download", { "url": "https://docs.example.com" })
docs_library/
domain_name/
index.html
about.html
docs/
getting-started.html
...
assets/
css/
js/
images/
fonts/
rag_index.json
The server follows standard MCP architecture:
src/
mcp_windows_website_downloader/
__init__.py
server.py # MCP server implementation
core.py # Core downloader functionality
utils.py # Helper utilities
server.py
: Main MCP server implementation that handles tool registration and requestscore.py
: Core website downloading functionality with proper asset handlingutils.py
: Helper utilities for file handling and URL processingSingle Responsibility
Clean Structure
Robust Operation
The rag_index.json
file contains:
{ "url": "https://docs.example.com", "domain": "docs.example.com", "pages": 42, "path": "/path/to/site" }
MIT License - See LICENSE file
The server handles common issues:
Error responses follow the format:
{ "status": "error", "error": "Detailed error message" }
Success responses:
{ "status": "success", "path": "/path/to/downloaded/site", "pages": 42 }