
Firecrawl
STDIOWeb scraping server for extracting website content and structured data using Firecrawl APIs.
Web scraping server for extracting website content and structured data using Firecrawl APIs.
This is a simple MCP server that provides tools to scrape websites and extract structured data using Firecrawl's APIs.
npm install
.env
file in the root directory with the following variables:FIRECRAWL_API_TOKEN=your_token_here
SENTRY_DSN=your_sentry_dsn_here
FIRECRAWL_API_TOKEN
(required): Your Firecrawl API tokenSENTRY_DSN
(optional): Sentry DSN for error tracking and performance monitoringnpm start
Alternatively, you can set environment variables directly when running the server:
FIRECRAWL_API_TOKEN=your_token_here npm start
The server exposes two tools:
scrape-website
: Basic website scraping with multiple format optionsextract-data
: Structured data extraction based on prompts and schemasThis tool scrapes a website and returns its content in the requested formats.
Parameters:
url
(string, required): The URL of the website to scrapeformats
(array of strings, optional): Array of desired output formats. Supported formats are:
"markdown"
(default)"html"
"text"
Example usage with MCP Inspector:
# Basic usage (defaults to markdown) mcp-inspector --tool scrape-website --args '{ "url": "https://example.com" }' # Multiple formats mcp-inspector --tool scrape-website --args '{ "url": "https://example.com", "formats": ["markdown", "html", "text"] }'
This tool extracts structured data from websites based on a provided prompt and schema.
Parameters:
urls
(array of strings, required): Array of URLs to extract data fromprompt
(string, required): The prompt describing what data to extractschema
(object, required): Schema definition for the data to extractThe schema definition should be an object where keys are field names and values are types. Supported types are:
"string"
: For text fields"boolean"
: For true/false fields"number"
: For numeric fields["type"]
where type is one of the aboveExample usage with MCP Inspector:
# Basic example extracting company information mcp-inspector --tool extract-data --args '{ "urls": ["https://example.com"], "prompt": "Extract the company mission, whether it supports SSO, and whether it is open source.", "schema": { "company_mission": "string", "supports_sso": "boolean", "is_open_source": "boolean" } }' # Complex example with nested data mcp-inspector --tool extract-data --args '{ "urls": ["https://example.com/products", "https://example.com/pricing"], "prompt": "Extract product information including name, price, and features.", "schema": { "products": [{ "name": "string", "price": "number", "features": ["string"] }] } }'
Both tools will return appropriate error messages if the scraping or extraction fails and automatically log errors to Sentry if configured.
If you encounter issues: