浏览器自动化代理
STDIO智能网页自动化采集工具
智能网页自动化采集工具
A powerful browser automation tool built with MCP (Model Controlled Program) that combines web scraping capabilities with LLM-powered intelligence. This agent can search Google, navigate to webpages, and intelligently scrape content from various websites including GitHub, Stack Overflow, and documentation sites.
This project uses a client-server architecture powered by MCP:
git clone https://github.com/yourusername/browser-automation-agent.git cd browser-automation-agent
pip install -r requirements.txt
playwright install
.env file in the project root and add your Mistral AI API key:MISTRAL_API_KEY=your_api_key_here
python main.py
python client.py
Once both the server and client are running:
get_top_google_url🔍 Searches Google and returns the top result URL for a given query.
browse_and_scrape🌐 Navigates to a URL and scrapes content based on the website type.
scrape_github📂 Specializes in extracting README content and code blocks from GitHub repositories.
scrape_stackoverflow💬 Extracts questions, answers, comments, and code blocks from Stack Overflow pages.
scrape_documentation📚 Optimized for extracting documentation content and code examples.
scrape_generic🌐 Extracts paragraph text and code blocks from generic websites.
browser-automation-agent/
├── main.py            # MCP server implementation
├── client.py          # Mistral AI client implementation
├── requirements.txt   # Project dependencies
├── .env               # Environment variables (API keys)
└── README.md          # Project documentation
The agent generates two types of output files with timestamps:
final_page_YYYYMMDD_HHMMSS.png: Screenshot of the final page statescraped_content_YYYYMMDD_HHMMSS.txt: Extracted text content from the pageYou can modify the following parameters in the code:
width and height in browse_and_scrapeheadless=True for invisible browser operationnum_results in get_top_google_urlplaywright install.env filemain.py in client.py if neededContributions are welcome! Please feel free to submit a Pull Request.
Built with 🧩 MCP, 🎭 Playwright, and 🧠 Mistral AI