PentestThinkingMCP

A main and foundational implementation of the research paper:

LIMA: Leveraging Large Language Models and MCP Servers for Initial Machine Access

This project serves as the primary and naive system described in the paper, providing a practical framework for orchestrating automated initial-access reconnaissance, enumeration, and exploitation using large language models (LLMs) with Model Context Protocol (MCP) servers.

Research Context

This repository is the main codebase referenced in the paper LIMA: Leveraging Large Language Models and MCP Servers for Initial Machine Access. The research introduces LIMA, a modular system that pairs off-the-shelf LLMs with MCP servers to automate penetration testing tasks such as reconnaissance, enumeration, vulnerability assessment, and exploitation. The project demonstrates how LLMs can autonomously reason about attack paths, generate exploits, and complete initial access challenges with minimal human input, as benchmarked on public HackTheBox machines and custom environments.

Key highlights from the research:

LIMA achieves up to 2x faster completion than expert testers for certain tasks, including autonomous CAPTCHA solving and attack chain execution.
The system provides the first quantitative baseline for AI-augmented penetration testing at the initial-access phase.
The modular design allows other LLMs to be integrated easily for future research and development.

For more details, see the full paper or the summary in this repository.

What is PentestThinkingMCP?

PentestThinkingMCP is an advanced Model Context Protocol (MCP) server designed to empower both human and AI pentesters. It provides:

Automated attack path planning using Beam Search and Monte Carlo Tree Search (MCTS)
Step-by-step reasoning for CTFs, Hack The Box (HTB), and real-world pentests
Attack step scoring and prioritization
Tool recommendations for each step (e.g., nmap, metasploit, linpeas)
Critical path highlighting for the most promising exploit chains
Tree-based reasoning for reporting and documentation

Why is it special?

Brings LLMs to the next level: Transforms a normal LLM into a structured, methodical pentest planner and advisor
Automates complex reasoning: Finds multi-stage attack chains, not just single exploits
Works for CTFs, HTB, and real-world pentests: Adapts to any scenario where stepwise attack logic is needed
Bridges the gap between AI and hacking: Makes AI a true partner in offensive security

Features

Dual search strategies for attack modeling:
- Beam search with configurable width (for methodical exploit chain discovery)
- MCTS for complex decision spaces (for dynamic attack scenarios with unknowns)
Evidence/Vulnerability scoring and evaluation
Tree-based attack path analysis
Statistical analysis of potential attack vectors
MCP protocol compliance

How does it work?

Input:
You (or your AI) provide the current attack step/state (e.g., "Enumerate SMB on 10.10.10.10").
Reasoning:
The server uses Beam Search or MCTS to explore possible next steps, scoring and prioritizing them.
Output:
Returns the next best attack step, the full attack chain, recommended tool, and highlights the critical path.

Example Workflow: Solving an HTB Machine

Recon:
Input: attackStep: "Start with initial recon on 10.10.10.10"
Output: Run nmap -p- 10.10.10.10 (recommended tool: nmap)
Enumeration:
Input: attackStep: "Run nmap -p- 10.10.10.10"
Output: Enumerate SMB on port 445 (recommended tool: enum4linux)
Vulnerability Analysis:
Input: attackStep: "Enumerate SMB on port 445"
Output: Search for public SMB exploits (CVE-2017-0144) (recommended tool: searchsploit)
Exploitation:
Input: attackStep: "Search for public SMB exploits (CVE-2017-0144)"
Output: Exploit SMB with EternalBlue (CVE-2017-0144) (recommended tool: metasploit)
Privilege Escalation:
Input: attackStep: "Got shell as user"
Output: Run winPEAS for privilege escalation checks (recommended tool: winPEAS)
Root/Flag:
Input: attackStep: "Found user.txt, need root"
Output: Check for AlwaysInstallElevated misconfiguration (recommended tool: manual investigation)

Installation

git clone https://github.com/ibrahimsaleem/PentestThinkingMCP.git
cd PentestThinkingMCP
npm install
npm run build

Usage

Add to your MCP client (Cursor, Claude Desktop, etc.) as a server:

{
  "mcpServers": {
    "pentestthinkingMCP": {
      "command": "node",
      "args": ["path/to/pentestthinkingMCP/dist/index.js"]
    }
  }
}

Interact with it by sending attack steps and receiving next-step recommendations, tool suggestions, and attack path trees.

Search Strategies for Pentesting

Beam Search

Maintains a fixed-width set of the most promising attack paths or vulnerability chains.
Optimal for step-by-step exploit development and known vulnerability pattern matching.
Best for: Enumerating attack vectors, methodical vulnerability chaining, logical exploit pathfinding.

Monte Carlo Tree Search (MCTS)

Simulation-based exploration of the potential attack surface.
Balances exploration of novel attack vectors and exploitation of known weaknesses.
Best for: Complex network penetration tests, scenarios with uncertain outcomes, advanced persistent threat (APT) simulation.

Algorithm Details

Attack Vector Selection
- Beam Search: Evaluates and ranks multiple potential attack paths or exploit chains.
- MCTS: Uses UCT for node selection (potential exploit steps) and random rollouts (simulating attack progression).
Evidence/Vulnerability Scoring Based On:
- Likelihood of exploitability
- Potential impact (CIA triad)
- CVSS scores or similar metrics
- Strength of connection in an attack chain (e.g., vulnerability A enables exploit B)
Process Management
- Tree-based state tracking of attack progression
- Statistical analysis of successful/failed simulated attack paths
- Progress monitoring against pentest objectives

Use Cases

Automated vulnerability identification and chaining
Exploit pathfinding and optimization
Attack scenario simulation and "what-if" analysis
Red teaming strategy development and refinement
Assisting in manual pentesting by suggesting potential avenues
Decision tree exploration for complex attack vectors
Strategy optimization for achieving specific pentest goals (e.g., data exfiltration, privilege escalation)

License

MIT

Pentest Thinking