Doris MCP服务器
STDIO用于Apache Doris数据库交互的MCP服务
用于Apache Doris数据库交互的MCP服务
Doris MCP (Model Control Panel) Server is a backend service built with Python and FastAPI. It implements the MCP (Model Control Panel) protocol, allowing clients to interact with it through defined "Tools". It's primarily designed to connect to Apache Doris databases, potentially leveraging Large Language Models (LLMs) for tasks like converting natural language queries to SQL (NL2SQL), executing queries, and performing metadata management and analysis.
/sse
(initialization) and /mcp/messages
(communication) endpoints (src/sse_server.py
)./mcp
endpoint, supporting request/response and streaming (src/streamable_server.py
).src/stdio_server.py
), requires specific startup configuration.mcp_doris_exec_query
)mcp_doris_get_db_list
, mcp_doris_get_db_table_list
)mcp_doris_get_table_schema
, mcp_doris_get_table_comment
, mcp_doris_get_table_column_comments
, mcp_doris_get_table_indexes
)mcp_doris_get_recent_audit_logs
)
Note: Current tools primarily focus on direct DB operations.src/utils/db.py
)..env
file, supporting settings for database connections, LLM providers/models, API keys, logging levels, etc.src/utils/schema_extractor.py
).# Replace with the actual repository URL if different git clone https://github.com/apache/doris-mcp-server.git cd doris-mcp-server
pip install -r requirements.txt
Copy the .env.example
file to .env
and modify the settings according to your environment:
cp env.example .env
Key Environment Variables:
DB_HOST
: Database hostnameDB_PORT
: Database port (default 9030)DB_USER
: Database usernameDB_PASSWORD
: Database passwordDB_DATABASE
: Default database nameSERVER_HOST
: Host address the server listens on (default 0.0.0.0
)SERVER_PORT
: Port the server listens on (default 3000
)ALLOWED_ORIGINS
: CORS allowed origins (comma-separated, *
allows all)MCP_ALLOW_CREDENTIALS
: Whether to allow CORS credentials (default false
)LOG_DIR
: Directory for log files (default ./logs
)LOG_LEVEL
: Log level (e.g., INFO
, DEBUG
, WARNING
, ERROR
, default INFO
)CONSOLE_LOGGING
: Whether to output logs to the console (default false
)The following table lists the main tools currently available for invocation via an MCP client:
Tool Name | Description | Parameters | Status |
---|---|---|---|
mcp_doris_get_db_list | Get a list of all database names on the server. | random_string (string, Required) | ✅ Active |
mcp_doris_get_db_table_list | Get a list of all table names in the specified database. | random_string (string, Required), db_name (string, Optional, defaults to current db) | ✅ Active |
mcp_doris_get_table_schema | Get detailed structure of the specified table. | random_string (string, Required), table_name (string, Required), db_name (string, Optional) | ✅ Active |
mcp_doris_get_table_comment | Get the comment for the specified table. | random_string (string, Required), table_name (string, Required), db_name (string, Optional) | ✅ Active |
mcp_doris_get_table_column_comments | Get comments for all columns in the specified table. | random_string (string, Required), table_name (string, Required), db_name (string, Optional) | ✅ Active |
mcp_doris_get_table_indexes | Get index information for the specified table. | random_string (string, Required), table_name (string, Required), db_name (string, Optional) | ✅ Active |
mcp_doris_exec_query | Execute SQL query and return result command. | random_string (string, Required), sql (string, Required), db_name (string, Optional), max_rows (integer, Optional, default 100), timeout (integer, Optional, default 30) | ✅ Active |
mcp_doris_get_recent_audit_logs | Get audit log records for a recent period. | random_string (string, Required), days (integer, Optional, default 7), limit (integer, Optional, default 100) | ✅ Active |
Note: All tools require a random_string
parameter as a call identifier, typically handled automatically by the MCP client. "Optional" and "Required" refer to the tool's internal logic; the client might need to provide values for all parameters depending on its implementation. The tool names listed here are the base names; clients might see them prefixed (e.g., mcp_doris_stdio3_get_db_list
) depending on the connection mode.
If you use SSE mode, execute the following command:
./start_server.sh
This command starts the FastAPI application, providing both SSE and Streamable HTTP MCP services by default.
Service Endpoints:
http://<host>:<port>/sse
http://<host>:<port>/mcp/messages
(POST)http://<host>:<port>/mcp
(Supports GET, POST, DELETE, OPTIONS)http://<host>:<port>/health
http://<host>:<port>/status
(Confirm if implemented in main.py
)Interaction with the Doris MCP Server requires an MCP Client. The client connects to the server's SSE or Streamable HTTP endpoints and sends requests (like tool_call
) according to the MCP specification to invoke the server's tools.
Main Interaction Flow:
/sse
(SSE) or send an initialize
method call to /mcp
(Streamable).mcp/listTools
or mcp/listOfferings
to get the list of supported tools, their descriptions, and parameter schemas.tool_call
message/request, specifying the tool_name
and arguments
.
tool_name
: mcp_doris_get_table_schema
(or the mode-specific name)arguments
: Include random_string
, table_name
, db_name
.result
or error
.tools/progress
notifications, followed by a final response containing the result
or error
.Specific tool names and parameters should be referenced from the src/tools/
code or obtained via MCP discovery mechanisms.
You can connect Cursor to this MCP server using either Stdio or SSE mode.
Stdio mode allows Cursor to manage the server process directly. Configuration is done within Cursor's MCP Server settings file (typically ~/.cursor/mcp.json
or similar).
If you use stdio mode, please execute the following command to download and build the environment dependency package, but please note that you need to change the project path to the correct path address:
uv --project /your/path/doris-mcp-server run doris-mcp
Configure Cursor: Add an entry like the following to your Cursor MCP configuration:
{ "mcpServers": { "doris-stdio": { "command": "uv", "args": ["--project", "/path/to/your/doris-mcp-server", "run", "doris-mcp"], "env": { "DB_HOST": "127.0.0.1", "DB_PORT": "9030", "DB_USER": "root", "DB_PASSWORD": "your_db_password", "DB_DATABASE": "your_default_db" } }, // ... other server configurations ... } }
Key Points:
/path/to/your/doris-mcp
with the actual absolute path to the project's root directory on your system. The --project
argument is crucial for uv
to find the pyproject.toml
and run the correct command.command
is set to uv
(assuming you use uv
for package management as indicated by uv.lock
). The args
include --project
, the path, run
, and mcp-doris
(which should correspond to a script defined in your pyproject.toml
).DB_HOST
, DB_PORT
, DB_USER
, DB_PASSWORD
, DB_DATABASE
) are set directly in the env
block within the configuration file. Cursor will pass these to the server process. No .env
file is needed for this mode when configured via Cursor.SSE mode requires you to run the MCP server independently first, and then tell Cursor how to connect to it.
Configure .env
: Ensure your database credentials and any other necessary settings (like SERVER_PORT
if not using the default 3000) are correctly configured in the .env
file within the project directory.
Start the Server: Run the server from your terminal in the project's root directory:
./start_server.sh
This script typically reads the .env
file and starts the FastAPI server in SSE mode (check the script and sse_server.py
/ main.py
for specifics). Note the host and port the server is listening on (default is 0.0.0.0:3000
).
Configure Cursor: Add an entry like the following to your Cursor MCP configuration, pointing to the running server's SSE endpoint:
{ "mcpServers": { "doris-sse": { "url": "http://127.0.0.1:3000/sse" // Adjust host/port if your server runs elsewhere }, // ... other server configurations ... } }
Note: The example uses the default port 3000
. If your server is configured to run on a different port (like 3010
in the user example), adjust the URL accordingly.
After configuring either mode in Cursor, you should be able to select the server (e.g., doris-stdio
or doris-sse
) and use its tools.
doris-mcp-server/
├── doris_mcp_server/ # Source code for the MCP server
│ ├── main.py # Main entry point, FastAPI app definition
│ ├── mcp_core.py # Core MCP tool registration and Stdio handling
│ ├── sse_server.py # SSE server implementation
│ ├── streamable_server.py # Streamable HTTP server implementation
│ ├── config.py # Configuration loading
│ ├── tools/ # MCP tool definitions
│ │ ├── mcp_doris_tools.py # Main Doris-related MCP tools
│ │ ├── tool_initializer.py # Tool registration helper (used by mcp_core.py)
│ │ └── __init__.py
│ ├── utils/ # Utility classes and helper functions
│ │ ├── db.py # Database connection and operations
│ │ ├── logger.py # Logging configuration
│ │ ├── schema_extractor.py # Doris metadata/schema extraction logic
│ │ ├── sql_executor_tools.py # SQL execution helper (might be legacy)
│ │ └── __init__.py
│ └── __init__.py
├── logs/ # Log file directory (if file logging enabled)
├── README.md # This file
├── .env.example # Example environment variable file
├── requirements.txt # Python dependencies for pip
├── pyproject.toml # Project metadata and build system configuration (PEP 518)
├── uv.lock # Lock file for 'uv' package manager (alternative to pip)
├── start_server.sh # Script to start the server
└── restart_server.sh # Script to restart the server
This section outlines the process for adding new MCP tools to the Doris MCP Server, considering the current project structure.
Before writing new database interaction logic from scratch, check the existing utility modules:
doris_mcp_server/utils/db.py
: Provides basic functions for getting database connections (get_db_connection
) and executing raw queries (execute_query
, execute_query_df
).doris_mcp_server/utils/schema_extractor.py
(MetadataExtractor
class): Offers high-level methods to retrieve database metadata, such as listing databases/tables (get_all_databases
, get_database_tables
), getting table schemas/comments/indexes (get_table_schema
, get_table_comment
, get_column_comments
, get_table_indexes
), and accessing audit logs (get_recent_audit_logs
). It includes caching mechanisms.doris_mcp_server/utils/sql_executor_tools.py
(execute_sql_query
function): Provides a wrapper around db.execute_query
that includes security checks (optional, controlled by ENABLE_SQL_SECURITY_CHECK
env var), adds automatic LIMIT
to SELECT queries, handles result serialization (dates, decimals), and formats the output into the standard MCP success/error structure. It's recommended to use this for executing user-provided or generated SQL.You can import and combine functionalities from these modules to build your new tool.
Implement the core logic for your new tool as an async
function within doris_mcp_server/tools/mcp_doris_tools.py
. This keeps the primary tool implementations centralized. Ensure your function returns data in a format that can be easily wrapped into the standard MCP response structure (see _format_response
in the same file for reference).
Example: Let's create a simple tool get_server_time
.
# In doris_mcp_server/tools/mcp_doris_tools.py import datetime # ... other imports ... from doris_mcp_server.tools.mcp_doris_tools import _format_response # Reuse formatter # ... existing tools ... async def mcp_doris_get_server_time() -> Dict[str, Any]: """Gets the current server time.""" logger.info(f"MCP Tool Call: mcp_doris_get_server_time") try: current_time = datetime.datetime.now().isoformat() # Use the existing formatter for consistency return _format_response(success=True, result={"server_time": current_time}) except Exception as e: logger.error(f"MCP tool execution failed mcp_doris_get_server_time: {str(e)}", exc_info=True) return _format_response(success=False, error=str(e), message="Error getting server time")
Due to the separate handling of SSE/Streamable and Stdio modes, you need to register the tool in two places:
A. SSE/Streamable Registration (tool_initializer.py
)
mcp_doris_tools.py
.register_mcp_tools
function, add a new wrapper function decorated with @mcp.tool()
.random_string
parameter description for client compatibility, even if your wrapper doesn't explicitly use it.Example (tool_initializer.py
):
# In doris_mcp_server/tools/tool_initializer.py # ... other imports ... from doris_mcp_server.tools.mcp_doris_tools import ( # ... existing tool imports ... mcp_doris_get_server_time # <-- Import the new tool ) async def register_mcp_tools(mcp): # ... existing tool registrations ... # Register Tool: Get Server Time @mcp.tool("get_server_time", description="""[Function Description]: Get the current time of the MCP server.\n [Parameter Content]:\n - random_string (string) [Required] - Unique identifier for the tool call\n""") async def get_server_time_tool() -> Dict[str, Any]: """Wrapper: Get server time""" # Note: No parameters needed for the core function call here return await mcp_doris_get_server_time() # ... logging registration count ...
B. Stdio Registration (mcp_core.py
)
@stdio_mcp.tool()
.mcp_doris_get_server_time
) inside the wrapper function (delayed import pattern used in this file).async def
depending on how FastMCP
handles tools in Stdio mode, even if the underlying function is simple (as seen in the current file structure). Ensure the call matches (e.g., use await
if calling an async function).Example (mcp_core.py
):
# In doris_mcp_server/mcp_core.py # ... other imports and setup ... # ... existing Stdio tool registrations ... # Register Tool: Get Server Time (for Stdio) @stdio_mcp.tool("get_server_time", description="""[Function Description]: Get the current time of the MCP server.\n [Parameter Content]:\n - random_string (string) [Required] - Unique identifier for the tool call\n""") async def get_server_time_tool_stdio() -> Dict[str, Any]: # Using a slightly different wrapper name for clarity if needed """Wrapper: Get server time (Stdio)""" from doris_mcp_server.tools.mcp_doris_tools import mcp_doris_get_server_time # <-- Delayed import # Assuming the Stdio runner handles async wrappers correctly return await mcp_doris_get_server_time() # --- Register Tools --- (Or wherever the registrations are finalized)
After implementing and registering the tool in both files, restart the MCP server (both SSE mode via ./start_server.sh
and ensure the Stdio command used by Cursor is updated if necessary) and test the new tool using your MCP client (like Cursor) in both connection modes.
Contributions are welcome via Issues or Pull Requests.
This project is licensed under the Apache 2.0 License. See the LICENSE file (if it exists) for details.