
Data Product
STDIOMCP server for discovering data products and executing queries with governance in Data Mesh Manager
MCP server for discovering data products and executing queries with governance in Data Mesh Manager
A Model Context Protocol (MCP) server for discovering data products and requesting access in Data Mesh Manager, and executing queries on the data platform to access business data.
https://github.com/user-attachments/assets/8c8cd04d-33f6-4e33-856f-6141a41af2bb
Idea: Enable AI agents to find and access any data product for semantic business context while enforcing data governance policies.
or, if you prefer:
Enable AI to answer any business question.
Data Products are managed high-quality business data sets shared with other teams within an organization and specified by data contracts. Data contracts describe the structure, semantics, quality, and terms of use. Data products provide the semantic context AI needs to understand not just what data exists, but what it means and how to use it correctly. We use Data Mesh Manager as a data product marketplace to search for available data products and evaluate if these are relevant for the task by analyzing its metadata.
Once a data product is identified, data governance plays a crucial role in ensuring that access to data products is controlled, queries are in line with the data contract's terms of use, and its compliance with organizational global policies. If necessary, the AI agent can request access to the data product's output port, which may require manual approval from the data product owner.
Finally, the LLM can generate SQL queries based on the data contracts data model descriptions and semantics. The SQL queries are executed, while security guardrails are in place to ensure that no sensitive data is misused and attack vectors (such as prompt injections) are mitigated. The results are returned to the AI agent, which can then use them to answer the original business question.
Steps:
Data Mesh Manager serves as the central data product marketplace and governance layer, providing metadata, access controls, and data contracts for all data products in your organization.
Data Platforms (Snowflake, Databricks, etc.) host the actual data and execute queries. The MCP server connects to these platforms to run SQL queries against the data products you have access to.
dataproduct_search
search_term
(string): Search term to filter data products. Searches in the id, title, and description. Multiple search terms are supported, separated by space.dataproduct_get
data_product_id
(string): The data product ID.dataproduct_request_access
data_product_id
(string): The data product ID.output_port_id
(string): The output port ID.purpose
(string): The specific purpose what the user is doing with the data and the reason why they need access. If the access request needs to be approved by the data owner, the purpose is used by the data owner to decide if the access is eligible from a business, technical, and governance point of view.dataproduct_query
data_product_id
(string): The data product ID.output_port_id
(string): The output port ID.query
(string): The SQL query to execute.Add this entry to your MCP client configuration:
{ "mcpServers": { "dataproduct": { "command": "uvx", "args": [ "dataproduct_mcp" ], "env": { "DATAMESH_MANAGER_API_KEY": "dmm_live_user_...", "SNOWFLAKE_USER": "", "SNOWFLAKE_PASSWORD": "", "SNOWFLAKE_ROLE": "", "SNOWFLAKE_WAREHOUSE": "COMPUTE_WH", "DATABRICKS_HOST": "adb-xxx.azuredatabricks.net", "DATABRICKS_HTTP_PATH": "/sql/1.0/warehouses/xxx", "DATABRICKS_CLIENT_ID": "", "DATABRICKS_CLIENT_SECRET": "" } } } }
This is the format for Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json
), other MCP clients have similar config options.
In Data Mesh Manager, create an API Key with scope "User (personal access token)".
Add the properties for Snowflake, Databricks, etc. as needed.
(Yes, we will work on OAuth2 based authentication to get rid of these access tokens)
The dataproduct_query
tool supports executing queries on data products. The MCP client formulates SQL queries based on the data contract with its data model structure and semantics.
The following server types are currently supported out-of-the-box:
Server Type | Status | Notes |
---|---|---|
Snowflake | ✅ | Requires SNOWFLAKE_USER, SNOWFLAKE_PASSWORD, SNOWFLAKE_WAREHOUSE, SNOWFLAKE_ROLE environment variables |
Databricks | ✅ | Requires DATABRICKS_HOST, DATABRICKS_HTTP_PATH, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET environment variables |
S3 | Coming soon | Implemented through DuckDB client |
BigQuery | Coming soon | |
Fabric | Coming soon |
Note: Use additional Platform-specific MCP servers for other data platform types (e.g., BigQuery, Redshift, PostgreSQL) by adding them to your MCP client.
See CONTRIBUTING.md for development setup and contribution guidelines.
Maintained by Simon Harrer, André Deuerling, and Jochen Christ.