Elasticsearch语义搜索
基于Elasticsearch的博客语义搜索服务
基于Elasticsearch的博客语义搜索服务
Demo repo for: https://j.blaszyk.me/tech-blog/mcp-server-elasticsearch-semantic-search/
This repository provides a Python implementation of an MCP server for semantic search through Search Labs blog posts indexed in Elasticsearch.
It assumes you've crawled the blog posts and stored them in the search-labs-posts
index using Elastic Open Crawler.
Add ES_URL
and ES_AP_KEY
into .env
file, (take a look here for generating api key with minimum permissions)
Start the server in MCP Inspector:
make dev
Once running, access the MCP Inspector at: http://localhost:5173
To add the MCP server to Claude Desktop:
make install-claude-config
This updates claude_desktop_config.json
in your home directory. On the next restart, the Claude app will detect the server and load the declared tool.
To check if the Elastic Open Crawler works, run:
docker run --rm \ --entrypoint /bin/bash \ -v "$(pwd)/crawler-config:/app/config" \ --network host \ docker.elastic.co/integrations/crawler:latest \ -c "bin/crawler crawl config/test-crawler.yml"
This should print crawled content from a single page.
Set up Elasticsearch URL and API Key.
Generate an API key with minimum crawler permissions:
POST /_security/api_key { "name": "crawler-search-labs", "role_descriptors": { "crawler-search-labs-role": { "cluster": ["monitor"], "indices": [ { "names": ["search-labs-posts"], "privileges": ["all"] } ] } }, "metadata": { "application": "crawler" } }
Copy the encoded
value from the response and set it as API_KEY
.
Ensure the search-labs-posts
index exists. If not, create it:
PUT search-labs-posts
Update the mapping to enable semantic search:
PUT search-labs-posts/_mappings { "properties": { "body": { "type": "text", "copy_to": "semantic_body" }, "semantic_body": { "type": "semantic_text", "inference_id": ".elser-2-elasticsearch" } } }
The body
field is indexed as semantic text using Elasticsearch’s ELSER model.
Run the crawler to populate the index:
docker run --rm \ --entrypoint /bin/bash \ -v "$(pwd)/crawler-config:/app/config" \ --network host \ docker.elastic.co/integrations/crawler:latest \ -c "bin/crawler crawl config/elastic-search-labs-crawler.yml"
[!TIP] If using a fresh Elasticsearch cluster, wait for the ELSER model to start before indexing.
Check if the documents were indexed:
GET search-labs-posts/_count
This will return the total document count in the index. You can also verify in Kibana.
Done! You can now perform semantic searches on Search Labs blog posts