icon for mcp server

OpenAlex作者消歧

STDIO

基于OpenAlex的作者消歧与学术研究MCP服务器

OpenAlex MCP Server

OpenAlex Author Disambiguation MCP Server

MCP Python OpenAlex License Optimized

A streamlined Model Context Protocol (MCP) server for author disambiguation and academic research using the OpenAlex.org API. Specifically designed for AI agents with optimized data structures and enhanced functionality.


🎯 Key Features

🔍 Core Capabilities

  • Advanced Author Disambiguation: Handles complex career transitions and name variations
  • Institution Resolution: Current and past affiliations with transition tracking
  • Academic Work Retrieval: Journal articles, letters, and research papers
  • Citation Analysis: H-index, citation counts, and impact metrics
  • ORCID Integration: Highest accuracy matching with ORCID identifiers

🚀 AI Agent Optimized

  • Streamlined Data: Focused on essential information for disambiguation
  • Fast Processing: Optimized data structures for rapid analysis
  • Smart Filtering: Enhanced filtering options for targeted queries
  • Clean Output: Structured responses optimized for AI reasoning

🤖 Agent Integration

  • Multiple Candidates: Ranked results for automated decision-making
  • Structured Responses: Clean, parseable output optimized for LLMs
  • Error Handling: Graceful degradation with informative messages
  • Enhanced Filtering: Journal-only, citation thresholds, and temporal filters

🏛️ Professional Grade

  • MCP Best Practices: Built with FastMCP following official guidelines
  • Tool Annotations: Proper MCP tool annotations for optimal client integration
  • Resource Management: Efficient HTTP client management and cleanup
  • Rate Limiting: Respectful API usage with proper delays

🚀 Quick Start

Prerequisites

  • Python 3.10 or higher
  • MCP-compatible client (e.g., Claude Desktop)
  • Email address (for OpenAlex API courtesy)

Installation

For detailed installation instructions, see INSTALL.md.

  1. Clone the repository:

    git clone https://github.com/drAbreu/alex-mcp.git cd alex-mcp
  2. Create a virtual environment:

    python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
  3. Install the package:

    pip install -e .
  4. Configure environment:

    export OPENALEX_MAILTO=[email protected]
  5. Run the server:

    ./run_alex_mcp.sh # Or, if installed as a CLI tool: alex-mcp

⚙️ MCP Configuration

Claude Desktop Configuration

Add to your Claude Desktop configuration file:

{ "mcpServers": { "alex-mcp": { "command": "/path/to/alex-mcp/run_alex_mcp.sh", "env": { "OPENALEX_MAILTO": "[email protected]" } } } }

Replace /path/to/alex-mcp with the actual path to the repository on your system.


🤖 Using with AI Agents

OpenAI Agents Integration

You can load this MCP server in your OpenAI agent workflow using the agents.mcp.MCPServerStdio interface:

from agents.mcp import MCPServerStdio async with MCPServerStdio( name="OpenAlex MCP For Author disambiguation and works", cache_tools_list=True, params={ "command": "uvx", "args": [ "--from", "git+https://github.com/drAbreu/[email protected]", "alex-mcp" ], "env": { "OPENALEX_MAILTO": "[email protected]" } }, client_session_timeout_seconds=10 ) as alex_mcp: await alex_mcp.connect() tools = await alex_mcp.list_tools() print(f"Available tools: {[tool.name for tool in tools]}")

Academic Research Agent Integration

This MCP server is specifically optimized for academic research workflows:

# Optimized for academic research workflows from alex_agent import run_author_research # Enhanced functionality with streamlined data result = await run_author_research( "Find J. Abreu at EMBO with recent publications" ) # Clean, structured output for AI processing print(f"Success: {result['workflow_metadata']['success']}") print(f"Quality: {result['research_result']['metadata']['result_analysis']['quality_score']}/100")

Direct Launch with uvx

# Standard launch uvx --from git+https://github.com/drAbreu/[email protected] alex-mcp # With environment variables OPENALEX_MAILTO=[email protected] uvx --from git+https://github.com/drAbreu/[email protected] alex-mcp

🛠️ Available Tools

1. autocomplete_authors ⭐ NEW

Get multiple author candidates using OpenAlex autocomplete API for intelligent disambiguation.

Parameters:

  • name (required): Author name to search (e.g., "James Briscoe", "M. Ralser")
  • context (optional): Context for disambiguation (e.g., "Francis Crick Institute developmental biology")
  • limit (optional): Maximum candidates (1-10, default: 5)

Key Features:

  • Fast: ~200ms response time
  • 🎯 Smart: Multiple candidates with institutional hints
  • 🧠 AI-Ready: Perfect for context-based selection
  • 📊 Rich: Works count, citations, institution info

Streamlined Output:

{ "query": "James Briscoe", "context": "Francis Crick Institute", "total_candidates": 3, "candidates": [ { "openalex_id": "https://openalex.org/A5019391436", "display_name": "James Briscoe", "institution_hint": "The Francis Crick Institute, UK", "works_count": 415, "cited_by_count": 24623, "external_id": "https://orcid.org/0000-0002-1020-5240" } ] }

Usage Pattern:

# Get multiple candidates for disambiguation candidates = await autocomplete_authors( "James Briscoe", context="Francis Crick Institute developmental biology" ) # AI selects best match based on institutional context # Much more accurate than single search result!

2. search_authors

Search for authors with streamlined output for AI agents.

Parameters:

  • name (required): Author name to search
  • institution (optional): Institution name filter
  • topic (optional): Research topic filter
  • country_code (optional): Country code filter (e.g., "US", "DE")
  • limit (optional): Maximum results (1-25, default: 20)

Streamlined Output:

{ "query": "J. Abreu", "total_count": 3, "results": [ { "id": "https://openalex.org/A123456789", "display_name": "Jorge Abreu-Vicente", "orcid": "https://orcid.org/0000-0000-0000-0000", "display_name_alternatives": ["J. Abreu-Vicente", "Jorge Abreu Vicente"], "affiliations": [ { "institution": { "display_name": "European Molecular Biology Organization", "country_code": "DE" }, "years": [2023, 2024, 2025] } ], "cited_by_count": 316, "works_count": 25, "summary_stats": { "h_index": 9, "i10_index": 5 }, "x_concepts": [ { "display_name": "Astrophysics", "score": 0.8 }, { "display_name": "Machine Learning", "score": 0.6 } ] } ] }

Features: Clean structure optimized for AI reasoning and disambiguation


2. retrieve_author_works

Retrieve works for a given author with enhanced filtering capabilities.

Parameters:

  • author_id (required): OpenAlex author ID
  • limit (optional): Maximum results (1-50, default: 20)
  • order_by (optional): "date" or "citations" (default: "date")
  • publication_year (optional): Filter by specific year
  • type (optional): Work type filter (e.g., "journal-article")
  • authorships_institutions_id (optional): Filter by institution
  • is_retracted (optional): Filter retracted works
  • open_access_is_oa (optional): Filter by open access status

Enhanced Output:

{ "author_id": "https://openalex.org/A123456789", "total_count": 25, "results": [ { "id": "https://openalex.org/W123456789", "title": "A platform for the biomedical application of large language models", "doi": "10.1038/s41587-024-02534-3", "publication_year": 2025, "type": "journal-article", "cited_by_count": 42, "authorships": [ { "author": { "display_name": "Jorge Abreu-Vicente" }, "institutions": [ { "display_name": "European Molecular Biology Organization" } ] } ], "locations": [ { "source": { "display_name": "Nature Biotechnology", "type": "journal" } } ], "open_access": { "is_oa": true }, "primary_topic": { "display_name": "Biomedical Engineering" } } ] }

Features: Comprehensive work data with flexible filtering for targeted queries


📊 Data Optimization

Focused Information Architecture

This MCP server provides focused, structured data specifically designed for AI agent consumption:

Author Data Features

  • Identity Resolution: Names, ORCID, alternatives for disambiguation
  • Affiliation Tracking: Current and historical institutional connections
  • Impact Metrics: Citation counts, h-index, and scholarly impact
  • Research Context: Fields, concepts, and domain expertise
  • Career Analysis: Temporal affiliation changes and transitions

Work Data Features

  • Publication Metadata: Title, DOI, venue, and publication details
  • Impact Assessment: Citation counts and scholarly influence
  • Access Information: Open access status and availability
  • Authorship Details: Complete author lists and institutional affiliations
  • Research Classification: Topics, concepts, and domain categorization

Enhanced Filtering

# Target high-impact journal articles works = await retrieve_author_works( author_id="https://openalex.org/A123456789", type="journal-article", # Focus on journal publications open_access_is_oa=True, # Open access only order_by="citations", # Most cited first limit=15 ) # Career transition analysis authors = await search_authors( name="J. Abreu", institution="EMBO", # Current institution topic="Machine Learning", # Research focus limit=10 )

🧪 Example Usage

Author Disambiguation

from alex_mcp.server import search_authors_core # Comprehensive author search results = search_authors_core( name="J Abreu Vicente", institution="EMBO", topic="Machine Learning", limit=20 ) print(f"Found {results.total_count} candidates") for author in results.results: print(f"- {author.display_name}") if author.affiliations: current_inst = author.affiliations[0].institution.display_name print(f" Institution: {current_inst}") print(f" Metrics: {author.cited_by_count} citations, h-index {author.summary_stats.h_index}") if author.x_concepts: fields = [c.display_name for c in author.x_concepts[:3]] print(f" Research: {', '.join(fields)}")

Academic Work Analysis

from alex_mcp.server import retrieve_author_works_core # Comprehensive work retrieval works = retrieve_author_works_core( author_id="https://openalex.org/A5058921480", type="journal-article", # Academic focus order_by="citations", # Impact-based ordering limit=20 ) print(f"Found {works.total_count} publications") for work in works.results: print(f"- {work.title}") if work.locations: journal = work.locations[0].source.display_name print(f" Published in: {journal} ({work.publication_year})") print(f" Impact: {work.cited_by_count} citations") if work.open_access and work.open_access.is_oa: print(" ✓ Open Access")

Institution and Field Analysis

# Analyze career transitions def analyze_career_path(author_result): affiliations = author_result.affiliations if len(affiliations) > 1: print("Career path:") for aff in sorted(affiliations, key=lambda x: min(x.years)): years = f"{min(aff.years)}-{max(aff.years)}" print(f" {years}: {aff.institution.display_name}") # Research evolution if author_result.x_concepts: print("Research areas:") for concept in author_result.x_concepts[:5]: print(f" {concept.display_name} (score: {concept.score:.2f})") # Usage results = search_authors_core("Jorge Abreu Vicente") if results.results: analyze_career_path(results.results[0])

🔧 Configuration Options

Environment Variables

# Required export OPENALEX_MAILTO=[email protected] # Optional settings export OPENALEX_MAX_AUTHORS=100 # Maximum authors per query export OPENALEX_USER_AGENT=research-agent-v1.0 export ALEX_MCP_VERSION=4.1.0 # Rate limiting (respectful usage) export OPENALEX_RATE_PER_SEC=10 export OPENALEX_RATE_PER_DAY=100000

Performance Tuning

# For comprehensive research applications config = { "max_authors_per_query": 25, # Detailed author analysis "max_works_per_author": 50, # Complete publication history "enable_all_filters": True, # Full filtering capabilities "detailed_affiliations": True, # Complete institutional data "research_concepts": True # Detailed concept analysis }

🧑‍💻 Development & Testing

Project Structure

alex-mcp/
├── src/alex_mcp/
│   ├── server.py              # Main MCP server
│   ├── data_objects.py        # Data models and structures
│   └── utils.py               # Utility functions
├── examples/
│   ├── basic_usage.py         # Simple examples
│   ├── advanced_queries.py    # Complex query examples
│   └── integration_demo.py    # AI agent integration
├── tests/
│   ├── test_server.py         # Server functionality tests
│   └── test_integration.py    # Integration tests
└── docs/
    └── api_reference.md       # Detailed API documentation

Running Tests

# Install test dependencies pip install -e ".[test]" # Run functionality tests pytest tests/test_server.py -v # Test with real queries python examples/basic_usage.py # Test AI agent integration python examples/integration_demo.py

Development Examples

# Test author disambiguation python examples/basic_usage.py --query "J. Abreu" --institution "EMBO" # Test work retrieval python examples/advanced_queries.py --author-id "A123456789" --type "journal-article" # Test integration patterns python examples/integration_demo.py --workflow "career-analysis"

📈 Integration Examples

Academic Research Workflows

Perfect integration with AI-powered research analysis:

# Enhanced academic research agent from alex_agent import AcademicResearchAgent agent = AcademicResearchAgent( mcp_servers=[alex_mcp], # Streamlined data processing model="gpt-4.1-2025-04-14" ) # Complex research queries with structured data result = await agent.research_author( "Find J. Abreu at EMBO with machine learning publications" ) # Rich, structured output for AI reasoning print(f"Quality Score: {result.quality_score}/100") print(f"Author disambiguation: {result.confidence}") print(f"Research fields: {result.research_domains}")

Multi-Agent Systems

# Collaborative research analysis async def research_collaboration_network(seed_author): # Find primary author authors = await alex_mcp.search_authors(seed_author) primary = authors['results'][0] # Get their works works = await alex_mcp.retrieve_author_works( primary['id'], type="journal-article" ) # Analyze co-authors and build network collaborators = set() for work in works['results']: for authorship in work.get('authorships', []): collaborators.add(authorship['author']['display_name']) return { 'primary_author': primary, 'publication_count': len(works['results']), 'collaborator_network': list(collaborators), 'research_impact': sum(w['cited_by_count'] for w in works['results']) }

🤝 Contributing

We welcome contributions to improve functionality and add new features:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/enhanced-filtering
  3. Add tests: Ensure your changes maintain data quality and structure
  4. Submit a pull request: Include examples and documentation

Development Priorities

  • Enhanced filtering capabilities
  • Additional data enrichment
  • Performance optimizations
  • Integration examples
  • Documentation improvements

📄 License

This project is licensed under the MIT License. See LICENSE for details.


🌐 Links

MCP Now 重磅来袭,抢先一步体验