
Open Census
STDIOMCP server providing natural language access to U.S. Census demographic data and statistics
MCP server providing natural language access to U.S. Census demographic data and statistics
This is an independent, open-source experiment.
It is not affiliated with, endorsed by, or sponsored by the U.S. Census Bureau or the Department of Commerce.
Data retrieved through this project remains subject to the terms of the original data providers (e.g., Census API Terms of Service).
Container build 2.0 is decent, but still issues with the quality of results.
Turn any AI assistant into your personal Census data expert. Ask questions in plain English, get accurate demographic data with proper interpretation and context.
Before: "I need ACS Table B19013 for FIPS code 24510 with margin of error calculations..."
After: "What's the median income in Baltimore compared to Maryland?"
# Run the Census MCP server (one command - everything included!) docker run -e CENSUS_API_KEY=your_key ghcr.io/brockwebb/census-mcp-server:latest # Or without API key (still works, just slower rate limits) docker run ghcr.io/brockwebb/census-mcp-server:latest
Add to your claude_desktop_config.json
:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{ "mcpServers": { "census-mcp": { "command": "docker", "args": ["run", "--rm", "-i", "-e", "CENSUS_API_KEY=your_census_key", "ghcr.io/brockwebb/census-mcp-server:latest"] } } }
That's it! Restart Claude Desktop and you'll see a 🔨 hammer icon indicating the Census tools are available.
Ask Claude questions like:
U.S. Census data is incredibly valuable but has a steep learning curve for non-specialists. Even experienced researchers struggle with geographic hierarchies, variable naming conventions, margin of error calculations, and knowing which data combinations actually work. In our experience, the biggest impediment to demographic analysis is often just figuring out how to get the right data in the first place.
Today: Census data influences billions in government spending and policy decisions, but accessing it effectively requires specialized knowledge that creates barriers for many potential users.
Opportunity: The semantic matching from natural language to Census variable codes is genuinely novel and could be transformative for researchers who currently have to navigate thousands of cryptic variable names manually. Everyone should have access to high-quality Census data with a world-class AI assistant.
Tomorrow: City council members fact-check claims in real-time during meetings. Journalists get demographic context while writing stories. Nonprofits understand their communities without hiring statisticians. Researchers spend time analyzing instead of wrestling with APIs.
The Goal: Make America's most valuable public dataset as easy to use as asking a question.
graph LR A["User Question: Poverty rate in rural counties?"] --> B["AI Assistant (Claude, ChatGPT, etc.)"] B --> C["Census MCP Server (Domain Expertise Layer)"] C --> D["tidycensus R Package (Geography & Variable Resolution)"] C --> H["Knowledge Base (R Documentation + Census Methodology)"] D --> E["Census Bureau API (Official Data Source)"] H --> C E --> D D --> C C --> F["Interpreted Results + Context + Caveats"] F --> B B --> G["User gets accurate answer with proper interpretation"] style C fill:#e1f5fe style F fill:#f3e5f5
🏗️ Complete Self-Contained System:
📚 Built-in Intelligence:
🔄 No Setup Required:
The system consists of five main layers:
Each layer handles its specialized function, creating a maintainable system that can evolve as both AI tools and Census data infrastructure change.
What Works Now:
Future Expansion: Additional surveys, geographic visualizations, multi-agency integration
Built on: Model Context Protocol (MCP) for AI tool integration
Core Engine: tidycensus R package by Kyle Walker
Knowledge Base: ChromaDB vector database with sentence transformers
Container: ~4GB Docker image with all dependencies
Rate Limiting: Built-in throttling and caching strategies
If you want to modify or extend the system:
git clone https://github.com/brockwebb/census-mcp-server.git cd census-mcp-server # Build from source (requires R, Python, and substantial setup) ./build.sh # Or use the pre-built container and modify as needed docker run -v $(pwd):/workspace ghcr.io/brockwebb/census-mcp-server:latest
This project builds on exceptional work by:
Special thanks to Kyle Walker whose tidycensus documentation and methodology formed the foundation of our knowledge base. This project essentially wraps tidycensus with natural language intelligence - all the hard statistical and geographic work was solved by the tidycensus team.
This project aims to democratize access to public data. We welcome contributions in:
RPC DCE RDF ↓ ↓ ↓ CORBA DCOM OWL ↓ ↓ ↓ └─→ SOAP ←┘ SPARQL ↓ ↓ REST Knowledge Graphs ↓ ↓ GraphQL ↓ ↓ ↓ MCP ←─────────────→ LLMs
"The patterns never really die, they just get better UX"