Wraps the Alembica structured extraction library as an MCP server, giving you tools to validate extraction schemas, estimate token costs across different LLM providers, run semantic extraction jobs on unstructured text, and query available schemas. Built for research workflows where you need to transform documents, articles, or corpora into structured datasets using OpenAI, Anthropic, Google, Cohere, DeepSeek, or self-hosted models. The cost estimation tool is handy for budgeting before running large extraction jobs, though it only works with public API providers. Runs via stdio transport from the Go binary or GHCR container. Reach for this when you're building data pipelines that need repeatable LLM extraction with cost visibility.
claude mcp add --transport stdio io.github.open-and-sustainable-alembica-mcp uvx alembica-mcp