Connects Claude to any HiveServer2-compatible system (Spark, EMR, Hive, Impala) over the Thrift protocol. Exposes four tools: list_databases, list_tables, describe_table, and execute_query for read-only SQL operations. Enforces safety by blocking non-SELECT statements and automatically limiting unbounded queries. Supports multiple auth methods including LDAP, Kerberos, and NOSASL. Built for AWS EMR workflows with SSH tunnel support, though it works with any Spark cluster exposing port 10000. Useful when you need Claude to explore schemas and run analytics queries against production data lakes without write access. Credentials stay local via environment variables. Ships with Docker Compose setup and sample data for local testing.
claude mcp add --transport stdio aidancorrell-spark-sql-mcp-server -- uvx spark-sql-mcp-server