This handles local LLM integration through llama.cpp and Ollama, focused on running models securely without cloud dependencies. You get patterns for prompt injection prevention, resource limits to avoid DoS, and model loading with integrity checks. The implementation leans heavily on input sanitization and output filtering since you're processing untrusted prompts. Worth noting it includes specific version requirements tied to CVE fixes (there's a template injection vulnerability in older llama-cpp-python versions). The patterns are built for a voice assistant use case called JARVIS, but the security boundaries and performance optimizations apply to any local LLM deployment where you care about sub-500ms latency and memory constraints.
npx skills add https://github.com/martinholovsky/claude-skills-generator --skill llm-integration