CLAUDE CODE MARKETPLACES
SkillsMarketplacesMCPDigestLearnJobsAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Web & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web CrawlingAutomation & Workflows
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Claude Code Marketplaces

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Learn
  • Feedback
  • Privacy Policy
  • Jobs
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic
  1. Skills
  2. /
  3. aradotso
  4. /
  5. trending-skills
  6. /
  7. Flash Moe Inference

Flash Moe Inference

Editor's Note

This is a pure C/Metal inference engine that runs Qwen3.5-397B (397B parameters, 209GB on disk) on a MacBook Pro with 48GB RAM at 4.4 tokens per second. It streams expert weights from SSD on demand using parallel pread calls, no Python or ML frameworks at runtime. The repo includes hand-tuned Metal shaders with fused dequantization FMA kernels and uses Accelerate BLAS for the GatedDeltaNet attention layers. You'll need an M3 Max or similar, 1TB SSD, and patience for the weight extraction and repacking process. There's an optional 2-bit quantization path that's faster but breaks tool calling. This is what happens when someone decides PyTorch is too slow and writes 7000 lines of Objective-C instead.

Install

npx skills add https://github.com/aradotso/trending-skills --skill flash-moe-inference
Votes
0
Installs977
GitHub Stars7
Categories
Data Science & MLAutomation & WorkflowsPythonGoFinance & TradingGame Development
First SeenApr 16, 2026
View on GitHub

Comments

Login to comment

Related Data Science & ML Skills

View all →
azure-kusto

microsoft/azure-skills

0
319.8k
964
Execute KQL queries and analyze data in Azure Data Explorer for log analytics, telemetry, and time series insights.
azure-observability

microsoft/azure-skills

0
98.1k
964
Query metrics, logs, and traces across Azure Monitor, Application Insights, and Log Analytics.
analytics-tracking

coreyhaines31/marketingskills

0
57.9k
28.8k
Set up, audit, and improve analytics tracking to measure marketing and product decisions.
customer-research

coreyhaines31/marketingskills

0
30.9k
28.8k
customer research
expo-cicd-workflows

expo/skills

0
26.5k
1.9k
Write and validate EAS CI/CD workflow YAML files for Expo projects.
baoyu-post-to-wechat

jimliu/baoyu-skills

0
24.3k
18.4k
Publish articles and image-text posts to WeChat Official Accounts via API or browser automation.