CLAUDE CODE MARKETPLACES
SkillsMarketplacesMCPDigestLearnAdvertise

This week in Claude

Every Monday: Claude Code, Agent SDK, MCP, and the Anthropic platform moves worth your time.

Skills by Category
Frontend DevelopmentBackend & APIsTesting & QASecurityDevOps & CI/CDGit & Pull RequestsDocumentationCode Review & QualityAI & Agent BuildingSkill Development
MCP Servers by Category
Web & Browser AutomationDatabasesAI & LLM ToolsCloud & InfrastructureCommunication & MessagingDeveloper ToolsDesign & CreativeDocuments & KnowledgeSearch & Web CrawlingAutomation & Workflows
Marketplaces by Category
AI Agents & OrchestrationLLM IntegrationDevelopment ToolsFrontend & UIBackend & APIsDatabasesTesting & Code QualityDevOps & CloudSecurity & ComplianceGit & Version Control

Claude Code Marketplaces

Discover Claude Code plugins, extensions, and tools. Automatically updated directory of Anthropic Claude AI marketplaces with development tools, productivity plugins, and integrations.

Resources

  • Browse Skills
  • Browse MCP Servers
  • Browse Marketplaces
  • Plugins Reference

Community

  • About
  • Learn
  • Feedback
  • Privacy Policy
  • Advertise

Built for the Claude Code community with Claude Code by @mertduzgun

Independent project, not affiliated with Anthropic
  1. Skills
  2. /
  3. orchestra-research
  4. /
  5. ai-research-skills
  6. /
  7. Huggingface Tokenizers

Huggingface Tokenizers

Editor's Note

This is the Rust-based tokenization library that powers HuggingFace transformers under the hood. It processes a gigabyte of text in under 20 seconds on CPU, which matters when you're training custom tokenizers or building production pipelines. You get all three major algorithms: BPE for GPT-style models, WordPiece for BERT, and Unigram for multilingual work. The alignment tracking is genuinely useful since you can map tokens back to their original character positions. If you're just loading a pretrained tokenizer for inference, transformers.AutoTokenizer is easier. But if you need to train from scratch, handle massive corpora, or squeeze performance out of tokenization, this is the tool. The Python bindings hide all the Rust complexity while keeping the speed.

Install

npx skills add https://github.com/orchestra-research/ai-research-skills --skill huggingface-tokenizers
Votes
0
Installs247
GitHub Stars9.2k
Categories
AI & Agent BuildingData Science & ML
First SeenJun 3, 2026
View on GitHub

Comments

Login to comment

Related AI & Agent Building Skills

View all →
agentica-prompts

parcadei/continuous-claude-v3

0
398
3.8k
agentica prompts
llm-application-dev-langchain-agent

sickn33/antigravity-awesome-skills

0
306
39.4k
llm application dev langchain agent
agentic-eval

github/awesome-copilot

0
9.4k
34.3k
Iterative evaluation and refinement patterns for improving AI agent outputs through self-critique loops.
ai-prompt-engineering-safety-review

github/awesome-copilot

0
9.4k
34.3k
Comprehensive safety analysis and improvement framework for AI prompts with detailed assessment methodologies.
emblem-ai-prompt-examples

emblemcompany/agent-skills

0
8.7k
10
emblem ai prompt examples
finalize-agent-prompt

github/awesome-copilot

0
8.6k
34.3k
Polish and refine agent prompt files against proven best practices.