development en agent skills

apm::packages

@orchestra-research/moe-training

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), implementing sparse architectures like Mixtral 8x7B or DeepSeek-V3, or scaling model capacity without proportional compute increase. Covers MoE architectures, routing mechanisms, load balancing, expert parallelism, and inference optimization.

★ 5,030MIT

Orchestra-Research/development·4,130 tokens

@orchestra-research/evaluating-llms-harness

skill

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.

★ 5,030MIT

Orchestra-Research/development·3,474 tokens

@orchestra-research/transformer-lens-interpretability

skill

Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.

★ 5,030MIT

Orchestra-Research/development·3,056 tokens

@orchestra-research/sentence-transformers

skill

Framework for state-of-the-art sentence, text, and image embeddings. Provides 5000+ pre-trained models for semantic similarity, clustering, and retrieval. Supports multilingual, domain-specific, and multimodal models. Use for generating embeddings for RAG, semantic search, or similarity tasks. Best for production embedding generation.

★ 5,030MIT

Orchestra-Research/development·1,574 tokens

@orchestra-research/ray-train

skill

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

★ 5,030MIT

Orchestra-Research/development·2,537 tokens

@orchestra-research/nemo-evaluator-sdk

skill

Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.

★ 5,030MIT

Orchestra-Research/development·3,310 tokens

@orchestra-research/instructor

skill

Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library

★ 5,030MIT

Orchestra-Research/development·4,250 tokens

@orchestra-research/weights-and-biases

skill

Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform

★ 5,030MIT

Orchestra-Research/development·3,272 tokens

@orchestra-research/sentencepiece

skill

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT, XLNet, mBART. Train on raw text without pre-tokenization. Use when you need multilingual support, CJK languages, or reproducible tokenization.

★ 5,030MIT

Orchestra-Research/development·1,572 tokens

@orchestra-research/llama-cpp

skill

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware. Use for edge deployment, M1/M2/M3 Macs, AMD/Intel GPUs, or when CUDA is unavailable. Supports GGUF quantization (1.5-8 bit) for reduced memory and 4-10× speedup vs PyTorch on CPU.

★ 5,030MIT

Orchestra-Research/development·1,928 tokens

@orchestra-research/model-merging

skill

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat), improving performance beyond single models, or experimenting rapidly with model variants. Covers SLERP, TIES-Merging, DARE, Task Arithmetic, linear merging, and production deployment strategies.

★ 5,030MIT

Orchestra-Research/development·3,685 tokens

@orchestra-research/ray-data

skill

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.

★ 5,030MIT

Orchestra-Research/development·1,917 tokens

@orchestra-research/blip-2-vision-language

skill

Vision-language pre-training framework bridging frozen image encoders and LLMs. Use when you need image captioning, visual question answering, image-text retrieval, or multimodal chat with state-of-the-art zero-shot performance.

★ 5,030MIT

Orchestra-Research/development·4,283 tokens

@orchestra-research/implementing-llms-litgpt

skill

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

★ 5,030MIT

Orchestra-Research/development·3,217 tokens

@orchestra-research/pytorch-fsdp2

skill

Adds PyTorch FSDP2 (fully_shard) to training scripts with correct init, sharding, mixed precision/offload config, and distributed checkpointing. Use when models exceed single-GPU memory or when you need DTensor-based sharding with DeviceMesh.

★ 5,030MIT

Orchestra-Research/development·2,675 tokens

@orchestra-research/gptq

skill

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.

★ 5,030MIT

Orchestra-Research/development·3,462 tokens

@orchestra-research/mamba-architecture

skill

State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2 (d_state=128, multi-head). Models 130M-2.8B on HuggingFace.

★ 5,030MIT

Orchestra-Research/development·2,097 tokens

@microsoft/trigger-pipelines-for-copilot-pr

✓skill

Trigger ADO pipelines for a Copilot-created PR by posting /azp run comments. Use when the user asks to trigger CI pipelines for a specific PR.

★ 4,918MIT

microsoft/development·433 tokens·gittestingapi-design

@vercel/streamdown

✓skill

Implement, configure, and customize Streamdown — a streaming-optimized React Markdown renderer with syntax highlighting, Mermaid diagrams, math rendering, and CJK support. Use when working with Streamdown setup, configuration, plugins, styling, security, or integration with AI streaming (e.g., Vercel AI SDK). Triggers on: (1) Installing or setting up Streamdown, (2) Configuring plugins (code, mermaid, math, cjk), (3) Styling or theming Streamdown output, (4) Integrating with AI chat/streaming, (5) Configuring security, link safety, or custom HTML tags, (6) Using carets, static mode, or custom components, (7) Troubleshooting Tailwind, Shiki, or Vite issues.

★ 4,735

vercel/development·1,655 tokens·reacttypescriptjavascript+1

@cloudflare/eli5

✓skill

Transform technical jargon into clear explanations using before/after comparisons, metaphors, and practical context

★ 4,480CC-BY-4.0

cloudflare/development·4,440 tokens·development·opencode

Prev 1...38 39 40...173 Next

Page 39 of 173