skill agent skills

apm::packages

@orchestra-research/hqq-quantization

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.

★ 5,030MIT

Orchestra-Research/development·3,222 tokens

@orchestra-research/peft-fine-tuning

skill

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.

★ 5,030MIT

Orchestra-Research/development·3,463 tokens

@orchestra-research/ml-paper-writing

skill

Write publication-ready ML/AI/Systems papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM, OSDI, NSDI, ASPLOS, SOSP. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verification workflows.

★ 5,030MIT

Orchestra-Research/development·9,418 tokens

@orchestra-research/simpo-training

skill

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for preference alignment when want simpler, faster training than DPO/PPO.

★ 5,030MIT

Orchestra-Research/development·1,661 tokens

@orchestra-research/speculative-decoding

skill

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.

★ 5,030MIT

Orchestra-Research/development·3,753 tokens

@orchestra-research/pytorch-lightning

skill

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.

★ 5,030MIT

Orchestra-Research/development·2,254 tokens

@orchestra-research/evaluating-code-models

skill

Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics. Use when benchmarking code models, comparing coding abilities, testing multi-language support, or measuring code generation quality. Industry standard from BigCode Project used by HuggingFace leaderboards.

★ 5,030MIT

Orchestra-Research/development·3,244 tokens

@orchestra-research/pytorch-fsdp2

skill

Adds PyTorch FSDP2 (fully_shard) to training scripts with correct init, sharding, mixed precision/offload config, and distributed checkpointing. Use when models exceed single-GPU memory or when you need DTensor-based sharding with DeviceMesh.

★ 5,030MIT

Orchestra-Research/development·2,675 tokens

@orchestra-research/pyvene-interventions

skill

Provides guidance for performing causal interventions on PyTorch models using pyvene's declarative intervention framework. Use when conducting causal tracing, activation patching, interchange intervention training, or testing causal hypotheses about model behavior.

★ 5,030MIT

Orchestra-Research/development·3,358 tokens

@orchestra-research/model-merging

skill

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat), improving performance beyond single models, or experimenting rapidly with model variants. Covers SLERP, TIES-Merging, DARE, Task Arithmetic, linear merging, and production deployment strategies.

★ 5,030MIT

Orchestra-Research/development·3,685 tokens

@orchestra-research/sparse-autoencoder-training

skill

Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.

★ 5,030MIT

Orchestra-Research/development·3,272 tokens

@orchestra-research/evaluating-llms-harness

skill

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.

★ 5,030MIT

Orchestra-Research/development·3,474 tokens

@orchestra-research/faiss

skill

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications.

★ 5,030MIT

Orchestra-Research/development·1,386 tokens

@orchestra-research/tensorboard

skill

Visualize training metrics, debug models with histograms, compare experiments, visualize model graphs, and profile performance with TensorBoard - Google's ML visualization toolkit

★ 5,030MIT

Orchestra-Research/development·3,808 tokens

@orchestra-research/moe-training

skill

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), implementing sparse architectures like Mixtral 8x7B or DeepSeek-V3, or scaling model capacity without proportional compute increase. Covers MoE architectures, routing mechanisms, load balancing, expert parallelism, and inference optimization.

★ 5,030MIT

Orchestra-Research/development·4,130 tokens

@orchestra-research/nnsight-remote-interpretability

skill

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.

★ 5,030MIT

Orchestra-Research/development·3,347 tokens

@orchestra-research/clip

skill

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.

★ 5,030MIT

Orchestra-Research/development·1,752 tokens

@orchestra-research/optimizing-attention-flash

skill

Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.

★ 5,030MIT

Orchestra-Research/development·2,901 tokens

@orchestra-research/dspy

skill

Build complex AI systems with declarative programming, optimize prompts automatically, create modular RAG systems and agents with DSPy - Stanford NLP's framework for systematic LM programming

★ 5,030MIT

Orchestra-Research/development·3,735 tokens

@orchestra-research/instructor

skill

Extract structured data from LLM responses with Pydantic validation, retry failed extractions automatically, parse complex JSON with type safety, and stream partial results with Instructor - battle-tested structured output library

★ 5,030MIT

Orchestra-Research/development·4,250 tokens

Prev 1...38 39 40...148 Next

Page 39 of 148