skill en agent skills

apm::packages

@orchestra-research/axolotl

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

★ 5,030MIT

Orchestra-Research/development·1,144 tokens

@orchestra-research/transformer-lens-interpretability

skill

Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.

★ 5,030MIT

Orchestra-Research/development·3,056 tokens

@orchestra-research/hqq-quantization

skill

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.

★ 5,030MIT

Orchestra-Research/development·3,222 tokens

@orchestra-research/awq-quantization

skill

Activation-aware weight quantization for 4-bit LLM compression with 3x speedup and minimal accuracy loss. Use when deploying large models (7B-70B) on limited GPU memory, when you need faster inference than GPTQ with better accuracy preservation, or for instruction-tuned and multimodal models. MLSys 2024 Best Paper Award winner.

★ 5,030MIT

Orchestra-Research/development·2,482 tokens

@orchestra-research/ml-paper-writing

skill

Write publication-ready ML/AI/Systems papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM, OSDI, NSDI, ASPLOS, SOSP. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verification workflows.

★ 5,030MIT

Orchestra-Research/development·9,418 tokens

@orchestra-research/peft-fine-tuning

skill

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.

★ 5,030MIT

Orchestra-Research/development·3,463 tokens

@orchestra-research/knowledge-distillation

skill

Compress large language models using knowledge distillation from teacher to student models. Use when deploying smaller models with retained performance, transferring GPT-4 capabilities to open-source models, or reducing inference costs. Covers temperature scaling, soft targets, reverse KLD, logit distillation, and MiniLLM training strategies.

★ 5,030MIT

Orchestra-Research/development·3,411 tokens

@orchestra-research/gptq

skill

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.

★ 5,030MIT

Orchestra-Research/development·3,462 tokens

@orchestra-research/evaluating-code-models

skill

Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics. Use when benchmarking code models, comparing coding abilities, testing multi-language support, or measuring code generation quality. Industry standard from BigCode Project used by HuggingFace leaderboards.

★ 5,030MIT

Orchestra-Research/development·3,244 tokens

@orchestra-research/pytorch-lightning

skill

High-level PyTorch framework with Trainer class, automatic distributed training (DDP/FSDP/DeepSpeed), callbacks system, and minimal boilerplate. Scales from laptop to supercomputer with same code. Use when you want clean training loops with built-in best practices.

★ 5,030MIT

Orchestra-Research/development·2,254 tokens

@orchestra-research/speculative-decoding

skill

Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.

★ 5,030MIT

Orchestra-Research/development·3,753 tokens

@orchestra-research/clip

skill

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.

★ 5,030MIT

Orchestra-Research/development·1,752 tokens

@orchestra-research/pytorch-fsdp2

skill

Adds PyTorch FSDP2 (fully_shard) to training scripts with correct init, sharding, mixed precision/offload config, and distributed checkpointing. Use when models exceed single-GPU memory or when you need DTensor-based sharding with DeviceMesh.

★ 5,030MIT

Orchestra-Research/development·2,675 tokens

@orchestra-research/pyvene-interventions

skill

Provides guidance for performing causal interventions on PyTorch models using pyvene's declarative intervention framework. Use when conducting causal tracing, activation patching, interchange intervention training, or testing causal hypotheses about model behavior.

★ 5,030MIT

Orchestra-Research/development·3,358 tokens

@orchestra-research/moe-training

skill

Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), implementing sparse architectures like Mixtral 8x7B or DeepSeek-V3, or scaling model capacity without proportional compute increase. Covers MoE architectures, routing mechanisms, load balancing, expert parallelism, and inference optimization.

★ 5,030MIT

Orchestra-Research/development·4,130 tokens

@orchestra-research/evaluating-llms-harness

skill

Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.

★ 5,030MIT

Orchestra-Research/development·3,474 tokens

@orchestra-research/model-merging

skill

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat), improving performance beyond single models, or experimenting rapidly with model variants. Covers SLERP, TIES-Merging, DARE, Task Arithmetic, linear merging, and production deployment strategies.

★ 5,030MIT

Orchestra-Research/development·3,685 tokens

@orchestra-research/sparse-autoencoder-training

skill

Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.

★ 5,030MIT

Orchestra-Research/development·3,272 tokens

@orchestra-research/distributed-llm-pretraining-torchtitan

skill

Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.

★ 5,030MIT

Orchestra-Research/development·2,634 tokens

@orchestra-research/ray-train

skill

Distributed training orchestration across clusters. Scales PyTorch/TensorFlow/HuggingFace from laptop to 1000s of nodes. Built-in hyperparameter tuning with Ray Tune, fault tolerance, elastic scaling. Use when training massive models across multiple machines or running distributed hyperparameter sweeps.

★ 5,030MIT

Orchestra-Research/development·2,537 tokens

Prev 1...52 53 54...237 Next

Page 53 of 237