development agent skills

apm::packages

@orchestra-research/deepspeed

Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism, FP16/BF16/FP8, 1-bit Adam, sparse attention

★ 5,030MIT

Orchestra-Research/development·33,311 tokens

@orchestra-research/nemo-guardrails

skill

NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.

★ 5,030MIT

Orchestra-Research/development·1,898 tokens

@orchestra-research/nemo-curator

skill

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.

★ 5,030MIT

Orchestra-Research/development·2,513 tokens

@orchestra-research/fine-tuning-with-trl

skill

Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.

★ 5,030MIT

Orchestra-Research/development·3,079 tokens

@orchestra-research/long-context

skill

Extend context windows of transformer models using RoPE, YaRN, ALiBi, and position interpolation techniques. Use when processing long documents (32k-128k+ tokens), extending pre-trained models beyond original context limits, or implementing efficient positional encodings. Covers rotary embeddings, attention biases, interpolation methods, and extrapolation strategies for LLMs.

★ 5,030MIT

Orchestra-Research/development·4,249 tokens

@orchestra-research/serving-llms-vllm

skill

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.

★ 5,030MIT

Orchestra-Research/development·2,495 tokens

@orchestra-research/chroma

skill

Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local development and open-source projects.

★ 5,030MIT

Orchestra-Research/development·2,281 tokens

@orchestra-research/unsloth

skill

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

★ 5,030MIT

Orchestra-Research/development·492 tokens

@orchestra-research/ray-data

skill

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.

★ 5,030MIT

Orchestra-Research/development·1,917 tokens

@orchestra-research/llamaguard

skill

Meta's 7-8B specialized moderation model for LLM input/output filtering. 6 safety categories - violence/hate, sexual content, weapons, substances, self-harm, criminal planning. 94-95% accuracy. Deploy with vLLM, HuggingFace, Sagemaker. Integrates with NeMo Guardrails.

★ 5,030MIT

Orchestra-Research/development·2,491 tokens

@orchestra-research/autogpt-agents

skill

Autonomous AI agent platform for building and deploying continuous agents. Use when creating visual workflow agents, deploying persistent autonomous agents, or building complex multi-step AI automation systems.

★ 5,030MIT

Orchestra-Research/development·2,233 tokens

@orchestra-research/slime-rl-training

skill

Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.

★ 5,030MIT

Orchestra-Research/development·2,969 tokens

@orchestra-research/audiocraft-audio-generation

skill

PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.

★ 5,030MIT

Orchestra-Research/development·3,758 tokens

@orchestra-research/faiss

skill

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types (Flat, IVF, HNSW). Use for fast k-NN search, large-scale vector retrieval, or when you need pure similarity search without metadata. Best for high-performance applications.

★ 5,030MIT

Orchestra-Research/development·1,386 tokens

@orchestra-research/model-merging

skill

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat), improving performance beyond single models, or experimenting rapidly with model variants. Covers SLERP, TIES-Merging, DARE, Task Arithmetic, linear merging, and production deployment strategies.

★ 5,030MIT

Orchestra-Research/development·3,685 tokens

@orchestra-research/qdrant-vector-search

skill

High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.

★ 5,030MIT

Orchestra-Research/development·3,295 tokens

@orchestra-research/implementing-llms-litgpt

skill

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

★ 5,030MIT

Orchestra-Research/development·3,217 tokens

@microsoft/trigger-pipelines-for-copilot-pr

✓skill

Trigger ADO pipelines for a Copilot-created PR by posting /azp run comments. Use when the user asks to trigger CI pipelines for a specific PR.

★ 4,918MIT

microsoft/development·433 tokens·gittestingapi-design

@microsoft/issue-triage-report

✓skill

Generate comprehensive GitHub Feature Area Status reports for the Windows App SDK repository. Use when asked to create triage reports, identify high-priority issues, analyze feature area health, find issues needing attention, or generate status dashboards. Triggers on requests involving issue triage, area status, priority analysis, bug tracking reports, or engineering team focus areas.

★ 4,457MIT

microsoft/development·2,368 tokens·git

@microsoft/worktree-manager

✓skill

Create and manage Git worktrees for parallel development workflows. Use when multiple self-contained issues should NOT be fixed in a single branch, when human-Copilot iteration requires isolated environments with separate chat history and commits, or when parallel work items need independent build/test results. Triggers on requests involving branch isolation, work item separation, parallel development, or avoiding messy branch switching.

★ 4,457MIT

microsoft/development·2,048 tokens·gittesting

Prev 1...32 33 34...115 Next

Page 33 of 115