llm-infra 5
- SGLang + LoRA Deep Dive — Qwen3-30B-A3B-Instruct-2507
- Efficient Forward Pass for Agent RL: Solving Multi-Turn Context Consistency (Part 2)
- Efficient Forward Pass for Agent RL: Solving Multi-Turn Context Consistency (Part 1)
- LangGraph Rollout: Evolving VeRL's Multi-Turn Capabilities for Agent RL
- When Reasoning Models Break Tokenization: The Hidden Complexity of Multiturn Training