
LMSYS Blog
lmsys.org/
How to support new VLMs into SGLang: A Case Study with NVILA
<p>The world of LLMs is evolving at a remarkable pace, with Visual Language Models (VLMs) at the forefront of this revolution. These models power application...

Cost Effective Deployment of DeepSeek R1 with Intel® Xeon® 6 CPU on SGLang
<p>The impressive performance of DeepSeek R1 marked a rise of giant Mixture of Experts (MoE) models in Large Language Models (LLM). However, its massive mode...

slime: An SGLang-Native Post-Training Framework for RL Scaling
<h2><a id="vision-that-drives-slime" class="anchor" href="#vision-that-drives-slime" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" ...

OME: Revolutionizing LLM Infrastructure with Model-Driven Architecture
<h2><a id="the-tale-of-two-teams-why-model-serving-is-broken" class="anchor" href="#the-tale-of-two-teams-why-model-serving-is-broken" aria-hidden="true"><sv...

Deploying DeepSeek on GB200 NVL72 with PD and Large Scale EP (Part I): 2.7x Higher Decoding Throughput
<p>The GB200 NVL72 is the world's most advanced hardware for AI training and inference. In this blog post, we're excited to share early results from running ...

Deploying DeepSeek with PD Disaggregation and Large-Scale Expert Parallelism on 96 H100 GPUs
<p>DeepSeek is a popular open-source large language model (LLM) praised for its strong performance. However, its large size and unique architecture, which us...