LMSYS Blog

LMSYS Blog

lmsys.org/
10
Articles
7月31日 16:02
Last updated
GLM-4.5 Meets SGLang: Reasoning, Coding, and Agentic Abilities

GLM-4.5 Meets SGLang: Reasoning, Coding, and Agentic Abilities

<p>Today, we are excited to introduce our latest flagship models <a href="https://huggingface.co/zai-org/GLM-4.5">GLM-4.5</a> and <a href="https://huggingfac...

LMSYS Blog
library tool
SpecForge: Accelerating Speculative Decoding Training for SGLang

SpecForge: Accelerating Speculative Decoding Training for SGLang

<p>Speculative decoding is a powerful technique for accelerating Large Language Model (LLM) inference. In this blog post, we are excited to announce the open...

LMSYS Blog
framework tool
Deploying Kimi K2 with PD Disaggregation and Large-Scale Expert Parallelism on 128 H200 GPUs

Deploying Kimi K2 with PD Disaggregation and Large-Scale Expert Parallelism on 128 H200 GPUs

<h2><a id="1️⃣-introduction-deploying-the-most-advanced-open-source-moe-model" class="anchor" href="#1️⃣-introduction-deploying-the-most-advanced-open-source...

LMSYS Blog
framework tool
Accelerating SGLang with Multiple Token Prediction

Accelerating SGLang with Multiple Token Prediction

<h2><a id="tldr" class="anchor" href="#tldr" aria-hidden="true"><svg aria-hidden="true" class="octicon octicon-link" height="16" version="1.1" viewbox="0 0 1...

LMSYS Blog
library tool
How to support new VLMs into SGLang: A Case Study with NVILA

How to support new VLMs into SGLang: A Case Study with NVILA

<p>The world of LLMs is evolving at a remarkable pace, with Visual Language Models (VLMs) at the forefront of this revolution. These models power application...

LMSYS Blog
api cloud tool
Cost Effective Deployment of DeepSeek R1 with Intel® Xeon® 6 CPU on SGLang

Cost Effective Deployment of DeepSeek R1 with Intel® Xeon® 6 CPU on SGLang

<p>The impressive performance of DeepSeek R1 marked a rise of giant Mixture of Experts (MoE) models in Large Language Models (LLM). However, its massive mode...

LMSYS Blog
library tool