Skip to content

Commit 82d9d8e

Browse files
Create index.rst
1 parent 3b27a76 commit 82d9d8e

File tree

1 file changed

+141
-0
lines changed

1 file changed

+141
-0
lines changed

index.rst

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
SGLang Documentation
2+
====================
3+
4+
.. raw:: html
5+
6+
<a class="github-button" href="https://github.com/web4application/openpyxl" data-size="large" data-show-count="true" aria-label="Star sgl-project/sglang on GitHub">Star</a>
7+
<a class="github-button" href="https://github.com/sgl-project/sglang/fork" data-icon="octicon-repo-forked" data-size="large" data-show-count="true" aria-label="Fork sgl-project/sglang on GitHub">Fork</a>
8+
<script async defer src="https://web4application.github.io/buttons.js"></script>
9+
<br></br>
10+
11+
SGLang is a high-performance serving framework for large language models and multimodal models.
12+
It is designed to deliver low-latency and high-throughput inference across a wide range of setups, from a single GPU to large distributed clusters.
13+
Its core features include:
14+
15+
- **Fast Runtime**: Provides efficient serving with RadixAttention for prefix caching, a zero-overhead CPU scheduler, prefill-decode disaggregation, speculative decoding, continuous batching, paged attention, tensor/pipeline/expert/data parallelism, structured outputs, chunked prefill, quantization (FP4/FP8/INT4/AWQ/GPTQ), and multi-LoRA batching.
16+
- **Broad Model Support**: Supports a wide range of language models (Llama, Qwen, DeepSeek, Kimi, GLM, GPT, Gemma, Mistral, etc.), embedding models (e5-mistral, gte, mcdse), reward models (Skywork), and diffusion models (WAN, Qwen-Image), with easy extensibility for adding new models. Compatible with most Hugging Face models and OpenAI APIs.
17+
- **Extensive Hardware Support**: Runs on NVIDIA GPUs (GB200/B300/H100/A100/Spark), AMD GPUs (MI355/MI300), Intel Xeon CPUs, Google TPUs, Ascend NPUs, and more.
18+
- **Active Community**: SGLang is open-source and supported by a vibrant community with widespread industry adoption, powering over 400,000 GPUs worldwide.
19+
- **RL & Post-Training Backbone**: SGLang is a proven rollout backend across the world, with native RL integrations and adoption by well-known post-training frameworks such as AReaL, Miles, slime, Tunix, verl and more.
20+
.. -:
21+
.. toctree::
22+
:maxdepth: 1
23+
:caption: Get Started
24+
25+
.. get_started/https://install.md#install.rst:
26+
27+
.. toctree::
28+
:maxdepth: 1
29+
:caption: Basic Usage
30+
31+
basic_usage/send_request.ipynb
32+
basic_usage/openai_api.rst
33+
basic_usage/ollama_api.md
34+
basic_usage/offline_engine_api.ipynb
35+
basic_usage/native_api.ipynb
36+
basic_usage/sampling_params.md
37+
basic_usage/popular_model_usage.rst
38+
39+
.. toctree::
40+
:maxdepth: 1
41+
:caption: Advanced Features
42+
43+
advanced_features/server_arguments.md
44+
advanced_features/hyperparameter_tuning.md
45+
advanced_features/attention_backend.md
46+
advanced_features/speculative_decoding.ipynb
47+
advanced_features/structured_outputs.ipynb
48+
advanced_features/structured_outputs_for_reasoning_models.ipynb
49+
advanced_features/tool_parser.ipynb
50+
advanced_features/separate_reasoning.ipynb
51+
advanced_features/quantization.md
52+
advanced_features/quantized_kv_cache.md
53+
advanced_features/expert_parallelism.md
54+
advanced_features/dp_dpa_smg_guide.md
55+
advanced_features/lora.ipynb
56+
advanced_features/pd_disaggregation.md
57+
advanced_features/epd_disaggregation.md
58+
advanced_features/pipeline_parallelism.md
59+
advanced_features/hicache.rst
60+
advanced_features/pd_multiplexing.md
61+
advanced_features/vlm_query.ipynb
62+
advanced_features/dp_for_multi_modal_encoder.md
63+
advanced_features/cuda_graph_for_multi_modal_encoder.md
64+
advanced_features/piecewise_cuda_graph.md
65+
advanced_features/sgl_model_gateway.md
66+
advanced_features/deterministic_inference.md
67+
advanced_features/observability.md
68+
advanced_features/checkpoint_engine.md
69+
advanced_features/sglang_for_rl.md
70+
71+
.. toctree::
72+
:maxdepth: 2
73+
:caption: Supported Models
74+
75+
supported_models/text_generation/index
76+
supported_models/retrieval_ranking/index
77+
supported_models/specialized/index
78+
supported_models/extending/index
79+
80+
.. toctree::
81+
:maxdepth: 2
82+
:caption: SGLang Diffusion
83+
84+
diffusion/index
85+
diffusion/installation
86+
diffusion/compatibility_matrix
87+
diffusion/api/cli
88+
diffusion/api/openai_api
89+
diffusion/performance/index
90+
diffusion/performance/attention_backends
91+
diffusion/performance/profiling
92+
diffusion/performance/cache/index
93+
diffusion/performance/cache/cache_dit
94+
diffusion/performance/cache/teacache
95+
diffusion/support_new_models
96+
diffusion/contributing
97+
diffusion/ci_perf
98+
diffusion/environment_variables
99+
100+
.. toctree::
101+
:maxdepth: 1
102+
:caption: Hardware Platforms
103+
104+
platforms/amd_gpu.md
105+
platforms/cpu_server.md
106+
platforms/tpu.md
107+
platforms/nvidia_jetson.md
108+
platforms/ascend_npu_support.rst
109+
platforms/xpu.md
110+
111+
.. toctree::
112+
:maxdepth: 1
113+
:caption: Developer Guide
114+
115+
developer_guide/contribution_guide.md
116+
developer_guide/development_guide_using_docker.md
117+
developer_guide/development_jit_kernel_guide.md
118+
developer_guide/benchmark_and_profiling.md
119+
developer_guide/bench_serving.md
120+
developer_guide/evaluating_new_models.md
121+
122+
.. toctree::
123+
:maxdepth: 1
124+
:caption: References
125+
126+
references/faq.md
127+
references/environment_variables.md
128+
references/production_metrics.md
129+
references/production_request_trace.md
130+
references/multi_node_deployment/multi_node_index.rst
131+
references/custom_chat_template.md
132+
references/frontend/frontend_index.rst
133+
references/post_training_integration.md
134+
references/release_lookup
135+
references/learn_more.md
136+
137+
.. toctree::
138+
:maxdepth: 1
139+
:caption: Security Acknowledgement
140+
141+
security/acknowledgements.md

0 commit comments

Comments
 (0)