Skip to content
View sjmshsh's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@go-sun-team @Higress-RAG

Block or report sjmshsh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sjmshsh/README.md

Hi there 👋

  • 开源经历
    • Apache Dubbo社区:(Apache Dubbo Commiter)
      • 负责dubbo-kubernetes,dubbo-kubernetes创始人。目前正在进行的是跨集群以及LLM相关的改造
    • Kuma
    • Cula
    • vllm
    • sglang
    • sglang-omni
  • 工作
    • 千卡集群推理平台从0到1落地经验
    • 腾讯,Agent,大模型后训练经验

随时看新的工作机会。wx:Misaka_-_-_Mikoto qq: 1550693033

正在学习:vllm,RL Infra,RL算法

正在给:vllm,cuLA,AReal做开源贡献,正在打kvcache长序列化优化的比赛

Pinned Loading

  1. sgl-project/sglang-omni sgl-project/sglang-omni Public

    SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models

    Python 462 193

  2. sgl-project/sglang sgl-project/sglang Public

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Python 28.9k 6.4k

  3. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 82.2k 17.8k

  4. apache/dubbo-kubernetes apache/dubbo-kubernetes Public

    Build a native mesh using gRPC and xDS.

    Go 197 88

  5. inclusionAI/cuLA inclusionAI/cuLA Public

    CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

    Python 519 63

  6. RL-Align/RL-Kernel RL-Align/RL-Kernel Public

    Modern RL Post-training Infrastructure: Optimized for NVIDIA/AMD GPUs with a focus on vLLM integration, Triton kernels, and transparent hardware-aware scaling.

    Python 80 13