vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep

(blog.vllm.ai)

55 points | by robertnishihara 12 hours ago ago

5 comments