Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM

(arxiv.org)

16 points | by dryarzeg 4 hours ago ago

2 comments