Towards Greater Leverage: Scaling Laws for Efficient MoE Language Models

(arxiv.org)

4 points | by Anon84 a day ago ago

No comments yet.