10 points | by i386 5 hours ago ago
3 comments
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
> MoE models via expert sharding with zero cross-node inference traffic
This makes the whole project questionable
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
> MoE models via expert sharding with zero cross-node inference traffic
This makes the whole project questionable
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.