HN New Show Ask Jobs Built with Astro

Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change

(andreaborio.substack.com)

6 points | by andreaborio 10 hours ago ago

1 comments

andreaborio 10 hours ago ago

[dead]