Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change

(andreaborio.substack.com)

6 points | by andreaborio 10 hours ago ago

1 comments