FastVLM: Efficient vision encoding for vision language models

(github.com)

272 points | by nhod 9 hours ago ago

48 comments