Astro - Hacker News

2 comments

chasing0entropy 17 hours ago ago

I also use self-hosted LLMs. You can make three GTX 1080s run a 7b model competently at limited context through ollama. Get a little more bold with LM studio and you can actually make a coherent and sort of reliable model.
keyle 16 hours ago ago

on macOS if you opted for 32GB you can run a GPT4-oss model with LMStudio really easily.
It's "good enough" for a lot of questions and doesn't go up and down like a yoyo (OpenAI dashboard lies)