Steering interpretable language models with concept algebra

(guidelabs.ai)

26 points | by luulinh90s 21 hours ago ago

1 comments