CVE-Bench: testing LLM agents on real-world vulnerability patches

(giovannigatti.github.io)

8 points | by logickkk1 2 hours ago ago

1 comments