“But it appears 1 or more organizations have successfully jail-broken Fable 5”
This is hardly true or it’s true of all frontier models and this was only magnified by Fables capabilities. It’s that you could hand Fable 5 vulnerable code, ask it to fix it, return patch plus test cases proving the fix and exploit relevant detail falls out as a byproduct of legitimate secure code review work.
I challenge anyone to provide a fix for this “exploit” without compromising Fable’s ability to patch unsecure code.
As with so much (LLM) security work, the devil is in the details: "~25 security issues per codebase" means nothing without a grounding in the codebase's actual security model, capabilities exposed to an attacker, etc. I haven't used Aikido's product, but my experience with similar tools is that tend to not find actual security issues until a proper security model is introduced for grounding.
(I say this as someone who is, broadly, extremely impressed by and interested in the use of LLMs for security research.)
> logic based vulnerabilities like a ReDoS pattern identified from source without live exploitation, or an admin-only route that's never been exercised
The two classes of vulnerability given as examples are the exact kind of issue I probably don’t care about, and are not grounded in an actual security model
This looks promising, but I find it a little odd to bury the bulk of plan limitations under "fair-usage limits". When the limitations are specifically coupled to plans, it feels less like an FUP and more like plan-specific caps that should be surfaced more directly.
I'm building a competing product and am curious if you'd be up for a conversation about what you've enjoyed best about Aikido and, importantly, what gaps are still not covered.
“But it appears 1 or more organizations have successfully jail-broken Fable 5”
This is hardly true or it’s true of all frontier models and this was only magnified by Fables capabilities. It’s that you could hand Fable 5 vulnerable code, ask it to fix it, return patch plus test cases proving the fix and exploit relevant detail falls out as a byproduct of legitimate secure code review work.
I challenge anyone to provide a fix for this “exploit” without compromising Fable’s ability to patch unsecure code.
As with so much (LLM) security work, the devil is in the details: "~25 security issues per codebase" means nothing without a grounding in the codebase's actual security model, capabilities exposed to an attacker, etc. I haven't used Aikido's product, but my experience with similar tools is that tend to not find actual security issues until a proper security model is introduced for grounding.
(I say this as someone who is, broadly, extremely impressed by and interested in the use of LLMs for security research.)
> logic based vulnerabilities like a ReDoS pattern identified from source without live exploitation, or an admin-only route that's never been exercised
The two classes of vulnerability given as examples are the exact kind of issue I probably don’t care about, and are not grounded in an actual security model
This looks promising, but I find it a little odd to bury the bulk of plan limitations under "fair-usage limits". When the limitations are specifically coupled to plans, it feels less like an FUP and more like plan-specific caps that should be surfaced more directly.
We’ve been using aikido code scanning and pen test tools and been pretty impressed. Will have to take a look at this.
I'm building a competing product and am curious if you'd be up for a conversation about what you've enjoyed best about Aikido and, importantly, what gaps are still not covered.
Looks like a solid bridge between SAST and manual review. Will check it out.
This is marketed as a defensive tool, but how do you prove that you check against "your" source code?