> We write unit tests for the happy path, maybe a few edge cases we can imagine, but what about the inputs we'd never consider? Many times we assume that LLMs are handling these scenarios by default,
I've seen companies advertise with LLM generated claims (~Best company for X according to ChatGPT), I've seen (political) discussions being held with LLM opinions as "evidence".
So it's pretty safe to say some (many?) attribute inappropriate credence to LLM outputs. It's eating our minds.
Technically a property based test caught the issue.
What I've found surprising is that the __proto__ string is a fixed set from the strings sampling set. Whereas I'd have expected the function to return random strings in the range given.
But maybe that's my biased expectation being introduced to property-based testing with random values. It also feels like a stretch to call this a property-based test, because what is the property "setters and getters that work"? Cause I expect that from all my classes.
Good PBT code doesn't simply generate values at random, they skew the distributions so that known problematic values are more likely to appear. In JS "__proto__" is a good candidate for strings as shown here, for floating point numbers you'll probably want skew towards generating stuff like infinities, nans, denormals, negative zero and so on. It'll depend on your exact domain.
> Is this exploitable? No. ... JSON.stringify knows to skip the __proto__ field. ... However, refactors to the code could ... [cause] subtle incorrectness and sharp edge cases in your code base.
So what? This line of what-if reasoning is so annoying especially when it's analysis for a language like javascript. There's no vulnerability found here and most web developers are well aware of the risky parts of the language. This is almost as bad as all the insane false positives SAST scans dump on you.
Oh I'm just waiting to get dogpiled by people who want to tell me web devs are dumber than them and couldn't possibly be competent at anything.
TL;DR: obj[key] with user-controlled key == "__proto__" is a gift that keeps on giving; buy our AI tool that will write subtle vulnerabilities like that which you yourself won’t catch in review but then it will also write some property-based tests that maybe will
For real. The bullet-point summary at the beginning with a "Why this matters for..." immediately followed by, "This isn't just a theoretical exercise—it's a real example of..." Dead giveaways.
> We write unit tests for the happy path, maybe a few edge cases we can imagine, but what about the inputs we'd never consider? Many times we assume that LLMs are handling these scenarios by default,
Do we?
I've seen companies advertise with LLM generated claims (~Best company for X according to ChatGPT), I've seen (political) discussions being held with LLM opinions as "evidence".
So it's pretty safe to say some (many?) attribute inappropriate credence to LLM outputs. It's eating our minds.
Technically a property based test caught the issue.
What I've found surprising is that the __proto__ string is a fixed set from the strings sampling set. Whereas I'd have expected the function to return random strings in the range given.
But maybe that's my biased expectation being introduced to property-based testing with random values. It also feels like a stretch to call this a property-based test, because what is the property "setters and getters that work"? Cause I expect that from all my classes.
Good PBT code doesn't simply generate values at random, they skew the distributions so that known problematic values are more likely to appear. In JS "__proto__" is a good candidate for strings as shown here, for floating point numbers you'll probably want skew towards generating stuff like infinities, nans, denormals, negative zero and so on. It'll depend on your exact domain.
> Is this exploitable? No. ... JSON.stringify knows to skip the __proto__ field. ... However, refactors to the code could ... [cause] subtle incorrectness and sharp edge cases in your code base.
So what? This line of what-if reasoning is so annoying especially when it's analysis for a language like javascript. There's no vulnerability found here and most web developers are well aware of the risky parts of the language. This is almost as bad as all the insane false positives SAST scans dump on you.
Oh I'm just waiting to get dogpiled by people who want to tell me web devs are dumber than them and couldn't possibly be competent at anything.
> most web developers are well aware of the risky parts of the language
I don't think this is true, and I think that's supported by the success of JavaScript: The Good Parts.
It would be unfair to characterise a lack of comprehensive knowledge of JavaScript foot-guns as general incompetence.
TL;DR: obj[key] with user-controlled key == "__proto__" is a gift that keeps on giving; buy our AI tool that will write subtle vulnerabilities like that which you yourself won’t catch in review but then it will also write some property-based tests that maybe will
Don't forget you can use AI to turn a 50 word blog post into a 2,000 word one!
For real. The bullet-point summary at the beginning with a "Why this matters for..." immediately followed by, "This isn't just a theoretical exercise—it's a real example of..." Dead giveaways.