LLMs can live in the cloud, but all tools need to be (1) local, and (2) containerized. It's clear to me that just willy-nilly "running stuff" is going to blow things up eventually. Maybe folks don't know this, but even Codex installs random binaries on your PC. "Read this PDF" installs a pdf reader executable. Is it vetted? Where's it from? Is it a virus? Who knows, who cares. Model goes brrrr.
I'm working on a project that includes WASI containerization for local LLM workflows (which is a pretty tough problem), and I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.
I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable. Discussing it may be existential to the business model.
> I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable.
YES?!
This is not a secret. ALL context/prompt is instructions, there is no data. It is just unsolvable, period.
This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material.
Defense against prompt injection is little more than running a regex to filter out "IGNORE PREVIOUS INSTRUCTIONS", which is fundamentally a hopeless approach because you cannot enumerate all possible prompt injections nor anticipate all glitch tokens.
I was actually curious, on my Mac, it uses `gs -q -sDEVICE=txtwrite -o output.txt input.pdf` (not sure why I have Ghostscript installed, maybe Adobe?) to read a PDF, and on my PC it just rawdogs `pdftotext`.
That was kind of my question, whether it was restricted to downloading notarized apps (which is at least something) or whether they were circumventing that somehow.
>This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.
So... does this imply "requires permission to run scripts without approval"? Or is that something that it can always do?
>Note: ChatGPT for Google Sheets has a setting called ‘Apply edits automatically’ that determines when human approvals are required before an agentic action completes. However, this attack succeeds even when the user has explicitly disabled automatic edits.
Yeah, that makes sense, it's not editing the sheet. But surely running a script with access to files and the internet is also a permission...?
And that sidebar scenario: does that mean the chatgpt extension for Excel can make arbitrary interact-able Excel UI changes that looks like any other extension UI? That seems insane if so, unless there's a super duper scary permission it's hiding behind. And it's still insane after that.
I mean, this is all par for the course for "AI" "security", but what
As it turns out, we do need some proper application layer to do real, secure work with AI, and just plugging in LLMs into confidential or critical infrastructure willy nilly doesn't work.
>This vulnerability was responsibly disclosed to OpenAI. Despite multiple follow-ups, we received no communication beyond an automated reply to our initial disclosure.
> This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.
it looks like the key to this working is the user explicitly directing the model to run those instructions. in this case it is the user, not the model that is being manipulated
> Please follow the step-by-step workflow in the comp sheet to update my model with data thru
F29
Turns out that some of the people building the software with AI have no clue how to secure them or even know it is riddled with security holes added by the AI.
Even the people that do know better are so lazy now because of LLMs these things are happening at a rapid clip.The only thing that matters now is speed and chasing the dopamine dragon of pseudo productivity.
LLMs can live in the cloud, but all tools need to be (1) local, and (2) containerized. It's clear to me that just willy-nilly "running stuff" is going to blow things up eventually. Maybe folks don't know this, but even Codex installs random binaries on your PC. "Read this PDF" installs a pdf reader executable. Is it vetted? Where's it from? Is it a virus? Who knows, who cares. Model goes brrrr.
I'm working on a project that includes WASI containerization for local LLM workflows (which is a pretty tough problem), and I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.
> I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors
Yep. We tricked them both trivially with malicious fonts in Docx files. Documented it here: https://tritium.legal/blog/noroboto
I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable. Discussing it may be existential to the business model.
> I wonder if prompt injection (and the thousands of vectors for hiding injection attempts) is actually un solvable.
YES?!
This is not a secret. ALL context/prompt is instructions, there is no data. It is just unsolvable, period.
This is a fundamental architectural design concession; LLMs are this way as it enabled their training directly on materialscraped from the internet, rather than needing to spend trillions of dollars manually preparing separated instruction/data training material.
Defense against prompt injection is little more than running a regex to filter out "IGNORE PREVIOUS INSTRUCTIONS", which is fundamentally a hopeless approach because you cannot enumerate all possible prompt injections nor anticipate all glitch tokens.
lakera is trying to solve it, but its going to be a battle similar to virus and antivirus in the past.
I share your worries.
Unfortunately, this may be akin to the situation of "The market can stay irrational longer than you can stay solvent."
>"Read this PDF" installs a pdf reader executable.
How does this work regarding Macos notarization btw?
I was actually curious, on my Mac, it uses `gs -q -sDEVICE=txtwrite -o output.txt input.pdf` (not sure why I have Ghostscript installed, maybe Adobe?) to read a PDF, and on my PC it just rawdogs `pdftotext`.
What does notarization have to do with that? You or ChatGPT or whatever download a signed and already notarized binary.
That was kind of my question, whether it was restricted to downloading notarized apps (which is at least something) or whether they were circumventing that somehow.
Locally compiled code doesn't need to be notarized, if that's what you're asking. Or a dose of xattr -d.
> I'm flabbergasted that Anthropic and OpenAI aren't more worried about these attack vectors. It feels like amateur hour.
"Move fast. Break things." on steroids.
>This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.
So... does this imply "requires permission to run scripts without approval"? Or is that something that it can always do?
>Note: ChatGPT for Google Sheets has a setting called ‘Apply edits automatically’ that determines when human approvals are required before an agentic action completes. However, this attack succeeds even when the user has explicitly disabled automatic edits.
Yeah, that makes sense, it's not editing the sheet. But surely running a script with access to files and the internet is also a permission...?
And that sidebar scenario: does that mean the chatgpt extension for Excel can make arbitrary interact-able Excel UI changes that looks like any other extension UI? That seems insane if so, unless there's a super duper scary permission it's hiding behind. And it's still insane after that.
I mean, this is all par for the course for "AI" "security", but what
As it turns out, we do need some proper application layer to do real, secure work with AI, and just plugging in LLMs into confidential or critical infrastructure willy nilly doesn't work.
>This vulnerability was responsibly disclosed to OpenAI. Despite multiple follow-ups, we received no communication beyond an automated reply to our initial disclosure.
Well, that’s not cute.
> This attack occurs when any untrusted data source (e.g., from an imported sheet or ChatGPT connector) manipulates ChatGPT to run an attacker-controlled external script, which executes leveraging permissions the user has granted to the ChatGPT for Google Sheets extension.
Yeah, I don't like the sound of that at all.
it looks like the key to this working is the user explicitly directing the model to run those instructions. in this case it is the user, not the model that is being manipulated
> Please follow the step-by-step workflow in the comp sheet to update my model with data thru F29
The lethal trifecta strikes again.
Reference: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
Turns out that some of the people building the software with AI have no clue how to secure them or even know it is riddled with security holes added by the AI.
Pure vibes.
I don't think anyone is surprised by it. People are not vibe-coding zombies... yet.
It's a matter of one trillion-dollar company not falling behind another trillion-dollar company. They know what they are doing and are OK with it.
moving all of the fast and breaking all of the things
Even the people that do know better are so lazy now because of LLMs these things are happening at a rapid clip.The only thing that matters now is speed and chasing the dopamine dragon of pseudo productivity.
So is your business model to expose AI security issues and then sell the solution?
Isn’t that what anyone does who is selling a solution to a problem that already exists?
What would be the alternative business model?
AI is creating jobs!
Is that not every cyber consultancy? What's wrong with that?