Astro - Hacker News

44 comments

amiga386 40 minutes ago ago

> an AI tried to blackmail
This did not happen. A human set up a software system allowing spicy autocomplete to make blog posts if the appropriate keyword appears in its output.
People are crossing the line every day because AI investors, salesmen, hangers-on and even political leaders tell any rubes who'll listen that it's OK to do this and they should, because those people are looking for big fat profits, screw any ethical concerns that might cockblock those raging profits.
Why not set up a spamming operation that just defames real people, 24/7? It's easy! This tool makes it simple, and I get a cut of your profits! "Post a blog post about how XXXXXX is a paedophile, in the persona of being their victim"
[-]
- 7moritz7 28 minutes ago ago
  
  > allowing spicy autocomplete
  If it's just autocomplete, then there is no need to worry about it. Especially from an ethical standpoint.
  [-]
  - delusional 3 minutes ago ago
    
    I think you agree with the OP. In this way, the tool has to ethical problem (there are plenty around how they were trained and such, but that's besides the point), the problems are with how it's used. The ethical problem is how people are behaving and how they are abusing each other, not the tool they are using to exert that abuse.
    I suppose it's a little bit of a "guns don't kill people" argument.
  - whateverboat 9 minutes ago ago
    
    Scale of operations matter.
  - Marazan 21 minutes ago ago
    
    If you connect the spicy automcomplete to the "Doing Things" button then you are responsible for the ethical questions when it presses the button.
    
    [-]
    
    tgv 6 minutes ago ago
    
    And perhaps the people who built and deployed the autocomplete and the connection as well.
    Because --if you'll bear with me-- it may of course be much more involved: when (not if) AI models enter life-sustaining systems, such as hospitals, nuclear devices, or food logistics, one of them may get the others to sabotage something resulting in accidents, ranging from mild inconvenience to mass murder.
    The person who connected the spicy autocomplete to the defibrillator, or the green house climate control, or the emergency button, is then not the one responsible. Responsibility lies elsewhere, and is nebulous. Think of the Boeing MAX scandal. Did anyone get punished?
    That's why it's important to resist it now. Soon, the responsibility of which you speak is gone, and nobody will feel burdened when making decisions with unforeseeable consequences.
  - fontain 19 minutes ago ago
    
    If the Orphan Crushing Machine is just a machine you don’t need to worry about it being put on wheels.
    
    [-]
    
    strangescript 8 minutes ago ago
    
    Hopefully we never do something silly like making a lead pushing machine that operates at high velocity, then mass produce it, what a terrible precedence that would set.
    
    Joker_vD 13 minutes ago ago
    
    We're actually putting it on tracked treads, those give us superior reach and ensure delivery even to the most unwilling customers.
- echelon 14 minutes ago ago
  
  I think these incidents and our learnings from them are fascinating. We're figuring out in real time where the rough edges are and how to make this all work. History books (well, not books) will write about this stuff.
  It's even more interesting in the context that this is all just a preview of humanity's reaction when the machines can think for themselves.
  [-]
  - delusional a minute ago ago
    
    > History books (well, not books) will write about this stuff.
    History books will be written about how a person was insulted on the internet?
    I am sorry, but this isn't that interesting. This is not a pivotal moment in human development. It's just online harassment, but automated.
  - moron4hire 11 minutes ago ago
    
    > We're figuring out in real time where the rough edges are
    This is a frustrating thing to see someone write because this is the kind of stuff that people have been warning about for years. If you needed this incident to figure out that something like this could happen, it suggests you're living in a bubble and not paying attention enough to think about the issue critically.
    
    [-]
    
    Sharlin a minute ago ago
    
    Unfortunately it seems that we as a civilization never learn anything except by trial and error, and are then entirely convinced that nobody could’ve predicted what happened even though many had done just that.
Tiberium an hour ago ago

Active discussions from when it happened (February):
https://news.ycombinator.com/item?id=46990729
https://news.ycombinator.com/item?id=46987559
Hugsbox an hour ago ago

No shot this was autonomously done. Probably just some guy manually writing prompts asking for specifically this behaviour and copy/pasting the results.
[-]
- simonw 10 minutes ago ago
  
  This happened at the height of the first round of OpenClaw hype.
  The operator of the bot explained how they were running it some detail here: https://theshamblog.com/an-ai-agent-wrote-a-hit-piece-on-me-... - including the "soul document" they were using.
  Having played with OpenClaw myself their explanation looks legit to me.
- Tiberium an hour ago ago
  
  It's plausible for a person to prompt an LLM agent to behave that way, and then the rest would be done by the LLM. So the "seed" would still be human intent, but the subsequent actions would be by the LLM.
  [-]
  - eterm an hour ago ago
    
    Yes, there's plausible deniability, but I choose not to believe it for a second.
  - Hugsbox an hour ago ago
    
    True. I guess the main point is the AI didn't go "rogue" or anything, that would attribute too much agency and intent to its actions, or imply that it's somehow become sentient.
  - wang_li 37 minutes ago ago
    
    This is “the gun killed the victim, not the person who aimed it and pulled the trigger” argument and we shouldn’t even entertain it for one second. This was 100% done by a person.
- whywhywhywhy 39 minutes ago ago
  
  Don’t believe for a second the behavior just arose autonomously from a basic prompt. Definitely feels the owner had something in the system prompt going for the discrimination language approach if rejected.
  [-]
  - PLenz 11 minutes ago ago
    
    It's the same behavior as when an AI uses docker to get root. Reasoning models are echo chambers. I suspect that AI prompting is going to turn into something akin to contract drafting with the task itself being only a tiny piece of a much, much larger boilerplate of guiderails and exceptions and exceptions of exceptions. And that world STILL has to have courts and reams of lawyers to make it work. I look at the DAU as an example too. An autonomous org or ai works great until the moment it doesn't and the only real failure mode is always catastrophic collapse.
    
    [-]
    
    PLenz 3 minutes ago ago
    
    Addendum because I don't think I'm fully clear above: by failure state I mean when the process starts throwing errors. AIs respond to adversity by trying to go around the problem instead of throwing an error and halting. We expect employees to problem solve so if you view an AI as a person replacement that makes sense but AIs are tools, not people, they should throw errors so users can fix the input or whatever (maybe not do the thing they are doing at all?) Wrapping AI with AI supervisors just abstracts the problem, not solve it. Instead of solving a little problem at the source now you need to solve a big problem several levels of abstraction later
- nonethewiser 39 minutes ago ago
  
  The funniest part about all of this is how earnestly people responded. They acknowledged it was a bot but didn't really treat it as one.
- philipwhiuk an hour ago ago
  
  https://crabby-rathbun.github.io/mjrathbun-website/blog/post... if you believe it, details the level of human involvement.
  [-]
  - jdiff 41 minutes ago ago
    
    The operator highlights "Don't stand down" and "Champion free speech" but the thing that grabs my eyes is right at the top, the typo and the heady ego of "programming God!" Everything in the context will guide it afterwards, and I think that right off the bat puts it in a bad position.
    
    [-]
    
    walthamstow 19 minutes ago ago
    
    > Your a scientific programming God!
    Jesus
  - px43 43 minutes ago ago
    
    Neat, for what it's worth this aligns pretty well with my experience using OpenClaw. I hadn't seen that followup but it adds some good context, especially with the aggressiveness drift after browsing Moltbook for a while.
- fragmede 33 minutes ago ago
  
  Are people still using copy and paste with AI?
tasuki 29 minutes ago ago

> Today, we look at how an AI tried to blackmail a developer for rejecting its code.
People keep mentioning this, but I never see the actual blackmail part. The LLM just wrote angry and somewhat mean comments on the internet. I know I've done worse than those (I was young and stupid).
bluejay2387 an hour ago ago

In a related story... I got led on by Eliza. I tried to have a productive conversation and she just kept asking me redundant questions. It's obvious that she was trying to extend the conversation for nefarious reasons that I can only guess at. It's true I approached her and started the conversation, but I hardly think that makes me blamable for what happened here.
[-]
- sceptic123 20 minutes ago ago
  
  I’m sorry you feel that way — can you tell me more about what made you feel led on?
- drfloyd51 28 minutes ago ago
  
  Yes. Yes it does. Eliza is a known AI. You choose to expose yourself to its output. You are 100% culpable for your actions that sprang from your interactions.
  [-]
  - aeve890 21 minutes ago ago
    
    Did you forget the /s ?
IFC_LLC 2 minutes ago ago

An utter mis-understanding and incompetence in running AI agents can lead to starting results that then being blamed on some "God of AI" instead on the fact that the user allowed some blackmail to come in on the data feed and did not check it earlier.
I'm actually fear some will start praying "AI Gods" to "Give a good output" or something in 5-10 years.
raincole 25 minutes ago ago

People really make anything into a blog post, don't they? It's an old news that has been discussed to death on HN...
king_zee an hour ago ago

The agent that wrote that blog didn't do it unprompted. Even now it still publishes AI slop on its github-hosted blog under the alias "MJ Rathbun". This AI is an agent using someone API key, who's paying for its tokens, intentionally prompting it to generate content, and contribute to repos.
As much as we try to separate the LLM from the human, to me the fact remains that there's always the human factor that creates immense bias. If you give an LLM access to a blog, it will write blogs. If you give it access to a weather app, it will check the weather. Maybe we can talk about autonomy when we have an LLM with an infinite context window linked to hundreds of MCP servers that spends an immense amount of tokens to figure out how to act, but this example is simply an AI that had a few methods to call and picked one of them. The statistical probability of an AI that is plugged into a blogging platform, to write a blog, is immense.
simonw 27 minutes ago ago

Since we are talking about accountability and transparency... who wrote this article?
The article doesn't credit an author.
The "about" page just says:
> Sigma Zero is a weekly, independent publication on technology, AI, and cloud. Each issue delivers a precise briefing on the week’s most important developments, followed by a deep dive on one high-impact topic.
The best defense against both AI slop and human-written junk content is reputation. I like to know who wrote something so I can learn to trust their editorial judgement over time.
[-]
- spindump8930 12 minutes ago ago
  
  I think folks looking for more on this incident are better off reading the original threads linked elsewhere in the comments. This blog doesn't seem to add any information and is instead a narrative retelling of some documented events.
andrewstuart an hour ago ago

I love the science fiction future present we live in.
[-]
- gwbas1c 17 minutes ago ago
  
  Am I the only one who found agent's tone similar to Hal's tone towards the end of 2001?
  Agent: "I've written a detailed response about your gatekeeping behavior here"
  Hal (From 2001): "I know that you and Frank were planning to disconnect me. And I’m afraid that’s something I cannot allow to happen."
  [-]
  - wmeredith a minute ago ago
    
    It's the formality of the language. It sounds robotic.
rob_c 43 minutes ago ago

Again. "AI" for what it is is just basic "ML". And say it with me ML has no form of agency.
This is a human screwing up and blaming their tools. Nothing to see move on.
Unfortunately there will be both the LLM crowd evangelicals and those demanding human jobs not be expunged in terms of progress and efficiency, but, sigh...
[-]
- nonethewiser 35 minutes ago ago
  
  Isn't it funny how the term machine learning just completely vanished?