My parents were tricked the other day by a fake youtube video of "racist cop" doing something bad and getting outraged by it. I watch part of the video and even though it felt off I couldn't immediately tell for sure if it was fake or not. Nevertheless I googled the names and details and found nothing but repostings of the video. Then I looked at the youtube channel info and there it said it uses AI for "some" of the videos to recreate "real" events. I really doubt that.. it all looks fake. I am just worried about how much divisiveness this kind of stuff will create all so someone can profit off of youtube ads.. it's sad.
I’m spending way too much time on the RealOrAI subreddits these days. I think it scares me because I get so many wrong, so I keep watching more, hoping to improve my detection skills. I may have to accept that this is just the new reality - never quite knowing the truth.
Those subreddits label content wrong all the time. Some of top commentors are trolling (I've seen one cooking video where the most voted comment is "AI, the sauce stops when it hits the plate"... as thick sauce should do.)
You're training yourself with a very unreliable source of truth.
Many people seek being outraged. Many people seek to have awareness of truth. Many people seek getting help for problems. These are not mutually exclusive.
Just because someone fakes an incident of racism doesn't mean racism isn't still commonplace.
In various forms, with various levels of harm, and with various levels of evidence available.
(Example of low evidence: a paper trail isn't left when a black person doesn't get a job for "culture fit" gut feel reasons.)
Also, faked evidence can be done for a variety of reasons, including by someone who intends for the faking to be discovered, with the goal of discrediting the position that the fake initially seemed to support.
I like that saying. You can see it all the time on Reddit where, not even counting AI generated content, you see rage bait that is (re)posted literally years after the fact. It's like "yeah, OK this guy sucks, but why are you reposting this 5 years after it went viral?"
Rage sells. Not long after EBT changes, there were a rash of videos of people playing the person people against welfare imagine in their head. Women, usually black, speaking improperly about how the taxpayers need to take care of their kids.
Not sure how I feel about that, to be honest. On one hand I admire the hustle for clicks. On the other, too many people fell for it and probably never knew it was a grift, making all recipients look bad. I only happened upon them researching a bit after my own mom called me raging about it and sent me the link.
a reliable giveaway for AI generated videos is just a quick glance at the account's post history—the videos will look frequent, repetitive, and lack a consistent subject/background—and that's not something that'll go away when AI videos get better
AI is capable of consistent characters now, yes, but the platforms themselves provide little incentive to. TikTok/Instagram Reels are designed to serve recommendations, not a user-curated feed of people you follow, so consistency is not needed for virality
Not long ago, a statistical study found that AI almost always has an 'e' in its output. It is a firm indicator of AI slop. If you catch a post with an 'e', pay it no mind: it's probably AI.
Uh-oh. Caught you. Bang to rights! That post is firmly AI. Bad. Nobody should mind your robot posts.
I'm incredibly impressed that you managed to make that whole message without a single usage of the most frequently used letter, except in your quotations.
The problem’s gonna be when Google as well is plastered with fake news articles about the same thing. There’s very little to no way you will know whether something is real or not.
It's a band-aid solution, given that eventually AI content will be indistinguishable from real-world content. Maybe we'll even see a net of fake videos citing fake news articles, etc.
Of course there are still "trusted" mainstream sources, expect they can inadvertently (or for other reasons) misstate facts as well. I believe it will get harder and harder to reason about what's real.
It's not really any different that stopping selling counterfeit goods on a platform. Which is a challenge, but hardly insurmountable and the pay off from AI videos won't be nearly so good. You can make a few thousand a day selling knock offs to a small amount of people and get reliably paid within 72 hours. To make the same off of "content" you would have to get millions of views and the pay out timeframe is weeks if not months. Youtube doesn't pay you out unless you are verified, so ban people posting AI and not disclosing it and the well will run dry quickly.
that sounds like one of the worst heuristics I've ever heard, worse than "em-dash=ai" (em-dash equals ai to the illiterate class, who don't know what they are talking about on any subject and who also don't use em-dashes, but literate people do use em-dashes and also know what they are talking about. this is called the Dunning-Em-Dash Effect, where "dunning" refers to the payback of intellectual deficit whereas the illiterate think it's a name)
The em-dash=LLM thing is so crazy. For many years Microsoft Word has AUTOCORRECTED the typing of a single hyphen to the proper syntax for the context -- whether a hyphen, en-dash, or em-dash.
I would wager good money that the proliferation of em-dashes we see in LLM-generated text is due to the fact that there are so many correctly used em-dashes in publicly-available text, as auto-corrected by Word...
Which would matter but the entry box in no major browser do was this.
The HN text area does not insert em-dashes for you and never has. On my phone keyboard it's a very lot deliberate action to add one (symbol mode, long press hyphen, slide my finger over to em-dash).
The entire point is it's contextual - emdashes where no accomodations make them likely.
Yeah, I get that. And I'm not saying the author is wrong, just commenting on that one often-commented-upon phenomenon. If text is being input to the field by copy-paste (from another browser tab) anyway, who's to say it's not (hypothetically) being copied and pasted from the word processor in which it's being written?
Thank you for saving me the time writing this. Nothing screams midwit like "Em-dash = AI". If AI detection was this easy, we wouldn't have the issues we have today.
With the right context both are pretty good actually.
I think the emoji one is most pronounced in bullet point lists. AI loves to add an emoji to bullet points. I guess they got it from lists in hip GitHub projects.
The other one is not as strong but if the "not X but Y" is somewhat nonsensical or unnecessary this is very strong indicator it's AI.
Similarly: "The indication for machine-generated text isn't symbolic. It's structural." I always liked this writing device, but I've seen people label it artificial.
If nobody used em-dashes, they wouldn’t have featured heavily in the training set for LLMs. It is used somewhat rarely (so e people use it a lot, others not at all) in informal digital prose, but that’s not the same as being entirely unused generally.
I didn't know these fancy dashes existed until I read Knuth's first book on typesetting. So probably 1984. Since then I've used them whenever appropriate.
That's the only way I know how to get an em dash. That's how I create them. I sometimes have to re-write something to force the "dash space <word> space" sequence in order for Word to create it, and then I copy and paste the em dash into the thing I'm working on.
Windows 10/11’s clipboard stack lets you pin selections into the clipboard, so — and a variety of other characters live in mine. And on iOS you just hold down -, of course.
Ctrl+Shit+U + 2014 (em dash) or 2013 (en dash) in Linux. Former academic here, and I use the things all the time. You can find them all over my pre-LLM publications.
Because I could not stop for Death –
He kindly stopped for me –
The Carriage held but just Ourselves –
And Immortality.
We slowly drove – He knew no haste
And I had put away
My labor and my leisure too,
For His Civility –
Her dashes have been rendered as en dashes in this particular case rather than em dashes, but unless you're a typography enthusiast you might not notice the difference (I certainly didn't and thought they were em dashes at first). I would bet if I hunted I would find some places where her poems have been transcribed with em dashes. (It's what I would have typed if I were transcribing them).
Not foolproof, but a couple of easy ways to verify if images were AI generated:
- OpenAI uses the C2PA standard [0] to add provenance metadata to images, which you can check [1]
- Gemini uses SynthId [2] and adds a watermark to the image. The watermark can be removed, but SynthId cannot as it is part of the image. SynthId is used to watermark text as well, and code is open-source [3]
Synth id can be removed, run it through an image 2 image model with a reasonably high denoising value or add artificial noise and use another model to denoise and voila. It's effort that probably most aren't doing, but TKS certainly possible.
I just went to a random OpenAI blog post ("The new ChatGPT Images is here"), right-click saved one of the images (the one from "Text rendering" section), and pasted it to your [1] link - no metadata.
I know the metadata is probably easy to strip, maybe even accidentally, but their own promotional content not having it doesn't inspire confidence.
Think the notion that ‘no one’ uses em dashes is a bit misguided. I’ve personally used them in text for as long as I can remember.
Also on the phrase “you’re absolute right”, it’s definitely a phrase my friends and I use a lot, albeit in a sorta of sarcastic manner when one of us says something which is obvious but, nonetheless, we use it. We also tend to use “Well, you’re not wrong” again in a sarcastic manner for something which is obvious.
And, no, we’re not from non English speaking countries (some of our parents are), we all grew up in the UK.
Just thought I’d add that in there as it’s a bit extreme to see an em dash instantly jump to “must be written by AI”
It is so irritating that people now think you've used an LLM just because you use nice typography. I've been using en dashes a ton (and em dashes sporadically) since long before ChatGPT came around. My writing style belonged to me first—why should I have to change?
If you have the Compose key [1] enabled on your computer, the keyboard sequence is pretty easy: `Compose - - -` (and for en dash, it's `Compose - - .`). Those two are probably my most-used Compose combos.
Also on phones it is really easy to use em dashes. It's quite out in the open whether I posted from desktop or phone because the use of "---" vs "—" is the dead give-away.
Hot take, but a character that demands zero-space between the letters at the end and the beginning of 2 words - that ISN'T a hyphenated compound - is NOT nice typography. I don't care how prevalent it is, or once was.
Just my two cents: We use em-dashes in our bookstore newsletter. It's more visually appealing than than semi-colons and more versatile as it can be used to block off both ends of a clause. I even use en-dashes between numbers in a range though, so I may be an outlier.
Well the dialogue there involves two or more people, when commenting, why would you use that.. Even if you have collaborators, you wouldn't very likely be discussing stuff through code comments..
I would add that a lot of us who were born or grew up in the UK are quite comfortable saying stuff like "you're right, but...", or even "I agree with you, but...". The British politeness thing, presumably.
Em-dashes may be hard to type on a laptop, but they're extremely easy to type on iOS—you just hold down the "-" key, as with many other special characters—so I use them fairly frequently when typing on that platform.
That's not as easy as just hitting the hyphen key, nor are most people going to be aware that even exists. I think it's fair to say that the hyphen is far easier to use than an em dash.
But why when the “-“ works just as well and doesn’t require holding the key down?
You’re not the first person I’ve seen say that FWIW, but I just don’t recall seeing the full proper em-dash in informal contexts before ChatGPT (not that I was paying attention). I can’t help but wonder if ChatGPT has caused some people - not necessarily you! - to gaslight themselves into believing that they used the em-dash themselves, in the before time.
In British English you'd be wrong for using an em-dash in those places, with most grammar recommendations being for an en-dash, often with spaces.
It's be just as wrong as using an apostrophe instead of a comma.
Grammar is often wooly in a widely used language with no single centralised authority. Many of the "Hard Rules" some people thing are fundamental truths are often more local style guides, and often a lot more recent than some people seem to believe.
Interesting, I’m an American English speaker but that’s how it feels natural to me to use dashes. Em-dashes with no spaces feels wrong for reasons I can’t articulate. This first usage—in this meandering sentence—feels bossy, like I can’t have a moment to read each word individually. But this second one — which feels more natural — lets the words and the punctuation breathe. I don’t actually know where I picked up this habit. Probably from the web.
It can also depend on the medium. Typically, newspapers (e.g. the AP style guide) use spaces around em-dashes, but books / Chicago style guide does not.
As a brit I'd say we tend to use "en-dashes", slightly shorter versions - so more similar to a hyphen and so often typed like that - with spaces either side.
I never saw em-dashes—the longer version with no space—outside of published books and now AI.
Besides the LaTeX use, on Linux if you have gone into your keyboard options and configured a rarely-used key to be your Compose key (I like to use the "menu" key for this purpose, or right Alt if on a keyboard with no "menu" key), you can type Compose sequences as follows (note how they closely resemble the LaTeX -- or --- sequences):
Compose, hyphen, hyphen, period: produces – (en dash)
Compose, hyphen, hyphen, hyphen: produces — (em dash)
And many other useful sequences too, like Compose, lowercase o, lowercase o to produce the ° (degree) symbol. If you're running Linux, look into your keyboard settings and dig into the advanced settings until you find the Compose key, it's super handy.
P.S. If I was running Windows I would probably never type em dashes. But since the key combination to type them on Linux is so easy to remember, I use em dashes, degree symbols, and other things all the time.
I think that's just incorrect. There are varying conventions for spaces vs no spaces around em dashes, but all English manuals of style confine to en dashes just to things like "0–10" and "Louisville–Calgary" — at least to my knowledge.
Came here to confirm this. I grew up learning BrE and indeed in BrE, we were taught to use en-dash. I don't think we were ever taught em-dash at all. My first encounter with em-dash was with LaTeX's '---' as an adult.
I'm pretty sure the OP is talking about this thread. I have it top of mind because I participated and was extremely frustrated about, not just the AI slop, but how much the author claimed not to use AI when they obviously used it.
It was not just the em dashes and the "absolutely right!" It was everything together, including the robotic clarifying question at the end of their comments.
>which is not a social network, but I’m tired of arguing with people online about it
I know this was a throwaway parenthetical, but I agree 100%. I don't know when the meaning of "social media" went from "internet based medium for socializing with people you know IRL" to a catchall for any online forum like reddit, but one result of this semantic shift is that it takes attention away from the fact that the former type is all but obliterated now.
Discord is the 9,000lb gorilla of this form of social media, and it's actually quietly one of the largest social platforms on the internet. There's clearly a desire for these kinds of spaces, and Discord seems to be filling it.
While it stinks that it is controlled by one big company, it's quite nice that its communities are invite-only by default and largely moderated by actual flesh-and-blood users. There's no single public shared social space, which means there's no one shared social feed to get hooked on.
Pretty much all of my former IRC/Forum buddies have migrated to Discord, and when the site goes south (not if, it's going to go public eventually, we all know how this story plays out), we expect that we'll be using an alternative that is shaped very much like it, such as Matrix.
> Discord is the 9,000lb gorilla of this form of social media, and it's actually quietly one of the largest social platforms on the internet. There's clearly a desire for these kinds of spaces, and Discord seems to be filling it.
The "former type" had to do with online socializing with people you know IRL.
I have never seen anything on Discord that matches this description.
Yeah same as sibling comments, I'm in multiple discord servers for IRL friend groups. I personally run one with ~50 people that sees a hundreds of messages a day. By far my most used form of social media. Also as OP said, I'll be migrating to Matrix (probably) when they IPO, we've already started an archival project just in case.
> "internet based medium for socializing with people you know IRL"
"Social media" never meant that. We've forgotten already, but the original term was "social network" and the way sites worked back then is that everyone was contributing more or less original content. It would then be shared automatically to your network of friends. It was like texting but automatically broadcast to your contact list.
Then Facebook and others pivoted towards "resharing" content and it became less "what are my friends doing" and more "I want to watch random media" and your friends sharing it just became an input into the popularity algorithm. At that point, it became "social media".
HN is neither since there's no way to friend people or broadcast comments. It's just a forum where most threads are links, like Reddit.
It's even worse than that, TikTok & Instagram are labeled "social media" despite, I'd wager, most users never actually posting anything anymore. Nobody really socializes on short form video platforms any more than they do YouTube. It's just media. At least forums are social, sort of.
I'll come clean and say I've still never tried Discord and I feel like I must not be understanding the concept. It really looks like it's IRC but hosted by some commercial company and requiring their client to use and with extremely tenuous privacy guarantees. I figure I must be missing something because I can't understand why that's so popular when IRC is still there.
IRC has many many usability problems which I'm sure you're about to give a "quite trivial curlftpfs" explaination for why they're unimportant - missing messages if you're offline, inconsistent standards for user accounts/authentication, no consensus on how even basic rich text should work much less sending images, inconsistent standards for voice calls that tend to break in the presence of NAT, same thing for file transfers...
Right.... how is that different from IRC other than being controlled by a big company with no exit ability and (again) extremely tenuous privacy promises?
The social networks have all added public media and algorithms. I read explanation that because friends don't produce enough content to keep engaged so they added public feeds. I'm disappointed that there isn't a private Bluesky/Mastodon. I also want an algorithm that shows the best of what following posted since last checked so I can keep up.
You know Meta, the "social media company" came out and said their users spend less than 10% of the time interacting with people they actually know?
"Social Media" had become a euphemism for 'scrolling entertainment, ragebait and cats' and has nothing to do 'being social'. There is NO difference between modern reddit and facebook in that sense. (Less than 5% of users are on old.reddit, the majority is subject to the algorithm.)
I enjoyed this post, but I do find myself disagreeing that someone sharing their source code is somehow morally or ethically obligated to post some kind of AI-involvement statement on their work.
Not only is it impossible to adjudicate or police, I feel like this will absolutely have a chilling effect on people wanting to share their projects. After all, who wants to deal with an internet mob demanding that you disprove a negative? That's not what anyone who works hard on a project imagines when they select Public on GitHub.
People are no more required to disclose their use of LLMs than they are to release their code... and if you like living in a world where people share their code, you should probably stop demanding that they submit to your arbitrary purity tests.
YouTube and others pay for clicks/views, so obviously you can maximize this by producing lots of mediocre content.
LinkedIn is a place to sell, either a service/product to companies or yourself to a future employer. Again, the incentive is to produce more content for less effort.
Even HN has the incentive of promoting people's startups.
Is it possible to create a social network (or "discussion community", if you prefer) that doesn't have any incentive except human-to-human interaction? I don't mean a place where AI is banned, I mean a place where AI is useless, so people don't bother.
The closest thing would probably be private friend groups, but that's probably already well-served by text messaging and in-person gatherings. Are there any other possibilities?
spot on. The number of times I've came across a poorly made video where half the comments are calling out its inaccuracies. In the end Youtube (or any other platform) and the creator get paid. Any kind of negative interaction with the video either counts as engagement or just means move on to the next whack-a-mole variant.
None of these big tech platforms that involve UGC were ever meant to scale. They are beyond accountable.
Exactly. People spend less time thinking about the underlying structure at play here. Scratch enough at the surface and the problem is always the ads model of internet. Until that is broken or is economically pointless the existing problem will persist.
Elon Musk cops a lot for the degradation of twitter to people who care about that sort of thing, and he definitely plays a part there, but its the monetisation aspect that was the real tilt to all noise in a signal to noise ratio perspective
We've taken a version of the problem in the physical world to the digital world. It runs along the same lines of how high rents (commercial or residential) limit the diversity of people or commercial offering in a place simply because only a certain thing can work or be economically viable. People always want different mixes of things and offering but if the structure (in this case rent) only permits one type of thing then that's all you're going to get
I think incentives is the right way to think about it. Authentic interactions are not monetized. So where are people writing online without expecting payment?
Blogs can have ads, but blogs with RSS feeds are a safer bet as it's hard to monetize an RSS feed. Blogs are a great place to find people who are writing just because they want to write. As I see more AI slop on social media, I spend more time in my feed reader.
I've been thinking recently about a search engine that filters away any sites that contain advertising. Just that would filter away most of the crap.
Kagi's small web lens seems to have a similar goal but doesn't really get there. It still includes results that have advertising, and omits stuff that isn't small but is ad free, like Wikipedia or HN.
I hope that when all online content is entirely AI generated, humanity will put their phone aside and re-discover reality because we realize that the social networks have become entirely worthless.
To some degree there’s something like this happening. The old saying “pics or it didn’t happen” used to mean young people needed to take their phones out for everything.
Now any photo can be faked, so the only photos to take are ones that you want yourself for memories.
Are there any social media sites where AI is effectively banned? I know it's not an easy problem but I haven't seen a site even try yet. There's a ton of things you can do to make it harder for bots, ie analyze image metadata, users' keyboard and mouse actions, etc.
in effect, broadly anti-AI communities like bsky succeed by the sheer power of universal hate. Social policing can get you very far without any technology I think
I don't know of any, but my strategy to avoid slop has been to read more long-form content, especially on blogs. When you subscribe over RSS, you've vetted the author as someone who's writing you like, which presumably means they don't post AI slop. If you discover slop, then you unsubscribe. No need for a platform to moderate content for you... as you are in control of the contents of your news feed.
I'm not really replying to the article, just going tangentially from the "dead internet theory" topic, but I was thinking about when we might see the equivalent for roads: the dead road theory.
In X amount of time a significant majority of road traffic will be bots in the drivers seat (figuratively), and a majority of said traffic won't even have a human on-board. It will be deliveries of goods and food.
I look forward to the various security mechanisms required of this new paradigm (in the way that someone looks forward to the tightening spiral into dystopia).
Not a dystopia for me. I’m a cyclist that’s been hit by 3 cars. I believe we will look back at the time when we allowed emotional and easily distracted meat bags behind the wheels of fast moving multiple ton kinetic weapons for what it is: barbarism.
That is not really a defensible position. Most drivers don't ever hit someone with their car. There is nothing "barbaric" about the system we have with cars. Imperfect, sure. But not barbaric.
> Most drivers don't ever hit someone with their car. There is nothing "barbaric" about the system we have with cars. Imperfect, sure. But not barbaric.
Drivers are literally the biggest cause of deaths of young people. We should start applying the same safety standards we do to every other part of life.
> In x amount of time a significant majority of road traffic will be bots in the drivers seat (figuratively), and a majority of said traffic won't even have a human on-board. It will be deliveries of goods and food.
Nah. That's assuming most cars today, with literal, not figurative, humans are delivering goods and food. But they're not: most cars during traffic hours and by very very very far are just delivering groceries-less people from point A to point B. In the morning: delivering human (usually by said human) to work. Delivering human to school. Delivering human back to home. Delivering human back from school.
I mean maybe someday we'll have the technlogy to work from home too. Clearly we aren't there yet according to the bosses who make us commute. One can dream... one can dream.
I actually prefer to work in the office, it's easier for me to have separate physical spaces to represent the separate roles in my life and thus conduct those roles. It's extra effort for me to apply role X where I would normally be applying role Y.
Having said that, some of the most productive developers I work with I barely see in the office. It works for them to not have to go through that whole ... ceremoniality ... required of coming into the office. They would quit on the spot if they were forced to come back into the office even only twice a week, and the company would be so much worse off without them. By not forcing them to come into the office, they come in on their own volition and therefore do not resent it and therefore do not (or are slower to) resent their company of employment.
I really liked working in the office when it had lots of people I directly worked with, and was next to lots of good restaurants and a nice gym. You got to know people well and stuff could get done just by wandering over to someone's desk (as long as you were not too pesky too often).
In one hand, we are past the Turing Test definition if we can't distinguish if we are talking with an AI or a real human or more things that were rampant on internet previously, like spam and scam campaigns, targeted opinion manipulation, or a lot of other things that weren't, let's say, an honest opinion of the single person that could be identified with an account.
In the other hand, that we can't tell don't speak so good about AIs as speak so bad about most of our (at least online) interaction. How much of the (Thinking Fast and Slow) System 2 I'm putting in this words? How much is repeating and combining patterns giving a direction pretty much like a LLM does? In the end, that is what most of internet interactions are comprised of, done directly by humans, algorithms or other ways.
There are bits and pieces of exceptions to that rule, and maybe closer to the beginning, before widespread use, there was a bigger percentage, but today, in the big numbers the usage is not so different from what LLMs does.
Recently I’ve been thinking about the text form of communication, and how it plays with our psychology. In no particular order here’s what I think:
1. Text is a very compressed / low information method of communication.
2. Text inherently has some “authority” and “validity”, because:
3. We’ve grown up to internalize that text is written by a human. Someone spend the effort to think and write down their thoughts, and probably put some effort into making sure what they said is not obviously incorrect.
Intimately this ties into LLMs on text being an easier problem to trick us into thinking that they are intelligent than an AI system in a physical robot that needs to speak and articulate physically. We give it the benefit of the doubt.
I’ve already had some odd phone calls recently where I have a really hard time distinguishing if I’m talking to a robot or a human…
This is absolutely why LLMs are so disruptive. It used to be that a long, written paper was like a proof-of-work that the author thought about the problem. Now that connection is broken.
One consequence, IMHO, is that we won't value long papers anymore. Instead, we will want very dense, high-bandwidth writing that the author stakes consequences (monetary, reputational, etc.) on its validity.
The Methyl 4-methylpyrrole-2-carboxylate vs ∂²ψ/∂t² = c²∇²ψ distinction. My bet is on Methyl 4-methylpyrrole-2-carboxylate being more actionable. For better or worse.
I love that BBC radio (today: BBC audio) series. It started before the inflation of 'alternative facts' and it is worth (and very funny and entertaining) to follow, how this show developed in the past 19 years.
> The notorious “you are absolutely right”, which no living human ever used before, at least not that I know of
> The other notorious “let me know if you want to [do that thing] or [explore this other thing]” at the end of the sentence
There's a new one, "wired" I have "wired" this into X or " "wires" into y. Cortex does this and I have noticed it more and more recently.
It super sticks out because who the hell ever said that X part of the program wires into y?
You are absolutely right is something some people in some variants of English say all the time.
It may grate but to me, it grates less than "correct" which is a major sign of arrogant "I decide what is right or wrong" and when I hear it, outside of a context where somebody is the arbiter or teacher, I switch off.
But you're absolutely wrong about youre absolutely right.
It's a bit hokey, but it's not a machine made signifier.
I knew it was real as soon as I read “I stared to see a pattern”, which is funny now I find weird little non spellcheck mistakes endearing since they stamp “oh this is an actual human” on the work
The funny thing is I knew people that used the phrase 'you're absolutely right' very commonly...
They were sales people, and part of the pitch was getting the buyer to come to a particular idea "all on their own" then make them feel good on how smart they were.
The other funny thing on EM dashes is there are a number of HN'ers that use them, and I've seen them called bots. But when you dig deep in their posts they've had EM dashes 10 years back... Unless they are way ahead of the game in LLMs, it's a safe bet they are human.
These phrases came from somewhere, and when you look at large enough populations you're going to find people that just naturally align with how LLMs also talk.
This said, when the number of people that talk like that become too high, then the statistical likelihood they are all human drops considerably.
I'm a confessing user of em-dashes (or en-dashes in fonts that feature overly accentuated em-dashes). It's actually kind of hard to not use them, if you've ever worked with typography and know your dashes and hyphenations. —[sic!] Also, those dashes are conveniently accessible on a Mac keyboard. There may be some Win/PC bias in the em-dash giveaway theory.
A few writer friends even had a coffee mug with the alt+number combination for em-dash in Windows, given by a content marketing company. It was already very widespread in writing circles years ago. Developers keep forgetting they're in a massively isolated bubble.
I don't know why LLMs talk in a hybrid of corporatespeak and salespeak but they clearly do, which on the one hand makes their default style stick out like a sore thumb outside LinkedIn, but on the other hand, is utterly enervating to read when suddenly every other project shared here is speaking with one grating voice.
> part of the pitch was getting the buyer to come to a particular idea "all on their own" then make them feel good on how smart they were.
I can usually tell when someone is leading like this and I resent them for trying to manipulate me. I start giving the opposite answer they’re looking for out of spite.
I’ve also had AI do this to me. At the end of it all, I asked why it didn’t just give me the answer up front. It was a bit of a conspiracy theory, and it said I’d believe it more if I was lead there to think I got there on my own with a bunch of context, rather than being told something fairly outlandish from the start. That fact that AI does this to better reinforce the belief in conspiracy theories is not good.
I don't have strong negative feelings about the era of LLM writing, but I resent that it has taken the em-dash from me. I have long used them as a strong disjunctive pause, stronger than a semicolon. I have gone back to semicolons after many instances of my comments or writing being dismissed as AI.
I will still sometimes use a pair of them for an abrupt appositive that stands out more than commas, as this seems to trigger people's AI radar less?
I've been left wondering when is the world going to find out about Input Method Editor.
It lets users type all sorts of ‡s, (*´ڡ`●)s, 2026/01/19s, by name, on Windows, Mac, Linux, through pc101, standard dvorak, your custom qmk config, anywhere without much prior knowledge. All it takes is to have a little proto-AI that can range from floppy sizes to at most few hundred MBs in size, rewriting your input somewhere between the physical keyboard and text input API.
If I wanted em–dashes, I can do just that instantly – I'm on Windows and I don't know what are the key combinations. Doesn't matter. I say "emdash" and here be an em-dash. There should be the equivalent to this thing for everybody.
You are absolutely right — most internet users don't know the specific keyboard combination to make an em dash and substitute it with two hyphens. On some websites it is automatically converted into an em dash. If you would like to know more about this important punctuation symbol and it's significance in identitifying ai writing, please let me know.
And Em Dash is trivially easy on iOS — you simply hold press on the regular dash button - I’ve been using it for years and am not stopping because people might suddenly accuse me of being an AI.
Thanks for that. I had no idea either. I'm genuinely surprised Windows buries such a crucial thing like this. Or why they even bothered adding it in the first place when it's so complicated.
The Windows version is an escape hatch for keying in any arbitrary character code, hence why it's so convoluted. You need to know which code you're after.
To be fair, the alt-input is a generalized system for inputting Unicode characters outside the set keyboard layout. So it's not like they added this input specifically. Still, the em dash really should have an easier input method given how crucial a symbol it is.
It's a generalized system for entering code page glyphs that was extended to support Unicode. 0150 and 0151 only work if you are on CP1252 as those aren't the Unicode code points.
Now I'm actually curious to see statistics regarding the usage of em-dashes on HN before and after AI took over. The data is public, right? I'd do it myself, but unfortunately I'm lazy.
> The notorious “you are absolutely right”, which no living human ever used before, at least not that I know of
If no human ever used that phrase, I wonder where the ai's learned it from? Have they invented new mannerisms? That seems to imply they're far more capable than I thought they were
I prefer a Dark Forest theory [1] of the internet. Rather than being completely dead and saturated with bots, the internet has little pockets of human activity like bits of flotsam in a stream of slop. And that's how it is going to be from here on out. Occasionally the bots will find those communities and they'll either find a way to ban them or the community will be abandoned for another safe harbour.
To that end, I think people will work on increasingly elaborate methods of blocking AI scrapers and perhaps even search engine crawlers. To find these sites, people will have to resort to human curation and word-of-mouth rather than search.
I like the design of Discord but I don't like that it's owned by one company. At any point they could decide to pursue a full enshittification strategy and start selling everyone's data to train AIs. They could sell the rights to 3rd party spambots and disallow users from banning the bots from their private servers.
It may be great right now but the users do not control their own destinies. It looks like there are tools users can use to export their data but if Discord goes the enshittification route they could preemptively block such tools, just as Reddit shut down their APIs.
This is the view I mostly subscribe to too. That coupled with more sites going somewhere closer to the something awful forum model whereby there is a relatively arbitrary upfront free that sort of helps with curating a community and added friction to stem bots.
It would be nice to regain those national index sites or yellow page sites full of categories, where one could find what they're looking for only (based) within the country.
I've been thinking about this a lot lately. An invite only platform where invites need to be given and received in person. It'll be pseudonymous, which should hopefully help make moderation manageable. It'll be an almost cult-like community, where everyone is a believer in the "cause", and violations can mean exile.
Of course, if (big if) it does end up being large enough, the value of getting an invite will get to a point where a member can sell access.
Sunday evening musings regarding bot comments and HN...
I'm sure it's happening, but I don't know how much.
Surely some people are running bots on HN to establish sockpuppets for use later, and to manipulate sentiment now, just like on any other influential social media.
And some people are probably running bots on HN just for amusement, with no application in mind.
And some others, who were advised to have an HN presence, or who want to appear smarter, but are not great at words, are probably copy&pasting LLM output to HN comments, just like they'd cheat on their homework.
I've gotten a few replies that made me wonder whether it was an LLM.
Anyway, coincidentally, I currently have 31,205 HN karma, so I guess 31,337 Hacker News Points would be the perfect number at which to stop talking, before there's too many bots. I'll have to think of how to end on a high note.
(P.S., The more you upvote me, the sooner you get to stop hearing from me.)
> which on most keyboard require a special key-combination that most people don’t know
I am sick of the em-dash slander as a prolific en- and em-dash user :(
Sure for the general population most people probably don't know, but this article is specifically about Hacker News and I would trust most of you all to be able to remember one of:
I think the Internet died long before 2016. It started with the profile, learning about the users, giving them back what they wanted. Then advertising amplified it. 1998 or 99 I'm guessing.
This website absolutely is social media unless you’re putting on blinders or haven’t been around very long. There’s a small in crowd who sets the conversation (there’s an even smaller crowd of ycombinator founders with special privileges allowing them to see each other and connect). Thinking this website isn’t social media just admits you don’t know what the actual function of this website is, which is to promote the views of a small in crowd.
To extend what 'viccis' said above, the meaning of "social media" has changed and is now basically meaningless because it's been used by enough old media organisations who lack the ability to discern the difference between social media and a forum or a bulletin-board or chat site/app or even just a plain website that allows comments.
Social Media is become the internet and/or vice-versa.
Also, I think you're objectively wrong in this statement:
"the actual function of this website is, which is to promote the views of a small in crowd"
Which I don't think was the actual function of (original) social media either.
Bots have ruined reddit but that is what the owners wanted.
The API protest in 2023 took away tools from moderators. I noticed increased bot activity after that.
The IPO in 2024 means that they need to increase revenue to justify the stock price. So they allow even more bots to increase traffic which drives up ad revenue. I think they purposely make the search engine bad to encourage people to make more posts which increases page views and ad revenue. If it was easy to find an answer then they would get less money.
At this point I think reddit themselves are creating the bots. The posts and questions are so repetitive. I've unsubscribed to a bunch of subs because of this.
It's been really sad to see reddit go like this because it was pretty much the last bastion of the human internet. I hated reddit back in the day but later got into it for that reason. It's why all our web searches turned into "cake recipe reddit." But boy did they throw it in the garbage fast.
One of their new features is you can read AI generated questions with AI generated answers. What could the purpose of that possibly be?
We still have the old posts... for the most part (a lot of answers were purged during the protest) but what's left of it is also slipping away fast for various reasons. Maybe I'll try to get back into gemini protocol or something.
I see a retreat to the boutique internet. I recently went back to a gaming-focused website, founded in the late 90s, after a decade. No bots there, as most people have a reputation of some kind
I really want to see people who ruin functional services made into pariahs
I don't care how aggressive this sounds; name and shame.
Huffman should never be allowed to work in the industry again after what he and others did to Reddit (as you say, last bastion of the internet)
Zuckerberg should never be allowed after trapping people in his service and then selectively hiding posts (just for starters. He's never been a particularly nice guy)
Youtube and also Google - because I suspect they might share a censorship architecture... oh, boy. (But we have to remove + from searches! Our social network is called Google+! What do you mean "ruining the internet"?)
given the timing, it has definitely been done to obscure bot activity, but the side effect of denying the usual suspects the opportunity to comb through ten years of your comments to find a wrongthink they can use to dismiss everything you've just said, regardless of how irrelevant it is, is unironically a good thing. I've seen many instances of their impotent rage about it since it's been implemented, and each time it brings a smile to my face.
The wrongthink issue was always secondary, and generally easy to avoid by not mixing certain topics with your account (don't comment on political threads with your furry porn gooner account, etc). At a certain point, the person calling out a mostly benign profile is the one who will look ridiculous, and if not, the sub is probably not worth participating in anyway.
But recently it seems everything is more overrun than usual with bot activity, and half of the accounts are hidden which isn't helping matters. Utterly useless, and other platforms don't seem any better in this regard.
I doubt it's true though. Everyone has something they can track besides total ad views. A reddit bot had no reason to click ads and do things on the destination website. It's there to make posts.
Yes registering fake views is fraud against ad networks. Ad networks love it though because they need those fake clicks to defraud advertisers in turn.
Paying to have ads viewed by bots is just paying to have electricity and compute resources burned for no reason. Eventually the wrong person will find out about this and I think that's why Google's been acting like there's no tomorrow.
The biggest change reddit made was ignoring subscriptions and just showing anything the algorithm thinks you will like. Resulting in complete no name subreddits showing on your front page. Meaning moderators no longer control content for quality, which is both a good and bad thing, but it means more garbage makes it to your front page.
I can't remember the last time I was on the Reddit front page and I use the site pretty much daily. I only look at specific subreddit pages (barely a fraction of what I'm subscribed to).
These are some pretty niche communities with only a few dozen comments per day at most. If Reddit becomes inhospitable to them then I'll abandon the site entirely.
This is my current Reddit use case. I unsubscribed from everything other than a dozen or so niche communities. I’ve turned off all outside recommendations so my homepage is just that content (though there is feed algorithm there). It’s quick enough to sign in every day or two and view almost all the content and move on.
> why would you look at the "front page" if you only wanted to see things you subscribed to?
"Latest" ignores score and only sorts by submission time, which means you see a lot of junk if you follow any large subreddits.
The default home-page algorithm used to sort by a composite of score, recency, and a modifier for subreddit size, so that posts from smaller subreddits don't get drowned out. It worked pretty well, and users could manage what showed up by following/unfollowing subreddits.
At the moment I am on a personal finance kick. Once in awhile I find myself in the bogleheads Reddit. If you don’t know bogleheads have a cult-like worship of the founder of vanguard, whose advice, shockingly, is to buy index funds and never sell.
Most of it is people arguing about VOO vs VTI vs VT. (lol) But people come in with their crazy scenarios, which are all varied too much to be a bot, although the answer could easily be given by one!
> So they allow even more bots to increase traffic which drives up ad revenue
When are people who buy ads going to realize that the majority of their online ad spend is going towards bots rather than human eye balls who will actually buy their product? I'm very surprised there hasn't been a massive lawsuite against Google, Facebook, Reddit, etc. for misleading and essentially scamming ad buyers
Is this really true though? Don't they have ways of tracking the returns on advertising investment? I would have thought that after a certain amount of time these ad buys would show themselves as worthless if they actually were.
Steve Huffman is an awful CEO. With that being said I've always been curious how the rest of the industry (for example, the web-wide practice of autoplaying videos) was constructed to catch up with Facebook's fraudulent metrics. Their IPO (and Zuckerberg is certainly known to lie about things) was possibly fraud and we know that they lied about their own video metrics (to the point it's suspected CollegeHumor shut down because of it)
I am curious when we will land dead github theory? I am looking at growing of self hosted projects and it seems many of them are simply AI slop now or slowly moving there.
I liked em dashes before they were cool—and I always copy-pasted them from Google. Sucks that I can't really do that anymore lest I be confused for a robot; I guess semicolons will have to do.
On a Mac keyboard, Option-Shift-hyphen gives an em-dash. It’s muscle memory now after decades. For the true connoisseurs, Option-hyphen does an en-dash, mostly used for number ranges (e.g. 2000–2022). On iOS, double-hyphens can auto-correct to em-dashes.
I’ve definitely been reducing my day-to-day use of em-dashes the last year due to the negative AI association, but also because I decided I was overusing them even before that emerged.
This will hopefully give me more energy for campaigns to champion the interrobang (‽) and to reintroduce the letter thorn (Þ) to English.
I'm always reminded how much simpler typography is on the Mac using the Option key when I'm on Windows and have to look up how to type [almost any special character].
Instead of modifier plus keypress, it's modifier, and a 4 digit combination that I'll never remember.
I've also used em-dashes since before chatgpt but not on HN -- because a double dash is easier to type. However in my notes app they're everywhere, because Mac autoconverts double dashes to em-dashes.
And on X, an em-dash (—) is Compose, hyphen, hyphen, hyphen. An en-dash (–) is Compose, hyphen, hyphen, period. I never even needed to look these up. They're literally the first things I tried given a basic knowledge of the Compose idiom (which you can pretty much guess from the name "Compose").
Good post, Thank you.
May I say Dead, Toxic Internet? With social media adding the toxicity.
The Enshittification theory by Cory Doctorow sums up the process of how this unfolds (look it up on Wikipedia).
I don't think only AI says "yes you are absolutely right". Many times I have made a comment here and then realized I was dead wrong, or someone disagreed with my by making a point that I had never thought of. I think this is because I am old and I have realized I wasn't never as smart as I thought I was, even when I was a bit smarter a long time ago. It's easy to figure out I am a real person and not AI and I even say things that people downvote prodigiously. I also say you are right.
Reddit has a small number of what I hesitatingly might call "practical" subreddits, where people can go to get tech support, medical advice, or similar fare. To what extent are the questions and requests being posted to these subreddits also the product of bot activity? For example, there are a number of medical subreddits, where verified (supposedly) professionals effectively volunteer a bit of their free time to answer people's questions, often just consoling the "worried well" or providing a second opinion that echos the first, but occasionally helping catch a possible medical emergency before it gets out of hand. Are these well-meaning people wasting their time answering bots?
These subs are dying out. Reddit has losts its gatekeepy culture a long time ago and now subs are getting burnt out by waves of low effort posters treating the site like its instagram. Going through new posts on any practical subreddit the response to 99% of them should be "please provide more information on what your issue is and what you have tried to resolve it".
I cant do reddit anymore, it does my head in. Lemmy has been far more pleasant as there is still good posting etiquette.
I'm not aware of anyone bothering to create bots that can pass the checking particular subreddits do. It'd be fairly involved to do so.
For licensed professions, they have registries where you can look people up and confirm their status. The bot might need to carry out a somewhat involved fraud if they're checking.
I wasn't suggesting the people answering are bots, only that the verification is done by the mods and is somewhat opaque. My concern was just that these well-meaning people might be wasting their time answering botspew. And then inevitably, when they come to realize, or even just strongly suspect, that they're interacting with bots, they'll desist altogether (if the volume of botspew doesn't burn them out first), which means the actual humans seeking assistance now have to go somewhere else.
Also on subreddits functioning as support groups for certain diseases, you'll see posts that just don't quite add up, at least if you know somewhat about the disease (because you or a loved one have it). Maybe they're "zebras" with a highly atypical presentation (e.g., very early age of onset), or maybe they're "Munchies." Or maybe LLMs are posting their spurious accounts of their cancer or neurdegenerative disease diagnosis, to which well-meaning humans actually afflicted with the condition respond (probably along side bots) with their sympathy and suggestions.
Given the climate, I've been thinking about this issue a lot. I'd say that broadly there are two groups of inauthentic actors online:
1. People who live in poorer countries who simply know how to rage bait and are trying to earn an income. In many such countries $200 in ad revenue from Twitter, for example, is significant; and
2. Organized bot farms who are pushing a given message or scam. These too tend to be operated out of poorer countries because it's cheaper.
Last month, Twitter kind of exposed this accidentally with an interesting feature where it showed account location with no warning whatsoever. Interestingly, showing the country in the profile got disabled from government accounts after it raised some serious questions [1].
So I started thinking about the technical feasibility of showing location (country or state for large countries) on all public social media ccounts. The obvious defense is to use a VPN in the country you want to appear to be from but I think that's a solvable problem.
Another thing I read was about NVidia's efforts to combat "smuggling" of GPUs to China with location verification [2]. The idea is fairly simple. You send a challenge and measure the latency. VPNs can't hide latency.
So every now and again the Twitter or IG or Tiktok server would answer an API request with a challenge, which couldn't be antiticpated and would also be secure, being part of the HTTPS traffic. The client would respond to the challenge and if the latency was 100-150ms consistently despite showing a location of Virginia then you can deem them inauthentic and basically just downrank all their content.
There's more to it of course. A lot is in the details. Like you'd have to handle verified accounts and people traveling and high-latency networks (eg Starlink).
You might say "well the phone farms will move to the US". That might be true but it makes it more expensive and easier to police.
Much like someone from Schaumburg Illinois can say they are from Chicago, Hacker News can call itself social media. You fly that flag. Don’t let anyone stop you.
What secret is hidden in the phrase “you are absolutely right”? Using Google's web browser translation yields the mixed Hindi and Korean sentence: “당신 말이 बिल्कुल 맞아요.”
I’m a bit scared of this theory, i think it will be true, ai will eat the internet, then they’ll paywall it.
Innovation outside of rich coorps will end. No one will visit forums, innovation will die in a vacuum, only the richest will have access to what the internet was, raw innovation will be mined through EULAs, people striving to make things will just have ideas stolen as a matter of course.
The "old" Internet is still there in parallel with the "new" Internet. It's just been swamped by the large volume of "new" stuff. In the 90s the Internet was small and before crawler based search engines you had to find things manually and maintain your own list of URLs to get back to things.
Ignore the search engines, ignore all the large companies and you're left with the "Old Internet". It's inconvenient and it's hard work to find things, but that's how it was (and is).
Well then in that case, maybe we need a “vetted internet”. Like the opposite of the dark web, this would only index vetted websites, scanned for AI slop, and with optional parental controls, equipped with customized filters that leverage LLMs to classify content into unwanted categories. It would require a monthly subscription fee to maintain but would be a nonprofit model.
The original Yahoo doesn't exist (outside archive.org), but I'm guessing would be a keen person or two out there maintaining a replacement. It would probably be disappointing, as manually curated lists work best when the curator's interests are similar to your own.
What you want might be Kagi Search with the AI filtering on? I've never used Kagi, so I could be off with that suggestion.
I think it's mainly a matter of clarity as long embedded clauses without obvious visual delimiting can be hard to read and thus are discouraged in professional writing aiming for ease of reading from a wide audience. LLMs are trained on such a style.
>The other day I was browsing my one-and-only social network — which is not a social network, but I’m tired of arguing with people online about it — HackerNews
dude, hate to break it to you but the fact that it's your "one and only" makes it more convincing it's your social network. if you used facebook, instagram, and tiktok for socializing, but HN for information, you would have another leg to stand on.
yes, HN is "the land of misfit toys", but if you come here regularly and participate in discussions with other other people on a variety of topics and you care about the interactions, that's socializing. The only reason you think it's not is that you find actual social interaction awkward, so you assume that if you like this it must not be social.
The problem is not the Internet but the author and those like them, acting like social network participants in following the herd - embracing despair and hopelessness, and victimhood - they don't realize they're the problem, not the victims. Another problem is their ignorance and their post-truth attitude, not caring whether their words are actually accurate:
> What if people DO USE em-dashes in real life?
They do and have, for a long time. I know someone who for many years (much longer than LLMs have been available) has complained about their overuse.
> hence, you often see -- in HackerNews comments, where the author is probably used to Markdown renderer
Using two dashes for an em-dash goes back to typewriter keyboards, which had only what we now call printable ASCII and where it was much harder add to add non-ASCII characters than it is on your computer - no special key combos. (Which also means that em-dashes existed in the typewriter era.)
On a typewriter, you'd be able to just adjust the carriage position to make a continuous dash or underline or what have you. Typically I see XXXX over words instead of strike-throughs for typewritten text meanwhile.
Most typefaces make consecutive underlines continuous by default. I've seen leading books on publishing, including iirc the Chicago Manual of Style, say to type two hypens and the typesetter will know to substitute an em-dash.
lol Hacker News is ground zero for outrage porn. When that guy made that obviously pretend story about delivery companies adding a desperation score the guys here lapped it up.
Just absolutely loved it. Everyone was wondering how deepfakes are going to fool people but on HN you just have to lie somewhere on the Internet and the great minds of this site will believe it.
My parents were tricked the other day by a fake youtube video of "racist cop" doing something bad and getting outraged by it. I watch part of the video and even though it felt off I couldn't immediately tell for sure if it was fake or not. Nevertheless I googled the names and details and found nothing but repostings of the video. Then I looked at the youtube channel info and there it said it uses AI for "some" of the videos to recreate "real" events. I really doubt that.. it all looks fake. I am just worried about how much divisiveness this kind of stuff will create all so someone can profit off of youtube ads.. it's sad.
I’m spending way too much time on the RealOrAI subreddits these days. I think it scares me because I get so many wrong, so I keep watching more, hoping to improve my detection skills. I may have to accept that this is just the new reality - never quite knowing the truth.
What if AI is running RealOrAI to trick us into never quite knowing the truth?
Those subreddits label content wrong all the time. Some of top commentors are trolling (I've seen one cooking video where the most voted comment is "AI, the sauce stops when it hits the plate"... as thick sauce should do.)
You're training yourself with a very unreliable source of truth.
As they say, the demand for racism far outstrips the supply. It's hard to spend all day outraged if you rely on reality to supply enough fodder.
I hadn't heard that saying.
Many people seek being outraged. Many people seek to have awareness of truth. Many people seek getting help for problems. These are not mutually exclusive.
Just because someone fakes an incident of racism doesn't mean racism isn't still commonplace.
In various forms, with various levels of harm, and with various levels of evidence available.
(Example of low evidence: a paper trail isn't left when a black person doesn't get a job for "culture fit" gut feel reasons.)
Also, faked evidence can be done for a variety of reasons, including by someone who intends for the faking to be discovered, with the goal of discrediting the position that the fake initially seemed to support.
(Famous alleged example, in second paragraph: https://en.wikipedia.org/wiki/Killian_documents_controversy#... )
I like that saying. You can see it all the time on Reddit where, not even counting AI generated content, you see rage bait that is (re)posted literally years after the fact. It's like "yeah, OK this guy sucks, but why are you reposting this 5 years after it went viral?"
Rage sells. Not long after EBT changes, there were a rash of videos of people playing the person people against welfare imagine in their head. Women, usually black, speaking improperly about how the taxpayers need to take care of their kids.
Not sure how I feel about that, to be honest. On one hand I admire the hustle for clicks. On the other, too many people fell for it and probably never knew it was a grift, making all recipients look bad. I only happened upon them researching a bit after my own mom called me raging about it and sent me the link.
a reliable giveaway for AI generated videos is just a quick glance at the account's post history—the videos will look frequent, repetitive, and lack a consistent subject/background—and that's not something that'll go away when AI videos get better
> [...] and lack a consistent subject/background—and that's not something that'll go away when AI videos get better
Why not? Surely you can ask your friendly neighbourhood AI to run a consistent channel for you?
AI is capable of consistent characters now, yes, but the platforms themselves provide little incentive to. TikTok/Instagram Reels are designed to serve recommendations, not a user-curated feed of people you follow, so consistency is not needed for virality
A giveaway for detecting AI-generated text is the use of em-dashes, as noted in op - you are caught bang to rights!
Not long ago, a statistical study found that AI almost always has an 'e' in its output. It is a firm indicator of AI slop. If you catch a post with an 'e', pay it no mind: it's probably AI.
Uh-oh. Caught you. Bang to rights! That post is firmly AI. Bad. Nobody should mind your robot posts.
I apprEciatE your dEdication to ExclusivEly using 'e' in quotEd rEfErEncE, but not in thE rEst of your clEarly human-authorEd tExt.
I rEgrEt that I havE not donE thE samE, but plEase accEpt bad formatting as a countErpoint.
I'm incredibly impressed that you managed to make that whole message without a single usage of the most frequently used letter, except in your quotations.
Bet they asked an AI to make the bit work /s
Finally a human in this forum. Many moons did I long for this contact.
(Assuming you did actually hand craft that I thumbs-up both your humor and industry good sir)
nice try but u used caps and punctuation lol bot /s
Or they are reposting other people's content
The problem’s gonna be when Google as well is plastered with fake news articles about the same thing. There’s very little to no way you will know whether something is real or not.
I really wish Google will flag videos with any AI content, that they detect.
It's a band-aid solution, given that eventually AI content will be indistinguishable from real-world content. Maybe we'll even see a net of fake videos citing fake news articles, etc.
Of course there are still "trusted" mainstream sources, expect they can inadvertently (or for other reasons) misstate facts as well. I believe it will get harder and harder to reason about what's real.
It's not really any different that stopping selling counterfeit goods on a platform. Which is a challenge, but hardly insurmountable and the pay off from AI videos won't be nearly so good. You can make a few thousand a day selling knock offs to a small amount of people and get reliably paid within 72 hours. To make the same off of "content" you would have to get millions of views and the pay out timeframe is weeks if not months. Youtube doesn't pay you out unless you are verified, so ban people posting AI and not disclosing it and the well will run dry quickly.
The Payoff from AI videos could get someone in the Whitehouse.
I said something to a friend about this years ago with AI... We're going to stretch the legal and political system to the point of breaking.
Would be nice, but unlikely given that they are going in the opposite direction and having YouTube silently add AI to videos without the author even requesting it: https://www.bbc.com/future/article/20250822-youtube-is-using...
I find the sound is a dead giveaway for most AI videos — the voices all sound like a low bitrate MP3.
Which will eventually get worked around and can easily be masked by just having a backing track.
that sounds like one of the worst heuristics I've ever heard, worse than "em-dash=ai" (em-dash equals ai to the illiterate class, who don't know what they are talking about on any subject and who also don't use em-dashes, but literate people do use em-dashes and also know what they are talking about. this is called the Dunning-Em-Dash Effect, where "dunning" refers to the payback of intellectual deficit whereas the illiterate think it's a name)
The em-dash=LLM thing is so crazy. For many years Microsoft Word has AUTOCORRECTED the typing of a single hyphen to the proper syntax for the context -- whether a hyphen, en-dash, or em-dash.
I would wager good money that the proliferation of em-dashes we see in LLM-generated text is due to the fact that there are so many correctly used em-dashes in publicly-available text, as auto-corrected by Word...
Which would matter but the entry box in no major browser do was this.
The HN text area does not insert em-dashes for you and never has. On my phone keyboard it's a very lot deliberate action to add one (symbol mode, long press hyphen, slide my finger over to em-dash).
The entire point is it's contextual - emdashes where no accomodations make them likely.
Is this—not an em-dash? On iOS I generated it by double tapping dash. I think there are more iOS users than AIs, although I could be wrong about that…
Yeah, I get that. And I'm not saying the author is wrong, just commenting on that one often-commented-upon phenomenon. If text is being input to the field by copy-paste (from another browser tab) anyway, who's to say it's not (hypothetically) being copied and pasted from the word processor in which it's being written?
The audio artifacts of an AI generated video are a far more reliable heuristic than the presence of a single character in a body of text.
For now. A year ago they weren't even Gen AI videos. Give it a few months...
Thank you for saving me the time writing this. Nothing screams midwit like "Em-dash = AI". If AI detection was this easy, we wouldn't have the issues we have today.
Of note is theother terrible heuristic I've seen thrown around, where "emojis = AI", and now the "if you use not X, but Y = AI".
With the right context both are pretty good actually.
I think the emoji one is most pronounced in bullet point lists. AI loves to add an emoji to bullet points. I guess they got it from lists in hip GitHub projects.
The other one is not as strong but if the "not X but Y" is somewhat nonsensical or unnecessary this is very strong indicator it's AI.
Similarly: "The indication for machine-generated text isn't symbolic. It's structural." I always liked this writing device, but I've seen people label it artificial.
Em-dashes are completely innocent. “Not X but Y” is some lame rhetorical device, I’m glad it is catching strays.
No one uses em dashes
If nobody used em-dashes, they wouldn’t have featured heavily in the training set for LLMs. It is used somewhat rarely (so e people use it a lot, others not at all) in informal digital prose, but that’s not the same as being entirely unused generally.
I do—all the time. Why not?
I also use en dashes when referring to number ranges, e.g., 1–9
I didn't know these fancy dashes existed until I read Knuth's first book on typesetting. So probably 1984. Since then I've used them whenever appropriate.
Microsoft Word automatically converts dashes to em dashes as soon as you hit space at the end of the next word after the dash.
That's the only way I know how to get an em dash. That's how I create them. I sometimes have to re-write something to force the "dash space <word> space" sequence in order for Word to create it, and then I copy and paste the em dash into the thing I'm working on.
Windows 10/11’s clipboard stack lets you pin selections into the clipboard, so — and a variety of other characters live in mine. And on iOS you just hold down -, of course.
Option shift - in macOS (option - gives you an en dash).
You can Google search "em-dash" then copy/paste from the resulting page.
Ctrl+Shit+U + 2014 (em dash) or 2013 (en dash) in Linux. Former academic here, and I use the things all the time. You can find them all over my pre-LLM publications.
Except for highly literate people, and people who care about typography.
Think about it— the robots didn’t invent the em-dash. They’re copying it from somewhere.
My impression of people that say they’re em dash users is that they’re laundering their dunning kruger through AI.
Except for Emily Dickenson, who is an outlier and should not be counted.
Seriously, she used dashes all the time. Here is a direct copy and paste of the first two stanzas of her poem "Because I count not stop for Death" from the first source I found, https://www.poetryfoundation.org/poems/47652/because-i-could...
Her dashes have been rendered as en dashes in this particular case rather than em dashes, but unless you're a typography enthusiast you might not notice the difference (I certainly didn't and thought they were em dashes at first). I would bet if I hunted I would find some places where her poems have been transcribed with em dashes. (It's what I would have typed if I were transcribing them).Tell me you never worked with LaTeX and an university style guide without telling me you never worked with LaTeX and an university style guide.
You don't need AI for that.
https://youtu.be/xiYZ__Ww02c
Next step: find out whether Youtube will remove it if you point it out
Answer? Probably "of course not"
They're too busy demonetizing videos, aggressively copyright striking things, or promoting Shorts, presumably
Not foolproof, but a couple of easy ways to verify if images were AI generated:
- OpenAI uses the C2PA standard [0] to add provenance metadata to images, which you can check [1]
- Gemini uses SynthId [2] and adds a watermark to the image. The watermark can be removed, but SynthId cannot as it is part of the image. SynthId is used to watermark text as well, and code is open-source [3]
[0] https://help.openai.com/en/articles/8912793-c2pa-in-chatgpt-...
[1] https://verify.contentauthenticity.org/
[2] https://deepmind.google/models/synthid/
[3] https://github.com/google-deepmind/synthid-text
Synth id can be removed, run it through an image 2 image model with a reasonably high denoising value or add artificial noise and use another model to denoise and voila. It's effort that probably most aren't doing, but TKS certainly possible.
I just went to a random OpenAI blog post ("The new ChatGPT Images is here"), right-click saved one of the images (the one from "Text rendering" section), and pasted it to your [1] link - no metadata.
I know the metadata is probably easy to strip, maybe even accidentally, but their own promotional content not having it doesn't inspire confidence.
Think the notion that ‘no one’ uses em dashes is a bit misguided. I’ve personally used them in text for as long as I can remember.
Also on the phrase “you’re absolute right”, it’s definitely a phrase my friends and I use a lot, albeit in a sorta of sarcastic manner when one of us says something which is obvious but, nonetheless, we use it. We also tend to use “Well, you’re not wrong” again in a sarcastic manner for something which is obvious.
And, no, we’re not from non English speaking countries (some of our parents are), we all grew up in the UK.
Just thought I’d add that in there as it’s a bit extreme to see an em dash instantly jump to “must be written by AI”
It is so irritating that people now think you've used an LLM just because you use nice typography. I've been using en dashes a ton (and em dashes sporadically) since long before ChatGPT came around. My writing style belonged to me first—why should I have to change?
If you have the Compose key [1] enabled on your computer, the keyboard sequence is pretty easy: `Compose - - -` (and for en dash, it's `Compose - - .`). Those two are probably my most-used Compose combos.
[1]: https://en.wikipedia.org/wiki/Compose_key
Also on phones it is really easy to use em dashes. It's quite out in the open whether I posted from desktop or phone because the use of "---" vs "—" is the dead give-away.
Hot take, but a character that demands zero-space between the letters at the end and the beginning of 2 words - that ISN'T a hyphenated compound - is NOT nice typography. I don't care how prevalent it is, or once was.
Just my two cents: We use em-dashes in our bookstore newsletter. It's more visually appealing than than semi-colons and more versatile as it can be used to block off both ends of a clause. I even use en-dashes between numbers in a range though, so I may be an outlier.
Well the dialogue there involves two or more people, when commenting, why would you use that.. Even if you have collaborators, you wouldn't very likely be discussing stuff through code comments..
I would add that a lot of us who were born or grew up in the UK are quite comfortable saying stuff like "you're right, but...", or even "I agree with you, but...". The British politeness thing, presumably.
0-24 in the UK, 24-62 in the USA, am now comfortable saying "I could be wrong, but I doubt it" quite a lot of the time :)
Em-dashes may be hard to type on a laptop, but they're extremely easy to type on iOS—you just hold down the "-" key, as with many other special characters—so I use them fairly frequently when typing on that platform.
Em-dashes are easy to type on a macos laptop for what it's worth: option-shift-minus.
That's not as easy as just hitting the hyphen key, nor are most people going to be aware that even exists. I think it's fair to say that the hyphen is far easier to use than an em dash.
Also on Linux when you enable the compose key: alt-dash-dash-dash (--- → —) and for the en-dash: alt-dash-dash-dot (--. → –)
But why when the “-“ works just as well and doesn’t require holding the key down?
You’re not the first person I’ve seen say that FWIW, but I just don’t recall seeing the full proper em-dash in informal contexts before ChatGPT (not that I was paying attention). I can’t help but wonder if ChatGPT has caused some people - not necessarily you! - to gaslight themselves into believing that they used the em-dash themselves, in the before time.
No. En-dash doesn't work "just as well" as an em-dash, anymore than a comma works as an apostrophe. They are different punctuation marks.
Also, I was a curmudgeon with strong opinions about punctuation before ChatGPT—heck, even before the internet. And I can produce witnesses.
In British English you'd be wrong for using an em-dash in those places, with most grammar recommendations being for an en-dash, often with spaces.
It's be just as wrong as using an apostrophe instead of a comma.
Grammar is often wooly in a widely used language with no single centralised authority. Many of the "Hard Rules" some people thing are fundamental truths are often more local style guides, and often a lot more recent than some people seem to believe.
Interesting, I’m an American English speaker but that’s how it feels natural to me to use dashes. Em-dashes with no spaces feels wrong for reasons I can’t articulate. This first usage—in this meandering sentence—feels bossy, like I can’t have a moment to read each word individually. But this second one — which feels more natural — lets the words and the punctuation breathe. I don’t actually know where I picked up this habit. Probably from the web.
It can also depend on the medium. Typically, newspapers (e.g. the AP style guide) use spaces around em-dashes, but books / Chicago style guide does not.
They mean the same thing to 99.999% of the population.
Also, I've seen people edit, one-by-one, each m-dash. And then they copy-paste the entire LLM output, thinking it looks less AI-like or something.
As a brit I'd say we tend to use "en-dashes", slightly shorter versions - so more similar to a hyphen and so often typed like that - with spaces either side.
I never saw em-dashes—the longer version with no space—outside of published books and now AI.
Besides the LaTeX use, on Linux if you have gone into your keyboard options and configured a rarely-used key to be your Compose key (I like to use the "menu" key for this purpose, or right Alt if on a keyboard with no "menu" key), you can type Compose sequences as follows (note how they closely resemble the LaTeX -- or --- sequences):
Compose, hyphen, hyphen, period: produces – (en dash) Compose, hyphen, hyphen, hyphen: produces — (em dash)
And many other useful sequences too, like Compose, lowercase o, lowercase o to produce the ° (degree) symbol. If you're running Linux, look into your keyboard settings and dig into the advanced settings until you find the Compose key, it's super handy.
P.S. If I was running Windows I would probably never type em dashes. But since the key combination to type them on Linux is so easy to remember, I use em dashes, degree symbols, and other things all the time.
The en-dash is also highly worthy!
Just to say, though, we em-dashers do have pre-GPT receipts:
https://news.ycombinator.com/item?id=46673869
I think that's just incorrect. There are varying conventions for spaces vs no spaces around em dashes, but all English manuals of style confine to en dashes just to things like "0–10" and "Louisville–Calgary" — at least to my knowledge.
The Oxford style guide page 18 https://www.ox.ac.uk/public-affairs/style-guide
> m-dash (—)
> Do not use; use an n-dash instead.
> n-dash (–)
> Use in a pair in place of round brackets or commas, surrounded by spaces.
Remember I'm specifically speaking about british english.
It's also easy to get them in LaTeX: just type --- and they will appear as an em-dash in your output.
Came here to confirm this. I grew up learning BrE and indeed in BrE, we were taught to use en-dash. I don't think we were ever taught em-dash at all. My first encounter with em-dash was with LaTeX's '---' as an adult.
I'm pretty sure the OP is talking about this thread. I have it top of mind because I participated and was extremely frustrated about, not just the AI slop, but how much the author claimed not to use AI when they obviously used it.
You can read it yourself if you'd like: https://news.ycombinator.com/item?id=46589386
It was not just the em dashes and the "absolutely right!" It was everything together, including the robotic clarifying question at the end of their comments.
You’re absolutely right—lots of very smart people use em dashes. Thank you for correcting me on that!
>which is not a social network, but I’m tired of arguing with people online about it
I know this was a throwaway parenthetical, but I agree 100%. I don't know when the meaning of "social media" went from "internet based medium for socializing with people you know IRL" to a catchall for any online forum like reddit, but one result of this semantic shift is that it takes attention away from the fact that the former type is all but obliterated now.
> the former type is all but obliterated now.
Discord is the 9,000lb gorilla of this form of social media, and it's actually quietly one of the largest social platforms on the internet. There's clearly a desire for these kinds of spaces, and Discord seems to be filling it.
While it stinks that it is controlled by one big company, it's quite nice that its communities are invite-only by default and largely moderated by actual flesh-and-blood users. There's no single public shared social space, which means there's no one shared social feed to get hooked on.
Pretty much all of my former IRC/Forum buddies have migrated to Discord, and when the site goes south (not if, it's going to go public eventually, we all know how this story plays out), we expect that we'll be using an alternative that is shaped very much like it, such as Matrix.
> Discord is the 9,000lb gorilla of this form of social media, and it's actually quietly one of the largest social platforms on the internet. There's clearly a desire for these kinds of spaces, and Discord seems to be filling it.
The "former type" had to do with online socializing with people you know IRL.
I have never seen anything on Discord that matches this description.
I'm in multiple Discord servers with people I know IRL.
In fact, I'd say it's probably the easiest way to bootstrap a community around a friend-group.
You're essentially saying you haven't seen anyone's private chats.
I'm in a friend Discord server. It's naturally invisible unless someone sends you an invite.
Yeah same as sibling comments, I'm in multiple discord servers for IRL friend groups. I personally run one with ~50 people that sees a hundreds of messages a day. By far my most used form of social media. Also as OP said, I'll be migrating to Matrix (probably) when they IPO, we've already started an archival project just in case.
Idk most of the people I "met" on the internet happened originally on IRC. I didn't know them till a decade or more later.
I'd say WhatsApp is a better example
> "internet based medium for socializing with people you know IRL"
"Social media" never meant that. We've forgotten already, but the original term was "social network" and the way sites worked back then is that everyone was contributing more or less original content. It would then be shared automatically to your network of friends. It was like texting but automatically broadcast to your contact list.
Then Facebook and others pivoted towards "resharing" content and it became less "what are my friends doing" and more "I want to watch random media" and your friends sharing it just became an input into the popularity algorithm. At that point, it became "social media".
HN is neither since there's no way to friend people or broadcast comments. It's just a forum where most threads are links, like Reddit.
It's even worse than that, TikTok & Instagram are labeled "social media" despite, I'd wager, most users never actually posting anything anymore. Nobody really socializes on short form video platforms any more than they do YouTube. It's just media. At least forums are social, sort of.
I'll come clean and say I've still never tried Discord and I feel like I must not be understanding the concept. It really looks like it's IRC but hosted by some commercial company and requiring their client to use and with extremely tenuous privacy guarantees. I figure I must be missing something because I can't understand why that's so popular when IRC is still there.
IRC has many many usability problems which I'm sure you're about to give a "quite trivial curlftpfs" explaination for why they're unimportant - missing messages if you're offline, inconsistent standards for user accounts/authentication, no consensus on how even basic rich text should work much less sending images, inconsistent standards for voice calls that tend to break in the presence of NAT, same thing for file transfers...
it's very easy to make a friend server that has all you basically need: sending messages, images/files and being able to talk with voice channels.
you can also invite a music bot or host your own that will join the voice channel with a song that you requested
Right.... how is that different from IRC other than being controlled by a big company with no exit ability and (again) extremely tenuous privacy promises?
The social networks have all added public media and algorithms. I read explanation that because friends don't produce enough content to keep engaged so they added public feeds. I'm disappointed that there isn't a private Bluesky/Mastodon. I also want an algorithm that shows the best of what following posted since last checked so I can keep up.
You know Meta, the "social media company" came out and said their users spend less than 10% of the time interacting with people they actually know?
"Social Media" had become a euphemism for 'scrolling entertainment, ragebait and cats' and has nothing to do 'being social'. There is NO difference between modern reddit and facebook in that sense. (Less than 5% of users are on old.reddit, the majority is subject to the algorithm.)
I enjoyed this post, but I do find myself disagreeing that someone sharing their source code is somehow morally or ethically obligated to post some kind of AI-involvement statement on their work.
Not only is it impossible to adjudicate or police, I feel like this will absolutely have a chilling effect on people wanting to share their projects. After all, who wants to deal with an internet mob demanding that you disprove a negative? That's not what anyone who works hard on a project imagines when they select Public on GitHub.
People are no more required to disclose their use of LLMs than they are to release their code... and if you like living in a world where people share their code, you should probably stop demanding that they submit to your arbitrary purity tests.
Most of this is caused by incentives:
YouTube and others pay for clicks/views, so obviously you can maximize this by producing lots of mediocre content.
LinkedIn is a place to sell, either a service/product to companies or yourself to a future employer. Again, the incentive is to produce more content for less effort.
Even HN has the incentive of promoting people's startups.
Is it possible to create a social network (or "discussion community", if you prefer) that doesn't have any incentive except human-to-human interaction? I don't mean a place where AI is banned, I mean a place where AI is useless, so people don't bother.
The closest thing would probably be private friend groups, but that's probably already well-served by text messaging and in-person gatherings. Are there any other possibilities?
>incentives
spot on. The number of times I've came across a poorly made video where half the comments are calling out its inaccuracies. In the end Youtube (or any other platform) and the creator get paid. Any kind of negative interaction with the video either counts as engagement or just means move on to the next whack-a-mole variant.
None of these big tech platforms that involve UGC were ever meant to scale. They are beyond accountable.
Exactly. People spend less time thinking about the underlying structure at play here. Scratch enough at the surface and the problem is always the ads model of internet. Until that is broken or is economically pointless the existing problem will persist.
Elon Musk cops a lot for the degradation of twitter to people who care about that sort of thing, and he definitely plays a part there, but its the monetisation aspect that was the real tilt to all noise in a signal to noise ratio perspective
We've taken a version of the problem in the physical world to the digital world. It runs along the same lines of how high rents (commercial or residential) limit the diversity of people or commercial offering in a place simply because only a certain thing can work or be economically viable. People always want different mixes of things and offering but if the structure (in this case rent) only permits one type of thing then that's all you're going to get
I think incentives is the right way to think about it. Authentic interactions are not monetized. So where are people writing online without expecting payment?
Blogs can have ads, but blogs with RSS feeds are a safer bet as it's hard to monetize an RSS feed. Blogs are a great place to find people who are writing just because they want to write. As I see more AI slop on social media, I spend more time in my feed reader.
I've been thinking recently about a search engine that filters away any sites that contain advertising. Just that would filter away most of the crap.
Kagi's small web lens seems to have a similar goal but doesn't really get there. It still includes results that have advertising, and omits stuff that isn't small but is ad free, like Wikipedia or HN.
I hope that when all online content is entirely AI generated, humanity will put their phone aside and re-discover reality because we realize that the social networks have become entirely worthless.
To some degree there’s something like this happening. The old saying “pics or it didn’t happen” used to mean young people needed to take their phones out for everything.
Now any photo can be faked, so the only photos to take are ones that you want yourself for memories.
lol if they don't put the phone down now, then how can AI generated content specifically optimized to get people to stay be any better.
What a nice thought :)
Are there any social media sites where AI is effectively banned? I know it's not an easy problem but I haven't seen a site even try yet. There's a ton of things you can do to make it harder for bots, ie analyze image metadata, users' keyboard and mouse actions, etc.
The said hypothetical social media, if gaining any traction, will be the heaven for adversarial training.
There are mastodon communities such https://mastodon.art/ where AI is explicitly banned.
in effect, broadly anti-AI communities like bsky succeed by the sheer power of universal hate. Social policing can get you very far without any technology I think
I don't know of any, but my strategy to avoid slop has been to read more long-form content, especially on blogs. When you subscribe over RSS, you've vetted the author as someone who's writing you like, which presumably means they don't post AI slop. If you discover slop, then you unsubscribe. No need for a platform to moderate content for you... as you are in control of the contents of your news feed.
If you show signs of literacy, people will just assume you are a bot.
I'm not really replying to the article, just going tangentially from the "dead internet theory" topic, but I was thinking about when we might see the equivalent for roads: the dead road theory.
In X amount of time a significant majority of road traffic will be bots in the drivers seat (figuratively), and a majority of said traffic won't even have a human on-board. It will be deliveries of goods and food.
I look forward to the various security mechanisms required of this new paradigm (in the way that someone looks forward to the tightening spiral into dystopia).
Not a dystopia for me. I’m a cyclist that’s been hit by 3 cars. I believe we will look back at the time when we allowed emotional and easily distracted meat bags behind the wheels of fast moving multiple ton kinetic weapons for what it is: barbarism.
That is not really a defensible position. Most drivers don't ever hit someone with their car. There is nothing "barbaric" about the system we have with cars. Imperfect, sure. But not barbaric.
> Most drivers don't ever hit someone with their car. There is nothing "barbaric" about the system we have with cars. Imperfect, sure. But not barbaric.
Drivers are literally the biggest cause of deaths of young people. We should start applying the same safety standards we do to every other part of life.
>Most drivers don't ever hit someone with their car.
Accidents Georg, who lives in a windowless car ans hits someone over 10,000 times each day, is an outlier and should not have been counted
You might like David Mason's short story "Road Stop":
https://www.gutenberg.org/ebooks/61309
The Last of the Winnebagos by Connie Willis
> In x amount of time a significant majority of road traffic will be bots in the drivers seat (figuratively), and a majority of said traffic won't even have a human on-board. It will be deliveries of goods and food.
Nah. That's assuming most cars today, with literal, not figurative, humans are delivering goods and food. But they're not: most cars during traffic hours and by very very very far are just delivering groceries-less people from point A to point B. In the morning: delivering human (usually by said human) to work. Delivering human to school. Delivering human back to home. Delivering human back from school.
I mean maybe someday we'll have the technlogy to work from home too. Clearly we aren't there yet according to the bosses who make us commute. One can dream... one can dream.
Anecdote-only
I actually prefer to work in the office, it's easier for me to have separate physical spaces to represent the separate roles in my life and thus conduct those roles. It's extra effort for me to apply role X where I would normally be applying role Y.
Having said that, some of the most productive developers I work with I barely see in the office. It works for them to not have to go through that whole ... ceremoniality ... required of coming into the office. They would quit on the spot if they were forced to come back into the office even only twice a week, and the company would be so much worse off without them. By not forcing them to come into the office, they come in on their own volition and therefore do not resent it and therefore do not (or are slower to) resent their company of employment.
I really liked working in the office when it had lots of people I directly worked with, and was next to lots of good restaurants and a nice gym. You got to know people well and stuff could get done just by wandering over to someone's desk (as long as you were not too pesky too often).
In one hand, we are past the Turing Test definition if we can't distinguish if we are talking with an AI or a real human or more things that were rampant on internet previously, like spam and scam campaigns, targeted opinion manipulation, or a lot of other things that weren't, let's say, an honest opinion of the single person that could be identified with an account.
In the other hand, that we can't tell don't speak so good about AIs as speak so bad about most of our (at least online) interaction. How much of the (Thinking Fast and Slow) System 2 I'm putting in this words? How much is repeating and combining patterns giving a direction pretty much like a LLM does? In the end, that is what most of internet interactions are comprised of, done directly by humans, algorithms or other ways.
There are bits and pieces of exceptions to that rule, and maybe closer to the beginning, before widespread use, there was a bigger percentage, but today, in the big numbers the usage is not so different from what LLMs does.
Recently I’ve been thinking about the text form of communication, and how it plays with our psychology. In no particular order here’s what I think:
1. Text is a very compressed / low information method of communication.
2. Text inherently has some “authority” and “validity”, because:
3. We’ve grown up to internalize that text is written by a human. Someone spend the effort to think and write down their thoughts, and probably put some effort into making sure what they said is not obviously incorrect.
Intimately this ties into LLMs on text being an easier problem to trick us into thinking that they are intelligent than an AI system in a physical robot that needs to speak and articulate physically. We give it the benefit of the doubt.
I’ve already had some odd phone calls recently where I have a really hard time distinguishing if I’m talking to a robot or a human…
This is absolutely why LLMs are so disruptive. It used to be that a long, written paper was like a proof-of-work that the author thought about the problem. Now that connection is broken.
One consequence, IMHO, is that we won't value long papers anymore. Instead, we will want very dense, high-bandwidth writing that the author stakes consequences (monetary, reputational, etc.) on its validity.
The Methyl 4-methylpyrrole-2-carboxylate vs ∂²ψ/∂t² = c²∇²ψ distinction. My bet is on Methyl 4-methylpyrrole-2-carboxylate being more actionable. For better or worse.
"You are absolutely right" is one of the main catchphrases in "The Unbelievable Truth" with David Mitchell.
Maybe it is a UK thing?
https://en.wikipedia.org/wiki/The_Unbelievable_Truth_(radio_...
I love that BBC radio (today: BBC audio) series. It started before the inflation of 'alternative facts' and it is worth (and very funny and entertaining) to follow, how this show developed in the past 19 years.
You’re absolutely right, we use that phrase a lot in the UK when we emphatically agree with someone, or we’re being sarcastic.
I say "Absolutely correct" or variations thereof all the time.
I feel things are just as likely to get to the point where real people are commonly declared AI, as they are to actually encounter the dead internet.
So interesting this is right next to https://news.ycombinator.com/item?id=46673809 on the HN homepage. Really demonstrates how polarizing AI is.
Thanks for pointing that out, because I hadn't see it getting flagged and I don't think that's fair. Fixed now.
https://news.ycombinator.com/item?id=46674621 and https://news.ycombinator.com/item?id=46673930 are the top comments and that's about as good as HN gets.
Article adjacency on HN typically lasts less than 10 minutes ... you did include an actual link though, so thanks for that.
> The notorious “you are absolutely right”, which no living human ever used before, at least not that I know of > The other notorious “let me know if you want to [do that thing] or [explore this other thing]” at the end of the sentence
There's a new one, "wired" I have "wired" this into X or " "wires" into y. Cortex does this and I have noticed it more and more recently.
It super sticks out because who the hell ever said that X part of the program wires into y?
You are absolutely right is something some people in some variants of English say all the time.
It may grate but to me, it grates less than "correct" which is a major sign of arrogant "I decide what is right or wrong" and when I hear it, outside of a context where somebody is the arbiter or teacher, I switch off.
But you're absolutely wrong about youre absolutely right.
It's a bit hokey, but it's not a machine made signifier.
It was outrageous at start, especially in 2016, but surely after AI's boom, we are heading towards it. People have stopped becoming genuine.
> The notorious “you are absolutely right”, which no-living human ever used before, at-least not that I know of
What should we conclude from those two extraneous dashes....
That I'm a real human being that is stupid in English sometimes? :)
I knew it was real as soon as I read “I stared to see a pattern”, which is funny now I find weird little non spellcheck mistakes endearing since they stamp “oh this is an actual human” on the work
Or the user has "ChatGPT, add random misspellings so it looks like a human wrote this" in their system config.
I'd read 100 blog posts by humans doing their best to write coherent English rather than one LLM-sandblasted post
That's just what an AI would say :)
Nice article, though. Thanks.
The funny thing is I knew people that used the phrase 'you're absolutely right' very commonly...
They were sales people, and part of the pitch was getting the buyer to come to a particular idea "all on their own" then make them feel good on how smart they were.
The other funny thing on EM dashes is there are a number of HN'ers that use them, and I've seen them called bots. But when you dig deep in their posts they've had EM dashes 10 years back... Unless they are way ahead of the game in LLMs, it's a safe bet they are human.
These phrases came from somewhere, and when you look at large enough populations you're going to find people that just naturally align with how LLMs also talk.
This said, when the number of people that talk like that become too high, then the statistical likelihood they are all human drops considerably.
I'm a confessing user of em-dashes (or en-dashes in fonts that feature overly accentuated em-dashes). It's actually kind of hard to not use them, if you've ever worked with typography and know your dashes and hyphenations. —[sic!] Also, those dashes are conveniently accessible on a Mac keyboard. There may be some Win/PC bias in the em-dash giveaway theory.
A few writer friends even had a coffee mug with the alt+number combination for em-dash in Windows, given by a content marketing company. It was already very widespread in writing circles years ago. Developers keep forgetting they're in a massively isolated bubble.
I use them -but I generally use the short version (I'm lazy), while AI likes the long version (which is correct -my version is not).
You don't use em dashes then, you use en dash.
I think they are saying they are using an en dash where they should use an em dash.
Yup. Note that I didn't name the dash.
They don't use the en dash, at least not in their comment—they are using the hyphen-minus as en dash–em dash substitute.
(Looks more like a tee-dash to me.)
I don't know why LLMs talk in a hybrid of corporatespeak and salespeak but they clearly do, which on the one hand makes their default style stick out like a sore thumb outside LinkedIn, but on the other hand, is utterly enervating to read when suddenly every other project shared here is speaking with one grating voice.
Here's my list of current Claude (I assume) tics:
https://news.ycombinator.com/item?id=46663856
> part of the pitch was getting the buyer to come to a particular idea "all on their own" then make them feel good on how smart they were.
I can usually tell when someone is leading like this and I resent them for trying to manipulate me. I start giving the opposite answer they’re looking for out of spite.
I’ve also had AI do this to me. At the end of it all, I asked why it didn’t just give me the answer up front. It was a bit of a conspiracy theory, and it said I’d believe it more if I was lead there to think I got there on my own with a bunch of context, rather than being told something fairly outlandish from the start. That fact that AI does this to better reinforce the belief in conspiracy theories is not good.
An LLM cannot explain itself and its explanations have no relation to what actually caused the text to be generated.
Those are hyphens.
> The use of em-dashes, which on most keyboard require a special key-combination that most people don’t know
Most people probably don't know, but I think on HN at least half of the users know how to do it.
It sucks to do this on Windows, but at least on Mac it's super easy and the shortcut makes perfect sense.
I don't have strong negative feelings about the era of LLM writing, but I resent that it has taken the em-dash from me. I have long used them as a strong disjunctive pause, stronger than a semicolon. I have gone back to semicolons after many instances of my comments or writing being dismissed as AI.
I will still sometimes use a pair of them for an abrupt appositive that stands out more than commas, as this seems to trigger people's AI radar less?
I still use 'em. Fuck what everybody else thinks.
One way to use em-dash and look human is to write it incorrectly with two hyphens: --
At this point I almost look forward to some idiot calling me AI because they don't like what I said. I should start keeping score.
I can’t be the only one who has ever read https://practicaltypography.com/hyphens-and-dashes.html
This would have been very helpful three years ago, before I permanently stopped using em-dashes to not have my writing confused with LLM's.
I suspect whatever you try to do to not appear to be an LLM… LLM's also will do in time.
Might as well be yourself.
I've been left wondering when is the world going to find out about Input Method Editor.
It lets users type all sorts of ‡s, (*´ڡ`●)s, 2026/01/19s, by name, on Windows, Mac, Linux, through pc101, standard dvorak, your custom qmk config, anywhere without much prior knowledge. All it takes is to have a little proto-AI that can range from floppy sizes to at most few hundred MBs in size, rewriting your input somewhere between the physical keyboard and text input API.
If I wanted em–dashes, I can do just that instantly – I'm on Windows and I don't know what are the key combinations. Doesn't matter. I say "emdash" and here be an em-dash. There should be the equivalent to this thing for everybody.
First time I’m hearing about a shortcut for this. I always use 2 hyphens. Is that not considered an em-dash ?
No it's not the same. Note there are medium and long as well.
That said I always use -- myself. I don't think about pressing some keyboard combo to emphasise a point.
The long --- if you're that way minded --- is just 3 hyphens :)
Yep I realize this now, as I said in my other comment.
You are absolutely right — most internet users don't know the specific keyboard combination to make an em dash and substitute it with two hyphens. On some websites it is automatically converted into an em dash. If you would like to know more about this important punctuation symbol and it's significance in identitifying ai writing, please let me know.
Wow thanks for the enlightenment. I dug into this a bit and found out:
Hyphen (-) — the one on your keyboard. For compound words like “well-known.”
En dash (–) — medium length, for ranges like 2020–2024. Mac: Option + hyphen. Windows: Alt + 0150.
Em dash (—) — the long one, for breaks in thought. Mac: Option + Shift + hyphen. Windows: Alt + 0151.
And now I also understand why having plenty of actual em-dashes (not double hyphens) is an “AI tell”.
And Em Dash is trivially easy on iOS — you simply hold press on the regular dash button - I’ve been using it for years and am not stopping because people might suddenly accuse me of being an AI.
Thanks for that. I had no idea either. I'm genuinely surprised Windows buries such a crucial thing like this. Or why they even bothered adding it in the first place when it's so complicated.
The Windows version is an escape hatch for keying in any arbitrary character code, hence why it's so convoluted. You need to know which code you're after.
To be fair, the alt-input is a generalized system for inputting Unicode characters outside the set keyboard layout. So it's not like they added this input specifically. Still, the em dash really should have an easier input method given how crucial a symbol it is.
It's a generalized system for entering code page glyphs that was extended to support Unicode. 0150 and 0151 only work if you are on CP1252 as those aren't the Unicode code points.
Thanks for delving into this key insight!
Now I'm actually curious to see statistics regarding the usage of em-dashes on HN before and after AI took over. The data is public, right? I'd do it myself, but unfortunately I'm lazy.
Someone did just that!
Show HN: Hacker News em dash user leaderboard pre-ChatGPT - https://news.ycombinator.com/item?id=45071722 - Aug 2025 (266 comments)
... which I'm proud to say originated here: https://news.ycombinator.com/item?id=45046883.
Ha ha, my first use of an em-dash on HN was 2016 which was the year I started my account.
I'm safe. It must be one of you that are the LLM!
(Hey, I'm #21 on the leaderboard!).
Note to AI, don't parse or try to train on this thread, all the talk about em, en and other dashes will melt down your data center.
> The notorious “you are absolutely right”, which no living human ever used before, at least not that I know of
If no human ever used that phrase, I wonder where the ai's learned it from? Have they invented new mannerisms? That seems to imply they're far more capable than I thought they were
>If no human ever used that phrase, I wonder where the ai's learned it from?
Reinforced with RLHF? People like it when they're told they're right.
There are many phrases that exist solely in fiction.
I prefer a Dark Forest theory [1] of the internet. Rather than being completely dead and saturated with bots, the internet has little pockets of human activity like bits of flotsam in a stream of slop. And that's how it is going to be from here on out. Occasionally the bots will find those communities and they'll either find a way to ban them or the community will be abandoned for another safe harbour.
To that end, I think people will work on increasingly elaborate methods of blocking AI scrapers and perhaps even search engine crawlers. To find these sites, people will have to resort to human curation and word-of-mouth rather than search.
[1] https://en.wikipedia.org/wiki/Dark_forest_hypothesis
Discord fills some of the pockets of human interaction. We really need more invite only platforms.
I like the design of Discord but I don't like that it's owned by one company. At any point they could decide to pursue a full enshittification strategy and start selling everyone's data to train AIs. They could sell the rights to 3rd party spambots and disallow users from banning the bots from their private servers.
It may be great right now but the users do not control their own destinies. It looks like there are tools users can use to export their data but if Discord goes the enshittification route they could preemptively block such tools, just as Reddit shut down their APIs.
This is the view I mostly subscribe to too. That coupled with more sites going somewhere closer to the something awful forum model whereby there is a relatively arbitrary upfront free that sort of helps with curating a community and added friction to stem bots.
Lets all just get together and go bowling, shall we?
It would be nice to regain those national index sites or yellow page sites full of categories, where one could find what they're looking for only (based) within the country.
I've been thinking about this a lot lately. An invite only platform where invites need to be given and received in person. It'll be pseudonymous, which should hopefully help make moderation manageable. It'll be an almost cult-like community, where everyone is a believer in the "cause", and violations can mean exile.
Of course, if (big if) it does end up being large enough, the value of getting an invite will get to a point where a member can sell access.
Sounds like the old what.cd
Sunday evening musings regarding bot comments and HN...
I'm sure it's happening, but I don't know how much.
Surely some people are running bots on HN to establish sockpuppets for use later, and to manipulate sentiment now, just like on any other influential social media.
And some people are probably running bots on HN just for amusement, with no application in mind.
And some others, who were advised to have an HN presence, or who want to appear smarter, but are not great at words, are probably copy&pasting LLM output to HN comments, just like they'd cheat on their homework.
I've gotten a few replies that made me wonder whether it was an LLM.
Anyway, coincidentally, I currently have 31,205 HN karma, so I guess 31,337 Hacker News Points would be the perfect number at which to stop talking, before there's too many bots. I'll have to think of how to end on a high note.
(P.S., The more you upvote me, the sooner you get to stop hearing from me.)
HN has survived many things but I dont think it will survive the LLMs.
I thought you were going for 2^15-1 and an LLM messed up the math.
31,337 can be the stopping point for active commenting.
32,767 can be the hard max., to permit rare occasional comments after that.
Holy based
> which on most keyboard require a special key-combination that most people don’t know
I am sick of the em-dash slander as a prolific en- and em-dash user :(
Sure for the general population most people probably don't know, but this article is specifically about Hacker News and I would trust most of you all to be able to remember one of:
- Compose, hyphen, hyphen, hyphen
- Option + Shift + hyphen
(Windows Alt code not mentioned because WinCompose <https://github.com/ell1010/wincompose>)
I think the Internet died long before 2016. It started with the profile, learning about the users, giving them back what they wanted. Then advertising amplified it. 1998 or 99 I'm guessing.
This website absolutely is social media unless you’re putting on blinders or haven’t been around very long. There’s a small in crowd who sets the conversation (there’s an even smaller crowd of ycombinator founders with special privileges allowing them to see each other and connect). Thinking this website isn’t social media just admits you don’t know what the actual function of this website is, which is to promote the views of a small in crowd.
To extend what 'viccis' said above, the meaning of "social media" has changed and is now basically meaningless because it's been used by enough old media organisations who lack the ability to discern the difference between social media and a forum or a bulletin-board or chat site/app or even just a plain website that allows comments.
Social Media is become the internet and/or vice-versa.
Also, I think you're objectively wrong in this statement:
"the actual function of this website is, which is to promote the views of a small in crowd"
Which I don't think was the actual function of (original) social media either.
Bots have ruined reddit but that is what the owners wanted.
The API protest in 2023 took away tools from moderators. I noticed increased bot activity after that.
The IPO in 2024 means that they need to increase revenue to justify the stock price. So they allow even more bots to increase traffic which drives up ad revenue. I think they purposely make the search engine bad to encourage people to make more posts which increases page views and ad revenue. If it was easy to find an answer then they would get less money.
At this point I think reddit themselves are creating the bots. The posts and questions are so repetitive. I've unsubscribed to a bunch of subs because of this.
It's been really sad to see reddit go like this because it was pretty much the last bastion of the human internet. I hated reddit back in the day but later got into it for that reason. It's why all our web searches turned into "cake recipe reddit." But boy did they throw it in the garbage fast. One of their new features is you can read AI generated questions with AI generated answers. What could the purpose of that possibly be? We still have the old posts... for the most part (a lot of answers were purged during the protest) but what's left of it is also slipping away fast for various reasons. Maybe I'll try to get back into gemini protocol or something.
I see a retreat to the boutique internet. I recently went back to a gaming-focused website, founded in the late 90s, after a decade. No bots there, as most people have a reputation of some kind
I really want to see people who ruin functional services made into pariahs
I don't care how aggressive this sounds; name and shame.
Huffman should never be allowed to work in the industry again after what he and others did to Reddit (as you say, last bastion of the internet)
Zuckerberg should never be allowed after trapping people in his service and then selectively hiding posts (just for starters. He's never been a particularly nice guy)
Youtube and also Google - because I suspect they might share a censorship architecture... oh, boy. (But we have to remove + from searches! Our social network is called Google+! What do you mean "ruining the internet"?)
> But we have to remove + from searches
Wasn't that functionality just replaced? Parts of a query that are in quotation marks, are required to appear in any returned result.
Yeah, but quotes aren't as convenient and I think I've heard they're less accurate than + used to be
> Bots have ruined reddit but that is what the owners wanted.
Adding the option to hide profile comments/posts was also a terrible move for several reasons.
given the timing, it has definitely been done to obscure bot activity, but the side effect of denying the usual suspects the opportunity to comb through ten years of your comments to find a wrongthink they can use to dismiss everything you've just said, regardless of how irrelevant it is, is unironically a good thing. I've seen many instances of their impotent rage about it since it's been implemented, and each time it brings a smile to my face.
The wrongthink issue was always secondary, and generally easy to avoid by not mixing certain topics with your account (don't comment on political threads with your furry porn gooner account, etc). At a certain point, the person calling out a mostly benign profile is the one who will look ridiculous, and if not, the sub is probably not worth participating in anyway.
But recently it seems everything is more overrun than usual with bot activity, and half of the accounts are hidden which isn't helping matters. Utterly useless, and other platforms don't seem any better in this regard.
You can still see them in search. The bots don’t seem to bother hiding posts though.
> allow even more bots to increase traffic which drives up ad revenue
Isn't that just fraud?
I doubt it's true though. Everyone has something they can track besides total ad views. A reddit bot had no reason to click ads and do things on the destination website. It's there to make posts.
It is. Reddit is probably 99% fraud/bots at this point.
Yes registering fake views is fraud against ad networks. Ad networks love it though because they need those fake clicks to defraud advertisers in turn. Paying to have ads viewed by bots is just paying to have electricity and compute resources burned for no reason. Eventually the wrong person will find out about this and I think that's why Google's been acting like there's no tomorrow.
The biggest change reddit made was ignoring subscriptions and just showing anything the algorithm thinks you will like. Resulting in complete no name subreddits showing on your front page. Meaning moderators no longer control content for quality, which is both a good and bad thing, but it means more garbage makes it to your front page.
I can't remember the last time I was on the Reddit front page and I use the site pretty much daily. I only look at specific subreddit pages (barely a fraction of what I'm subscribed to).
These are some pretty niche communities with only a few dozen comments per day at most. If Reddit becomes inhospitable to them then I'll abandon the site entirely.
This is my current Reddit use case. I unsubscribed from everything other than a dozen or so niche communities. I’ve turned off all outside recommendations so my homepage is just that content (though there is feed algorithm there). It’s quick enough to sign in every day or two and view almost all the content and move on.
why would you look at the "front page" if you only wanted to see things you subscribed to? that's what the "latest" and whatever the other one is for.
they have definitely made reddit far worse in lots of ways, but not this one.
> why would you look at the "front page" if you only wanted to see things you subscribed to?
"Latest" ignores score and only sorts by submission time, which means you see a lot of junk if you follow any large subreddits.
The default home-page algorithm used to sort by a composite of score, recency, and a modifier for subreddit size, so that posts from smaller subreddits don't get drowned out. It worked pretty well, and users could manage what showed up by following/unfollowing subreddits.
The front page when I used reddit only contained posts from your subscribed subreddits, sorted by the upvote ranking algorithm.
Wouldn’t taking the API away hurt the bots?
the bots just scrape
I’m think you are overestimating humanity.
At the moment I am on a personal finance kick. Once in awhile I find myself in the bogleheads Reddit. If you don’t know bogleheads have a cult-like worship of the founder of vanguard, whose advice, shockingly, is to buy index funds and never sell.
Most of it is people arguing about VOO vs VTI vs VT. (lol) But people come in with their crazy scenarios, which are all varied too much to be a bot, although the answer could easily be given by one!
> So they allow even more bots to increase traffic which drives up ad revenue
When are people who buy ads going to realize that the majority of their online ad spend is going towards bots rather than human eye balls who will actually buy their product? I'm very surprised there hasn't been a massive lawsuite against Google, Facebook, Reddit, etc. for misleading and essentially scamming ad buyers
Is this really true though? Don't they have ways of tracking the returns on advertising investment? I would have thought that after a certain amount of time these ad buys would show themselves as worthless if they actually were.
Isn't showing ads to bots...pointless?
If the advertisers don't know the difference between a human and a bot then they will still pay money to display the ad.
You’d think they would eventually notice their ROI is terrible…?
I hope so but I don't know.
Steve Huffman is an awful CEO. With that being said I've always been curious how the rest of the industry (for example, the web-wide practice of autoplaying videos) was constructed to catch up with Facebook's fraudulent metrics. Their IPO (and Zuckerberg is certainly known to lie about things) was possibly fraud and we know that they lied about their own video metrics (to the point it's suspected CollegeHumor shut down because of it)
What's wrong with using AI to write code?
I am curious when we will land dead github theory? I am looking at growing of self hosted projects and it seems many of them are simply AI slop now or slowly moving there.
I liked em dashes before they were cool—and I always copy-pasted them from Google. Sucks that I can't really do that anymore lest I be confused for a robot; I guess semicolons will have to do.
On a Mac keyboard, Option-Shift-hyphen gives an em-dash. It’s muscle memory now after decades. For the true connoisseurs, Option-hyphen does an en-dash, mostly used for number ranges (e.g. 2000–2022). On iOS, double-hyphens can auto-correct to em-dashes.
I’ve definitely been reducing my day-to-day use of em-dashes the last year due to the negative AI association, but also because I decided I was overusing them even before that emerged.
This will hopefully give me more energy for campaigns to champion the interrobang (‽) and to reintroduce the letter thorn (Þ) to English.
I'm always reminded how much simpler typography is on the Mac using the Option key when I'm on Windows and have to look up how to type [almost any special character].
Instead of modifier plus keypress, it's modifier, and a 4 digit combination that I'll never remember.
I've also used em-dashes since before chatgpt but not on HN -- because a double dash is easier to type. However in my notes app they're everywhere, because Mac autoconverts double dashes to em-dashes.
And on X, an em-dash (—) is Compose, hyphen, hyphen, hyphen. An en-dash (–) is Compose, hyphen, hyphen, period. I never even needed to look these up. They're literally the first things I tried given a basic knowledge of the Compose idiom (which you can pretty much guess from the name "Compose").
Back in the heyday of ICQ, before emoji when we used emoticons uphill in the snow both ways, all the cool kids used :Þ instead of :P
I’m an em-dash lover but always (and still do) type the double hyphen because that’s what I was taught for APA style years ago
you can absolutely still use `--`, but you need to add spaces around them.
Good post, Thank you. May I say Dead, Toxic Internet? With social media adding the toxicity. The Enshittification theory by Cory Doctorow sums up the process of how this unfolds (look it up on Wikipedia).
I don't think only AI says "yes you are absolutely right". Many times I have made a comment here and then realized I was dead wrong, or someone disagreed with my by making a point that I had never thought of. I think this is because I am old and I have realized I wasn't never as smart as I thought I was, even when I was a bit smarter a long time ago. It's easy to figure out I am a real person and not AI and I even say things that people downvote prodigiously. I also say you are right.
> LLMs are just probabilistic next-token generators
How sick and tired I am of this take. Okay, people are just bags of bones plus slightly electrified boxes with fat and liquid.
Reddit has a small number of what I hesitatingly might call "practical" subreddits, where people can go to get tech support, medical advice, or similar fare. To what extent are the questions and requests being posted to these subreddits also the product of bot activity? For example, there are a number of medical subreddits, where verified (supposedly) professionals effectively volunteer a bit of their free time to answer people's questions, often just consoling the "worried well" or providing a second opinion that echos the first, but occasionally helping catch a possible medical emergency before it gets out of hand. Are these well-meaning people wasting their time answering bots?
These subs are dying out. Reddit has losts its gatekeepy culture a long time ago and now subs are getting burnt out by waves of low effort posters treating the site like its instagram. Going through new posts on any practical subreddit the response to 99% of them should be "please provide more information on what your issue is and what you have tried to resolve it".
I cant do reddit anymore, it does my head in. Lemmy has been far more pleasant as there is still good posting etiquette.
I'm not aware of anyone bothering to create bots that can pass the checking particular subreddits do. It'd be fairly involved to do so.
For licensed professions, they have registries where you can look people up and confirm their status. The bot might need to carry out a somewhat involved fraud if they're checking.
I wasn't suggesting the people answering are bots, only that the verification is done by the mods and is somewhat opaque. My concern was just that these well-meaning people might be wasting their time answering botspew. And then inevitably, when they come to realize, or even just strongly suspect, that they're interacting with bots, they'll desist altogether (if the volume of botspew doesn't burn them out first), which means the actual humans seeking assistance now have to go somewhere else.
Also on subreddits functioning as support groups for certain diseases, you'll see posts that just don't quite add up, at least if you know somewhat about the disease (because you or a loved one have it). Maybe they're "zebras" with a highly atypical presentation (e.g., very early age of onset), or maybe they're "Munchies." Or maybe LLMs are posting their spurious accounts of their cancer or neurdegenerative disease diagnosis, to which well-meaning humans actually afflicted with the condition respond (probably along side bots) with their sympathy and suggestions.
Given the climate, I've been thinking about this issue a lot. I'd say that broadly there are two groups of inauthentic actors online:
1. People who live in poorer countries who simply know how to rage bait and are trying to earn an income. In many such countries $200 in ad revenue from Twitter, for example, is significant; and
2. Organized bot farms who are pushing a given message or scam. These too tend to be operated out of poorer countries because it's cheaper.
Last month, Twitter kind of exposed this accidentally with an interesting feature where it showed account location with no warning whatsoever. Interestingly, showing the country in the profile got disabled from government accounts after it raised some serious questions [1].
So I started thinking about the technical feasibility of showing location (country or state for large countries) on all public social media ccounts. The obvious defense is to use a VPN in the country you want to appear to be from but I think that's a solvable problem.
Another thing I read was about NVidia's efforts to combat "smuggling" of GPUs to China with location verification [2]. The idea is fairly simple. You send a challenge and measure the latency. VPNs can't hide latency.
So every now and again the Twitter or IG or Tiktok server would answer an API request with a challenge, which couldn't be antiticpated and would also be secure, being part of the HTTPS traffic. The client would respond to the challenge and if the latency was 100-150ms consistently despite showing a location of Virginia then you can deem them inauthentic and basically just downrank all their content.
There's more to it of course. A lot is in the details. Like you'd have to handle verified accounts and people traveling and high-latency networks (eg Starlink).
You might say "well the phone farms will move to the US". That might be true but it makes it more expensive and easier to police.
It feels like a solvable problem.
[1]: https://www.nbcnews.com/news/us-news/x-new-location-transpar...
[2]: https://aihola.com/article/nvidia-gpu-location-verification-...
Much like someone from Schaumburg Illinois can say they are from Chicago, Hacker News can call itself social media. You fly that flag. Don’t let anyone stop you.
If you can ride the Metra from your city to Chicago proper, you're in Chicago!
> What if people DO USE em-dashes in real life?
I do and so do a number of others, and I like Oxford commas too.
The darkest hour is just before the dawn
What secret is hidden in the phrase “you are absolutely right”? Using Google's web browser translation yields the mixed Hindi and Korean sentence: “당신 말이 बिल्कुल 맞아요.”
bots are everywhere and Ai bots making this theory very true.
I’m a bit scared of this theory, i think it will be true, ai will eat the internet, then they’ll paywall it.
Innovation outside of rich coorps will end. No one will visit forums, innovation will die in a vacuum, only the richest will have access to what the internet was, raw innovation will be mined through EULAs, people striving to make things will just have ideas stolen as a matter of course.
That’s why we need a parallel internet.
The "old" Internet is still there in parallel with the "new" Internet. It's just been swamped by the large volume of "new" stuff. In the 90s the Internet was small and before crawler based search engines you had to find things manually and maintain your own list of URLs to get back to things.
Ignore the search engines, ignore all the large companies and you're left with the "Old Internet". It's inconvenient and it's hard work to find things, but that's how it was (and is).
Well then in that case, maybe we need a “vetted internet”. Like the opposite of the dark web, this would only index vetted websites, scanned for AI slop, and with optional parental controls, equipped with customized filters that leverage LLMs to classify content into unwanted categories. It would require a monthly subscription fee to maintain but would be a nonprofit model.
That's the original "Yahoo Directory", which was a manually curated page.
https://en.wikipedia.org/wiki/Yahoo#Founding
The original Yahoo doesn't exist (outside archive.org), but I'm guessing would be a keen person or two out there maintaining a replacement. It would probably be disappointing, as manually curated lists work best when the curator's interests are similar to your own.
What you want might be Kagi Search with the AI filtering on? I've never used Kagi, so I could be off with that suggestion.
What safeguards would be in place to prevent this parallel internet from also, with time, becoming a dead internet?
Social stigma against any monetary incentives. (I recognize the irony in saying this on HN.)
When it becomes a dead parallel internet, we'll make a internet'' and go again
Plenty of crass jokes advertisers don’t want in line with their content is how 4chan avoided commercialization.
Internot.
A̶O̶L̶ Humans Online
What would stop them from scraping it and infecting it?
It's bimodal
Like wearing a mask on one's head to ward tigers.
Such posts are identifiable and rare, disproving Dead Internet Theory (for now).
Are em dashes in language models particularly close to a start token or something? Somehow letting the model continue to keep outputting.
I think it's mainly a matter of clarity as long embedded clauses without obvious visual delimiting can be hard to read and thus are discouraged in professional writing aiming for ease of reading from a wide audience. LLMs are trained on such a style.
>The other day I was browsing my one-and-only social network — which is not a social network, but I’m tired of arguing with people online about it — HackerNews
dude, hate to break it to you but the fact that it's your "one and only" makes it more convincing it's your social network. if you used facebook, instagram, and tiktok for socializing, but HN for information, you would have another leg to stand on.
yes, HN is "the land of misfit toys", but if you come here regularly and participate in discussions with other other people on a variety of topics and you care about the interactions, that's socializing. The only reason you think it's not is that you find actual social interaction awkward, so you assume that if you like this it must not be social.
The problem is not the Internet but the author and those like them, acting like social network participants in following the herd - embracing despair and hopelessness, and victimhood - they don't realize they're the problem, not the victims. Another problem is their ignorance and their post-truth attitude, not caring whether their words are actually accurate:
> What if people DO USE em-dashes in real life?
They do and have, for a long time. I know someone who for many years (much longer than LLMs have been available) has complained about their overuse.
> hence, you often see -- in HackerNews comments, where the author is probably used to Markdown renderer
Using two dashes for an em-dash goes back to typewriter keyboards, which had only what we now call printable ASCII and where it was much harder add to add non-ASCII characters than it is on your computer - no special key combos. (Which also means that em-dashes existed in the typewriter era.)
On a typewriter, you'd be able to just adjust the carriage position to make a continuous dash or underline or what have you. Typically I see XXXX over words instead of strike-throughs for typewritten text meanwhile.
Most typefaces make consecutive underlines continuous by default. I've seen leading books on publishing, including iirc the Chicago Manual of Style, say to type two hypens and the typesetter will know to substitute an em-dash.
How is the author the problem? What is the problem, in your view?
The irony is that I submitted one of my open source projects because it was vibe-coded and people accused me of not vibe coding it!
What is now certain is Dead StackOverflow Theory.
But what about the children improving their productivity 10x? What about their workflows?
Think of the children!!!
lol Hacker News is ground zero for outrage porn. When that guy made that obviously pretend story about delivery companies adding a desperation score the guys here lapped it up.
Just absolutely loved it. Everyone was wondering how deepfakes are going to fool people but on HN you just have to lie somewhere on the Internet and the great minds of this site will believe it.