> The LLM generated writing obviously felt significantly better than my own writing.
A general pattern for LLMs is that they look really good at things you are bad at. What that means is that if you find yourself thinking of its output as significantly better than yours in a particular domain, there's a high chance that you are not equipped to judge that quality effectively.
I don't disagree about the probability, but the current frontier models are not completely useless for writing even in areas where I have significant knowledge. I would not have said that a year ago. You have to watch them like a hawk -- they are good at spitting out plausible sounding nonsense that is hard even for an expert to discern. But the dice roll going on behind the scenes is continually more biased towards being correct/useful than not.
On factual things, potentially. But if I want to read your writing, wouldn't I be trying to pick your brain? Otherwise why don't I read wikipedia or usage documentation?
Honestly, I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more. One book a month is hardly an aspirational goal. You don't even have to read Melville or Hemingway or Chaucer or Shakespeare, just pick up any popular NYT best seller, and it'll be significantly better than anything an LLM can generate.
> I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more.
This makes me think you're only exposing yourself to high quality writing online and from an intelligent circle of friends and coworkers. The average person's reading and writing abilities are _atrocious_ and only getting worse. We're almost at the point where kids are communicating through abbreviations and emojis exclusively. LLM prose is significantly better than what the average person can produce.
I think it's both true that most LLM writing ("writing") sucks and that it's better than what a lot of people can produce unassisted. Which to me doesn't mean that we should roll over and accept LLM output as a lesser evil... it just means that the bar is so low it might as well be in hell, and rapidly getting lower :')
It’s acceptable for someone to buy a ready meal or takeout if it’s better than what they can cook. Why wouldn’t it be? Is that the greatest choice for their personal development? Probably not, but life is complex and folk have limited capability and bandwidth for acquiring skills.
Tell me your thoughts on the quality of LLM-generated code. I've never understood this attitude where people are absolutely disgusted by the slightest whiff of AI prose but will happily slurp up AI-generated code by the bucketful and proudly proclaim that it's OK because it's better than the average developer can produce.
Really hard to take your comment serious when the only post on dvt.name is a hello world page, because at least OP is trying to publish and you are lacking moral high ground to judge him thinking LLM writing is good.
Oh if I had a nickle for every web domain I bought and put a hello-world.html into s3 and never checked again ...
FWIW, I'm with GP. It's quite easy to get just mind-numbingly tired reading beyond the first two sentences of a typical LLM output, let alone on something I'm familiar with.
Lol my blog was hacked recently and I've been lazy about moving my backed-up mySQL DB to the new WP installation. Not sure where moral high ground enters the picture. If I really wanted to be an asshole, I'd cite a book I co-wrote and another I edited.
I dabble in drawing and I find LLM images (and maybe some non LLM one) abhorrent. As for why, I can think are no consistency (perspective, small details, and color theory) and too much details making it a visual noise. In most painting, the artist will have a subject that is most detailed (to draw the eyes) and from there, the lost of details will follow some kind of logic. This is how you pinpoint what the artist is most interested in. LLM looks like a filter applied to a montage of pictures.
It's like a gross looking slice of pizza, it's mindbending because at first it looks good, after all it's pizza, but something in it makes it really disgusting
- “(The) honest caveat:” (or “genuine caveat:”, both with the colon)
- “(The) honest answer:” (again, with colon)
- “The thing to internalize:”
- “The smoking gun:”
(really, sentences that start with “The <tag suggesting the next clause is the key point>:” are a strong tell, but those four are the most prolific)
- “load bearing” (when not talking about architecture)
- “blast radius” (when not talking about actual explosives, but rather the effect of an event/action)
- “smoke test” (esp. when “sanity check” is more apropos)
- Lists of three clauses/adjectives where the third is really just a combination of the first two
- Referring to the “shape” of things figuratively
- Social media posts that end with “Curious if anyone…”
- Stories or anecdotes using. “Oh. Oh.” (where the second “oh” is italicized)
Edit: Yes, some of those last ones are terms that we often use as devs...but I would argue about the actual frequency of their use. Plus, these tells live on in prose generated by the latest models.
> I would argue about the actual frequency of their use
Assuming you mean load bearing & blast radius, I'd see those used and use them myself very frequently pre LLM, mostly in online discussions though so its telling where they got their training data. Load bearing itself is/was a pretty normal phrase in the ops world in daily discussion.
Smoke test though, I can't say I've ever see irl usage.
These LLM idioms are constantly being consumed every day and are bound to make it into the next, if not current, generation's vernacular. It's going to be unbearable.
Honest, straight, genuine, actual, real are all words that paper over a weak claim to me. Im thinking about a hook that injects a subagent fact checking in an "are you sure" style here because it's so bad.
Also the false not X it's Y is used in a similar way for faux distinctions like a sov cit claiming "it's not driving, it's traveling in a car"
The LLM doesn't smell like authentic writing but it does a great job for fast and cheap words. We've gained something similar to fast food. Words made very cheap, very fast, easily digestible, but they have no emotion. In short stints it does have a place in the world.
It's kind of interesting how genuinely hard it is to get models to deviate from basically all of these tropes. You can straight up tell it "I hate that card design, do something different, get creative!" and it'll do something either (a) ugly as sin (clearly just essentially a random walk through parameters) or (b) some same-y derivation of that card.
In coding, I've noticed a few tropes as well: everything is a "contract" or an "artifact" (clearly trained on like three decades of Java lol), everything is constantly "backwards-compatible" or "versioned" (even if working on a brand new greenfield project), and a few others.
That's a funny one. I don't use LLMs at all but "load bearing" is such a common/over-used internet joke for DIY building projects and stuff like "load bearing caulk". Have never heard it in a software sense really so am slightly perplexed
All of those are included in the bulk of the documents passing my work input these days. It is infuriating. Out of principle I maintain 100% me in all my writing but I don't know if it matters. Well maybe it does... an interviewee recently complimented me on the "nicest and most human resume" they saw recently. That felt good
Those cards, so familiar! Exactly what Opus produced for me.
Did Anthropic and/or OpenAI deliberately train their models to produce websites with a specific design language, or did these stylistic preferences emerge naturally as some kind of LLM-selected optimum?
It's not the base model, it's the system prompt in dev tools.
To give an example I'm personally frequently annoyed by, Google's Antigravity will consistently use the word "anthropomorphic" while "thinking" and the end result will consistently have obnoxiously large border radius (kind of like Android's design language).
Codex on the other hand likes to make websites with blue elements on a black background and likes to use emojis for icons for some reason, which is a terrible idea accessibility-wise.
What I find amazing is how HARD it is to make the LLM produce a piece of text that does not sound like slop. I have had dozens of sessions where I tried to make it write like a human would, and yet it still uses those tired writing phrases. I don't understand why neither openai, nor anthropic are able to do anything to make it better, and in some cases it feels like we are actually going backwards.
> The LLM generated writing obviously felt significantly better than my own writing.
A general pattern for LLMs is that they look really good at things you are bad at. What that means is that if you find yourself thinking of its output as significantly better than yours in a particular domain, there's a high chance that you are not equipped to judge that quality effectively.
I don't disagree about the probability, but the current frontier models are not completely useless for writing even in areas where I have significant knowledge. I would not have said that a year ago. You have to watch them like a hawk -- they are good at spitting out plausible sounding nonsense that is hard even for an expert to discern. But the dice roll going on behind the scenes is continually more biased towards being correct/useful than not.
On factual things, potentially. But if I want to read your writing, wouldn't I be trying to pick your brain? Otherwise why don't I read wikipedia or usage documentation?
Honestly, I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more. One book a month is hardly an aspirational goal. You don't even have to read Melville or Hemingway or Chaucer or Shakespeare, just pick up any popular NYT best seller, and it'll be significantly better than anything an LLM can generate.
> I can't fathom thinking that LLM writing is even remotely passable. People that think this should honestly read more.
This makes me think you're only exposing yourself to high quality writing online and from an intelligent circle of friends and coworkers. The average person's reading and writing abilities are _atrocious_ and only getting worse. We're almost at the point where kids are communicating through abbreviations and emojis exclusively. LLM prose is significantly better than what the average person can produce.
Are we also saying it's acceptable to feed people junk because it's better than what they would cook?
At some point you're just making bad excuses for false scarcity.
I think it's both true that most LLM writing ("writing") sucks and that it's better than what a lot of people can produce unassisted. Which to me doesn't mean that we should roll over and accept LLM output as a lesser evil... it just means that the bar is so low it might as well be in hell, and rapidly getting lower :')
It’s acceptable for someone to buy a ready meal or takeout if it’s better than what they can cook. Why wouldn’t it be? Is that the greatest choice for their personal development? Probably not, but life is complex and folk have limited capability and bandwidth for acquiring skills.
Tell me your thoughts on the quality of LLM-generated code. I've never understood this attitude where people are absolutely disgusted by the slightest whiff of AI prose but will happily slurp up AI-generated code by the bucketful and proudly proclaim that it's OK because it's better than the average developer can produce.
Really hard to take your comment serious when the only post on dvt.name is a hello world page, because at least OP is trying to publish and you are lacking moral high ground to judge him thinking LLM writing is good.
Oh if I had a nickle for every web domain I bought and put a hello-world.html into s3 and never checked again ...
FWIW, I'm with GP. It's quite easy to get just mind-numbingly tired reading beyond the first two sentences of a typical LLM output, let alone on something I'm familiar with.
Lol my blog was hacked recently and I've been lazy about moving my backed-up mySQL DB to the new WP installation. Not sure where moral high ground enters the picture. If I really wanted to be an asshole, I'd cite a book I co-wrote and another I edited.
I dabble in drawing and I find LLM images (and maybe some non LLM one) abhorrent. As for why, I can think are no consistency (perspective, small details, and color theory) and too much details making it a visual noise. In most painting, the artist will have a subject that is most detailed (to draw the eyes) and from there, the lost of details will follow some kind of logic. This is how you pinpoint what the artist is most interested in. LLM looks like a filter applied to a montage of pictures.
It's like a gross looking slice of pizza, it's mindbending because at first it looks good, after all it's pizza, but something in it makes it really disgusting
Mnemonic: geLL-Mann amnesia effect
Scrolling down a LinkedIn feed is hilarious at the moment.
My favourite one today from today:
“The tax isn't the problem. The mindset is.”
> :black_circle_for_record: Smoking gun.
> "belt and suspenders"
- “(The) honest caveat:” (or “genuine caveat:”, both with the colon)
- “(The) honest answer:” (again, with colon)
- “The thing to internalize:”
- “The smoking gun:”
(really, sentences that start with “The <tag suggesting the next clause is the key point>:” are a strong tell, but those four are the most prolific)
- “load bearing” (when not talking about architecture)
- “blast radius” (when not talking about actual explosives, but rather the effect of an event/action)
- “smoke test” (esp. when “sanity check” is more apropos)
- Lists of three clauses/adjectives where the third is really just a combination of the first two
- Referring to the “shape” of things figuratively
- Social media posts that end with “Curious if anyone…”
- Stories or anecdotes using. “Oh. Oh.” (where the second “oh” is italicized)
Edit: Yes, some of those last ones are terms that we often use as devs...but I would argue about the actual frequency of their use. Plus, these tells live on in prose generated by the latest models.
> I would argue about the actual frequency of their use
Assuming you mean load bearing & blast radius, I'd see those used and use them myself very frequently pre LLM, mostly in online discussions though so its telling where they got their training data. Load bearing itself is/was a pretty normal phrase in the ops world in daily discussion.
Smoke test though, I can't say I've ever see irl usage.
These LLM idioms are constantly being consumed every day and are bound to make it into the next, if not current, generation's vernacular. It's going to be unbearable.
Honest, straight, genuine, actual, real are all words that paper over a weak claim to me. Im thinking about a hook that injects a subagent fact checking in an "are you sure" style here because it's so bad.
Also the false not X it's Y is used in a similar way for faux distinctions like a sov cit claiming "it's not driving, it's traveling in a car"
Jab, jab, thrust is how I think about that pattern. Or tap tap whack, if you prefer. And it shows up for for positives too:
"Smooth. Effortless. A perfect fit for your needs".
In any style of informal or persuasive writing this shows up , as if it has to drive the point in.
I kind of wish we'd stop talking openly about what the tells are. It's nice to be able to determine with fair accuracy - but it couldn't last forever.
Abusing the words "canonical" and "normalized".
The LLM doesn't smell like authentic writing but it does a great job for fast and cheap words. We've gained something similar to fast food. Words made very cheap, very fast, easily digestible, but they have no emotion. In short stints it does have a place in the world.
> The "JetBrains Mono" font
Thought for sure we'd get a critique of Inter overuse. JetBrains Mono is a lovely font, though.
It's my daily driver, so I kind of twitched a bit saying that list in here. I never noticed because I was using it anyway, I guess.
It's kind of interesting how genuinely hard it is to get models to deviate from basically all of these tropes. You can straight up tell it "I hate that card design, do something different, get creative!" and it'll do something either (a) ugly as sin (clearly just essentially a random walk through parameters) or (b) some same-y derivation of that card.
In coding, I've noticed a few tropes as well: everything is a "contract" or an "artifact" (clearly trained on like three decades of Java lol), everything is constantly "backwards-compatible" or "versioned" (even if working on a brand new greenfield project), and a few others.
If claude says "load bearing" once more, I think I'll vomit.
That's a funny one. I don't use LLMs at all but "load bearing" is such a common/over-used internet joke for DIY building projects and stuff like "load bearing caulk". Have never heard it in a software sense really so am slightly perplexed
Hah, ChatGPT constantly says "that's real" or "less about X, more about Y."
You are right to push back.
All of those are included in the bulk of the documents passing my work input these days. It is infuriating. Out of principle I maintain 100% me in all my writing but I don't know if it matters. Well maybe it does... an interviewee recently complimented me on the "nicest and most human resume" they saw recently. That felt good
Do you send your resume to people before you interview them?
Those cards, so familiar! Exactly what Opus produced for me.
Did Anthropic and/or OpenAI deliberately train their models to produce websites with a specific design language, or did these stylistic preferences emerge naturally as some kind of LLM-selected optimum?
It's not the base model, it's the system prompt in dev tools.
To give an example I'm personally frequently annoyed by, Google's Antigravity will consistently use the word "anthropomorphic" while "thinking" and the end result will consistently have obnoxiously large border radius (kind of like Android's design language).
Codex on the other hand likes to make websites with blue elements on a black background and likes to use emojis for icons for some reason, which is a terrible idea accessibility-wise.
AI has no taste, so I suspect the labs just gave it a bunch of decent looking boilerplate as preferred style.
When you bring your own ideas you can get AI to dev pretty nice looking non-generic stuff.
Welcome to the future of fast-food software. Taste of deep frying and preservatives.
KPI cards, purple gradients
What I find amazing is how HARD it is to make the LLM produce a piece of text that does not sound like slop. I have had dozens of sessions where I tried to make it write like a human would, and yet it still uses those tired writing phrases. I don't understand why neither openai, nor anthropic are able to do anything to make it better, and in some cases it feels like we are actually going backwards.