It's the 'authenticity' issue of generative AI tat troubles me rather than the content of the viewpoints.
If these same ideas were expressed by Vtubers (virtual youtubers, anime-like filters for people who want to do to-camera video but are shy or protective of their privacy), it would not be troubling, as everyone understands that fictionalized characters are a form of puppetry an can focus on the content of the argument.
But using generative video to simulate ordinary people expressing those ideas is a way of hijacking people's neural responses. Just pick the demographic you wish to micro-target (young/middle/old, working/middle/upper class, etc. etc. etc.) and generate attractive-looking exemplars for messages you want to promote and ugly-looking exemplars for those you wish to discredit.
The basic framing of the sentence works fine if you know what those terms are. And while "soyjak" is a bit newer, "chad", "soy boy" and "wojak" were well established at that point, so I think it would be easy enough to figure out.
Realistic AI video generation will put us firmly in a post-truth era. If those videos were 25% more realistic it would be indistinguishable from a real interview.
The speed with which you’ll be able to create and disseminate propaganda is mind blowing. Imagine what a 3 letter agency could do with their level of resources.
Content authenticity follows from the source's credibility. This is why chains of evidence are crucial. If squinting at pixels were the harbinger of a post-truth reality, then it's already many decades late.
> speed with which you’ll be able to create and disseminate propaganda is mind blowing
The problem isn’t the fakery, it’s this speed of dissemination on algorithmic social media. It’s increasingly looking like the modern West’s Roman lead pipes.
People want authentic confirmation of what they already think, which makes it easy to give them the confirmation and lie to them about the authenticity.
If they can't have both, some people prefer confirmation to authenticity. But that's far from a universal preference.
Text - “we have no idea if they actually said that”
Picture - “this could be out of context” (this is used constantly in politics and people fall for it anyway)
Video removes the question of context and if the person actually did it. So now instead of printing about it or showing an awkward picture from a certain unflattering angle I can generate a video of your favorite politician taking upskirt photos on a city bus.
As the tech gets more and more realistic we’re increasingly straining the average persons ability to maintain presence of mind and ask questions.
Ha, I’d love to share the optimism that that’s what most people would say but what we’ve seen is the opposite. The more convincing the media (as in media format) is the more people are willing to believe it.
Don’t get me wrong. I hope this level of fake media causes people to stop taking things at face value and dig into the facts but unfortunately it seems we’re just getting worse at it.
You're being obtuse. There's an obvious difference between "state-level actors can produce misleading films" and "anyone with an internet connection and 5 minutes can make anything they want".
Not in terms of effect. This might have been a gamechanger 20 years ago but nowadays people already trust TikTok memes more than they trust CNN. The bar for credibility is so low that this sort of thing is almost trying too hard.
“People” aren’t a monolith. Certain people are definitely falling for low effort TikTok trash but now more people will fall for these more “credible” fakes.
I think it will be like these "X celebrity is dead" fake articles that went viral on Facebook 201X something. People, as in enough people to make gossip, will only get fooled 3 or 4 times.
The Philippines has a story of foreign influence on their local politics. It wouldn't be crazy to expect this is just the latest chapter of the 3 letter using it as laboratory.
To be honest I can consider myself pretty savvy when it comes to identifying fakes and the schoolboy ones would have fooled me. Pretty flawless videos and the accent and lip sync were spot on. I don’t even think you truly need that extra 25% for the casual observer.
Expected reaction: every camera manufacturer will embed chips that hold a private key used to sign and/or watermark photos and videos, thus attesting that the raw footage came from a real camera.
Now it only remains to solve the analog hole problem.
I don't think that's as trivial and invulnerable as you think it is. You are talking about a key that exists on a device, but cannot be extracted from the device. It can be used to sign a high volume of data in a unique way such that the signature cannot be transferred to another video.
Now you have another problem -- a signature is unique to a device, presumably, so you've got the "I can authenticate that this camera took this photo" problem, which is great if you are validating a press agency's photographs but TERRIBLE if you are covering a protest or human rights violation.
- Attach a screen to the camera. Bonus points for bothering to calibrate that contraption.
- Watermarking is nearly useless as a way of conveying that information, either visibly distorting the image or being sensitive to all manner of normal alterations like cropping, lightness adjustments, and screenshotting.
- New file formats are hard to push to a wide enough audience for this to have the desired effect. If half the real images you see aren't signed, ignoring the signature becomes second-nature.
- Hardware keys can always be extracted in O(N) for an N-bit key. The constant factor is large, but not enough to deter a well-funded adversary. The ability to convincingly fake, e.g., video proof you weren't at a crime scene would be valuable in a hurry. I don't know the limits, but it's more than the 2-10 million dollars you need to extract a key.
- You mentioned the analog hole problem, but that's also very real. If the sensor is engineered as a separate unit, it's trivial to sign whatever data you want. That's hard to work around because camera sensors are big and crude, so integration a non-removable crypto enclave onto one is already a nontrivial engineering challenge.
- If this doesn't function something like TLS with certificate transparency logs and chains of trust then one compromised key from any manufacturer kills the whole thing. Would the US even trust Chinese-signed images? Vice versa? The government you obey has a lot of power to steal that secret without the outside world knowing.
- Even if you do have CT logs and trust the company publishing to them to not publish compromised certs, a breach is much worse than for something like TLS. People's devices are effectively just bricked (going back to that 3rd point -- if all the images you personally take aren't appropriately signed, will a lack of signing seem like a big deal?). If you can update the secure enclave then an adversary can too, and if updating tries to protect itself by, e.g., only sending signed bytecode then you still have the problem that the upstream key is (potentially) compromised.
- Everyone's current devices are immediately obsolete, which will kill adoption. If you grandfather the idea in, there's still a period of years where people get used to real images not being signed, and you still have a ton of wasted money and resources that'll get pushback.
Persistent media watermarking through the analog hole is a solved problem and has been for years. It's standard practice on films.
What does it even mean that hardware keys are extractable in O(N) time? If there's some reasonable multiple of N where you can figure out a key, your cryptosystem is broken, physical or not.
It's also very straightforward to attach metadata to media and wouldn't take a format change.
The problem would be spurious watermarks, not vanishing ones. Create fake video, point camera at screen, re-record it. Now it's fake and authenticated as a genuine camera recording.
The basic idea is that you apply a very, very large amount of error correction to the tag and inject it into the media so that enough survives the severe geometric, color, and luminance distortions of a camcorder to recover the data out the other end. You then download the pirated cams and sue the theater.
There's a fair bit of public information out there on theoretical techniques (e.g. https://www.intechopen.com/chapters/71851), but I'm not deeply familiar with what's actually used in industry, for example by imatag.
Interesting, and the paper is surprisingly accessible to read as well, thanks.
One critique I can lodge against this is that to me it reads like the security model in this scheme trusts the venue to not tamper with the projection equipment. This may not map well to everyday camera recording situations, where the camera owner / operator may have a vested interest and capability in tampering with the camera itself.
There's a fair bit of protection for the projection cameras. They're actually always-connected devices that get streaming permissions from a remote server before starting a show, device roots of trust, tamper detection systems, and so on. The movie file is essentially encrypted at rest until showtime. Plus the techs are generally not that technically advanced, and theaters face the threat of lawsuits/never being allowed to show movies again if they breach their contractual obligations to reverse engineer the equipment. It's generally effective.
The point isn't to completely close the analog hole here (and at least one piracy group seems to have leaks of raw movie data despite all of this security), but it's effective at making compliance the least costly option for almost everyone involved. It's one of the major reasons we've seen a shift by pirate groups to preproduction leaks or alternative methods.
AI fakes, and AI in general, will push more and more people to interact with each other in real life. I, for one, can't wait for that. Sometimes, the more things change, the more they stay the same
I don't know that this is true. A lot of people are getting sucked into "this is my AI friend/girlfriend/boyfriend/waifu/husbando" territory.
In real life, other humans are not machines you can put kindness tokens in and get sex out. AI, on the other hand, you can put any tokens at all into AI and get sex out. I'm worried that people will stop interacting with humans because it's harder.
Sure, the results from a human relationship are 10,000x higher quality, but they require you to be able to communicate. AI will do what it's told, and you can tell it to love you and it will.
That problem will naturally sort itself out through the magic of evolution: generic and cultural traits that increase the chance of pairing with AI will be bred out (as such people won't have children), and traits that reduce it will be selected for.
Implying that this is all somehow a genetic trait?
Which gene do you think encodes for having the hots for AI models?
You remind me to a reporting I saw on Taiwanese schoolchildrens' career goals. Most reported aiming for the semiconductor industry. Crazy how the local gene pool works, what a coincidence.
It's the 'authenticity' issue of generative AI tat troubles me rather than the content of the viewpoints.
If these same ideas were expressed by Vtubers (virtual youtubers, anime-like filters for people who want to do to-camera video but are shy or protective of their privacy), it would not be troubling, as everyone understands that fictionalized characters are a form of puppetry an can focus on the content of the argument.
But using generative video to simulate ordinary people expressing those ideas is a way of hijacking people's neural responses. Just pick the demographic you wish to micro-target (young/middle/old, working/middle/upper class, etc. etc. etc.) and generate attractive-looking exemplars for messages you want to promote and ugly-looking exemplars for those you wish to discredit.
I didn't have "I generated myself as the chad and you as the soyjak" turning out to be a valid psyop strat in my AI tech doomerism bingo.
I wonder what recipients would think if I sent this sentence 10 years back via time machine.
The basic framing of the sentence works fine if you know what those terms are. And while "soyjak" is a bit newer, "chad", "soy boy" and "wojak" were well established at that point, so I think it would be easy enough to figure out.
Send it back a century and it’d even be “what’s a bingo card?”
Realistic AI video generation will put us firmly in a post-truth era. If those videos were 25% more realistic it would be indistinguishable from a real interview.
The speed with which you’ll be able to create and disseminate propaganda is mind blowing. Imagine what a 3 letter agency could do with their level of resources.
Content authenticity follows from the source's credibility. This is why chains of evidence are crucial. If squinting at pixels were the harbinger of a post-truth reality, then it's already many decades late.
Good luck telling that to aging grandparents and people in third world countries who are just getting acquaintanted with the internet.
even in first world countries grandparents are reading the print version of The Epoch Times lol
> speed with which you’ll be able to create and disseminate propaganda is mind blowing
The problem isn’t the fakery, it’s this speed of dissemination on algorithmic social media. It’s increasingly looking like the modern West’s Roman lead pipes.
Yet people in this very website are 200% hyped about GenAI because it makes it easier to generate slop frontend code or whatever.
>> Realistic AI video generation will put us firmly in a post-truth era.
Verifiable authenticity might just be the next big thing.
It’s already a thing in some cameras. New Leicas will embed a cryptographic signature in the photo.
People don’t want authenticity. They want confirmation of what they already think.
People want authentic confirmation of what they already think, which makes it easy to give them the confirmation and lie to them about the authenticity.
If they can't have both, some people prefer confirmation to authenticity. But that's far from a universal preference.
Text can be mass faked since Guttenberg, photograph since at least decades.
What makes you think fake videos will have an outsized impact?
Text - “we have no idea if they actually said that”
Picture - “this could be out of context” (this is used constantly in politics and people fall for it anyway)
Video removes the question of context and if the person actually did it. So now instead of printing about it or showing an awkward picture from a certain unflattering angle I can generate a video of your favorite politician taking upskirt photos on a city bus.
As the tech gets more and more realistic we’re increasingly straining the average persons ability to maintain presence of mind and ask questions.
Yes! And:
Video - “There’s no way to tell whether this is AI-faked or not.”
Ha, I’d love to share the optimism that that’s what most people would say but what we’ve seen is the opposite. The more convincing the media (as in media format) is the more people are willing to believe it.
Don’t get me wrong. I hope this level of fake media causes people to stop taking things at face value and dig into the facts but unfortunately it seems we’re just getting worse at it.
A picture is worth a thousand words.
I don't agree. 3 letter agencies have been able to fake videos since the inception of videos. Even more so with CGI.
It has always been about trust in the authors.
The main difference is petty fakes would be cheap. I.e. my wife could be shown a fake portraying me for whatever malicious reasons.
You're being obtuse. There's an obvious difference between "state-level actors can produce misleading films" and "anyone with an internet connection and 5 minutes can make anything they want".
The post I responded to wrote "Imagine what a 3 letter agency could do with their level of resources" and I don't think much changed in that regard.
Not in terms of effect. This might have been a gamechanger 20 years ago but nowadays people already trust TikTok memes more than they trust CNN. The bar for credibility is so low that this sort of thing is almost trying too hard.
“People” aren’t a monolith. Certain people are definitely falling for low effort TikTok trash but now more people will fall for these more “credible” fakes.
I think it will be like these "X celebrity is dead" fake articles that went viral on Facebook 201X something. People, as in enough people to make gossip, will only get fooled 3 or 4 times.
The Philippines has a story of foreign influence on their local politics. It wouldn't be crazy to expect this is just the latest chapter of the 3 letter using it as laboratory.
To be honest I can consider myself pretty savvy when it comes to identifying fakes and the schoolboy ones would have fooled me. Pretty flawless videos and the accent and lip sync were spot on. I don’t even think you truly need that extra 25% for the casual observer.
No, it won't.
Expected reaction: every camera manufacturer will embed chips that hold a private key used to sign and/or watermark photos and videos, thus attesting that the raw footage came from a real camera.
Now it only remains to solve the analog hole problem.
I don't think that's as trivial and invulnerable as you think it is. You are talking about a key that exists on a device, but cannot be extracted from the device. It can be used to sign a high volume of data in a unique way such that the signature cannot be transferred to another video.
Now you have another problem -- a signature is unique to a device, presumably, so you've got the "I can authenticate that this camera took this photo" problem, which is great if you are validating a press agency's photographs but TERRIBLE if you are covering a protest or human rights violation.
- Attach a screen to the camera. Bonus points for bothering to calibrate that contraption.
- Watermarking is nearly useless as a way of conveying that information, either visibly distorting the image or being sensitive to all manner of normal alterations like cropping, lightness adjustments, and screenshotting.
- New file formats are hard to push to a wide enough audience for this to have the desired effect. If half the real images you see aren't signed, ignoring the signature becomes second-nature.
- Hardware keys can always be extracted in O(N) for an N-bit key. The constant factor is large, but not enough to deter a well-funded adversary. The ability to convincingly fake, e.g., video proof you weren't at a crime scene would be valuable in a hurry. I don't know the limits, but it's more than the 2-10 million dollars you need to extract a key.
- You mentioned the analog hole problem, but that's also very real. If the sensor is engineered as a separate unit, it's trivial to sign whatever data you want. That's hard to work around because camera sensors are big and crude, so integration a non-removable crypto enclave onto one is already a nontrivial engineering challenge.
- If this doesn't function something like TLS with certificate transparency logs and chains of trust then one compromised key from any manufacturer kills the whole thing. Would the US even trust Chinese-signed images? Vice versa? The government you obey has a lot of power to steal that secret without the outside world knowing.
- Even if you do have CT logs and trust the company publishing to them to not publish compromised certs, a breach is much worse than for something like TLS. People's devices are effectively just bricked (going back to that 3rd point -- if all the images you personally take aren't appropriately signed, will a lack of signing seem like a big deal?). If you can update the secure enclave then an adversary can too, and if updating tries to protect itself by, e.g., only sending signed bytecode then you still have the problem that the upstream key is (potentially) compromised.
- Everyone's current devices are immediately obsolete, which will kill adoption. If you grandfather the idea in, there's still a period of years where people get used to real images not being signed, and you still have a ton of wasted money and resources that'll get pushback.
Etc. It's really not an easy problem.
Persistent media watermarking through the analog hole is a solved problem and has been for years. It's standard practice on films.
What does it even mean that hardware keys are extractable in O(N) time? If there's some reasonable multiple of N where you can figure out a key, your cryptosystem is broken, physical or not.
It's also very straightforward to attach metadata to media and wouldn't take a format change.
The problem would be spurious watermarks, not vanishing ones. Create fake video, point camera at screen, re-record it. Now it's fake and authenticated as a genuine camera recording.
> Persistent media watermarking through the analog hole is a solved problem and has been for years. It's standard practice on films.
Can you expand on that a bit? Wikipedia's coverage on this seems mostly historical and copy protection focused.
The basic idea is that you apply a very, very large amount of error correction to the tag and inject it into the media so that enough survives the severe geometric, color, and luminance distortions of a camcorder to recover the data out the other end. You then download the pirated cams and sue the theater.
There's a fair bit of public information out there on theoretical techniques (e.g. https://www.intechopen.com/chapters/71851), but I'm not deeply familiar with what's actually used in industry, for example by imatag.
Interesting, and the paper is surprisingly accessible to read as well, thanks.
One critique I can lodge against this is that to me it reads like the security model in this scheme trusts the venue to not tamper with the projection equipment. This may not map well to everyday camera recording situations, where the camera owner / operator may have a vested interest and capability in tampering with the camera itself.
There's a fair bit of protection for the projection cameras. They're actually always-connected devices that get streaming permissions from a remote server before starting a show, device roots of trust, tamper detection systems, and so on. The movie file is essentially encrypted at rest until showtime. Plus the techs are generally not that technically advanced, and theaters face the threat of lawsuits/never being allowed to show movies again if they breach their contractual obligations to reverse engineer the equipment. It's generally effective.
The point isn't to completely close the analog hole here (and at least one piracy group seems to have leaks of raw movie data despite all of this security), but it's effective at making compliance the least costly option for almost everyone involved. It's one of the major reasons we've seen a shift by pirate groups to preproduction leaks or alternative methods.
a capable university student with their lam equipment could easily extract that key
I don't think the CIA will have any problems
I hope you’re right
Does anyone else feel like at some point reality turned into an unpublished Neal Stephenson novel?
Well, we could use AI to write one.
Or a published PKD novel, for that matter.
Was chatting to someone last week who was using AI to help teach their kids. A young ladies illustrated primer.
This seems pretty... mild? It's short clips of AI-generated schoolboys raising a pretty reasonable if obviously still politically motivated argument.
From the headline, I was expecting VP candidates slapping each other in the face with a glove and then facing off at dawn with loaded pistols.
Lol students fake stuff. These actor-less statements got to go for humanity's sake.
IMO all digital content is going to have to be signed so the provenance trail can be crawled by an AI across devices.
https://aditya-advani.medium.com/how-to-defeat-fake-news-wit...
AI fakes, and AI in general, will push more and more people to interact with each other in real life. I, for one, can't wait for that. Sometimes, the more things change, the more they stay the same
I don't know that this is true. A lot of people are getting sucked into "this is my AI friend/girlfriend/boyfriend/waifu/husbando" territory.
In real life, other humans are not machines you can put kindness tokens in and get sex out. AI, on the other hand, you can put any tokens at all into AI and get sex out. I'm worried that people will stop interacting with humans because it's harder.
Sure, the results from a human relationship are 10,000x higher quality, but they require you to be able to communicate. AI will do what it's told, and you can tell it to love you and it will.
for some values of "will".
That problem will naturally sort itself out through the magic of evolution: generic and cultural traits that increase the chance of pairing with AI will be bred out (as such people won't have children), and traits that reduce it will be selected for.
Implying that this is all somehow a genetic trait?
Which gene do you think encodes for having the hots for AI models?
You remind me to a reporting I saw on Taiwanese schoolchildrens' career goals. Most reported aiming for the semiconductor industry. Crazy how the local gene pool works, what a coincidence.
Sounds like a plan for a healthy social order
It could be a plot line in a Black Mirror episode
Once again the Philippines is leading the way with how these things will play out for other countries in the future[1]
[1]: https://www.bbc.com/news/blogs-trending-38173842
Welcome to the future, y’all. It’s gonna be a wild decade.