> When you publish a Notion page to the web, the webpage’s metadata may include the names, profile photos, and email addresses associated with any Notion users that have contributed to the page.
I think we’ll start seeing consulting agencies advertise how many vulnerabilities that can resolve per million token, and engineering teams feeling pressure to merge this generated code.
We’ll also see more token heavy services like dependabot, sonar cube, etc that specialize in providing security related PR Reviews and codebase audits.
This is one of the spaces where a small team could build something that quickly pulls great ARR numbers.
The same vertical-specialist logic applies in legal tech. Law firms are drowning in contract review — NDA, MSAs, leases — and generic AI gives them vague answers with no accountability. The teams winning there aren't building 'AI for lawyers', they're building AI that cites every answer to a specific clause and pins professional liability to the output. That's a very different product than a chatbot.
What is needed there are custom harnesses that don’t let the LLM decide what to do when. Use their power of pattern matching on data, not on decision transcriptions.
Does SonarCube use LLMs these days? It always seemed like a bloated, Goodhart's law inviting, waste of time, so hearing that doesn't surprise me at all.
The problem is that they don't "need" to. There's no consequences for not caring, and no incentive to care.
We need laws and a competent government to force these companies to care by levying significant fines or jail time for executives depending on severity. Not fines like 0.00002 cents per exposed customers, existential fines like 1% of annual revinue for each exposed customer. If you fuck up bad enough, your company burns to the ground and your CEO goes to jail type consequences.
This kind of response went out of fashion after Enron. Burning an entire company to the ground (in that case Arthur Andersen) and putting thousands out of work because of the misdeeds of a few - even if they were due to companywide culture problems - turned out to be disproportionate, wasteful, and cruel.
That's exactly backwards. In the current regime, it's precisely the billions of people who are affected by data breaches (and who happen to be taxpayers!) who are footing the bill.
I've been toying around an architecture that sets things up such that the data for each user is actually stored with each user and only materialized on demand, such that many data leaks would yield little since the server doesn't actually store most of the user data. I mention this since this sorts of leaks are inevitable as long as people are fallible. I feel the correct solution is to not store user data to begin with.
some problems I've identified:
1. suppose you have x users and y groups, of which require some subset of x. joining the data on demand can become expensive, O(x*y).
2. the main usefulness of such an architecture is if the data itself is stored with the user, but as group sizes y increase, a single user's data being offline makes aggregate usecases more difficult. this would lend itself to replicating the data server side, but that would defeat the purpose
3. assuming the previous two are solved, which is very difficult to say the least, how do you secure the data for the user such that someone who knows about this architecture can't just go to the clients and trivially scrape all of the data (per user)?
4. how do you allow for these features without allowing people to modify their data in ways you don't want to allow? encryption?
a concrete example of this would be if HN had it so that each user had a sqlite database that stored all of the posts made per user. then, HN server would actually go and fetch the data for each of the posters to then show the regular page. presumably here if a data of a given user is inaccessible then their data would be omitted.
I’ve always liked this idea but I think it eventually ends back up with essentially our current system. Users have multiple devices so you quickly get to needing a sync service. Once that gets complex enough, then people will outsource to a third party and then we are back to a FB/Google/Apple sign in and data mgmt world.
Apparently this is officially documented at https://www.notion.com/help/public-pages-and-web-publishing#... buried in a note:
> When you publish a Notion page to the web, the webpage’s metadata may include the names, profile photos, and email addresses associated with any Notion users that have contributed to the page.
That's just ... absurd.
The flaw itself is absurd but then just accepting it as "by design" makes it even worse.
This is, as a notion user with public pages, beyond stupid.
It has been an issue for at least 5 years. I remember one dude from HN deanonymized me around 5 years ago by looking at my notion page.
Looks like we're gonna have to go full CIA mode and shift into maximum OPSEC if we want any semblance of privacy. Gotta compartmentalize everything...
Notion’s macOS app is some of the worst software I’ve ever used. If there is a platform design idiom, they likely break it without a second thought.
Well thats because it isn't really a macOS app. its just the web app.
Big companys need to start caring more security and privacy of its users and employees
I think we’ll start seeing consulting agencies advertise how many vulnerabilities that can resolve per million token, and engineering teams feeling pressure to merge this generated code.
We’ll also see more token heavy services like dependabot, sonar cube, etc that specialize in providing security related PR Reviews and codebase audits.
This is one of the spaces where a small team could build something that quickly pulls great ARR numbers.
The same vertical-specialist logic applies in legal tech. Law firms are drowning in contract review — NDA, MSAs, leases — and generic AI gives them vague answers with no accountability. The teams winning there aren't building 'AI for lawyers', they're building AI that cites every answer to a specific clause and pins professional liability to the output. That's a very different product than a chatbot.
What is needed there are custom harnesses that don’t let the LLM decide what to do when. Use their power of pattern matching on data, not on decision transcriptions.
Does SonarCube use LLMs these days? It always seemed like a bloated, Goodhart's law inviting, waste of time, so hearing that doesn't surprise me at all.
Nah. They care about profits only, the sooner the better, so everyone can cash out and move to their next “venture”
I don’t think ”caring about profits” applies to any company 2026?
The problem is that they don't "need" to. There's no consequences for not caring, and no incentive to care.
We need laws and a competent government to force these companies to care by levying significant fines or jail time for executives depending on severity. Not fines like 0.00002 cents per exposed customers, existential fines like 1% of annual revinue for each exposed customer. If you fuck up bad enough, your company burns to the ground and your CEO goes to jail type consequences.
This kind of response went out of fashion after Enron. Burning an entire company to the ground (in that case Arthur Andersen) and putting thousands out of work because of the misdeeds of a few - even if they were due to companywide culture problems - turned out to be disproportionate, wasteful, and cruel.
the answer to that is a functional social safety net for the innocent employees to land in, not allowing companies to violate the law with impunity.
You’re describing a system where taxpayers foot the bill for data breaches.
That's exactly backwards. In the current regime, it's precisely the billions of people who are affected by data breaches (and who happen to be taxpayers!) who are footing the bill.
Not at all. Make the guilty corporation pay for all of it.
This. Severe harsh consequences are the best way to prevent crime.
If we also make the penalty for every crime the death penalty we'll have no more crime. Very simple solution no one has thought of.
If the government wants me to take copyright and IP laws seriously, then they need to take my personal information seriously too.
Any self hosted solution?
The tweet is only a few words, you really need an LLM to write that for you???
I've been toying around an architecture that sets things up such that the data for each user is actually stored with each user and only materialized on demand, such that many data leaks would yield little since the server doesn't actually store most of the user data. I mention this since this sorts of leaks are inevitable as long as people are fallible. I feel the correct solution is to not store user data to begin with.
some problems I've identified:
1. suppose you have x users and y groups, of which require some subset of x. joining the data on demand can become expensive, O(x*y).
2. the main usefulness of such an architecture is if the data itself is stored with the user, but as group sizes y increase, a single user's data being offline makes aggregate usecases more difficult. this would lend itself to replicating the data server side, but that would defeat the purpose
3. assuming the previous two are solved, which is very difficult to say the least, how do you secure the data for the user such that someone who knows about this architecture can't just go to the clients and trivially scrape all of the data (per user)?
4. how do you allow for these features without allowing people to modify their data in ways you don't want to allow? encryption?
a concrete example of this would be if HN had it so that each user had a sqlite database that stored all of the posts made per user. then, HN server would actually go and fetch the data for each of the posters to then show the regular page. presumably here if a data of a given user is inaccessible then their data would be omitted.
I’ve always liked this idea but I think it eventually ends back up with essentially our current system. Users have multiple devices so you quickly get to needing a sync service. Once that gets complex enough, then people will outsource to a third party and then we are back to a FB/Google/Apple sign in and data mgmt world.