As a software engineer I have some intuition for what the risks are of letting agents do some tasks vs others.
I don't have a similar intuition calibrated for what could go wrong when asking AI to draft a legal document. Some things seem harmless, i.e. drafting a will, but I don't really know- our legal system is notoriously rife with footguns.
I think this is probably true for most skilled professions. AI is best used in the hands of folks already knowledgeable in the skills/professions they are using it for.
I liken it to me googling things as a sysadmin vs. Jane from accounting doing it. The non-tech end user is far more likely to make the problem worse, or install something sketchy from the ad riddled results than I am, or one of my help desk employees are.
I wouldn't trust myself to draft an important legal document using AI without the advice of a lawyer, much like I wouldn't really want to rely on my lawyer to use AI to write code for me.
Vice versa there is also a lot of irresponsible programmers doing stupid things with ai. Irresponsible people stay irresponsible, AI just make them more productive at being irresponsible.
It's like that in engineering, for sure. My background is in aerospace and there are lots of things that a reasonably technically-inclined random can probably do passably. It takes an engineer to know which tasks those are, though.
I would imagine it's similar in law, in that it takes a lawyer or judge to know where the foot guns lie.
I wouldn't consider drafting a will to be harmless. If its done poorly the next of kin could have to deal with a huge headache and potentially months or years of probate proceedings.
I would think that LLMs would be better at avoiding foot-guns. That’s a situation where you have a list of well known rules and potential pit falls, and the work of the lawyer is to apply those to a fact pattern. That’s something that has been hard to automate programmatically, because the fact patterns are similar but different. LLMs, however, seem to excel at applying general principles to differing fact patterns.
I would categorize this in the "expertise that people internalize but never figure out how to verbalize" department, and that is a department we have no way to teach an LLM because if nobody is writing out those unspoken, subconscious rules then the LLM has nothing to read about them in its training data.
But can an LLM come up with questions like what the definition of is is? Seems to me there's a lot of "depends on how you read it" type of stuff that lawyers excel at finding novel interpretations. So what coders thinking of as rules are much less straight forward to understand when it comes to laws
I imagine it's really hard to spot a comma in the wrong place, or a missing sentence in a 10 page contract unless you wrote it yourself, or you assembled it from some battle tested templates.
My best guess is that Gemini was trained on the textbooks that the questions are meant to test against, thus they are probably better at explicit recall of those questions or related questions.
This is a pretty limited introductory course based on what it says in the methods of the paper itself.
Yeah this could be interesting. A lot of the spotlight has been on “law firm stuff” like demand letters and writing contracts…
But imagine if a dev team didn’t have to go engineer -> product manager -> legal team to get a question answered on local data retention requirements. You could ship that much faster.
If the only purpose of asking a lawyer is transferring risk (aka cover your ass) while getting the same advice as an LLM, that’s slowing down delivery for purely bureaucratic reasons.
I’ve seen that mentality at big companies where everyone is scared to stick their neck out and be accountable for a decision. And nothing gets done. Drives me crazy.
But the people who move up are the people who take ownership and get shit done (and are right a lot).
(BTW, I have been at companies that were sued by regulators. They never really punish the individual(s) who were in the room when the decision is made. So your worry is kind of misplaced.)
Incredible that the common people will be able to wrestle the right to rule of law away from the bloated legal caste, who have built themselves quite the moat.
The inaccessibility of justice is a huge driver of inequality. Any tools which bridge this gap will help make a more just society.
I think there will be a market for firms that aggressively market themselves as non-AI, and then as more people turn towards that human connection we'll go full circle
Nobody wants to pay their lawyers more than they have to. There will be a huge market for firms that can use AI to avoid charging clients for $1,000/hour junior associates.
If you want human connection the legal system is not where you are going to find it, period.
I don't think there will be any such market for "non ai" law. If I'm involved with the legal system I just want out as quick as possible as cheap as possible.
Bad legal advice will keep you dealing with the legal system for much longer and at much greater cost. Something being cheap and quick upfront doesn't mean it will be cheap and quick by the end of the process.
Maybe, although I would be extremely hesitant to extrapolate from this one study and trust my legal life to an LLM. One thing that's worth noting, though, is that regardless of the quality of objective legal advice in the abstract, for a lot of smaller scale stuff the human connection actually is literally what is important. There are ambiguities in the law, which are not resolved deterministically but rather at the individual discretion of judges. Your lawyer, if they're any good at their job, knows the local judges and how they're likely to rule for given circumstances, which can influence their legal advice to you specifically.
> In a blind evaluation of nearly 3,000 anonymized comparisons, professors rated AI responses significantly higher than answers written by other professors, with AI winning 75% of head-to-head matchups.
I do wish they'd used some more objective criteria. Simply being preferable one of the things LLMs have trained for since the beginning, hence its sycophantic nature.
The arguments need to be based on actual law, and any cited reference cases need to be real.
There's been a lot of news stories about lawyers using AI, and then getting in trouble for citing hallucinated laws or cases. It doesn't matter if the AI response is "preferred" over the human one if it gets thrown out when put under the scrutiny of a real case.
But did they? Or did they just go off what answer felt better? Did they put in any work to actually confirm the answer? Or did the busy law professors just click through and move on with their life?
Well, they had the data around if the answer would be harmful to the students learning. AI was scored at 3.5% harmful answers and 12% of law professor answers were considered harmful.
Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.
I'm getting more convinced. I mean, sure it makes dumb mistakes sometimes but its a particular set of self serving mistakes, commenting out tests in order to pass. We obv don't want this behavior but I wouldn't say it's dumb.
It'll be like the Turing test, which we just blew past years ago and no one cared. After all the hand-wringing about sentience and rights of the AI if it passes the Turing test, and now we just have AI bots running 24/7 writing slop.
> Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.
He stands to make billions if enough people believe him — unless you also do, consider that you’re the mark. For example, if that was true, it would have to mean that AI companies either aren’t letting customers use the good models or are instructing them to frequently make errors which reveal a fundamental lack of reasoning ability.
Consider also that his wealth means he hasn’t had to defend an idea stringently since the 90s. I wouldn’t be surprised if he does think LLMs give deep answers because it often looks that way until you critically review the response and ask questions like what’s missing which require you to have a decent understanding of the problem domain.
Knowing the question is half of the answer. LLMs are great at scoping your context and answering precisely what you asked; it's also why they go off the rails when they misunderstand a part of your question. Incidentally, they're great at "knowing" and reaching for knowledge.
Humans have the advantage of perspective. We always lack some knowledge and answer broadly. This is bad if you have a particular goal in mind, but better if you're just generally learning, because you see more and learn to discriminate the correct from the wrong. And most importantly, being wrong is part of human ingenuity - because sometimes we turn something "obviously" wrong into something right.
Marc Andreessen has a strong financial incentive to feel this way and to convince others to feel this way.
I also think it’s easy to think that AI gives good answers if you don’t know the field well. In fields where I know the material, the answers are pretty variable and can be quite bad.
> Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.
Investor with vested interest in AI companies makes claim of reaching "AGI".
He is one of the last people to listen to about AGI. Unless the term "AGI" means something entirely different to him vs to independent researchers vs to CEOs, since the term has become entirely meaningless.
Personally I think this is very good. One of the hardest things out there is maintaining a society in the face of changing times and it's because law is dense and slow.
As a software engineer I have some intuition for what the risks are of letting agents do some tasks vs others.
I don't have a similar intuition calibrated for what could go wrong when asking AI to draft a legal document. Some things seem harmless, i.e. drafting a will, but I don't really know- our legal system is notoriously rife with footguns.
I think this is probably true for most skilled professions. AI is best used in the hands of folks already knowledgeable in the skills/professions they are using it for.
I liken it to me googling things as a sysadmin vs. Jane from accounting doing it. The non-tech end user is far more likely to make the problem worse, or install something sketchy from the ad riddled results than I am, or one of my help desk employees are.
I wouldn't trust myself to draft an important legal document using AI without the advice of a lawyer, much like I wouldn't really want to rely on my lawyer to use AI to write code for me.
> I wouldn't really want to rely on my lawyer to use AI to write code for me.
Yet that is exactly what a lot of C-Suiters (many of whom are lawyers), are doing.
Vice versa there is also a lot of irresponsible programmers doing stupid things with ai. Irresponsible people stay irresponsible, AI just make them more productive at being irresponsible.
im not so sure
i think devs overestimate their own role and underestimate others
i am seeing lawyers and doctors roll out their own software with AI
but we dont have their training and experience
[delayed]
It's like that in engineering, for sure. My background is in aerospace and there are lots of things that a reasonably technically-inclined random can probably do passably. It takes an engineer to know which tasks those are, though.
I would imagine it's similar in law, in that it takes a lawyer or judge to know where the foot guns lie.
I wouldn't consider drafting a will to be harmless. If its done poorly the next of kin could have to deal with a huge headache and potentially months or years of probate proceedings.
I would think that LLMs would be better at avoiding foot-guns. That’s a situation where you have a list of well known rules and potential pit falls, and the work of the lawyer is to apply those to a fact pattern. That’s something that has been hard to automate programmatically, because the fact patterns are similar but different. LLMs, however, seem to excel at applying general principles to differing fact patterns.
I would categorize this in the "expertise that people internalize but never figure out how to verbalize" department, and that is a department we have no way to teach an LLM because if nobody is writing out those unspoken, subconscious rules then the LLM has nothing to read about them in its training data.
But can an LLM come up with questions like what the definition of is is? Seems to me there's a lot of "depends on how you read it" type of stuff that lawyers excel at finding novel interpretations. So what coders thinking of as rules are much less straight forward to understand when it comes to laws
[delayed]
I'm afraid since claude cheats in benches, what will it do with law?
there’s really no limit to how many times and ways you can review something with AI, except dollars.
cannot IMAGINE letting ai write my will rn.
I imagine it's really hard to spot a comma in the wrong place, or a missing sentence in a 10 page contract unless you wrote it yourself, or you assembled it from some battle tested templates.
My best guess is that Gemini was trained on the textbooks that the questions are meant to test against, thus they are probably better at explicit recall of those questions or related questions.
This is a pretty limited introductory course based on what it says in the methods of the paper itself.
That and the research if done by Stanford’s HAI institute with an obvious bias and the paper is curiously missing a conflict of interest statement.
Oh, a "Human-Cented" study by AI lover:
Julian Nyarko
LOL!Yeah this could be interesting. A lot of the spotlight has been on “law firm stuff” like demand letters and writing contracts…
But imagine if a dev team didn’t have to go engineer -> product manager -> legal team to get a question answered on local data retention requirements. You could ship that much faster.
Would you take responsibility for missing details about local data retention requirements?
Yes.
If the only purpose of asking a lawyer is transferring risk (aka cover your ass) while getting the same advice as an LLM, that’s slowing down delivery for purely bureaucratic reasons.
I’ve seen that mentality at big companies where everyone is scared to stick their neck out and be accountable for a decision. And nothing gets done. Drives me crazy.
But the people who move up are the people who take ownership and get shit done (and are right a lot).
(BTW, I have been at companies that were sued by regulators. They never really punish the individual(s) who were in the room when the decision is made. So your worry is kind of misplaced.)
honestly if you just avoid EU and China
you can get away with anything
California too.
And with those three places listed you've ruled out literally 40% of the world economy. Great, you can ship your product in bumfuck Nebraska.
More great news from the prestigious university where 40% of students claim they are disabled
https://fortune.com/article/rise-in-elite-students-seeking-a...
and where they wanted to ban words such as "chief", "stupid", "karen" and "American"
https://reason.com/2022/12/21/stanford-elimination-harmful-l...
Incredible that the common people will be able to wrestle the right to rule of law away from the bloated legal caste, who have built themselves quite the moat.
The inaccessibility of justice is a huge driver of inequality. Any tools which bridge this gap will help make a more just society.
Yes, LLMs are great at search. That's not news.
I think there will be a market for firms that aggressively market themselves as non-AI, and then as more people turn towards that human connection we'll go full circle
Nobody wants to pay their lawyers more than they have to. There will be a huge market for firms that can use AI to avoid charging clients for $1,000/hour junior associates.
that worked out for artists and translators right ?
If you want human connection the legal system is not where you are going to find it, period.
I don't think there will be any such market for "non ai" law. If I'm involved with the legal system I just want out as quick as possible as cheap as possible.
Bad legal advice will keep you dealing with the legal system for much longer and at much greater cost. Something being cheap and quick upfront doesn't mean it will be cheap and quick by the end of the process.
But isn’t this study saying that the legal advice could actually be better with AI?
A bit of extrapolation from the study, but not a crazy stretch.
Maybe, although I would be extremely hesitant to extrapolate from this one study and trust my legal life to an LLM. One thing that's worth noting, though, is that regardless of the quality of objective legal advice in the abstract, for a lot of smaller scale stuff the human connection actually is literally what is important. There are ambiguities in the law, which are not resolved deterministically but rather at the individual discretion of judges. Your lawyer, if they're any good at their job, knows the local judges and how they're likely to rule for given circumstances, which can influence their legal advice to you specifically.
Fair.
But I could also see a world where that, too, is fed to models for hyper-local results.
Could be a way off, but I could see it.
> In a blind evaluation of nearly 3,000 anonymized comparisons, professors rated AI responses significantly higher than answers written by other professors, with AI winning 75% of head-to-head matchups.
75% win rate seems pretty good!
Paper link: https://law.stanford.edu/wp-content/uploads/2026/06/salinas_...
I wonder to what degree the AI was just better at communicating. My experience with attorneys is that they are often some of the worst writers.
I do wish they'd used some more objective criteria. Simply being preferable one of the things LLMs have trained for since the beginning, hence its sycophantic nature.
What criteria would you use for judging legal arguments?
The arguments need to be based on actual law, and any cited reference cases need to be real.
There's been a lot of news stories about lawyers using AI, and then getting in trouble for citing hallucinated laws or cases. It doesn't matter if the AI response is "preferred" over the human one if it gets thrown out when put under the scrutiny of a real case.
Who's gonna determine that? A bunch of law professors?
But did they? Or did they just go off what answer felt better? Did they put in any work to actually confirm the answer? Or did the busy law professors just click through and move on with their life?
maybe seeing if the case law it cited was real or imagined? Just one idea, IANAL
Well, they had the data around if the answer would be harmful to the students learning. AI was scored at 3.5% harmful answers and 12% of law professor answers were considered harmful.
Yeah, 75% win rate is a ~200 points Elo difference, which is quite massive.
AI will never convince a jury though.
A couple of acting classes might be cheaper than a lawyer, then you can go all out representing yourself.
Library outperforms student... more news at 9
Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.
I'm getting more convinced. I mean, sure it makes dumb mistakes sometimes but its a particular set of self serving mistakes, commenting out tests in order to pass. We obv don't want this behavior but I wouldn't say it's dumb.
It'll be like the Turing test, which we just blew past years ago and no one cared. After all the hand-wringing about sentience and rights of the AI if it passes the Turing test, and now we just have AI bots running 24/7 writing slop.
How does everyone else feel?
> Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.
He stands to make billions if enough people believe him — unless you also do, consider that you’re the mark. For example, if that was true, it would have to mean that AI companies either aren’t letting customers use the good models or are instructing them to frequently make errors which reveal a fundamental lack of reasoning ability.
Consider also that his wealth means he hasn’t had to defend an idea stringently since the 90s. I wouldn’t be surprised if he does think LLMs give deep answers because it often looks that way until you critically review the response and ask questions like what’s missing which require you to have a decent understanding of the problem domain.
Knowing the question is half of the answer. LLMs are great at scoping your context and answering precisely what you asked; it's also why they go off the rails when they misunderstand a part of your question. Incidentally, they're great at "knowing" and reaching for knowledge.
Humans have the advantage of perspective. We always lack some knowledge and answer broadly. This is bad if you have a particular goal in mind, but better if you're just generally learning, because you see more and learn to discriminate the correct from the wrong. And most importantly, being wrong is part of human ingenuity - because sometimes we turn something "obviously" wrong into something right.
Marc Andreessen has a strong financial incentive to feel this way and to convince others to feel this way.
I also think it’s easy to think that AI gives good answers if you don’t know the field well. In fields where I know the material, the answers are pretty variable and can be quite bad.
Getting the right answers is just half of it, you need to know the right questions to ask. I haven't yet seen AI crack that one.
He would tell you NFTs were AGIs if it might get you to buy them.
> Marc Andreessen argued that we've already reached AGI. He says that the top AI models give better answers than 99% of people he has access to, and he has access to some of the best people in their field.
Investor with vested interest in AI companies makes claim of reaching "AGI".
He is one of the last people to listen to about AGI. Unless the term "AGI" means something entirely different to him vs to independent researchers vs to CEOs, since the term has become entirely meaningless.
Personally I think this is very good. One of the hardest things out there is maintaining a society in the face of changing times and it's because law is dense and slow.
I think, in the right hands, this could be huge.
It turns out everybody has at least one right hand, even the people we trust the least.
He is basically an AI professor for law. This study just confirms his existence:
https://juliannyarko.com/
Stanford and its donors of course want to replace anyone but its administrators, so they cheer on such anti-intellectual nonsense.
This is the state of HN. Created new account. Accused without evidence. Emotional clickbait.
in mice