ChatGPT In The Law Office: Pitfalls of using AI for lawyers

March 9th, 2023

Art Credits: All artwork created by OpenAI’s DALL-E2 using the “watercolor sketch” style. Specific prompts are listed as the caption of each photo.

A robot in the form of the scales of justice (variation)

Despite big promises of courtroom appearances in early 2023, DoNotPay, an AI-powered robot lawyer, won’t be at a defense table near you anytime too soon. A bot can pass the bar, but it has quite a few limitations to overcome before it’s ready to make its opening argument. For starters, the tech’s not allowed in most courtrooms. As reported by CBS News, “Of the 300 cases DoNotPay considered for a trial of its robot lawyer, only two were feasible” under the current rules.

Those kinds of courthouse rules—not to mention legal decorum—are slow to change, as are state licensing organizations. It was the state bar that stopped DoNotPay’s trial appearance this time, because the robot instructions would constitute unlicensed practice. Regardless of one's feelings about his company, though, DoNotPay CEO Joshua Browder makes an excellent point:

"ChatGPT is very good at holding conversations, but it's terrible at knowing the law. We've had to retrain these AIs to know the law.…AI is a high school student, and we're sending it to law school." But even then, it isn’t yet permitted to practice the law on its own.

Robot lawyer looking confused

The truth is, right now these AI-powered tools are a Swiss army knife: they do lots of things, but none of them very well. But the day is coming. Even today, you can potentially save some time by properly applying AI tools to the right problems—freeing yourself up to focus on the high-level legal work that a bot can never beat: nuanced critical thinking, knowing the law, knowing the judge, knowing your clients, and understanding the complex interactions of the law and the real world.

What is “AI”?

The term "artificial intelligence" often confuses people. This is especially the case with text generating AIs like ChatGPT, whose responses certainly sound intelligent.

In general, when we say something is "intelligent," we're saying that it perceives the world in some way, is able to remember what it perceived, and can do something with that memory. Basically, it "knows stuff."

In contrast to a person or dog or parrot or dolphin, an AI chatbot like ChatGPT doesn't "know" anything. Instead, it ingests extremely large collections of data then calculates how common given sequences of words are in that data collection and extrapolates the possible existence of relationships between those words and phrases. In the case of ChatGPT, the data consists of huge amounts of text, including software documentation, Wikipedia entries, Internet forums and message boards, published books and webpages—possibly including the webpage for your business—and more.

For example, if you process enough text, you'll start to notice that "dog" and "leash" often occur next to each other, as do "dog" and "hot." You'll further note that sentences where "hot" and "dog" occur next to each other, "eating" is often present. Meanwhile, sentences with "dog" and "leash" are much more likely to include the verb "walking." Your model of this data set will begin to reflect that "dog leash" is a potentially "walking" related phrase, while "hot dog" is more often associated with "eating."

But if you then go on to process a batch of travel blog posts about dog-friendly parks near great hot dog stands, your chatbot’s idea of what “dogs” are for will likely start to tip in a disturbing direction.

Dog in a hotdog bun

Rather than call this “AI-written text,” it’s really much more accurate to think of it as "statistically generated output" that happens to come in the form of witty answers to odd questions.

Large Language Models (LLMs)

As Luis Ceze, a computer science professor at the University of Washington and CEO of the AI start-up OctoML, recently told Will Oremus at the Washington Post: “These [AI] models aren’t thinking…What they’re doing is a very, very complex lookup of words that figures out, ‘What is the highest-probability word that should come next in a sentence?'”

A sea of letters

In other words, as Oremus explains, AI "mimics natural, humanlike interactions, even though the chatbot doesn’t have any self-awareness or common sense."

Models like these, which indiscriminately reflect the statistical relationships found in human-written texts, are known as "large language models" (or "LLMs").

Earlier large language models were used to build tools you use every day, like Google Translate, grammar check, and autocomplete. In fact, more simplistic language models have been used in much the same way as ChatGPT, generating realistic-sounding text since the late 1980s—albeit usually just as a novelty; look into the history of "Mark V. Shaney" for the first well-documented instance of this.

What Are the Pitfalls of Using AIs in the Law Office?

In order to highlight the potential pitfalls that come with using systems like ChatGPT in your law practice, we ran an experiment: Modern Firm used ChatGPT to write a lengthy blog post about civil commitment in the state of Minnesota.

If you want to try this yourself, you can use our prompt: "Can you write a blog post on the topic of 'civil commitment' in the state of Minnesota and how an attorney can be helpful?" Just note, your results will likely be different from ours, as every interaction adds more words and phrases to ChatGPTs LLM.

We then shared the resulting blog with a team of professional copywriters. One of the copywriters has had a hobby interest in statistically and programmatically generated text since the late 1990s. The other is a retired public defender.

They identified four key pitfalls with using the current generation of AI tools in law offices.

1. Plagiarism

Robot stealing an idea from a person

Many outlets have reported on AI-based text generators plagiarizing the work of others (here's just one example). Our ChatGPT-generated blog post was no exception. It failed most online plagiarism detection checkers, with one tool scoring it at just "66.9% original," borrowing heavily from only a handful of sources. More distressingly, ChatGPT was not very selective: none of the content appeared to be taken from reputable sources. ChatGPT did not appear to take any material from official Minnesota records or even other lawyers' websites. Instead, 16.1% of the article seemed to rely on the legal experts at "moviecultists.com."

2. Inaccuracy and Missed Nuance

Robot tripping on acid

This is another widely reported issue with AI-generated texts: in short, chatbots lie. Well, "lie" is a bit strong; a lie requires intent and, as we discussed earlier, tools like ChatGPT cannot have intent, any more than a spreadsheet has intent. The actual term of art when an AI system makes things up is that the system is “hallucinating.” Though cognitive scientist and AI researcher, Gary Marcus, said it bluntly in a recent 60 Minutes piece, that what these bots often produce is “authoritative bullshit.”

But these bots make lots of mistakes. These can be simple errors, such as the one we found in our test article, where ChatGPT repeatedly insisted that the statute in question had 25 sections; in reality it has 24.

More concerning than these simple factual errors are the bot's subtle mistakes, ones that might only be detected if a person has expertise in the field the bot is commenting on. For example, ChatGPT repeatedly used "gravely disabled" in its answer, going so far as to define the phrase after specifying:

"Under the Minnesota Commitment and Treatment Act, an individual can be involuntarily committed if they are mentally ill and a danger to themselves or others, or if they are gravely disabled."

This would lead most casual readers, and even some with legal training, to believe that "gravely disabled" must be an important term in the Minnesota Commitment and Treatment Act.

But the term "gravely disabled" appears nowhere in the act. The phrase—which is common in other states’ commitment statutes—does appear elsewhere in Minnesota case law, which explains why ChatGPT included it. So, some might argue this is less an error and more a case of missing nuance. An actual lawyer would want to accurately capture that nuance, saying something more along the lines of:

"Under the Minnesota Commitment and Treatment Act and the caselaw, regulations, and local practices that inform and interpret it, an individual can be involuntarily committed if they are mentally ill and a danger to themselves or others, or if they are gravely disabled."

3. Bad Writing

Robot hands typing

ChatGPT invariably produces bad writing. Yes, the writing itself is smooth, with clean grammar and punctuation. In fact, although Grammarly flagged our ChatGPT-generated article for plagiarism, it gave it high marks for grammar, punctuation, and readability.

But even Grammarly dinged the post for needless wordiness, poor word choice, and stylistic inconsistencies.

Our copywriters pointed out that part of the reason ChatGPT can seem to say so much is that it leans heavily on many of the same tricks and crutches employed by bad writers. Most prominently:

weasel words—These are unattributed, and usually unattributable, claims to authority. e.g. "Many believe that…" or "research shows…"
- Good writing specifies who believes this—"42% of doctors surveyed…" or "Bob Loblaw of the American Raconteurs Society explained…"—or links to the research it mentions.
puffery (also called "peacock phrases")—These are casual overstatements that lack any real meaning. e.g., "most important", "highly influential," etc.
- Instead, omit the fluff and include verifiable credentials or achievements, such as "Pulitzer-prize winning journalist…" or "three-time state spelling bee champion…”.
vagueness—Failing to go into detail, even when the topic demands it; e.g., recipes that lack cooking times, temperatures, or amounts.
- Good writing informs.
a lack of focus—Drifting between closely related topics, switching perspectives, or abruptly shifting audiences.
- Good writing speaks clearly to a specific audience about a specific topic.

These aren't just aesthetic or style concerns. Bad writing habits confuse readers or leave gaps that allow readers to "fill in the blanks" with what they want to believe. They then make bad decisions based on incomplete understanding or wishful thinking. In either case, they're likely to ultimately decide they were either misled or poorly advised—bad news for any law practice.

4. Accurate, but Unrealistic

A robot attempting to fall in love

When we asked ChatGPT to write about civil commitment and how an attorney can be helpful, it insisted on focusing very tightly on how a person undergoing civil commitment might fight the process, even as we rephrased the question.

Our retired public defender pointed out that this was not the only position or role for an advocate in such a situation. In her experience, few people undergoing civil commitment had the money, mental capacity, or resources to raise such a defense. When lawyers are involved in a civil commitment, they could just as easily be assisting the family or loved ones of an individual who appears to be a danger to themselves or others.

Safely Harnessing the Power of AI for Lawyering

Taking all of these flaws and shortcomings into account, you can still put ChatGPT to work for you now—as long as you supervise it closely and keep your expectations in check. Read our piece on Safely Harnessing the Power of AI for Lawyering to learn how!

Categories: Blog, Practice Management, Question of the Week, Technology

Tags: Blogs & Writing, ChatGPT, Ethics & Bar Rules, Technology

ChatGPT at the Law Office: The Pitfalls of using AIs for Lawyering

What is “AI”?

Large Language Models (LLMs)

What Are the Pitfalls of Using AIs in the Law Office?

1. Plagiarism

2. Inaccuracy and Missed Nuance

3. Bad Writing

4. Accurate, but Unrealistic

Safely Harnessing the Power of AI for Lawyering

Search

Blog Categories

Popular Tags

Our Services

Meet Our Team

Navigate

Company

Contact

Socials

Addresses