New Research Shows That AI Models Taught Legalese Are Surprisingly Efficient

Date

Author

By Tad Vezner
A portrait of Daniel Martin Katz against a blue backdrop

How good of a lawyer should an artificial intelligence system be? More importantly, how good do you want one to be?

When it comes to teaching machines how to understand and utilize language, it turns out that the more legalese that they know, the better, according to a new paper co-authored by Ā鶹APP-Kent College of Law Professor and Law Lab Director Daniel Martin Katz.

ā€œIt is pretty clear that a legally trained AI system is just going to perform betterā€”but the open question is to identify the precise information diet to feed these modelsā€ Katz states simply.

His paper, ā€œ,ā€ explores how different large language models (LLMs) were used to solve a variety of tasks.

Pioneered by organizations such as Google, OpenAI, and the Allen Institute, LLMs such as Bert, Elmo, and GPT-3, among others, have grown increasingly popular in the field of natural language processing. Many LLMs have been trained in general language, but the question that Katz and his colleagues sought to explore is how to apply these LLMs to legal tasks. They analyzed several the different modelsā€”to evaluate the performance of LLMs on tasks such as evaluating contracts, including determining if such contracts were unfair under European Union consumer law.

ā€œA lot of effort in computer science goes into making machines understand language broadly,ā€ Katz says. ā€œHow do you train a machine in the language of law? Well, how do you train a person? You send them [to law school] for three years, and you say a lot of words at them. You use words in a variety of contexts. In a real sense, you are training a studentā€™s neural network (their brains).ā€

Thatā€™s what the models tested in Katzā€™s paper did: They exposed machines to a large corpus of different words and measured how effective those words were at getting the machines to solve tasks.

It turned out, of the seven different models that were tested, the model that taught legal language got the machines, on average, to perform tasks betterā€”not just legal tasks, but any type of task.

ā€œThe diet of getting legal information when itā€™s being trained makes it better across all tasks,ā€ Katz says.

The paper has been deemed intriguing enough to be accepted for presentation at the Association for Computational Linguistics annual 2022 meeting in May.

ā€œItā€™s a rare thing to see a law professor get a paper accepted into a computer science conference,ā€ Katz notes. ā€œItā€™s the type of place you should take this type of workā€”a group of people that can actually evaluate its technical merits.ā€

Itā€™s a research area on the cutting-edge of both computer science and the law; itā€™s an area that Illinois Institute of Technology and Ā鶹APP-Kent are uniquely situated to excel in, Katz notes.

ā€œEven though machines are getting good at understanding basic language, itā€™s a much harder problem to understand specialist languages: medical English or law,ā€ Katz says. ā€œWeā€™re trying to answer: How do we build the scientific infrastructure to have machines understand legal language?ā€

ā€œLaws and their interpretations, legal arguments and agreements, are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size,ā€ the authors note in the paper, adding that ā€œnatural language understanding technologies can be a valuable tool to support legal practitioners in these endeavors.ā€

Along with Katz, the paper is co-authored by Ilias Chalkidis of the University of Copenhagen, Denmark; Abhik Jana of the Universit Ģˆat Hamburg, Germany; Dirk Hartung of Bucerius Law School, Hamburg, Germany; Michael Bommarito of CodeX, Stanford Law School; Ion Androutsopoulos of the Athens University of Economics and Business, Greece; and Nikolaos Aletras of the University of Sheffield, United Kingdom.

Photo: Ā鶹APP-Kent College of Law Professor and Law Lab Director Daniel Martin Katz