Robolawyers Are As Good As Humans, Say Researchers

Posted on Categories Discover Magazine

Back in February 2023, the international law firm Allen & Overy gave its lawyers access to an AI chatbot to help them draft contracts. Almost immediately, this AI system began saving the company’s 3500 lawyers in the region of 2 hours per week. Later in the year, the company announced an AI system capable of contract negotiations, saving up to seven hours per negotiation.

But what is less clear is just how significant these savings really are. In particular, how does the work of a Large Language Model stack up against the work of a junior lawyer in standard tasks such as drafting or reviewing a contract.

AI v Humans

Now we get an answer of sorts thanks to the work of Lauren Martin and colleagues at the AI Center of Excellence at the legal tech company, Onit. Martin and co have directly compared the work of junior lawyers with Large Language Models and say the machines significantly outperform their human counterparts. “Large Language models stand poised to disrupt the legal industry, enhancing accessibility and efficiency of legal services,” say Martin and co.

The team come to this conclusion by asking senior lawyers to point out important legal issues in a set of real-world procurement contracts and then asking Large Language Models and junior lawyers to evaluate the same contracts.

The results were eye-opening. Large Language Models like Claude and GPT-4 matched or even exceeded human precision in identifying legal matters. When pinpointing the specific parts of contracts relating to these issues, the AI lagged slightly behind humans.

But speed-wise, the contrast was staggering. The fastest Large Language Model whipped through contract reviews in under a minute, while junior lawyers took 56 minutes on average.

Martin and co say this doesn’t include the 16 hours or so that it takes to train an AI system. But they point out that this time is roughly equal to the time it takes to instruct a junior lawyer to do the same task. “This equivalence in preparatory time challenges the notion that Large Language Models’ speed advantage is offset by their setup requirements,” say Martin and co.

And the potential cost-savings were similarly immense. “While a Junior Lawyer incurs an average cost of 74 dollars per contract review, the fastest Large Language Model performed the same task for approximately 2 cents,” they say.

Proceed With Caution

“The implications on the legal industry are profound and multifaceted,” say the researchers, with junior lawyers likely to be given more high value work earlier in their careers with routine tasks more likely to be given to AI systems.

But not all is rosy. The analysis cautions that AI still falls short of humans in nuanced legal understanding. Models can also “hallucinate” by raising nonexistent issues in contracts. So oversight from experienced lawyers remains crucial.

Much work is still needed to evaluate Large Language Models across more jurisdictions and more contract types. But the research clearly reveals AI’s viability for core legal tasks. Law is changing quickly, as Allen & Overy are already showing. This research, which is funded by the legal tech firm Onit, puts some more numbers on that. The era of robot lawyers has clearly arrived.

Ref: Better Call GPT, Comparing Large Language Models Against Lawyers : https://arxiv.org/abs/2401.16212

This post was prepared with the assistance of Claude.AI

Leave a Reply