Lawyer Versus Robot

With an accuracy of 86.6%, compared to the lawyers’ accuracy of 62.3%, CaseCrunch emerged victorious. This experiment is a step towards General Legal Intelligence and challenges the status quo of Legal AI.

Format of the Competition

Throughout October, 112 lawyers pre-registered to participate in the Lawyer Challenge, which ran from 20th - 27th October. They were presented with factual scenarios of PPI mis-selling claims, and asked to predict “yes or no” as to whether the Financial Ombudsman would succeed in the claim. The same factual scenarios were given to CaseCrunch: whoever had the highest accuracy won. 775 predictions were submitted by the participants.

A Technology Judge and a Legal Judge independently verified the fairness of the competition. The Legal Judge was Felix Steffek (LLM, PhD), University of Cambridge Lecturer in Law, and Co-Director for the Centre of Corporate and Commercial Law. The Technical Judge was Ian Dodd, UK Director of Premonition.

The factual scenarios were real decided cases from the Financial Ombudsman Service, published under the FOIA. All identifying details – such as the name of the parties, case names and dates – were removed, leaving only the facts. Lawyers completed their predictions in an unsupervised environment; they were able to use all available resources. PPI mis-selling was chosen as the basis of the competition because it matched the background of most lawyers taking part in the challenge and is an area of law that is easier to learn about than others. Participants were given links to the Financial Conduct Authority’s rules detailing the basis of an Ombudsman’s decision.

Dr Steffek attested: "The factual descriptions of the problems set by the Financial Ombudsman Service are a reasonable basis for a prediction about PPI mis-selling complaints being upheld or rejected by the Ombudsman at an early stage in the advisory process. Trained lawyers from commercial London law firms, using all the tools and resources they usually work with, are able to make reasonable predictions about these problems at this point even though the information given per claim varies and further information might be revealed at later stages."

Results

112 Lawyers competed in the Challenge, ranging from Magic Circle Partners, barristers, and in-house counsel. Participating law firms include: Bird & Bird, Kennedys, Weightmans, Allen & Overy, Berwin Leighton Paisner, DLA Piper, DAC Beachcroft, DLA Piper, and more. “Teams” were entered from large firms including Pinsent Masons and Eversheds Sutherlands.

The lawyers scored an accuracy score of 62.3%. CaseCruncher Alpha (the system entered into the competition by CaseCrunch) scored a validation accuracy of 86.6%.

The Challenge – Thoughts From the Team

Ian Dodd, UK Director of Premonition and the competition’s Technical Judge: “The session I observed produced an accuracy of 86.6%. It would also be interesting to put a £ value on the processing cost. The real number of: “Human: 62.3% at £300p/h and X hours” compared to AI:86.6% at £17ph and X hours” is the true bottom line.”

Jozef Maruscak, Managing Director: “We could not be happier about the outcome. We are grateful to all involved parties, especially competing lawyers who were not afraid to participate. We are not necessarily adversaries in this game, the systems like ours can make the legal world more effective for everyone. I am convinced that we have now reached the point where our technology and expertise allow us to satisfy both our vision and our commercial interests. We are looking forward to finding solutions for our clients now. ”

Rebecca Agliolo, Marketing Director: “Ultimately, the Challenge wasn’t about ‘winning or losing’; it was about showcasing the potential of artificial intelligence and changing the current paradigm not by talking, but by doing. The Lawyer Challenge started as an idea, and spiralled into a vision. Like any vision, it can’t belong to a single person. We hope that the Challenge will be replicated and improved – and we are proud to get the ball rolling.”

Scientific Director, Ludwig Bull: “Evaluating these results is tricky. These results do not mean that machines are generally better at predicting outcomes than human lawyers. These results show that if the question is defined precisely, machines are able to compete with and sometimes outperform human lawyers. The use case for these systems is clear. Legal decision prediction systems like ours can solve legal bottlenecks within organisations permanently and reliably.”

Ian Dodd: “After my well-known and uncomplimentary views on legal awards ceremonies, I never thought that I'd be confessing that I was at one. What's more, I was a judge and I even announced the results. However, this was no ordinary awards ceremony. It featured a real competition – one with numbers, facts and statistics. It was CaseCrunch’s Man vs. Machine challenge to predict the outcome of a series of legal cases. In other words, it was real; unlike most legal awards ceremonies. Those who think this sort of stuff is a gimmick might like to think again.”

Tom Dent-Spargo talks to the team about the potential of AI in the legal market.

Are you at CaseCrunch surprised by the results?

Rebecca Agliolo: “To an extent. The Challenge started out as an idea, and became a vision. We had no idea how it would turn out – or even whether we could get lawyers to participate in the first place. We spent months training the CaseCrunch system, and saw its high accuracy levels. If anything, we were more surprised by the lawyers’ results than CaseCrunch’s.”

Will we see a rise in the adoption of legal tech given these results?

Rebecca Agliolo: “We certainly hope so. We realised that there are two main obstacles to the substantive advancement and adoption of legal AI.

“Firstly, that the capabilities of AI are shrouded in misconceptions, as lawyers and journalists are fundamentally asking the wrong questions. Secondly, there has been too much talking and too little ‘doing’. Ultimately, the challenge wasn’t about ‘winning or losing’ – it was about showcasing the potential of artificial intelligence and changing the current paradigm not by talking, but by doing.”

What future developments are there for CaseCrunch?

Rebecca Agliolo: “We will continue evolving and perfecting our technology, and scope increasing areas of application, both in the legal industry and beyond.”

How Will Artificial Intelligence be Employed in the Legal Profession?

Felix Steffek: “The employment of artificial intelligence in the legal profession is still in its infancy. The cost-benefit advantage of artificial intelligence will guarantee its future place in legal services. It is too early, however, to reliably predict the extent to which it will replace jobs. This will depend on whether artificial intelligence will be involved in writing contracts or only in solving disputes, whether artificial intelligence will remain limited to descriptive analysis or whether it will be capable of evaluating rules and events, and whether it will be a tool that intermediaries use to enhance their services or whether it will replace intermediaries."

"The result of the artificial intelligence vs the lawyers competition is certainly impressive. The CaseCruncher Alpha predicted the outcome of dispute resolution with an accuracy of 87% while the lawyers scored an accuracy of 62%. I would, however, suggest not to put too much emphasis on the exact figures. Both sides could have achieved better or worse results under different conditions. The artificial intelligence might have benefited from more computing power. The lawyers' results might have improved if only experts in PPI claims as opposed to commercial lawyers generally participated. The questions for the future are rather: what are the specific strengths of artificial intelligence and where do lawyers excel? What are the costs of the services provided by artificial intelligence compared to the costs of the legal profession? And most importantly, what services do customers want?”