AI Lawyers: A Guide To Fully Automated Luxury Lawyers


I am excited about the future of AI, it’s why I work for Genie. It has always been and continues to be our mission to enable business leaders to take control of their legals.

A fully automated luxury "AI Lawyer" is undoubtably a long way off. Despite this, things are moving quickly. It's hard to understate the progress researchers have made in natural language processing in recent years.

At Genie, we've long hypothesised about what a truly intelligent AI Lawyer must be able to do. Ignoring the behavioural changes required, here is my stab at the general cognitive capabilities required, and where I think the research is. In some places the research is lacking, in some places the research is there but the engineering isn't, in others the research and engineering is there, but adoption is lacking. We'd love to hear your thoughts (email me)

In this post, I break down the cognitive capabilities required by a lawyer, and how they work with clients. I then look at the state of current technology and how the capabilities required by an AI Lawyer might be achieved by technology.

I think that...

An AI lawyer should explain its decisions (Explainable Machine Learning).

Users deserve better than this when using AI

The research community has belatedly started to address explainability in AI over the last few years. Much of the motivation for explaining machine learning models seems to come from regulation, but we think this misses the point. At Genie, we believe the most compelling reason for mandating machine learning explanations is for collaboration with users.

If you were to ask for legal advice on an employment matter like “can we terminate somebodies contract?”, and the lawyer simply responded with “Yes”, only a maniac would take them at their word. You would likely ask the lawyer “Why?”, and if the lawyer responded “because they are on a fixed length contract that has already terminated”, you’d likely double check this. If this information was incorrect, then you would know the lawyers advice was based on incorrect information. Furthermore, if the lawyer were to refer to a completely irrelevant part of the contract, you’d likely question their judgement. At Genie, we believe that machine learning models should be able to collaborate with humans in this way. Human collaboration with machines is much more powerful than replacement!

Few machine learning models are truly able to explain their decisions, with notable exceptions including decision trees and logistic regression. LIME (Ribeiro et al. 2016) is arguably the seminal paper in explaining black box models. This method, along with SHAP and Anchors require generating perturbations of an input, finding the corresponding outputs, and modelling the influence of the different input features on the outputs.

A single accurate explanation of a prediction using these methods requires several hundred (or even thousands) inferences (pass throughs of the model). This is costly. Furthermore the explanations are only local explanations, i.e single predictions for a model are explained rather than the general properties of the model. Genie has therefore invested a lot of time into models which have a level of interpretability generally and can generate good explanations for individual predictions.

We have teamed up with Marta Kwiatkoska’s team at Oxford. Most of our work has been around robustness. We define a robust classifier to be a classifier that is unlikely to be victim of adversarial examples: i.e small trivial perturbations to the input which change the output. We have developed novel methods of quantifying robustness and how we can use it as the basis for explanations. Our Paper for the 2020 Conference on Empirical Methods in Natural Language Processing adopts maximal safe radius (see below diagram) as a measure of robustness and looks to calculate an upper bound on this using a Monte Carlo Tree Search.

This image shows the decision boundary for a classifier. A robust AI Lawyer needs to have a large maximal safe radius
Maximal safe radius illustrated for perturbing a sentence by changing the word “strange” to “odd”, “creativity” and “timeless”. If you change the word "strange" to any word within the maximal safe radius, it will not change classification.

We’ve also worked on a novel approach to explaining the decisions of state of the art transformer methods and how to leverage causality in explanations (more on both of these coming in the near future)

How well can neural networks explain their decisions? Genie's mark: 7/10- Simple explanations work well.

An AI Lawyer should maintain client privacy (Computational Privacy)

Our competitors legal tech platform's privacy.

Client confidentiality is one of the bare minimums expected from a lawyer. Lawyers shouldn’t tell you who their other clients are, and what their clients are doing. Automated lawyer's shouldn't either.

Fundamentally, when dealing with contracts, lawyers use past precedents and adapt them for new client's need. At Genie, we know that this essential flow of using past data to advise new cases doesn't need re-inventing, and in fact, we embrace this practice 🎉. However, past documents need to be de-identified before being shared. Even documents fed into a machine learning model can have significant data leakages (imagine the regulatory consequences of a sequence to sequence text generation model that spits out private training data where the task includes sensitive legal text).

Privacy preserving machine learning (where the model doesn't have access to the raw underlying data) has made great leaps in recent years. Differential privacy allows us to add noise to training data to avoid its re-identification, multi-party computation and fully homomorphic encryption are encryption protocols that allow a machine learning model to train on data without seeing the underlying data, and Federated Learning trains models in a distributed manner (avoiding the direct transfer of private data). However these technologies are still immature, and still don’t allow a lawyer to share human readable content.

At Genie we have therefore concentrated on direct text modification techniques. We have teamed up with the computational privacy team at Imperial College to create novel techniques to modify text to avoid the leakage of private information. We know that not all privacy leaks are explicit in text (like “Acme Corp will pay FooBar Inc.  $1M”). Many privacy leaks come from "what's left". We are particularly interested in authorship attribution: can an adversary figure out who wrote a piece of text from the patterns of language they use (as an example, checkout the subfield of native language identification which has shown that we unintentionally leave hints about our identity in the text we write). Stay tuned for an upcoming paper on the topic.

How well can an AI Lawyer explain its decisions? Genie Mark: 6/10 - major engineering challenges remain, but the theory is there

An AI Lawyer should know when it doesn't know (Bayesian Deep Learning)

The wrong reaction to unseen data

The correct reaction to unseen data

When you go to a Lawyer and ask for a piece of advice that is outside of their expertise, do they give you an answer with complete confidence? Obviously not. Unfortunately, Machine learning models are generally bluffers: they’ll give you full confidence based on no evidence.

It is naive to link this too closely calibration (how far away are your predicted probabilities from the true probabilities), which we already know is way off in modern neural networks. Uncertainty isn’t model confidence. A machine learning model may confidently model the probability of an outcome being 50%, and unconfidently model the outcome being very high probability.

An AI Lawyer therefore needs a way of measuring uncertainty. If a model hasn’t seen many similar examples to the example you are giving, it needs to tell us. Gal & Ghahramani 2016 gave us a method of using dropout as a bayesian approximation to measure uncertainty. Many extensions of this work have followed (Gal followed up in 2020 with this for text), but this family of methods require many inferences to be ran at run time to get a decent approximation of uncertainty. This is computationally intensive and impractical.

Can an AI lawyer tell you when it's not sure? Genie's mark: 3/10 - some techniques are there but there is no wide adoption yet for NLP.

An AI Lawyer should use knowledge to answer questions (Knowledge-Intensive Machine Learning)

Smart people know where to look rather than memorise

Addressing knowledge is an underserved area in machine learning. Neural networks are essentially big pattern matching machines. However, when a single piece of information contains most of what is needed to answer a question, neural networks generally fall short. Many would argue that machine learning models contain knowledge in the parameters. This is true, but this is rarely enough to answer truly knowledge intensive tasks. A human can answer a knowledge intensive task like “What is the capital of Malawi?” after being told that the capital of Malawi is “Lilongwe”.  However, to encode this knowledge in a neural network's weights would require many, many examples.

In knowledge intensive tasks, I like to think of a machine learning model's role as memory addressing, rather than to contain the knowledge directly.  This is analogous to being just a computer's kernel as opposed to being the entire computer. Models should learn how to find the answers to questions, rather than actually encode the answers explicitly in the weights. This has more analogies to a law student in an open book exam: the top students know where to look for the answer quicker than the bottom students.

There have been promising works in this field. The SQuAD task (Q+A) use a context piece of text which contains the answer. The AI’s task is find the answer in the text. However, this assumes that the context contains the solution. It is impractical to run a query individually against every member of a large document corpus to answer arbitrary questions.

Differential Neural Computers (2016) were an interesting step in the right direction for such tasks, but they seem to have gone quiet in the last few years (If anybody has an update on the state of these and any progress they have made please email me!). They not only learn how to read knowledge, but they even learn how to write it in a similar way to how a computer would store and address memory on a hard disk.

Retrieval Augmented Generation (Lewis et al. 2021) has got us all hot under the collar at Genie. Rather than learning how to write memory (as in Differential Neural Computers), they take the “open book exam” approach I mentioned above. They learn how to address a gigantic corpus (much like Genie's legal corpus) to search and generate an answer in natural language.

Could an AI lawyer effectively look up answers to arbitrary questions from a massive legal library? Genie's mark: 4/10 - RAG is the only thing that has excited us in 5 years.

An AI Lawyer should Transfer expertise between similar tasks (Transfer Learning)

Give a lawyer a contract slightly outside of their expertise, and they're likely to both offer better advice and learn quicker than somebody who is learning from scratch. Historically, machine learning did not follow this paradigm with every new task being learned afresh. Researchers have made massive progress on transfer learning in recent years: I cannot understate the progress.

Word2Vec (2013) was the first big progress in this direction. Representing words as fixed narrow vectors unlocked using neural networks in NLP effectively. A series of state of the art models were then developed.

Although Word2Vec was technically transfer learning, only the word representations are transferred and not network parameters. Dai and Le  (2015) started to make progress by pre-training a network on a sentence completion task extracted from unlabelled data. BERT (2018) nailed the approach of pre-training networks, obtaining start of the art results on eleven NLP tasks. This approach required considerably less data on the downstream task and pioneered using transformer networks pre-trained as language models as the initialisation for any nlp task. GPT-3 took this approach to another level with 175 Billion parameters. For many NLP tasks it used conditioning on a handful of examples (typically less than 10) to perform very well.

Can AI Lawyers apply previous experiences to new tasks? Genie's mark: 8/10 - there have been massive leaps in recent years but the models need to get smaller

TL;DR - How far away are we from an AI Lawyer?

...quite far, but significantly closer than you might think. Genie gives it an average mark of ~6/10 - The industry is making good ground: Genie is investing in research and collaborations with universities. We have teamed up with Imperial College and Oxford University to solve some of the big problem areas.

Well, this is my understanding. I'd love to know what I've missed. Please email me at to chat.

Legal perspectives

Following on from the success of this article, we have asked a handful of future-focused lawyers and in-house legal teams for their views.

We're moving closer and closer to an AI system, one way or another, which can handle the "fundamentals" e.g. AI which can measure, quantify and issue standard documents, perhaps even negotiating variations to standard positions. However, we're still some way off a fully automated legal negotiation i.e. two systems which can negotiate successfully through an impasse of fixed-positions by two parties.

That final creative approach, flair or instinct, remains human. What we could see, very realistically in the next 5 years, is the scope of legal expertise becoming narrower and more advisory/expert, with less experience-based tasks becoming rapidly automated.

Harry Borovick - Lecturer at The University of Law and UK Senior Legal Counsel at LiveRamp