The Legal Industry's Shared AI Evaluation Framework

Designed and validated by 100+ legal and technology leaders from the buy side across the globe.

Legal teams have well-established playbooks for hiring humans, but none for hiring AI agents. This is that playbook, built by the legal community.

Independent

No vendor sponsorship or commercial influence.

Practitioner-shaped

Shaped by buy-side legal, AI, and technology leaders.

Open-source

Publicly available and community-driven.

The Problem

Why Legal AI Procurement is Broken

0%*

Of legal teams are not committed to their current AI vendor.

Many legal teams are still searching for tools that meet their needs.

0%*

Of legal teams evaluate AI tools without IT or security involvement.

Many procurement decisions happen without a structured technical review.

0+**

AI tools now target legal teams.

Many promise similar capabilities, making meaningful vendor comparison difficult.

The result: months of duplicated effort, inconsistent evaluation, and decisions driven by demos and marketing rather than evidence.

The Solution

The Legal AI Evaluation Framework

The Legal AI Evaluation Framework is the legal community's first open-access evaluation framework built to help legal teams make defensible, evidence-based procurement decisions.

100+

Legal leaders across 82 organizations and law firms

25

Countries represented

100

Evaluation sub-tests across 8 core criteria

Evaluation Criteria

The 8 Core Evaluation Criteria

1.Strategic Fit
Alignment to legal use cases, IT systems, jurisdictions, languages, and long-term legal team needs.
2.Functionality
User interface design, workflow integration, customisation, input handling, and real-world usability within existing technology stacks and processes.
3.Robustness
Factual accuracy, completeness, instruction fidelity, verifiability, citation quality, and consistency of AI-generated outputs across legal tasks.
4.Security
Architecture and data flow transparency, access control, retrieval boundaries, adversarial resistance, AI safety, and alignment with the organisation’s security policy requirements.
5.Data Privacy
Data use restrictions (including no-training provisions covering aggregated and derived data), deletion, localisation, vector embedding governance, and sub-processor practices.
6.Vendor Risk
Licensing, contractual security commitments, data portability, audit trails, transparency, vendor track record, business continuity, and incident response.
7.Adoption Support
Training, support, issue resolution, documentation, and usage reporting to support rollout and sustained adoption.
8.Cost & Resourcing
Pricing model, total cost of ownership, and internal operational capacity required to deploy and maintain the tool.

Evaluation Toolkit

The 3 Stages of Legal AI Evaluation

Like hiring a knowledge worker, each stage of evaluating a legal AI tool demands a different level of scrutiny. The Evaluation Toolkit turns the Legal AI Evaluation Framework into practical tools for buyers at each stage of evaluation.

Stage 1

·Resume screening

Pre-Demo Checklist

Pass/fail screening to decide if a demo is worth booking

Proceed / Do Not Proceed

Stage 2

·Interview

Demo Scorecard

Scored validation of whether the tool performs as claimed

Proceed / Do Not Proceed

Stage 3

·Working trial

Pilot Scorecard

Evidence-based, weighted evaluation using real workflows

Proceed / Do Not Proceed

Getting Started

How to Use This Toolkit

This toolkit applies whether you are:

  • Starting a new search for a legal AI tool
  • Midway through vendor conversations and need structure
  • Running a pilot and need a consistent way to compare tools
  • Reassessing or renewing a contract with your current provider

Researching tools?

Start with the Pre-Demo Checklist (Stage 1)

Sitting in demos?

Use the Demo Scorecard (Stage 2)

Trialing a tool?

Use the Pilot Scorecard (Stage 3)

Starting from scratch? Work through all 3 stages in order.

Then make it yours. Adapt criteria based on your team's priorities. The toolkits are starting points, not rigid templates. Every legal team operates differently. The framework gives you the structure — you decide what matters most.

Timeline & Roadmap

How We Built This

This framework was built from the ground up by the people who actually do this work.

Framework Development

Complete

November to December 2025

Synthesised legal AI evaluation approaches used in real vendor selection processes and assembled the first draft of the framework.

Community Feedback

Complete

February 21, 2026

Opened the draft for community input. More than 100 legal leaders contributed feedback.

v1 Finalisation

Complete

Late February 2026

Incorporated community feedback into Version 1 of the Legal AI Evaluation Framework and the accompanying Evaluation ToolKit.

Publication and Launch

Live

March 11, 2026

Public release of the Legal AI Evaluation Framework v1, together with the practical toolkits that operationalise the framework for real evaluation workflows.

What Comes Next

Ongoing

From March 2026

Gathering post-launch feedback and releasing additional practical resources, including a vendor questionnaire, the evaluation framework report, a quick-reference cheat sheet, and community insights.

This is a living framework. It will evolve as the tools, the models, and the market change. Want to shape what comes next?

The Team

Built by Practitioners Across Legal, IT, Security, and Privacy

This framework is built and maintained by legal professionals, technologists, security experts, and privacy specialists.

Core Team

Designed and published the framework. Responsible for day-to-day development, content, and coordination.

Anna Guo
Anna Guo

Founder, Legal Benchmarks

Over the past 13 months, Anna has benchmarked AI performance in real legal workflows alongside 500+ legal professionals. That work has shaped hundreds of procurement decisions around AI tooling.

LinkedIn
Elgar Weijtmans
Elgar Weijtmans

Technologist & Former Lawyer

Led legal AI procurement and evaluated 40+ tools last year. Brings an end-to-end view of AI assessment, from screening and testing to piloting and making the final tool decision.

LinkedIn
Roel Schrijvers
Roel Schrijvers

General Counsel

Focused on data security and operational risk in AI systems. Pushes the team to ask the questions most organisations miss around security, safety, and deployment risk.

LinkedIn
Sunny Kim
Sunny Kim

Comms Lead

Leads communications and PR for the framework. Brings experience in legal and professional services communications to help the community understand and engage with the project.

LinkedIn

Advisory Board

Senior leaders from legal, technology, AI governance, and security who guide strategic decisions, share their perspective on new directions, and review drafts before they go public.

Alexandra Robins

Alexandra Robins

GC, Reptune

Andie Garford-Tull

Andie Garford-Tull

General Counsel, AI Transformation and Governance, Dentsu

Andrew Greenfeld

Andrew Greenfeld

Head of Legal Operations, Zoom

Blake Hei

Blake Hei

General Counsel, Hoyoverse

Celia Reinsvold

Celia Reinsvold

Director, Legal AI Transformation

Daniel Lönnberg

Daniel Lönnberg

Head of Data and Innovation, Familjens Jurist i Sverige AB

Danylo Panov

Danylo Panov

Head of Legal Operations, Romexsoft

Gary Ang, PHD

Gary Ang, PHD

ex-MAS AI Risk Lead & Investment Risk Head

Jason Tamara Widjaja

Jason Tamara Widjaja

Executive AI Director

Jean Li

Jean Li

APAC General Counsel, Rapyd

Johan Granqvist

Johan Granqvist

Head of Legal and Compliance, Swarmia

Laura Jeffords Greenberg

Laura Jeffords Greenberg

General Counsel, Worksome

Marc Mandel

Marc Mandel

General Counsel, Exos

Nada Alnajafi

Nada Alnajafi

Legal Ops Leader & Senior Corporate Counsel, Franklin Templeton

Nicolas Alejandro Panigutti

Nicolas Alejandro Panigutti

Legal Manager, Global Legal Transformation Team, Santander

Stephanie Dominy

Stephanie Dominy

General Counsel and Head of Ops, Tessl

Tara Herman

Tara Herman

Senior Vice President, Deputy General Counsel, Girl Scouts of the USA

Timm Ernst

Timm Ernst

Head of Legal Tech Group, Henkel

Victor Green

Victor Green

Vice President, Contracts Services, Morgan Stanley

Weng Yip

Weng Yip

Vice President, Legal, Razer Inc.

Steering Committee

A broader group of practitioners across legal, IT, security, and privacy who validate and shape the assessment criteria, review and give feedback on the framework, and publicly signal support. All members are acknowledged in the published v1 guideline and toolkit.

Aalia Manie

Aalia Manie

Ahmet Guler

Ahmet Guler

Alicia Larrazabal Erminy

Alicia Larrazabal Erminy

Anastasiia Chiganov-Zalesskaia

Anastasiia Chiganov-Zalesskaia

Andrada Popescu

Andrada Popescu

Annemarie Brennan

Annemarie Brennan

Arjen Eeken

Arjen Eeken

Arthur Souza Rodrigues

Arthur Souza Rodrigues

Azhar Aziz-Ismail

Azhar Aziz-Ismail

Bensu Aydin

Bensu Aydin

Bert Vries

Bert Vries

Carlos Eduardo Vieira Costa de Lamare

Carlos Eduardo Vieira Costa de Lamare

Carrie Stephenson

Carrie Stephenson

Charlie Morgan

Charlie Morgan

Chris Werner

Chris Werner

Claudio Bild

Claudio Bild

Claudio Klaus

Claudio Klaus

Damian Tommasino

Damian Tommasino

Dana Kempler

Dana Kempler

Dana Kretschmer-Konzal

Dana Kretschmer-Konzal

Daniella Domokos

Daniella Domokos

Denisa Kopandi

Denisa Kopandi

Dick van Lankeren Matthes

Dick van Lankeren Matthes

Douwe Groenevelt

Douwe Groenevelt

Elena Tzvetinova

Elena Tzvetinova

Elizabeth Evans

Elizabeth Evans

Emelie Wesselink

Emelie Wesselink

Eric Lystrup

Eric Lystrup

Fiona Q. Nguyen

Fiona Q. Nguyen

Gayk Ayvazyan

Gayk Ayvazyan

Grant Ramsey

Grant Ramsey

Hugo Chow

Hugo Chow

Jan-Willem Prakke

Jan-Willem Prakke

Jennifer Sing

Jennifer Sing

Joep Feringa

Joep Feringa

Jolanda Rose

Jolanda Rose

Jonathan Savar

Jonathan Savar

Joseph Zeke Rucker

Joseph Zeke Rucker

Kars van Houten

Kars van Houten

Kevan Wee

Kevan Wee

Kim Humphrey

Kim Humphrey

Lasse Milinski

Lasse Milinski

Lien Tran

Lien Tran

Luara Locateli

Luara Locateli

Lucia Loyo

Lucia Loyo

Mark Zijlstra

Mark Zijlstra

Martin Woodward

Martin Woodward

Matthew Kohel

Matthew Kohel

Maxim Svyatov

Maxim Svyatov

Meena Parbhu

Meena Parbhu

Meeta Agarwal

Meeta Agarwal

Melissa Köhler-van der Hulst

Melissa Köhler-van der Hulst

Michael Odo

Michael Odo

Mohamed Al Mamari

Mohamed Al Mamari

Nate Kostelnik

Nate Kostelnik

Olivia Singh

Olivia Singh

Pim Betist

Pim Betist

Prapti Patel

Prapti Patel

Rebecca Gwilt

Rebecca Gwilt

Robert Brosgill

Robert Brosgill

Robert Jweinat

Robert Jweinat

Robin Musch

Robin Musch

Rok Popov Ledinski

Rok Popov Ledinski

Rutger Lambriex

Rutger Lambriex

Sarah R. Moros

Sarah R. Moros

Shaik Ashfaq

Shaik Ashfaq

Siavash Shamskho

Siavash Shamskho

Spencer Gusick

Spencer Gusick

Sven van de Kamp

Sven van de Kamp

Tommaso Ricci

Tommaso Ricci

Tunaseli Kamburoglu

Tunaseli Kamburoglu

Waridah Makena

Waridah Makena

Wei Yee Tan

Wei Yee Tan

Xuan Ming Tan

Xuan Ming Tan

Zehra Sahin

Zehra Sahin

+

and more

FAQ

Frequently Asked Questions

Newsletter

Stay updated

Get behind-the-scenes updates, contribution opportunities, and report updates.

* Legal Benchmarks 2026 community survey data; forthcoming AI Evaluation Framework Report.

** December 2025 Legaltech Hub directory/map count of GenAI legal tech solutions

Disclaimer

This framework is a community resource published by Legal Benchmarks, developed with contributions from legal and AI professionals across more than 100 organisations. It is provided for informational and educational purposes only and does not constitute legal, technical, or professional advice. No contributor, editor, or affiliated organisation accepts liability for decisions made on the basis of this framework. Organisations should adapt the framework to their own context, risk profile, and regulatory environment, and consult qualified professionals before making procurement or deployment decisions.

License

This document may be shared and adapted under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence, provided that Legal Benchmarks is credited as the original source.