A study in online, collaborative legal informatics
Adam Wyner, University of Liverpool
Wim Peters, University of Sheffield
– Introduction –
This is an academic research study on legal informatics (information processing of the law). The study uses an online, collaborative tool to crowdsource the annotation of legal cases. The task is similar to legal professionals’ annotation of cases. The result will be a public corpus of searchable, richly annotated legal cases that can be further processed, analysed, or queried for conceptual annotations.
Adam and Wim are computer scientists who are interested in language, law, and the Internet.
We are inviting people to participate in this collaborative task. This is a beta version of the exercise, and we welcome comments on how to improve it. Please read through this blog post, look at the video, and get in contact.
– Highlighting, Annotations, and Legal Case Briefs –
In reading, analysing, and preparing a summary of a legal case, law students and legal professionals annotate cases by highlighting and colour coding elements of the case to make for easy identification. Different elements are annotated: the holding, the parties, the facts, and so on. A sample image of annotations is:

Annotations for Case Citations, Legal Roles, Jurisdiction, Hearing Date
– Problem –
To analyse a legal case, legal professionals annotate the case into its constituent parts. The analysis is summarised in a case brief. However, the current approach is very limited:
With annotated legal cases, we can enable conceptual search.
– Solution: Crowdsource Annotation –
We use an online legal case annotation tool and share the results to support:
The results of the study would be useful to:
Broadly speaking, a corpus of analysed cases makes case law a public resource.
– Annotations: types and features –
To crowdsource conceptual annotations of legal cases, we use the General Architecture of Text Engineering (GATE) Teamware tool. Teamware is a web-based application that provides an annotator with a text to annotate and a list of annotations to use. The task is a web-based version of what legal analysts of cases already do.
We use familiar annotations for legal cases, divided (for ease of reference) into types and features. For example, we have a type Legal Roles and various features to select among, e.g. defendant. We are counting on you to have learned and used these annotations in the course of your legal study and practice.
You do not need to memorise the types and features as they will appear in the GATE Teamware tool. It may be handy to keep this webpage open so you can consult it or you could also print out the page.
The annotations we use are:
Argument For Party – arguments for a particular party:
Facts – legal and procedural facts:
Indexes – various indicative information:
Issues – the issues before the court:
Legal Roles – the role of the parties in the case:
Other – relevant information not covered by the other annotations.
Procedural History – the disposition of the case with respect to the lower court(s):
Reasoning Outcomes – various parts of the legal decision:
– Collaborate –
Take a look at the instructional video below. If you wish to collaborate on the task, send an email to Adam Wyner – adam@wyner.info
In the email, please include brief information for:
This will help us know who we are collaborating with; from the pool of candidates, we will select participants for this early study.
You will be sent a user name and password so you can login to Teamware.
We respect your privacy. We are only interested in data in the aggregate and will not reveal any personal data to third parties.
– Next –
We have an instructional video that you can open in a new tab or window and that uses QuickTime. It lasts about 14 minutes. This will give you a good idea of what you will be doing. The presenter is Adam Wyner. The link (takes a moment to load) — Case Annotation Instructional Video
After reading this blog, viewing the instructional video, and receiving your username and password, you can login to begin annotating at — GATE Teamware
When you are done with your task, please answer the questions on the survey to give us feedback on your experience using the annotation tool — TO APPEAR
– What Then? –
We analyse the annotations from several annotators, comparing and contrasting them (interannotator agreement). This will show us similarities and differences in the understanding of the annotations and cases. As well, the results will help us develop a Gold Standard Corpus of legal cases, which are annotations of cases that annotators agree on. A Gold Standard is essential for information extraction and the development of advanced processing. We will publicly report the analysis of the exercise and make the annotated cases publicly available for re-use.
Once we have a better sense of how this study goes, we plan to roll out a larger version with more cases. And this is only the start….
– Questions –
How easy is it to learn to use the tool? Take a look at the video to get a sense of this. With a little bit of practice, it is rather straightforward.
What if I don’t agree with some of your annotations or features? Write a comment or send us an email, and we will consider your comment. Try to be as clear and specific as you can. We are not lawyers, and we are dealing with a global community with local variation, so it is likely there will be some disagreement and variation.
Can I get the results of my annotations? Our approach is to make individual contributions to the whole. So, you will be able to access annotated cases after the exercise. There will be further information on how to work with the material.
How many cases must I do? You can do one or you can do as many as we have (not many in the beta project).
How much time will it take? About as long as it would take you to do a similar highlighting and annotation task with paper and markers.
What if I have a problem with using the tool or if the tool is buggy? Be patient and try to work with the tool. Sometimes things go wrong. Write a comment or send us an email, and we will try to advise. Note – we are only consumers of GATE Teamware, so are not responsible for the system.
How thoroughly should I annotate the cases? The more cases that are annotated fully and accurately, the better. Apply the same diligence as you would to thoroughly and carefully analyse cases with pen and paper. As you will be the beneficiary of the work of others, so too should you work to benefit them.
Do we track good annotators and bad annotators? We are interested in data in the aggregate, and are only interested in interannotator agreement and disagreement. This information will help us better understand differences in how the cases are understood and annotated. But, to be frank, if we have bad annotators, we will see this in the results; we would contact the annotator and see how best to improve the situation. As we noted above, we are not sharing information with third parties.
– Paper –
If you are interested in some of the ideas behind this project, please see our paper:
Semantic Annotations for Legal Text Processing using GATE Teamware
The paper will appear in May 2012 in the Proceedings of the LREC Conference Workshop on Semantic Processing of Legal Texts, Istanbul, Turkey. The exercise here is a version of the exercise proposed in the paper.
A shortlink to this blog page is:
http://wyner.info/LanguageLogicLawSoftware/?p=1315
– Thanks for collaborating! –
– If you have any questions, please submit a comment! –

Crowdsourced Legal Case Annotation by Adam Wyner is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.