skip navigation

1. The Death and Life of Great Legal Data Standards to the many efforts of the open government movement in the past decade, the benefits of machine-readable legal data — legal data which can be processed and easily interpreted by computers — are now widely understood. In the world of government statutes and reports, machine-readability would significantly enhance public transparency, help to increase efficiencies in providing services to the public, and make it possible for innovators to develop third-party services that enhance civic life.

In the universe of private legal data — that of contracts, briefs, and memos — machine-readability would open up vast potential efficiencies within the law firm context, allow the development of novel approaches to processing the law, and would help to drive down the costs of providing legal services.

However, while the benefits are understood, by and large the vision of rendering the vast majority of legal documents into a machine-readable standard has not been realized. While projects do exist to acquire and release statutory language in a machine-readable format (and the government has backed similar initiatives), the vast body of contractual language and other private legal documents remains trapped in a closed universe of hard copies, PDFs, unstructured plaintext and Microsoft Word files.

Though this is a relatively technical point, it has broad policy implications for society at large. Perhaps the biggest upshot is that machine-readability promises to vastly improve access to the legal system, not only for those seeking legal services, but also for those seeking to provide legal services, as well.

It is not for lack of a standard specification that the status quo exists. Indeed, projects like LegalXML have developed specifications that describe a machine-readable markup for a vast range of different types of legal documents. As of writing, the project includes technical committees working on legislative documents, contracts, court filings, citations, and more.

However, by and large these efforts to develop machine-readable specifications for legal data have only solved part of the problem. Creating the standard is one thing, but actually driving adoption of a legal data standard is another (often more difficult) matter. There are a number of reasons why existing standards have failed to gain traction among the creators of legal data.

For one, the oft-cited aversion of lawyers to technology remains a relevant factor. Particularly in the case of the standardization of legal data, where the projected benefits exist in the future and the magnitude of benefit speculative at the present moment, persuading lawyers and legislatures to adopt a new standard remains a challenge, at best.VOX.confidential.stamp-pdf-file

Secondly, the financial incentives of some actors may actually be opposed towards rendering the universe of legal documents into a machine-readable standard. A universe of largely machine-readable legal documents would also be one in which it may be possible for third-parties to develop systems that automate and significantly streamline legal services. In the context of the ever-present billable hour, parties may resist the introduction of technological shifts that enable these efficiencies to emerge.

Third, the costs of converting existing legal data into a machine-readable standard may also pose a significant barrier to adoption. Marking up unstructured legal text can be highly costly depending on the intended machine usage of the document and the type of document in question. Persuading a legislature, firm, or other organization with a large existing repository of legal documents to take on large one-time costs to render the documents into a common standard also discourages adoption.

These three reinforcing forces erect a significant cultural and economic barrier against the integration of machine-readable standards into the production of legal text. To the extent that one believes in the benefits from standardization for the legal industry and society at large, the issue is — in fact — not how to define a standard, but how to establish one.

2. Rough Consensus, Running Standards

So, how might one go about promulgating a standard? Particularly in a world in which lawyers, the very actors that produce the bulk of legal data, are resistant to change, mere attempts to mobilize the legal community to action are destined to fail in bringing about the fundamental shift necessary to render most if not all legal documents in a common machine-readable format.

In such a context, implementing a standard in a way that removes humans from the loop entirely may, in fact, be more effective. To do so, one might design code that was capable of automatically rendering legal text into a machine-readable format. This code could then be implemented by applications of all kinds, which would output legal documents in a standard format by default. This would include the word processors used by lawyers, but also integration with platforms like LegalZoom or RocketLawyer that routinely generate large quantities of legal data. Such a solution would eliminate the need for lawyer involvement from the process of implementing a standard entirely: any text created would be automatically parsed and outputted in a machine readable format. Scripts might also be written to identify legal documents online and process them into a common format. As the body of documents rendered in a given format grew, it would be possible for others to write software leveraging the increased penetration of the standard.

There are — obviously — technical limitations in realizing this vision of a generalized legal data parser. For one, designing a truly comprehensive parser is a massively difficult computer science challenge. Legal documents come in a vast diversity of flavors, and no common textual conventions allow for the perfect accurate parsing of the semantic content of any given legal text. Quite simply, any parser will be an imperfect (perhaps highly imperfect) approximation of full machine-readability.

Despite the lack of a perfect solution, an open question exists as to whether or not an extremely rough parsing system, implemented at sufficient scale, would be enough to kickstart the creation of a true common standard for legal text. A popular solution, however imperfect, would encourage others to implement nuances to the code. It would also encourage the design of applications for documents rendered in the standard. Beginning from the roughest of parsers, a functional standard might become the platform for a much bigger change in the nature of legal documents. The key is to achieve the “minimal viable standard” that will begin the snowball rolling down the hill: the point at which the parser is rendering sufficient legal documents in a common format that additional value can be created by improving the parser and applying it to an ever broader scope of legal data.

But, what is the critical mass of documents one might need? How effective would the parser need to be in order to achieve the initial wave of adoption? Discovering this, and learning whether or not such a strategy would be effective, is at the heart of the Restatement project.

3. Introducing Project Restatement

Supported by a grant from the Knight Foundation Prototype Fund, Restatement is a simple, rough-and-ready system which automatically parses legal text into a basic machine-readable JSON format. It has also been released under the permissive terms of the MIT License, to encourage active experimentation and implementation.

The concept is to develop an easily-extensible system which parses through legal text and looks for some common features to render into a standard format. Our general design principle in developing the parser was to begin with only the most simple features common to nearly all legal documents. This includes the parsing of headers, section information, and “blanks” for inputs in legal documents like contracts. As a demonstration of the potential application of Restatement, we’re also designing a viewer that takes documents rendered in the Restatement format and displays them in a simple, beautiful, web-readable version.

Underneath the hood, Restatement is all built upon web technology. This was a deliberate choice, as Restatement aims to provide a usable alternative to document formats like PDF and Microsoft Word. We want to make it easy for developers to write software that displays and modifies legal documents in the browser.

In particular, Restatement is built entirely in JavaScript. The past few years have been exciting for the JavaScript community. We’ve seen an incredible flourishing of not only new projects built on JavaScript, but also new tools for building cool new things with JavaScript. It seemed clear to us that it’s the platform to build on right now, so we wrote the Restatement parser and viewer in JavaScript, and made the Restatement format itself a type of JSON (JavaScript Object Notation) document.

For those who are more technically inclined, we also knew that Restatement needed a parser formalism, that is, a precise way to define how plain text can get transformed into Restatement format. We became interested in recent advance in parsing technology, called PEG (Parsing Expression Grammar).

PEG parsers are different from other types of parsers; they’re unambiguous. That means that plain text passing through a PEG parser has only one possible valid parsed output. We became excited about using the deterministic property of PEG to mix parsing rules and code, and that’s when we found peg.js.

With peg.js, we can generate a grammar that executes JavaScript code as it parses your document. This hybrid approach is super powerful. It allows us to have all of the advantages of using a parser formalism (like speed and unambiguity) while also allowing us to run custom JavaScript code on each bit of your document as it parses. That way we can use an external library, like the Sunlight Foundation’s fantastic citation, from inside the parser.

Our next step is to prototype an “interactive parser,” a tool for attorneys to define the structure of their documents and see how they parse. Behind the scenes, this interactive parser will generate peg.js programs and run them against plaintext without the user even being aware of how the underlying parser is written. We hope that this approach will provide users with the right balance of power and usability.

4. Moving Forwards

Restatement is going fully operational in June 2014. After launch, the two remaining challenges are to (a) continuing expanding the range of legal document features the parser will be able to successfully process, and (b) begin widely processing legal documents into the Restatement format.

For the first, we’re encouraging a community of legal technologists to play around with Restatement, break it as much as possible, and give us feedback. Running Restatement against a host of different legal documents and seeing where it fails will expose the areas that are necessary to bolster the parser to expand its potential applicability as far as possible.

For the second, Restatement will be rendering popular legal documents in the format, and partnering with platforms to integrate Restatement into the legal content they produce. We’re excited to say on launch Restatement will be releasing the standard form documents used by the startup accelerator Y Combinator, and Series Seed, an open source project around seed financing created by Fenwick & West.

It is worth adding that the Restatement team is always looking for collaborators. If what’s been described here interests you, please drop us a line! I’m available at, and on Twitter @RobotandHwang.


JasonBoehmigJason Boehmig is a corporate attorney at Fenwick & West LLP, a law firm specializing in technology and life science matters. His practice focuses on startups and venture capital, with a particular emphasis on early stage issues. He is an active maintainer of the Series Seed Documents, an open source set of equity financing documents. Prior to attending law school, Jason worked for Lehman Brothers, Inc. as an analyst and then as an associate in their Fixed Income Division.

tim-hwangTim Hwang currently serves as the managing human partner at the offices of Robot, Robot & Hwang LLP. He is curator and chair for the Stanford Center on Legal Informatics FutureLaw 2014 Conference, and organized the New and Emerging Legal Infrastructures Conference (NELIC) at Berkeley Law in 2010. He is also the founder of the Awesome Foundation for the Arts and Sciences, a distributed, worldwide philanthropic organization founded to provide lightweight grants to projects that forward the interest of awesomeness in the universe. Previously, he has worked at the Berkman Center for Internet and Society at Harvard University, Creative Commons, Mozilla Foundation, and the Electronic Frontier Foundation. For his work, he has appeared in the New York Times, Forbes, Wired Magazine, the Washington Post, the Atlantic Monthly, Fast Company, and the Wall Street Journal, among others. He enjoys ice cream.

Paul_SawayaPaul Sawaya is a software developer currently working on Restatement, an open source toolkit to parse, manipulate, and publish legal documents on the web. He previously worked on identity at Mozilla, and studied computer science at Hampshire College.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

Artisanal Algorithms

Down here in Durham, NC, we have artisanal everything: bread, cheese, pizza, peanut butter, and of course coffee, coffee, and more coffee. It’s great—fantastic food and coffee, that is, and there is no doubt some psychological kick from knowing that it’s been made carefully by skilled craftspeople for my enjoyment. The old ways are better, at least until they’re co-opted by major multinational corporations.

Artisanal Cheese. Source: Wikimedia Commons

Aside from making you either hungry or jealous, or perhaps both, why am I talking about fancy foodstuffs on a blog about legal information? It’s because I’d like to argue that algorithms are not computerized, unknowable, mysterious things—they are produced by people, often painstakingly, with a great deal of care. Food metaphors abound, helpfully I think. Algorithms are the “special sauce” of many online research services. They are sets of instructions to be followed and completed, leading to a final product, just like a recipe. Above all, they are the stuff of life for the research systems of the near future.

Human Mediation Never Went Away

When we talk about algorithms in the research community, we are generally talking about search or information retrieval (IR) algorithms. A recent and fascinating VoxPopuLII post by Qiang Lu and Jack Conrad, “Next Generation Legal Search – It’s Already Here,” discusses how these algorithms have become more complicated by considering factors beyond document-based, topical relevance. But I’d like to step back for a moment and head into the past for a bit to talk about the beginnings of search, and the framework that we have viewed it within for the past half-century.

Many early information-retrieval systems worked like this: a researcher would come to you, the information professional, with an information need, that vague and negotiable idea which you would try to reduce to a single question or set of questions. With your understanding of Boolean search techniques and your knowledge of how the document corpus you were searching was indexed, you would then craft a search for the computer to run. Several hours later, when the search was finished, you would be presented with a list of results, sometimes ranked in order of relevance and limited in size because of a lack of computing power. Presumably you would then share these results with the researcher, or perhaps just turn over the relevant documents and send him on his way. In the academic literature, this was called “delegated search,” and it formed the background for the most influential information retrieval studies and research projects for many years—the Cranfield Experiments. See also “On the History of Evaluation in IR” by Stephen Robertson (2008).

In this system, literally everything—the document corpus, the index, the query, and the results—were mediated. There was a medium, a middle-man. The dream was to some day dis-intermediate, which does not mean to exhume the body of the dead news industry. (I feel entitled to this terrible joke as a former journalist… please forgive me.) When the World Wide Web and its ever-expanding document corpus came on the scene, many thought that search engines—huge algorithms, basically—would remove any barrier between the searcher and the information she sought. This is “end-user” search, and as algorithms improved, so too would the system, without requiring the searcher to possess any special skills. The searcher would plug a query, any query, into the search box, and the algorithm would present a ranked list of results, high on both recall and precision. Now, the lack of human attention, evidenced by the fact that few people ever look below result 3 on the list, became the limiting factor, instead of the lack of computing power.

A search for delegated search

A search for delegated search

The only problem with this is that search engines did not remove the middle-man—they became the middle-man. Why? Because everything, whether we like it or not, is editorial, especially in reference or information retrieval. Everything, every decision, every step in the algorithm, everything everywhere, involves choice. Search engines, then, are never neutral. They embody the priorities of the people who created them and, as search logs are analyzed and incorporated, of the people who use them. It is in these senses that algorithms are inherently human.

Empowering the Searcher by Failing Consistently

In the context of legal research, then, it makes sense to consider algorithms as secondary sources. Law librarians and legal research instructors can explain the advantages of controlled vocabularies like the Topic and Key Number System®, of annotated statutes, and of citators. In several legal research textbooks, full-text keyword searching is anathema because, I suppose, no one knows what happens directly after you type the words into the box and click search. It seems frightening. We are leaping without looking, trusting our searches to some kind of computer voodoo magic.

This makes sense—search algorithms are often highly guarded secrets, even if what they select for (timeliness, popularity, and dwell time, to name a few) is made known. They are opaque. They apparently do not behave reliably, at least in some cases. But can’t the same be said for non-algorithmic information tools, too? Do we really know which types of factors figure in to the highly vaunted editorial judgment of professionals?

To take the examples listed above—yes, we know what the Topics and Key Numbers are, but do we really know them well enough to explain why the work the way they do, what biases are baked-in from over a century of growth and change? Without greater transparency, I can’t tell you.

How about annotated statutes: who knows how many of the cases cited on online platforms are holdovers from the soon-to-be print publications of yesteryear? In selecting those cases, surely the editors had to choose to omit some, or perhaps many, because of space constraints. How, then, did the editors determine which cases were most on-point in interpreting a given statutory section, that is, which were most relevant? What algorithms are being used today to rank the list of annotations? Again, without greater transparency, I can’t tell you.

And when it comes to citators, why is there so much discrepancy between a case’s classification and which later-citing cases are presented as evidence of this classification? There have been several recent studies, like this one and this one, looking into the issue, but more research is certainly needed.

Finally, research in many fields is telling us that human judgments of relevance are highly subjective in the first place. At least one court has said that algorithmic predictive coding is better at finding relevant documents during pretrial e-discovery than humans are.

Where are the relevant documents? Source: CC BY 2.0, flickr user gosheshe

I am not presenting these examples to discredit subjectivity in the creation of information tools. What I am saying is that the dichotomy between editorial and algorithmic, between human and machine, is largely a false one. Both are subjective. But why is this important?

Search algorithms, when they are made transparent to researchers, librarians, and software developers (i.e. they are “open source”), do have at least one distinct advantage over other forms of secondary sources—when they fail, they fail consistently. After the fact or even in close to real-time, it’s possible to re-program the algorithm when it is not behaving as expected.

Another advantage to thinking of algorithms as just another secondary source is that, demystified, they can become a less privileged (or, depending on your point of view, less demonized) part of the research process. The assumption that the magic box will do all of the work for you is just as dangerous as the assumption that the magic box will do nothing for you. Teaching about search algorithms allows for an understanding of them, especially if the search algorithms are clear about which editorial judgments have been prioritized.

Beyond Search, Or How I Learned to Stop Worrying and Love Automated Research Tools

As an employee at Fastcase, Inc. this past summer, I had the opportunity to work on several innovative uses of algorithms in legal research, most notably on the new automated citation-analysis tool Bad Law Bot. Bad Law Bot, at least in its current iteration, works by searching the case law corpus for significant signals—words, phrases, or citations to legal documents—and, based on criteria selected in advance, determines whether a case has been given negative treatment in subsequent cases. The tool is certainly automated, but the algorithm is artisanal—it was massaged and kneaded by caring craftsmen to deliver a premium product. The results it delivered were also tested meticulously to find out where the algorithm had failed. And then the process started over again.

This is just one example of what I think the future of much general legal research will look like—smart algorithms built and tested by people, taking advantage of near unlimited storage space and ever-increasing computing power to process huge datasets extremely fast. Secondary sources, at least the ones organizing, classifying, and grouping primary law, will no longer be static things. Rather, they will change quickly when new documents are available or new uses for those documents are dreamed up. It will take hard work and a realistic set of expectations to do it well.

Computer assisted legal research cannot be about merely returning ranked lists of relevant results, even as today’s algorithms get better and better at producing these lists. Search must be only one component of a holistic research experience in which the searcher consults many tools which, used together, are greater than the sum of their parts. Many of those tools will be built by information professionals and software engineers using algorithms, and will be capable of being updated and changed as the corpus and user need changes.

It’s time that we stop thinking of algorithms as alien, or other, or too complicated, or scary. Instead, we should think of them as familiar and human, as sets of instructions hand-crafted to help us solve problems with research tools that we have not yet been able to solve, or that we did not know were problems in the first place.

Aaron KirschenfeldAaron Kirschenfeld is currently pursuing a dual J.D. / M.S.I.S. at the University of North Carolina at Chapel Hill. His main research interests are legal research instruction, the philosophy and aesthetics of legal citation analysis, and privacy law. You can reach him on Twitter @kirschsubjudice.

His views do not represent those of his part-time employer, Fastcase, Inc. Also, he has never hand-crafted an algorithm, let alone a wheel of cheese, but appreciates the work of those who do immensely.


VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

The first thing we do, let’s kill all the lawyers.
- Henry VI, Pt. 2, Act 4, sc. 2.

This line, delivered by Dick the Butcher (turned revolutionary) in Shakespeare’s Henry VI, is often performed tongue-in-cheek by actors to elicit an expected laugh from the audience. The essence of the line, however, is no joke, and relates to destabilizing the rule of law by removing its agents — those who promote and enforce the law. What no one could predict, including Shakespeare himself, is the horrific precision with which such a deed could be carried out.

The 1994 Genocide in Rwanda showed this horror and more, with upwards of one million killed in the span of three months. The effect on the legal system was particularly devastating, with the targeting of lawyers and the justice sector, resulting in the targeted killing of prosecutors and judges at its outset.

Rwanda’s Justice Sector Development
Since 1994, Rwanda has done a remarkable job rebuilding its society, establishing security, curbing corruption, and creating one of the fastest growing economies in sub-Saharan Africa.

Law Library at the Ministry of Justice, Kigali, Rwanda.

Law Library at the Ministry of Justice, Kigali, Rwanda.

One of the biggest areas of development in Rwanda, and in other areas of the world, has been strengthening justice sector institutions and strengthening the rule of law. In transitional states, especially those developing systems of democratic governance, the creation of online, reliable, and accessible legal information systems is a critical component of good governance. Rwanda’s efforts and opportunities for development in this area are noted below.

From 2010-2011, I played a very small part of this development when I served as a law clerk and legal advisor to then-Chief Justice Aloysie Cyanzayire of the Supreme Court of Rwanda. Working with a USAID-funded project, I was also able to participate with legal education reform, and the development of an online database of laws, the Rwanda Legal Information Portal (RwandaLIP). In the summer of 2013 I returned to Rwanda, with the support of the American Association of Law Libraries, to visit its law libraries and understand the role of law libraries in legal institutions and overall society. After learning the Rwanda LIP was no longer updated (and now offline entirely), investigating Rwanda’s online legal presence became a secondary research goal for the trip. The discovery also highlighted the importance of legal information systems and their role in justice sector reform. Part of this justice sector reform related to changes in Rwanda’s legal system. Once a Belgian colony, at independence Rwanda inherited a civil law system, codified much of the Belgian civil code, and today the main body of laws comes from enactments of Parliament. Rwanda’s judicial system, rebuilt after the 1994 Genocide, is made up of four levels of courts: District Courts, Provincial Courts, High Courts, and the Supreme Court.
With its civil law roots, courts in Rwanda were largely unconcerned with precedent. As Rwanda became a member of the East African Community in 2007 (and adopted English as an official language), the judiciary started a transition to a hybrid common law system, considering how to assign precedential value to court decisions. With this ongoing transition in Rwanda’s legal system, an online legal information system has become a significant need for legal and civil society.

One of four computer labs, called the "digital library" at Kigali Independent University, with more than 400 computer workstations available for student use.

One of four computer labs, called the “digital library” at Kigali Independent University, with more than 400 computer workstations available for student use.

Online Legal Information Systems
In order to establish the rule of law in a democratic system, citizens must have access, at the very minimum, to laws of a government. To make this access meaningful, a searchable database of laws should be created to allow users of legal information to find laws based on their particular information need. For this reason alone it is important for governments in transitional states to make a commitment to developing online legal information systems.

John Palfrey aptly noted: “In most countries, primary legal information is broadly accessible in one format or another, but it is rarely made accessible online in a stable and reliable format.” This is basically the case in Rwanda. Every law library, university library, and even the Kigali Public Library have paper copies of the Official Journal — the official laws of Rwanda. Today, however, the only current place to find laws online is through the Prime Minister’s webpage, where PDF copies of the Official Gazette are published. The website (Kinyarwanda for “law”) was frequently used by lawyers and members of the justice sector to search Rwanda’s laws, and allowed the general public to not only access laws, but run a full text search for keywords. This site, however, was not updated after 2011, and is now completely offline. The result is no online source to search Rwanda’s laws.

Law Library at the Parliament of the Republic of Rwanda in Kigali.

Law Library at the Parliament of the Republic of Rwanda in Kigali.

Rwanda is using its growing information infrastructure, however, to create other online quasi-legal information databases. For instance, the Rwanda Development Board created an online portal for businesses to access information on “investment related procedures” in Rwanda. The government is also allowing online registration of businesses, streamlining the processes and making it more accessible. These developments make sense with Rwanda’s reforms in the area of economic development, and its recent ranking in the top 30% globally for ease of doing business, and 3rd best in sub-Saharan Africa. While economic reform has driven these changes, justice sector reform has not yet yielded the same results for online legal information systems.

Service counter at the University Library at Kigali Independent University in Rwanda.  Students aren't allowed to browse the library stacks.

Service counter at the University Library at Kigali Independent University in Rwanda. Students aren’t allowed to browse the library stacks.

Rwanda’s Legal Information Culture Despite the limited online access to laws, there is a high value placed on legal information in Rwanda. Every legal institution has a law library and a dedicated library staff member (although most don’t have formal education in librarianship or information management). Moreover, members of the justice sector, from staff members to Permanent Secretaries and Ministers, believe libraries and access to legal information is of critical importance. A common theme in Rwanda’s law libraries, however, is the lack of funding. Some libraries have not invested in library materials in years, and have solely relied on donations to add items to their collections. It is not altogether surprising, then, that the Rwanda LIP remained un-funded, and is now completely defunct as an online legal information system. One source close to the Rwanda LIP project indicated that funding has been sought at Parliament, but as of today has yet to be successful.

The failure of the Rwanda LIP is perhaps a victim of how it came to be; that is, through donor-funded development. Creating sustainable online databases requires a government commitment of financial support. Just as before it, the Rwanda LIP was created through a donor-funded initiative, and at its conclusion the LIP’s source of funding also ended. For any donor-funded development initiative, sustainability is a key concern, and significant government collaboration is necessary for initiatives to remain after donor-funded projects end. This concept is especially true with legal information systems, and is perhaps the cause for the Rwanda LIP’s demise. While created in partnership with the Government of Rwanda, it failed to adequately secure a commitment for continued funding at its outset. Sustainability issues are not unique to Rwanda’s experience with online legal information systems. The availability of financial resources is one of the key challenges to creating a sustainable online database of laws. Working with developing countries in Africa, SAFLII found that sustainability issues come from “shortages of resources, skills and technical services.” While donor-funded projects have serious limitations, others experiencing the sustainability challenge have suggested databases supported by private enterprise, “offering free content as well as value-added services for sale.” One thing for certain is that long-term sustainability remains one of the biggest challenges for online legal information systems.

View of the Kigali Public Library in Kigali, Rwanda.

View of the Kigali Public Library in Kigali, Rwanda.

Print to Digital Transition and Overcoming the Digital Divide In addition to sustainability, transition from print to digital poses its own complications, and has emerged as a major issue in law libraries, from even the most established institutions. This challenge is especially unique in the context of developing and transitional states, where access to the internet can pose a significant challenge. This problem, known as the “digital divide,” has been described as something that “disproportionately disenfranchises certain segments of society and runs counter to the notion that inclusiveness and opportunity build strong communities and countries.” This is an even larger problem in developing and transitional states, where there is far less wealth and technological infrastructure for internet connectivity, and a greater disparity in access between and among communities.

Of all countries in the process of developing online legal information systems, however, Rwanda is perhaps the best suited to succeed. With high-speed fibre-optic internet cables recently installed throughout the small East African country, Rwanda has one of the best internet penetration rates in the developing world. So, while Rwanda’s law libraries (and other libraries) throughout the country have print copies of laws, there may be a legitimate opportunity to give a large number of citizens online access. For example, the Kigali Public Library, the flagship institution of the Rwanda Library Services, houses print copies of the laws of Rwanda but also has an internet cafe giving free access to online resources. Kigali Independent University has an “Internet Library” with more than 500 computers for student use. Rwanda’s law libraries are also open and accessible to the public, some of which have computers for use by the public as well. Other libraries, including the law library at the National University of Rwanda, have increasing access to online resources to serve their users.

In Rwanda, a new access to information law (Official Gazette No. 10 of 11.03.2013) makes online legal information even more critical in the developing state, and Rwanda’s current efforts can serve as an example for the importance of modernizing online legal information. The access to information law imposes a positive obligation on the Government of Rwanda, and some private companies working under government contracts, to disclose a broad range of information to the public and press. It has been stated that the law “meets standards of best practice in terms of scope and application” for freedom of information laws. Despite the law’s conditions to withhold information under Article 4, the significant shift in policy and the law’s broad range of information available are very positive signs. This and similar laws across the developing world have created a need for the improvement of existing legal information systems, or the creation of new systems to adequately make available essential legal information. A critical component to the implementation of this law, therefore, is a reliable and sustainable online legal information system.

A view of the volcanoes in the Northern Province of Rwanda.

A view of the volcanoes in the Northern Province of Rwanda.

Lessons Learned from Rwanda’s Experience
While and the Rwanda LIP are no longer online, institutions within the justice sector of Rwanda are currently working on solutions. In the meantime, there is no meaningful way to search Rwanda’s laws online. It is possible that a stronger financial commitment at the outset of the Rwanda LIP would have solved this. In the future, long-term sustainability should be one of the primary qualifications for creating an online system.

In the meantime, there are other ways of expanding Rwanda’s access to online legal information through databases of foreign law and secondary sources. Talking with law librarians in Rwanda, I learned that there is little, if any research instruction being delivered from law libraries. Even in the few libraries with subscription electronic databases, users aren’t necessarily being directed to relevant legal resources. Furthermore, law librarians generally collect, catalog and retrieve legal materials for users, rather than directing users to relevant sources. Users of legal information in Rwanda (and elsewhere) would be well served by being exposed to other online sources of legal information. Sites like the LII, WorldLII, and the Directory of Open Access Journals offers access to a wealth of free online primary and secondary materials that could be useful to researchers. Creating research guides and offering research instruction in these areas costs very little, and opens up countless resources that could be valuable to users of legal information in Rwanda, and elsewhere. Those working in justice sector development should investigate the possibility for this, in conjunction with creating online legal information systems of domestic laws.

Directional sign outside the Law Faculty at the Independent Institute of Lay Adventist of Kigali.

Directional sign outside the Law Faculty at the Independent Institute of Lay Adventist of Kigali.

Finally, the majority of those working as librarians in Rwanda’s law libraries have no formal instruction in library or information science. Nonetheless, it is remarkable that those with little or no formal training are competent librarians. Formal training or not, qualified librarians generally do not have the opportunity to offer research training to users of legal information. Treating law librarians as professionals would open up many opportunities to increase the capacity of users of legal information, and the online resources available.


IMG_1857Brian Anderson is a Reference Librarian and Assistant Professor at the Taggart Law Library at Ohio Northern University. His research involves the use of law libraries and legal information systems to support the rule of law in developing and transitional states. In September 2013 Brian presented two papers at the 2013 Law Via the Internet conference related to this topic; one related to civil society organizations and the use of the internet to strengthen the rule of law, and another about starting online legal information systems from scratch.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.


AT4AM – Authoring Tool for Amendments – is a web editor provided to Members of European Parliament (MEPs) that has greatly improved the drafting of amendments at European Parliament since its introduction in 2010.

The tool, developed by the Directorate for Innovation and Technological Support of European Parliament (DG ITEC) has replaced a system based on a collection of macros developed in MS Word and specific ad hoc templates.

Why move to a web editor?

The need to replace a traditional desktop authoring tool came from the increasing complexity of layout rules combined with a need to automate several processes of the authoring/checking/translation/distribution chain.

In fact, drafters not only faced complex rules and had to search among hundreds of templates in order to get the right one, but the drafting chain for all amendments relied on layout to transmit information down the different processes. Bold / Italic notation or specific tags were used to transmit specific information on the meaning of the text between the services in charge of subsequent revision and translation.

Over the years, an editor that was initially conceived to support mainly the printing of documents was often used to convey information in an unsuitable manner. During the drafting activity, documents transmitted between different services included a mix of content and layout where the layout sometime referred to some information on the business process that should rather be transmitted via other mediums.

Moreover, encapsulating in one single file all the amendments drafted in 23 languages was a severe limitation for subsequent revisions and translations carried out by linguistic sectors. Experts in charge of legal and linguistic revision of drafted amendments, who need to work in parallel on one document grouping multilingual amendments, were severely hampered in their work.

All the needs listed above justified the EP undertaking a new project to improve the drafting of amendments. The concept was soon extended to the drafting, revision, translation and distribution of the entire legislative content in the European Parliament, and after some months the eParliament Programme was initiated to cover all projects of the parliamentary XML-based drafting chain.

It was clear from the beginning that, in order to provide an advanced web editor, the original proposal to be amended had to be converted into a structured format. After an extensive search, XML Akoma Ntoso format was chosen, because it is the format that best covers the requirements for drafting legislation. Currently it is possible to export amendments produced via AT4AM in Akoma Ntoso. It is planned to apply Akoma Ntoso schema to the entire legislative chain within eParliament Programme. This will enable EP to publish legislative texts in open data format.

What distinguishes the approach taken by EP from other legislative actors who handle XML documents is the fact that EP decided to use XML to feed the legislative chain rather than just converting existing documents into XML for distribution. This aspect is fundamental because requirements are much stricter when the result of XML conversion is used as the first step of legislative chain. In fact, the proposal coming from European Commission is first converted in XML and after loaded into AT4AM. Because the tool relies on the XML content, it is important to guarantee a valid structure and coherence between the language versions. The same articles, paragraphs, point, subpoints must appear at the correct position in all the 23 language versions of the same text.

What is the situation now?

After two years of intensive usage,  Members of European Parliaments have drafted 285.000 amendments via AT4AM. The tool is also used daily by the staff of the secretariat in charge of receiving tabled amendments, checking linguistic and legal accuracy and producing voting lists. Today more then 2300 users access the system regularly, and no one wants to go back to the traditional methods of drafting. Why?

Automatic Bold ItalicBecause it is much simpler and faster to draft and manage amendments via an editor that takes care of everything, thus  allowing drafters to concentrate on their essential activity: modifying the text.

Soon after the introduction of AT4AM, the secretariat’s staff who manage drafted amendments breathed a sigh of relief, because errors like wrong position references, which weBetterre the cause of major headaches, no longer occurred.

What is better than a tool that guides drafters through the amending activity by adding all the surrounding information and taking care of all the metadata necessary for subsequent treatment, while letting the drafter focus on the text amendments and produce well-formatted output with track changes?

After some months of usage, it was clear that not only the time to draft, check and translate amendments was drastically reduced, but also the quality of amendments increased.

QuickerThe slogan that best describes the strength of this XML editor is: “You are always just two clicks away from tabling an amendment!”



Web editor versus desktop editor: is it an acceptable compromise?

One of the criticisms that users often raise against web editors is that they are limited when compared with a traditional desktop rich editor. The experience at the European Parliament has demonstrated that what users lose in terms of editing features is highly compensated by the gains of getting a tool specifically designed to support drafting activity. Moreover, recent technologies enable programmers to develop rich web WYSIWYG (What You See Is What You Get) editors that include many of the traditional features plus new functions specific to a “networking” tool.

What’s next?

The experience of EP was so positive and so well received by other Parliaments that in May 2012, at the opening of the international workshop “Identifying benefits deriving from the adoption of XML-based chains for drafting legislation“, Vice President Wieland announced the launch of a new project aimed at to providing an open source version of the AT4AM code.

AT4AM for All in a video conference with the United Nations Department for General Assembly and Conference Management from New York on 19 March 2013, Vice President Wieland announced,  the UN/DESA’s Africa i-Parliaments Action Plan from Nairobi and the Senate of Italy from Rome, the availability of AT4AM for All, which is the name given to this open source version, for any parliament and institution interested in taking advantage of this well-oiled IT tool that has made the life of MEPs much easier.

The code has been released under EUPL(European Union Public Licence), an open source licence provided by European Commission that is compatible with major open source licences like Gnu GPLv2 with the advantage of being available in the 22 official languages of the European Union.

AT4AM for All is provided with all the important features of the amendment tool used in the European Parliament and can manage all type of legislative content provided in the XML format Akoma Ntoso. This XML standard, developed through the UN/DESA’s initiative Africa i-Parliaments Action Plan, is currently under certification process at OASIS, a non-profit consortium that drives the development, convergence and adoption of open standards for the global information society. Those who are interested may have a look to the committee in charge of the certification: LegalDocumentML

Currently the Documentation Division, Department for General Assembly and Conference Management of United Nations is evaluating the software for possible integration in their tools to manage UN resolutions.

The ambition of EP is that other Parliaments with fewer resources may take advantage of this development to improve their legislative drafting chain. Moreover, the adoption of such tools allows a Parliament to move towards an XML based legislative chain. The distribution of legislative content in open document formats like XML allows other parties to treat in an efficient way the legislation produced.

Thanks to the efforts of European Parliament, any parliament in the world is now able to use the advanced features of AT4AM to support the drafting of amendments. AT4AM will serve as a useful tool for all those interested in moving towards open data solutions and more democratic transparency in the legislative process.

At AT4AM for All website it is possible to get the status of works and run a sample editor with several document types. Any Parliament interested can go to the repository and download the code.

Claudio FabianiClaudio Fabiani is Project Manager at the Directorate-General for Innovation and Tecnological Support of European Parliament. After an experience of several years in private sector as IT consultant, he started his career as civil servant at European Commission, in 2001, where he has managed several IT developments. Since 2008 he is responsible of AT4AM project and more recently he has managed the implementation of AT4AM for All, the open source version.



VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.


Vox.summer_readingMaybe it’s a bit late for a summer reading list, or maybe you’re just now starting to pack for your vacation, deep in a Goodreads list that you don’t ever expect to dig your way out of. Well, let us add to your troubles with a handful of books your editors are currently enjoying.

Clearing in the forest : law, life, and mind, by Steven L. Winter. A 2001 cognitive science argument for studying and developing law. Perhaps a little heavy for poolside, one of your editors finds it perfect for multi-day midwestern summer rainstorms, alons with a pot of tea. Review by Lawrence Solan in the Brooklyn Law Review, as part of a symposium.

Digital Disconnect: How Capitalism is Turning the Internet Against Democracy, by Robert W. McChesney.

“In Digital Disconnect, Robert McChesney offers a groundbreaking critique of the Internet, urging us to reclaim the democratizing potential of the digital revolution while we still can.”

This is currently playing on my work commute.

The Cognitive Style of Power Point: Pitching Out Corrupts Within, by Edward Tufte. Worth re-reading every so often, especially heading into conference/teaching seasons.

Delete: The Virtue of Forgetting in a Digital Age, by VoxPopuLII contributor Viktor Mayer-Schonberger. Winner of the 2010 Marshall McLuhan Award for Outstanding Book in Media ecology, Media Ecology Association; Winner of the 2010 Don K. Price Award for Best Book in Science and Technology Politics, Section on Science, Technology, and Environmental Politics (STEP) by the American Political Science Association. Review at the Times Higher Education.

Piracy: The Intellectual Property Wars from Gutenberg to Gates, by Adrian Johns (2010). A historian’s view of Intellectual Property — or, this has all happened before. Reviews at the Washington Post and the Electronic Frontier Foundation. From the latter, “Radio arose in the shadow of a patent thicket, became the province of tinkers, and posed a puzzle for a government worried that “experimenters” would ruin things by mis-adjusting their sets and flooding the ether with howling oscillation. Many will immediately recognize the parallels to modern controversies about iPhone “jailbreaking,” user innovation, and the future of the Internet.”

The Master Switch: The Rise and Fall of Information Empires, by Tim Wu (2010). A history of communications technologies, and the cyclical (or not) trends of their openness, and a theory on the fate of the Internet. Nice reviews on Ars Tecnica and The Guardian.

Too Big to Know: Rethinking Knowledge Now That the Facts Aren’t the Facts, Experts Are Everywhere, and the Smartest Person in the Room Is the Room, by David Weinberger (author of the Cluetrain Manifesto). For more, check out this excerpt by Weinberger in The Atlantic and

You are not so smart, by David McRaney. Examines the myth of being intelligent — a very refreshing read for the summer. A review of the book can be found at Brainpickings, which by the way is an excellent blog and definitely worth a look.

On a rainy day you can always check out the BBC series “QI” with a new take on what we think we know but don’t know. Hosted by Stephen Fry. Comedians share their intelligence with witty humour and you will learn a thing or two along the way. The TV show has also led to a few books, e.g. Qi: the Book of General Ignorance (Q1), by John Lloyd


Sparing the cheesy beach reads, here’s a fiction set that you may find interesting.

The Ware Tetralogy: Ware #1-4 , by Rudy Rucker (currently $6.99 for the four-pack)

Rucker’s four Ware novels–Software (1982), Wetware (1988), Freeware (1997), and Realware (2000)–form an extraordinary cyberweird future history with the heft of an epic fantasy novel and the speed of a quantum processor. Still exuberantly fresh despite their age, they primarily follow two characters (and their descendants): Cobb Anderson, who instigated the first robot revolution and is offered immortality by his grateful “children,” and stoner Sta-Hi Mooney, who (against his impaired better judgment) becomes an important figure in robot-human relations. Over several generations, humans, robots, and society evolve, but even weird drugs and the wisdom gathered from interstellar signals won’t stop them from making the same old mistakes in new ways. Rucker is both witty and serious as he combines hard science and sociology with unrelentingly sharp observations of all self-replicating beings. — Publisher’s Weekly

Happy reading! We’ll return mid-August with a feature on AT4AM.


VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

In March, Mike Lissner wrote for this blog about the troubling state of access to case law – noting with dismay that most of the US corpus is not publicly available. While a few states make official cases available, most still do not, and neither does the federal government. At Ravel Law we’re building a new legal research platform and, like Mike, we’ve spent substantial time troubleshooting access to law issues. Here, we will provide some more detail about how official case law is created and share our recommendations for making it more available and usable. We focus in particular on FDsys – the federal judiciary’s effort in this space – but the ideas apply broadly.

The Problem

If you ask a typical federal court clerk, such as our friend Rose, Pacific_Reporterabout the provenance of case opinions you will only learn half the story. Rose can tell you that after she and her judge finish an opinion it gets sent to a permanent court staffer. After that the story that Rose knows basically ends. The opinion at this stage is in its “slip” opinion state, and only some time later will Rose see the “official” version – which will have a citation number, copy edits, and perhaps other alterations. Yet, it is only this new “official” version that may be cited in court. For Mike Lissner, for Ravel, and for many others, the crux of the access challenge lies in steps beyond Rose’s domain, beyond the individual court’s in fact – when a slip becomes an official opinion.

For years the federal government has outsourced the creation of official opinions, relying on Westlaw and Lexis to create and publish them. These publishers are handed slip opinions by court staff, provide some editing, assign citations and release official versions through their systems. As a result, access to case law has been de facto privatized, and restricted.


Of late, however, courts are making some strides to change the nature of this system. The federal judiciary’s FDsys_bannerprimary effort in this regard is FDsys (and also see the 9th Circuit’s recent moves). But FDsys’s present course gives reason to worry that its goals have been too narrowly conceived to achieve serious benefit. This discourages the program’s natural supporters and endangers its chances of success.

We certainly count ourselves amongst FDsys’s strongest supporters, and we applaud the Judicial Conference for its quick work so far. And, as friends of the program, we want to offer feedback about how it might address the substantial skepticism it faces from those in the legal community who want the program to succeed but fear for its ultimate success and usability.

Our understanding is that FDsys’s primary goal is to provide free public access to court opinions. Its strategy for doing so (as inexpensively and as seamlessly as possible) seems to be to fully implement the platform at all federal courts before adding more functionality. This last point is especially critical. Because FDsys only offers slip opinions, which can’t be cited in court, its current usefulness for legal professionals is quite limited; even if every court used FDsys it would only be of marginal value. As a result, the legal community lacks incentive to lend its full, powerful, support to the effort. This support would be valuable in getting courts to adopt the system and in providing technology that could further reduce costs and help to overcome implementation hurdles.

Setting Achievable Goals

We believe that there are several key goals FDsys can accomplish, and that by doing so it will win meaningful support from the legal community and increase its end value and usage. With loftier goals (some modest, others ambitious), FDsys would truly become a world-class opinion publishing system. The following are the goals we suggest, along with metrics that could be used to assess them.



1. Comprehensive Access to Opinions - Does every federal court release every published and unpublished opinion?
  - Are the electronic records comprehensive in their historic reach?
2. Opinions that can be Cited in Court - Are the official versions of cases provided, not just the slip opinions?
  - And/or, can the version released by FDsys be cited in court?
3. Vendor-Neutral Citations - Are the opinions provided with a vendor-neutral citation (using, e.g., paragraph numbers)?
4. Opinions in File Formats that Enable Innovation - Are opinions provided in both human and machine-readable formats?
5. Opinions Marked with Meta-Data - Is a machine-readable language such as XML used to tag information like case date, title, citation, etc?
  - Is additional markup of information such as sectional breaks, concurrences, etc. provided?
6. Bulk Access to Opinions - Are cases accessible via bulk access methods such as FTP or an API?


The first three goals are the basic building blocks necessary to achieve meaningful open-access to the law. As Professor Martin of Cornell Law and others have chronicled, the open-access community has converged around these goals in recent years, and several states (such as Oklahoma) have successfully implemented them with very positive results.

Goals 3-6 involve the electronic format and storage medium used, and are steps that would be low-cost enablers of massive innovation. If one intention of the FDsys project is to support the development of new legal technologies, the data should be made accessible in ways that allow efficient computer processing. Word documents and PDFs do not accomplish this. PDFs, for example, are a fine format for archival storage and human reading, but computers don’t easily read them and converting PDFs into more usable forms is expensive and imperfect.

In contrast, publishing cases at the outset in a machine-readable Oliver_Wendell_Holmes_Jr_circa_1930-editformat is easy and comes at virtually no additional cost. It can be done in addition to publishing in PDF. Courts and the GPO already have electronic versions of cases and with a few mouse clicks could store them in a format that would inspire innovation rather than hamper it. The legal technology community stands ready to assist with advice and development work on all of these issues.

We believe that FDsys is a commendable step toward comprehensive public access to law, and toward enabling innovation in the legal space. Left to its current trajectory, however, it is certain to fall short of its potential. With some changes now, the program could be a home run for the entire legal community, ensuring that clerks like Rose can rest assured that the law as interpreted by her judge is accessible to everyone.


Nik and DanielDaniel Lewis and Nik Reed are graduates of Stanford Law School and the co-founders of Ravel Law, a legal search, analytics, and collaboration platform. In 2012, Ravel spun out of a Stanford University Law School, Computer Science Department, and Design School collaborative research effort focused on legal citation networks and information design. The Ravel team includes software engineers and data scientists from Stanford, MIT, and Georgia Tech. You can follow them on Twitter @ravellaw

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.