skip navigation
search

1. The Death and Life of Great Legal Data Standards

VOX.open.for.businessThanks to the many efforts of the open government movement in the past decade, the benefits of machine-readable legal data — legal data which can be processed and easily interpreted by computers — are now widely understood. In the world of government statutes and reports, machine-readability would significantly enhance public transparency, help to increase efficiencies in providing services to the public, and make it possible for innovators to develop third-party services that enhance civic life.

In the universe of private legal data — that of contracts, briefs, and memos — machine-readability would open up vast potential efficiencies within the law firm context, allow the development of novel approaches to processing the law, and would help to drive down the costs of providing legal services.

However, while the benefits are understood, by and large the vision of rendering the vast majority of legal documents into a machine-readable standard has not been realized. While projects do exist to acquire and release statutory language in a machine-readable format (and the government has backed similar initiatives), the vast body of contractual language and other private legal documents remains trapped in a closed universe of hard copies, PDFs, unstructured plaintext and Microsoft Word files.

Though this is a relatively technical point, it has broad policy implications for society at large. Perhaps the biggest upshot is that machine-readability promises to vastly improve access to the legal system, not only for those seeking legal services, but also for those seeking to provide legal services, as well.

It is not for lack of a standard specification that the status quo exists. Indeed, projects like LegalXML have developed specifications that describe a machine-readable markup for a vast range of different types of legal documents. As of writing, the project includes technical committees working on legislative documents, contracts, court filings, citations, and more.

However, by and large these efforts to develop machine-readable specifications for legal data have only solved part of the problem. Creating the standard is one thing, but actually driving adoption of a legal data standard is another (often more difficult) matter. There are a number of reasons why existing standards have failed to gain traction among the creators of legal data.

For one, the oft-cited aversion of lawyers to technology remains a relevant factor. Particularly in the case of the standardization of legal data, where the projected benefits exist in the future and the magnitude of benefit speculative at the present moment, persuading lawyers and legislatures to adopt a new standard remains a challenge, at best.VOX.confidential.stamp-pdf-file

Secondly, the financial incentives of some actors may actually be opposed towards rendering the universe of legal documents into a machine-readable standard. A universe of largely machine-readable legal documents would also be one in which it may be possible for third-parties to develop systems that automate and significantly streamline legal services. In the context of the ever-present billable hour, parties may resist the introduction of technological shifts that enable these efficiencies to emerge.

Third, the costs of converting existing legal data into a machine-readable standard may also pose a significant barrier to adoption. Marking up unstructured legal text can be highly costly depending on the intended machine usage of the document and the type of document in question. Persuading a legislature, firm, or other organization with a large existing repository of legal documents to take on large one-time costs to render the documents into a common standard also discourages adoption.

These three reinforcing forces erect a significant cultural and economic barrier against the integration of machine-readable standards into the production of legal text. To the extent that one believes in the benefits from standardization for the legal industry and society at large, the issue is — in fact — not how to define a standard, but how to establish one.

2. Rough Consensus, Running Standards

So, how might one go about promulgating a standard? Particularly in a world in which lawyers, the very actors that produce the bulk of legal data, are resistant to change, mere attempts to mobilize the legal community to action are destined to fail in bringing about the fundamental shift necessary to render most if not all legal documents in a common machine-readable format.

In such a context, implementing a standard in a way that removes humans from the loop entirely may, in fact, be more effective. To do so, one might design code that was capable of automatically rendering legal text into a machine-readable format. This code could then be implemented by applications of all kinds, which would output legal documents in a standard format by default. This would include the word processors used by lawyers, but also integration with platforms like LegalZoom or RocketLawyer that routinely generate large quantities of legal data. Such a solution would eliminate the need for lawyer involvement from the process of implementing a standard entirely: any text created would be automatically parsed and outputted in a machine readable format. Scripts might also be written to identify legal documents online and process them into a common format. As the body of documents rendered in a given format grew, it would be possible for others to write software leveraging the increased penetration of the standard.

There are — obviously — technical limitations in realizing this vision of a generalized legal data parser. For one, designing a truly comprehensive parser is a massively difficult computer science challenge. Legal documents come in a vast diversity of flavors, and no common textual conventions allow for the perfect accurate parsing of the semantic content of any given legal text. Quite simply, any parser will be an imperfect (perhaps highly imperfect) approximation of full machine-readability.

Despite the lack of a perfect solution, an open question exists as to whether or not an extremely rough parsing system, implemented at sufficient scale, would be enough to kickstart the creation of a true common standard for legal text. A popular solution, however imperfect, would encourage others to implement nuances to the code. It would also encourage the design of applications for documents rendered in the standard. Beginning from the roughest of parsers, a functional standard might become the platform for a much bigger change in the nature of legal documents. The key is to achieve the “minimal viable standard” that will begin the snowball rolling down the hill: the point at which the parser is rendering sufficient legal documents in a common format that additional value can be created by improving the parser and applying it to an ever broader scope of legal data.

But, what is the critical mass of documents one might need? How effective would the parser need to be in order to achieve the initial wave of adoption? Discovering this, and learning whether or not such a strategy would be effective, is at the heart of the Restatement project.

3. Introducing Project Restatement

Supported by a grant from the Knight Foundation Prototype Fund, Restatement is a simple, rough-and-ready system which automatically parses legal text into a basic machine-readable JSON format. It has also been released under the permissive terms of the MIT License, to encourage active experimentation and implementation.

The concept is to develop an easily-extensible system which parses through legal text and looks for some common features to render into a standard format. Our general design principle in developing the parser was to begin with only the most simple features common to nearly all legal documents. This includes the parsing of headers, section information, and “blanks” for inputs in legal documents like contracts. As a demonstration of the potential application of Restatement, we’re also designing a viewer that takes documents rendered in the Restatement format and displays them in a simple, beautiful, web-readable version.

Underneath the hood, Restatement is all built upon web technology. This was a deliberate choice, as Restatement aims to provide a usable alternative to document formats like PDF and Microsoft Word. We want to make it easy for developers to write software that displays and modifies legal documents in the browser.

In particular, Restatement is built entirely in JavaScript. The past few years have been exciting for the JavaScript community. We’ve seen an incredible flourishing of not only new projects built on JavaScript, but also new tools for building cool new things with JavaScript. It seemed clear to us that it’s the platform to build on right now, so we wrote the Restatement parser and viewer in JavaScript, and made the Restatement format itself a type of JSON (JavaScript Object Notation) document.

For those who are more technically inclined, we also knew that Restatement needed a parser formalism, that is, a precise way to define how plain text can get transformed into Restatement format. We became interested in recent advance in parsing technology, called PEG (Parsing Expression Grammar).

PEG parsers are different from other types of parsers; they’re unambiguous. That means that plain text passing through a PEG parser has only one possible valid parsed output. We became excited about using the deterministic property of PEG to mix parsing rules and code, and that’s when we found peg.js.

With peg.js, we can generate a grammar that executes JavaScript code as it parses your document. This hybrid approach is super powerful. It allows us to have all of the advantages of using a parser formalism (like speed and unambiguity) while also allowing us to run custom JavaScript code on each bit of your document as it parses. That way we can use an external library, like the Sunlight Foundation’s fantastic citation, from inside the parser.

Our next step is to prototype an “interactive parser,” a tool for attorneys to define the structure of their documents and see how they parse. Behind the scenes, this interactive parser will generate peg.js programs and run them against plaintext without the user even being aware of how the underlying parser is written. We hope that this approach will provide users with the right balance of power and usability.

4. Moving Forwards

Restatement is going fully operational in June 2014. After launch, the two remaining challenges are to (a) continuing expanding the range of legal document features the parser will be able to successfully process, and (b) begin widely processing legal documents into the Restatement format.

For the first, we’re encouraging a community of legal technologists to play around with Restatement, break it as much as possible, and give us feedback. Running Restatement against a host of different legal documents and seeing where it fails will expose the areas that are necessary to bolster the parser to expand its potential applicability as far as possible.

For the second, Restatement will be rendering popular legal documents in the format, and partnering with platforms to integrate Restatement into the legal content they produce. We’re excited to say on launch Restatement will be releasing the standard form documents used by the startup accelerator Y Combinator, and Series Seed, an open source project around seed financing created by Fenwick & West.

It is worth adding that the Restatement team is always looking for collaborators. If what’s been described here interests you, please drop us a line! I’m available at tim@robotandhwang.org, and on Twitter @RobotandHwang.

 

JasonBoehmigJason Boehmig is a corporate attorney at Fenwick & West LLP, a law firm specializing in technology and life science matters. His practice focuses on startups and venture capital, with a particular emphasis on early stage issues. He is an active maintainer of the Series Seed Documents, an open source set of equity financing documents. Prior to attending law school, Jason worked for Lehman Brothers, Inc. as an analyst and then as an associate in their Fixed Income Division.

tim-hwangTim Hwang currently serves as the managing human partner at the offices of Robot, Robot & Hwang LLP. He is curator and chair for the Stanford Center on Legal Informatics FutureLaw 2014 Conference, and organized the New and Emerging Legal Infrastructures Conference (NELIC) at Berkeley Law in 2010. He is also the founder of the Awesome Foundation for the Arts and Sciences, a distributed, worldwide philanthropic organization founded to provide lightweight grants to projects that forward the interest of awesomeness in the universe. Previously, he has worked at the Berkman Center for Internet and Society at Harvard University, Creative Commons, Mozilla Foundation, and the Electronic Frontier Foundation. For his work, he has appeared in the New York Times, Forbes, Wired Magazine, the Washington Post, the Atlantic Monthly, Fast Company, and the Wall Street Journal, among others. He enjoys ice cream.

Paul_SawayaPaul Sawaya is a software developer currently working on Restatement, an open source toolkit to parse, manipulate, and publish legal documents on the web. He previously worked on identity at Mozilla, and studied computer science at Hampshire College.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

The recent attention given to government information on the Internet, while laudable in itself, has been largely confined to the Executive Branch. While there is a technocratic appeal to cramming the entire federal bureaucracy into one vast spreadsheet with a wave of the president’s Blackberry, one cannot help but feel that this recent push for transparency has ignored government’s central function, to pass and enforce laws.

Advertisement on data.gov

Whether seen from the legislative or judicial point of view, law is a very prose-centric domain. This is a source of frustration to the mathematicians and computer scientists who hope analyze it. For example, while the United States Code presents a neat hierarchy at first glance, closer inspection reveals a sprawling narrative, full of quirks and inconsistencies. Even our Constitution, admired worldwide for its brevity and simplicity, has been tortured with centuries of hair-splitting over every word.

Nowhere is this more apparent than in judicial opinions. Unlike most government employees, who must adhere to rigid style manuals; or the general public, who interact with their government almost exclusively through forms; judges are free to write almost anything. They may quote Charles Dickens, or cite Shakespeare. A judicial opinion is one part newspaper report, one part rhetorical argument, and one part short story. Analyzing it mathematically is like trying to understand a painting by measuring how much of each color the artist used. Law students spend three years learning, principally, how to tease meaning out of form, fact out of fiction.

Why does a society, in which a President can be brought down by the definition of is, tolerate such ambiguity at the heart of its legal system? (And why, though we obsessively test our children, our athletes, and our attorneys, is our testing of judges such a farce?)

Engineers such as myself cannot tolerate ambiguity, so we feel a natural desire to bring order out of this chaos. The approach du jour may be top-down (taxonomy, classification) or bottom-up (tagging, clustering) but the impulse is the same: we want to tidy up the law. If code is law, as Larry Lessig famously declared, why not transform law into code?

Visualization of the structure of the U.S. Code

This transformation would certainly have advantages (beyond putting law firms out of business). Imagine the economic value of knowing, with mathematical certainty, exactly what the law is. If organizations could calculate legal risk as efficiently as they can now calculate financial risk (recession notwithstanding), millions of dollars in legal fees could be rerouted toward economic growth. All those bright liberal arts graduates who suffer through law school, only to land in dismal careers, could apply themselves to more useful and rewarding occupations.

And yet, despite years of effort, years in which the World Wide Web itself has submitted to computerized organization, the law remains stubbornly resistant to tidying. Why?

There are two answers, depending on what goal we have in mind. If the goal is truly to make tenets of law provable by mechanical (i.e., algorithmic) means, just as the tenets of mathematics are, we fail before we begin. Contrary to lay perception, law is not an exact science. It’s not a science at all (says a lawyer). Computers can answer scientific questions (“What is the diameter of Neptune?”) or bibliographic ones (“What articles has Tim Wu written?”) but cannot make value judgments. Law is all about value judgments, about rights and wrongs. Like many students of artificial intelligence, I believe that I will live to see computers that can make these kinds of judgments, but I do not know if I will live to see a world in which we let them.

The second answer speaks to the goal of information management, and the forms in which law is conveyed. The indexing of the World Wide Web succeeded for two reasons, form and scale. Form, in the case of the Web, means hypertext and universal identifiers. Together, they create a network of relationships among documents, a network which, critically, can be navigated by a computer without human aid. This fact, when realized at the scale of billions of pages containing trillions of hyperlinks, allows a computer to derive useful patterns from a seemingly chaotic mass of information.

3-d visualization of hypertext documents in XanaduSpace™

Law suffers from inadequacies of both form and scale. For example, all federal case law, taken together, would comprise just a few million pages, only a fraction of which are currently available in free, electronic form. In spite of the ubiquity of technology in the nation’s courts and legislatures, the dissemination of law itself, both statutory and common, remains a paper-centric, labor-intensive enterprise. The standard legal citation system is derived from the physical layout of text in bound volumes from a single publisher. Most courts now routinely publish their decisions on the Web, but almost exclusively in PDF form, essentially a photograph of a paper document, with all semantic information (such as paragraph breaks) lost. One almosts suspects a conspiracy to keep legal information out of the hands of any entity that lacks the vast human resources needed to reformat, catalog, and cross-index all this paper — in essence, to transform it into hypertext. It’s not such a far-fetched notion; if law were universally available in hypertext form, Google could put Wexis out of business in a week.

Social network of federal judges based on their clerks

But the legal establishment need not be quite so clannish with regard to Silicon Valley. For every intellectual predicting law’s imminant sublimation into the Great Global Computer, there are a hundred more keen to develop useful tools for legal professionals. The application is obvious; lawyers are drowning in information. Not only are dozens of court decisions published every day, but given the speed of modern communications, discovery for a single trial may turn up hundreds of thousands of documents. Computers are superb tools for organizing and visualizing information, and we have barely scratched the surface of what we can do in this area. Law is created as text, but who ever said we have to read it that way? Imagine, for example, animating a section of the U.S. Code to show how it changes over time, or “walking” through a 3-d map of legal doctrines as they split and merge.

Of course, all this is dependent on programmers and designers who have the time, energy, and financial support to create these tools. But it is equally dependent on the legal establishment — courts, legislatures, and attorneys — adopting information-management practices that enable this kind of analysis in the first place. Any such system has three essential parts:

  1. Machine-readable documents, e.g., hypertext
  2. Global identifiers, e.g., URIs
  3. Free and universal access

These requirements are not technically difficult to understand, nor arduous to implement. Even a child can do it, but the establishment’s (well-meaning) attempts have failed both technically and commercially. In the mean time, clever engineers, who might tackle more interesting problems, are preoccupied with issues of access, identification, and proofreading. (I have participated in long, unfruitful discussions about reverse-engineering page numbers. Page numbers!) With the extremely limited legal corpora available in hypertext form — at present, only the U.S. Code, Supreme Court opinions, and a subset of Circuit Court opinions — we lack sufficient data for truly innovative research and applications.

This is really what we mean when we talk about “tidying” the law. We are not asking judges and lawyers to abandon their jobs to some vast, Orwellian legal calculator, but merely to work with engineers to make their profession more amenable to computerized assistance. Until that day of reconciliation, we will continue our efforts, however modest, to make the law more accessible and more comprehensible. Perhaps, along the way, we can make it just a bit tidier.

stuart.jpgStuart Sierra is the technical guy behind AltLaw.  He says of himself, ” I live in New York City.  I have a degree in theatre from NYU/Tisch, and I’m a master’s student in computer science.  I work for the Program on Law & Technology at Columbia Law School, where I spend my day hacking on AltLaw, a free legal research site. I’m interested in the intersection of computers and human experience, particularly artificial intelligence, the web, and user interfaces.”

VoxPopuLII is edited by Judith Pratt.

As a comparative law academic, I have had an interest in legal translation for some time.  I’m not alone.  In our overseas programs at Nagoya University, we teach students from East and Central Asia who have a keen interest in the workings of other legal systems in the region, including Japan. We would like to supply them with an accessible base of primary resources on which to ground their research projects. At present, we don’t.  We can’t, as a practical matter, because the source for such material, the market for legal translation, is broken at its foundation.  One of my idle dreams is that one day it might be fixed. The desiderata are plain enough, and simple to describe. To be useful as a base for exploring the law (as opposed to explaining it), I reckon that a reference archive based on translated material should have the following characteristics:

  • Intelligibility Text should of course be readable (as opposed to unreadable), and terms of art should be consistent across multiple laws, so that texts can safely be read together.
  • Coverage A critical mass of material must be available. The Civil Code is not of much practical use without the Code of Civil Procedure and supporting instruments.
  • Currency If it is out of date, its academic value is substantially reduced, and its practical value vanishes almost entirely. If it is not known to be up-to-date, the vanishing happens much more quickly.
  • Accessibility Bare text is nice, but a reference resource ought to be enriched with cross-references, indexes, links to relevant cases, the original text on which the translation is based.
  • Sustainability  Isolated efforts are of limited utility.  There must be a sustained incentive to maintain the archive over time.

In an annoying confluence of market incentives, these criteria do not travel well together.  International law firms may have the superb in-house capabilities that they claim, but they are decidedly not in the business of disseminating information.  As for publishers, the large cost of achieving significant coverage means that the incentive to maintain and enhance accuracy and readability declines in proportion to the scope of laws translated by a given service.  As a result, no commercial product performs well on both of the first two criteria, and there is consequently little market incentive to move beyond them and attend to the remaining items in the list. So much for the invisible hand.

When markets fail, government can provide, of course, but a government itself is inevitably driven by well-focused interests (such as foreign investors) more than by wider communities (divorcing spouses, members of a foreign labor force, or, well, my students).  Bureaucratic initiatives tend to take on a life of their own, and without effective market signals, it is hard to measure how well real needs are actually being met.  In any case, barring special circumstances such as those obtaining within the EU, the problem of sustainability ever lurks in the background.

Unfortunately, these impediments to supply on terms truly attractive to the consumer are not limited to a single jurisdiction with particularly misguided policies; the same dismal logic applies everywhere (in a recent article, Carol Lawson provides an excellent and somewhat hopeful review of the status quo in Japan).  At the root of our discomfiture are, I think, two factors: the cookie-cutter application of copyright protection to this category of material; and a lack of adequate, recognized, and meaningful standards for legal translation (and of tools to apply them efficiently in editorial practice). The former raises an unnecessary barrier to entry. The latter saps value by aggravating agency problems, and raises risk for both suppliers and consumers of legal translations.

I first toyed with this problem a decade ago, in a fading conference paper now unknown to search engines (but still available through the kind offices of the Web Archive). At the time, I was preoccupied with the problem of barriers to entry and the dog-in-the-manger business strategies that are they foster, and this led me to think of the translation conundrum as an intractable, self-sustaining Gordian knot of conflicting interests, capable of resolution only through a sudden change in the rules of the game. Developments in subsequent years, in Japan and elsewhere, have taught me that both the optimism and the pessimism embedded in that view may have been misplaced. The emergence of standards, slow and uncertain though it be, may be our best hope of improvement over time.

To be clear, the objective is not freedom as in free beer.  Reducing the cost of individual statutory translations is less important than fostering an environment in which (a) scarce resources are not wasted in the competitive generation of identical content within private or protected containers; and (b) there is a reasonably clear and predictable relationship between quality (in terms of the list above) and cost. Resolving such problems are a common role for standards, both formal and informal.  It is not immediately clear how far voluntary standards can penetrate a complex, dispersed and often closed activity like the legal translation service sector — but one need not look far for cases in which an idea about standardization achieved acceptance on its merits and went on to have a significant impact on behavior in a similarly fragmented and dysfunctional market.  There is at least room for hope.

In 2006, as part of a Japanese government effort to improve the business environment (for that vocal group of foreign investors referred to above), an interdisciplinary research group in my own university led by Yoshiharu Matsuura and Katsuhiko Toyama released the first edition of a standard bilingual dictionary for legal translation (the SBD) to the Web. Aimed initially at easing the burden of the translation initiative on hard-pressed government officials charged with implementing it, the SBD has since gone through successive revisions, and recently found a new home on a web portal providing government-sponsored statutory translations. (This project is one of two major translation initiatives launched in the same period, the other being a funded drive to render a significant number of court decisions into English).

The benefits of the Standard Bilingual Dictionary are evident in new translations emerging in connection with the project. Other contributors to this space will have more to say about the technology and workflows underlying the SBD, and the roadmap for its future development. My personal concern is that it achieve its proper status, not only as a reference and foundation source for side products, but as a community standard. Paradoxically, restricting the licensing terms for distribution may be the simplest and most effective way of disseminating it as an industry standard.  A form of license requiring attribution to the SBD maintainers, and prohibiting modification of the content without permission, would give commercial actors an incentive to return feedback to the project.  I certainly hope that the leaders of the project will consider such a scheme, as it would help assure that their important efforts are not dissipated in a flurry of conflicting marketplace “improvements” affixed, one must assume, with more restrictive licensing policies.

There is certainly something to be said for making changes in the way that copyright applies to translated law more generally.  The peak demand for law in translation is the point of first enactment or revision. Given the limited pool of translator time available, once a translation is prepared and published, there is a case to be made for a compulsory licensing system, as a means of widening the channel of dissemination, while protecting the economic interest of translators and their sponsors.  The current regime, providing (in the case of Japan) for exclusive rights of reproduction for a period extending to fifty years from the death of the author (Japanese Copyright Act, section 51), really makes no sense in this field.  As a practical matter, we must depend on legislatures, of course, for core reform of this kind.  Alas, given the recent track record on copyright reform among influential legislative bodies in the United States and Europe, I fear that we may be in for a very long wait.  In the meantime, we can nonetheless move the game forward by adopting prudent licensing strategies for standards-based products that promise to move this important industry to the next level.

Frank Bennett is an Associate Professor in the Graduate School of Law at Nagoya University.

Vox PopulLII is edited by Judith Pratt

vanuatu.jpgLast year, I had the incredible opportunity of spending five weeks in Port Vila, Vanuatu, working on a research project to evaluate the impact of free access to law in the Pacific Islands. While  spending part of my days at the Law Faculty of the University of the Pacific (the other part I spent at the beach), I came to realize that most academics present were  working in the Pacific Languages Unit, another teaching program based at the Emalus Campus of the University. Linguists, anthropologists and ethnologists  have undertaken the large task of deciphering, writing down and analyzing some of the approximately 110 different languages that are still spoken by the 215,000 Ni-Vanuatu. With an average of 1,700 speakers for each language in 1996, Vanuatu has the world record for language density.

Apart from the beaches, the reason for the recent scientific interest in the numerous languages of this otherwise unknown archipelago is that many of them are at risk of disappearing. Some, such as Ifo, have already become extinct. Others, like Araki, will die in the years to come along with village elders. Young people instead choose to speak Bislama, a English-based Creole that is more widely disseminated. Many inhabitants also speak either English or French, a legacy of the British-French Condominium under which these islands were ruled for several decades.

This depressing reality is the most obvious sign that the once vibrant Ni-Vanuatu cultures are dying—not from a slowly evolving cancer, but from a dazzling flesh-eating disease. We cannot even take the comforting position that this phenomenon is limited to a few distant islands, because it is symptomatic in a large number of countries. This is not even limited to the developing world. Native American Indians are in the same situation. I personally live only a few kilometers from lands granted to the Mohawk Nation and can tell you that aside from a sign near the town hall, you would never guess that Mohawks have their own language.

Although language is the most visible component of cultural identity, the arts are another universally recognized reflection of a culture.  (Hollywood movie producers show their “concern” for this by trying to convince us that file sharing is a major threat to culture!) But for French Canadians like me, law can be also be added to that list. When the British Crown granted the Act of Quebec in 1774, it not only allowed French Canadians to keep their language and religion, but also their legal system (at least in private matters). Without this document, the French culture would have certainly declined and maybe disappeared from North America over the next two centuries. It is a paradox that the Act of Quebec sprang from the troubled years preceding the birth of our giant neighbor, which is now the biggest threat to our culture. Still, our ancestors managed to keep their own law accounts for the preservation of such legal concepts as unjust enrichment or the obligation to assist a person in danger. The evolution of these laws through a few centuries of intellectual isolation has uniquely shaped our legal system and continuously reflects the values of the people of Quebec. In the end, although the French Canadian accent is more often associated with our cultural identity, our dual legal system (common law / civil law) undoubtedly contributes to it.

If you admit that legal systems are subcomponents of cultures, there is no question that their diversity should be protected, or at least promoted. Unfortunately, the trend has been going in the opposite direction for a long time. Many legal systems have disappeared with the rise of moderns states. Many more vanished during the colonial era, as it spread common law and civil law all around the globe. But more recently, it is the lack of accessibility in a period characterized by the free flow of information that is causing most of the damages to legal diversity. While the laws and customs of many countries in the developing world are still difficult to find at all, especially in electronic forms, the legal documents that are the most easily accessible on the Internet receive unprecedented attention. For example, all across French-speaking Western Africa, French jurisprudence is more often cited than local law. This can be explained by the simple fact that local decisions are generally impossible to access. In Burkina Faso, the only remaining copies of historical decisions from the Cour de Cassation are piled under a staircase (at least they were in 2004 when I last visited the building). In contrast, every decision rendered since 1989 from the equivalent French court is freely accessible online on Legifrance, and all those published in the court bulletins are also available up to 1960. Another illustration of this problem is the ever-increasing number of citations of the decisions of the European courts at the international level. Without any doubt, the new leadership taken by these institutions, particularly in the field of humanitarian law, is a major factor in this phenomenon. But the fact that many of these decisions are freely distributed online in 23 different languages cannot be ignored either.

These two illustrations quickly help to expose a fact that is becoming harder and harder to deny: the accessibility of legal information influences what the law is and how it evolves. It does so internally by generating competition for authority among the various recognized legal sources. It does so externally by facilitating the incorporation of foreign legal concepts or doctrines that are more effectively disseminated. The examples given above also underline the crucial role played by the Internet and free access to law in this equation. When it comes to finding legal material to cite, jurists prefer to search libraries, rather than underneath staircases. They prefer to browse online instead of walking among alleys of books. They prefer freely accessible websites to highly expensive databases. The goal here is not to openly attack the prevalent legal theories, beliefs in the hierarchy of norms, or the importance of conducting all-inclusive legal research. But these findings imply that jurists do not hesitate to bend the pillars of legal positivism when confronted with necessity.

That brings me back to Vanuatu and the Pacific Islands. In this region, most countries acquired their independence after the end of the Second World War, and inherited the common law as the basis for their own legal system. Local rules of conflict resolution were not totally annihilated by colonization, but were instead relegated to unofficial proceedings or trivial matters. With independence, we could hope that the local, customary rules that remain significant in the modern world would somehow resurface in a system of judge-made law. But this is wishful thinking that ignores the accessibility issue. First, customary law being unwritten by definition, its accessibility was hindered right from the start. Second, for years judges, lawyers and other legal practitioners from the Pacific almost exclusively had access to case reports printed in the United Kingdom. A few local initiatives at case reporting did occur over the last few decades, but the coverage was irregular and far too small to sustain a modern state’s appetite for precedents. Third, it should be added that many foreign judges who have never been confronted with the local customs and traditions remained in position up to very recently. For these reasons, jurists from the Pacific continued to apply the latest British judgments as if it was the law in force in their own countries.

Fortunately, the Internet has brought changes in this regard. In 1997, the law library of the University of the South Pacific started to publish online cases collected from many of the jurisdictions of the region. The database grew steadily and around 2001 the Pacific Legal Information Institute (PacLII) was created in order to expand the experiment. Today, PacLII publishes more than 30,000 cases from 15 different Pacific island countries, in addition to legislative and treaty material. For most of these countries, PacLII is the sole point of access to national legal information.

In the course of my stay there I compiled the usage statistics of the PacLII website. My initial goal was to determine if local cases are downloaded by locals or instead by international users looking for foreign legal information. It turned out that five Pacific island countries are intensive users of the decisions disseminated on PacLII: Fiji (71); Vanuatu (62); Samoa (47); Kiribati (33); and Salomon Islands (21). (The number between parenthesis is the number of actual decision files downloaded for 1,000 inhabitants in 2007.) Not surprisingly, those five countries are the ones for which PacLII has the more comprehensive databases. The only exception is Papua New Guinea, where an alternative national publisher also provide online access to cases.

More relevant for legal diversity are the numbers that came out of my analysis of citations included inside the local judgments rendered over the period 1997-2007. In parallel with the development of PacLII, citations of national decisions increased by 42% in the five countries already mentioned. Citations of regional decisions (citations from other Pacific Islands countries) increased by 462%, although they still occur only occasionally. In comparison, those percentages stagnate in the other nine countries not using PacLII with equivalent enthusiasm. (Papua New Guinea is excluded here because the national access provider is having its own influence.)

Those numbers indicate that the online dissemination of Pacific Islands cases had a noticeable impact on the legal system of five countries. While it is still too early to write about the creation of a Pacific jurisprudence, there is no doubt that local decisions are slowly but surely replacing foreign cases as the primary source of precedents in those countries. It appears that free access to law has finally reversed the long established trend of ever increasing foreign legal influence in the region.

Even if this new phenomenon is particularly acute in Vanuatu itself, the dying legal cultures of this archipelago will certainly not be saved by this achievement alone. Many specificities of the customary dispute resolution mechanisms originating in the area must have already disappeared definitively, some for the better, some for the worse. Nevertheless, Ni-Vanuatu now possess a means to promote their own vision of what the law is and how it should be implemented. If uses of this new tool do not succeed in salvaging pieces from their traditional legal cultures, at least it should help them building a new one.

Pierre-Paul LemyrePierre-Paul Lemyre is in charge of the research and development activities of LexUM, the research group that runs the Canadian Legal Information Institute ( CanLII). Previously, he was in charge of the business development of LexUM, particularly on the international stage where he built relationships with numerous funding agencies and local partners.

VoxPopuLII is edited by Judith Pratt.