skip navigation
search

1. The Death and Life of Great Legal Data Standards

VOX.open.for.businessThanks to the many efforts of the open government movement in the past decade, the benefits of machine-readable legal data — legal data which can be processed and easily interpreted by computers — are now widely understood. In the world of government statutes and reports, machine-readability would significantly enhance public transparency, help to increase efficiencies in providing services to the public, and make it possible for innovators to develop third-party services that enhance civic life.

In the universe of private legal data — that of contracts, briefs, and memos — machine-readability would open up vast potential efficiencies within the law firm context, allow the development of novel approaches to processing the law, and would help to drive down the costs of providing legal services.

However, while the benefits are understood, by and large the vision of rendering the vast majority of legal documents into a machine-readable standard has not been realized. While projects do exist to acquire and release statutory language in a machine-readable format (and the government has backed similar initiatives), the vast body of contractual language and other private legal documents remains trapped in a closed universe of hard copies, PDFs, unstructured plaintext and Microsoft Word files.

Though this is a relatively technical point, it has broad policy implications for society at large. Perhaps the biggest upshot is that machine-readability promises to vastly improve access to the legal system, not only for those seeking legal services, but also for those seeking to provide legal services, as well.

It is not for lack of a standard specification that the status quo exists. Indeed, projects like LegalXML have developed specifications that describe a machine-readable markup for a vast range of different types of legal documents. As of writing, the project includes technical committees working on legislative documents, contracts, court filings, citations, and more.

However, by and large these efforts to develop machine-readable specifications for legal data have only solved part of the problem. Creating the standard is one thing, but actually driving adoption of a legal data standard is another (often more difficult) matter. There are a number of reasons why existing standards have failed to gain traction among the creators of legal data.

For one, the oft-cited aversion of lawyers to technology remains a relevant factor. Particularly in the case of the standardization of legal data, where the projected benefits exist in the future and the magnitude of benefit speculative at the present moment, persuading lawyers and legislatures to adopt a new standard remains a challenge, at best.VOX.confidential.stamp-pdf-file

Secondly, the financial incentives of some actors may actually be opposed towards rendering the universe of legal documents into a machine-readable standard. A universe of largely machine-readable legal documents would also be one in which it may be possible for third-parties to develop systems that automate and significantly streamline legal services. In the context of the ever-present billable hour, parties may resist the introduction of technological shifts that enable these efficiencies to emerge.

Third, the costs of converting existing legal data into a machine-readable standard may also pose a significant barrier to adoption. Marking up unstructured legal text can be highly costly depending on the intended machine usage of the document and the type of document in question. Persuading a legislature, firm, or other organization with a large existing repository of legal documents to take on large one-time costs to render the documents into a common standard also discourages adoption.

These three reinforcing forces erect a significant cultural and economic barrier against the integration of machine-readable standards into the production of legal text. To the extent that one believes in the benefits from standardization for the legal industry and society at large, the issue is — in fact — not how to define a standard, but how to establish one.

2. Rough Consensus, Running Standards

So, how might one go about promulgating a standard? Particularly in a world in which lawyers, the very actors that produce the bulk of legal data, are resistant to change, mere attempts to mobilize the legal community to action are destined to fail in bringing about the fundamental shift necessary to render most if not all legal documents in a common machine-readable format.

In such a context, implementing a standard in a way that removes humans from the loop entirely may, in fact, be more effective. To do so, one might design code that was capable of automatically rendering legal text into a machine-readable format. This code could then be implemented by applications of all kinds, which would output legal documents in a standard format by default. This would include the word processors used by lawyers, but also integration with platforms like LegalZoom or RocketLawyer that routinely generate large quantities of legal data. Such a solution would eliminate the need for lawyer involvement from the process of implementing a standard entirely: any text created would be automatically parsed and outputted in a machine readable format. Scripts might also be written to identify legal documents online and process them into a common format. As the body of documents rendered in a given format grew, it would be possible for others to write software leveraging the increased penetration of the standard.

There are — obviously — technical limitations in realizing this vision of a generalized legal data parser. For one, designing a truly comprehensive parser is a massively difficult computer science challenge. Legal documents come in a vast diversity of flavors, and no common textual conventions allow for the perfect accurate parsing of the semantic content of any given legal text. Quite simply, any parser will be an imperfect (perhaps highly imperfect) approximation of full machine-readability.

Despite the lack of a perfect solution, an open question exists as to whether or not an extremely rough parsing system, implemented at sufficient scale, would be enough to kickstart the creation of a true common standard for legal text. A popular solution, however imperfect, would encourage others to implement nuances to the code. It would also encourage the design of applications for documents rendered in the standard. Beginning from the roughest of parsers, a functional standard might become the platform for a much bigger change in the nature of legal documents. The key is to achieve the “minimal viable standard” that will begin the snowball rolling down the hill: the point at which the parser is rendering sufficient legal documents in a common format that additional value can be created by improving the parser and applying it to an ever broader scope of legal data.

But, what is the critical mass of documents one might need? How effective would the parser need to be in order to achieve the initial wave of adoption? Discovering this, and learning whether or not such a strategy would be effective, is at the heart of the Restatement project.

3. Introducing Project Restatement

Supported by a grant from the Knight Foundation Prototype Fund, Restatement is a simple, rough-and-ready system which automatically parses legal text into a basic machine-readable JSON format. It has also been released under the permissive terms of the MIT License, to encourage active experimentation and implementation.

The concept is to develop an easily-extensible system which parses through legal text and looks for some common features to render into a standard format. Our general design principle in developing the parser was to begin with only the most simple features common to nearly all legal documents. This includes the parsing of headers, section information, and “blanks” for inputs in legal documents like contracts. As a demonstration of the potential application of Restatement, we’re also designing a viewer that takes documents rendered in the Restatement format and displays them in a simple, beautiful, web-readable version.

Underneath the hood, Restatement is all built upon web technology. This was a deliberate choice, as Restatement aims to provide a usable alternative to document formats like PDF and Microsoft Word. We want to make it easy for developers to write software that displays and modifies legal documents in the browser.

In particular, Restatement is built entirely in JavaScript. The past few years have been exciting for the JavaScript community. We've seen an incredible flourishing of not only new projects built on JavaScript, but also new tools for building cool new things with JavaScript. It seemed clear to us that it's the platform to build on right now, so we wrote the Restatement parser and viewer in JavaScript, and made the Restatement format itself a type of JSON (JavaScript Object Notation) document.

For those who are more technically inclined, we also knew that Restatement needed a parser formalism, that is, a precise way to define how plain text can get transformed into Restatement format. We became interested in recent advance in parsing technology, called PEG (Parsing Expression Grammar).

PEG parsers are different from other types of parsers; they're unambiguous. That means that plain text passing through a PEG parser has only one possible valid parsed output. We became excited about using the deterministic property of PEG to mix parsing rules and code, and that's when we found peg.js.

With peg.js, we can generate a grammar that executes JavaScript code as it parses your document. This hybrid approach is super powerful. It allows us to have all of the advantages of using a parser formalism (like speed and unambiguity) while also allowing us to run custom JavaScript code on each bit of your document as it parses. That way we can use an external library, like the Sunlight Foundation's fantastic citation, from inside the parser.

Our next step is to prototype an "interactive parser," a tool for attorneys to define the structure of their documents and see how they parse. Behind the scenes, this interactive parser will generate peg.js programs and run them against plaintext without the user even being aware of how the underlying parser is written. We hope that this approach will provide users with the right balance of power and usability.

4. Moving Forwards

Restatement is going fully operational in June 2014. After launch, the two remaining challenges are to (a) continuing expanding the range of legal document features the parser will be able to successfully process, and (b) begin widely processing legal documents into the Restatement format.

For the first, we’re encouraging a community of legal technologists to play around with Restatement, break it as much as possible, and give us feedback. Running Restatement against a host of different legal documents and seeing where it fails will expose the areas that are necessary to bolster the parser to expand its potential applicability as far as possible.

For the second, Restatement will be rendering popular legal documents in the format, and partnering with platforms to integrate Restatement into the legal content they produce. We’re excited to say on launch Restatement will be releasing the standard form documents used by the startup accelerator Y Combinator, and Series Seed, an open source project around seed financing created by Fenwick & West.

It is worth adding that the Restatement team is always looking for collaborators. If what’s been described here interests you, please drop us a line! I’m available at tim@robotandhwang.org, and on Twitter @RobotandHwang.

 

JasonBoehmigJason Boehmig is a corporate attorney at Fenwick & West LLP, a law firm specializing in technology and life science matters. His practice focuses on startups and venture capital, with a particular emphasis on early stage issues. He is an active maintainer of the Series Seed Documents, an open source set of equity financing documents. Prior to attending law school, Jason worked for Lehman Brothers, Inc. as an analyst and then as an associate in their Fixed Income Division.

tim-hwangTim Hwang currently serves as the managing human partner at the offices of Robot, Robot & Hwang LLP. He is curator and chair for the Stanford Center on Legal Informatics FutureLaw 2014 Conference, and organized the New and Emerging Legal Infrastructures Conference (NELIC) at Berkeley Law in 2010. He is also the founder of the Awesome Foundation for the Arts and Sciences, a distributed, worldwide philanthropic organization founded to provide lightweight grants to projects that forward the interest of awesomeness in the universe. Previously, he has worked at the Berkman Center for Internet and Society at Harvard University, Creative Commons, Mozilla Foundation, and the Electronic Frontier Foundation. For his work, he has appeared in the New York Times, Forbes, Wired Magazine, the Washington Post, the Atlantic Monthly, Fast Company, and the Wall Street Journal, among others. He enjoys ice cream.

Paul_SawayaPaul Sawaya is a software developer currently working on Restatement, an open source toolkit to parse, manipulate, and publish legal documents on the web. He previously worked on identity at Mozilla, and studied computer science at Hampshire College.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

The first thing we do, let's kill all the lawyers.
- Henry VI, Pt. 2, Act 4, sc. 2.

This line, delivered by Dick the Butcher (turned revolutionary) in Shakespeare's Henry VI, is often performed tongue-in-cheek by actors to elicit an expected laugh from the audience. The essence of the line, however, is no joke, and relates to destabilizing the rule of law by removing its agents -- those who promote and enforce the law. What no one could predict, including Shakespeare himself, is the horrific precision with which such a deed could be carried out.

The 1994 Genocide in Rwanda showed this horror and more, with upwards of one million killed in the span of three months. The effect on the legal system was particularly devastating, with the targeting of lawyers and the justice sector, resulting in the targeted killing of prosecutors and judges at its outset.

Rwanda's Justice Sector Development
Since 1994, Rwanda has done a remarkable job rebuilding its society, establishing security, curbing corruption, and creating one of the fastest growing economies in sub-Saharan Africa.

Law Library at the Ministry of Justice, Kigali, Rwanda.

Law Library at the Ministry of Justice, Kigali, Rwanda.

One of the biggest areas of development in Rwanda, and in other areas of the world, has been strengthening justice sector institutions and strengthening the rule of law. In transitional states, especially those developing systems of democratic governance, the creation of online, reliable, and accessible legal information systems is a critical component of good governance. Rwanda's efforts and opportunities for development in this area are noted below.

From 2010-2011, I played a very small part of this development when I served as a law clerk and legal advisor to then-Chief Justice Aloysie Cyanzayire of the Supreme Court of Rwanda. Working with a USAID-funded project, I was also able to participate with legal education reform, and the development of an online database of laws, the Rwanda Legal Information Portal (RwandaLIP). In the summer of 2013 I returned to Rwanda, with the support of the American Association of Law Libraries, to visit its law libraries and understand the role of law libraries in legal institutions and overall society. After learning the Rwanda LIP was no longer updated (and now offline entirely), investigating Rwanda's online legal presence became a secondary research goal for the trip. The discovery also highlighted the importance of legal information systems and their role in justice sector reform. Part of this justice sector reform related to changes in Rwanda's legal system. Once a Belgian colony, at independence Rwanda inherited a civil law system, codified much of the Belgian civil code, and today the main body of laws comes from enactments of Parliament. Rwanda's judicial system, rebuilt after the 1994 Genocide, is made up of four levels of courts: District Courts, Provincial Courts, High Courts, and the Supreme Court.
With its civil law roots, courts in Rwanda were largely unconcerned with precedent. As Rwanda became a member of the East African Community in 2007 (and adopted English as an official language), the judiciary started a transition to a hybrid common law system, considering how to assign precedential value to court decisions. With this ongoing transition in Rwanda's legal system, an online legal information system has become a significant need for legal and civil society.

One of four computer labs, called the "digital library" at Kigali Independent University, with more than 400 computer workstations available for student use.

One of four computer labs, called the "digital library" at Kigali Independent University, with more than 400 computer workstations available for student use.

Online Legal Information Systems
In order to establish the rule of law in a democratic system, citizens must have access, at the very minimum, to laws of a government. To make this access meaningful, a searchable database of laws should be created to allow users of legal information to find laws based on their particular information need. For this reason alone it is important for governments in transitional states to make a commitment to developing online legal information systems.

John Palfrey aptly noted: "In most countries, primary legal information is broadly accessible in one format or another, but it is rarely made accessible online in a stable and reliable format." This is basically the case in Rwanda. Every law library, university library, and even the Kigali Public Library have paper copies of the Official Journal -- the official laws of Rwanda. Today, however, the only current place to find laws online is through the Prime Minister's webpage, where PDF copies of the Official Gazette are published. The website Amategeko.net (Kinyarwanda for "law") was frequently used by lawyers and members of the justice sector to search Rwanda's laws, and allowed the general public to not only access laws, but run a full text search for keywords. This site, however, was not updated after 2011, and is now completely offline. The result is no online source to search Rwanda's laws.

Law Library at the Parliament of the Republic of Rwanda in Kigali.

Law Library at the Parliament of the Republic of Rwanda in Kigali.

Rwanda is using its growing information infrastructure, however, to create other online quasi-legal information databases. For instance, the Rwanda Development Board created an online portal for businesses to access information on "investment related procedures" in Rwanda. The government is also allowing online registration of businesses, streamlining the processes and making it more accessible. These developments make sense with Rwanda's reforms in the area of economic development, and its recent ranking in the top 30% globally for ease of doing business, and 3rd best in sub-Saharan Africa. While economic reform has driven these changes, justice sector reform has not yet yielded the same results for online legal information systems.

Service counter at the University Library at Kigali Independent University in Rwanda.  Students aren't allowed to browse the library stacks.

Service counter at the University Library at Kigali Independent University in Rwanda. Students aren't allowed to browse the library stacks.

Rwanda's Legal Information Culture Despite the limited online access to laws, there is a high value placed on legal information in Rwanda. Every legal institution has a law library and a dedicated library staff member (although most don't have formal education in librarianship or information management). Moreover, members of the justice sector, from staff members to Permanent Secretaries and Ministers, believe libraries and access to legal information is of critical importance. A common theme in Rwanda's law libraries, however, is the lack of funding. Some libraries have not invested in library materials in years, and have solely relied on donations to add items to their collections. It is not altogether surprising, then, that the Rwanda LIP remained un-funded, and is now completely defunct as an online legal information system. One source close to the Rwanda LIP project indicated that funding has been sought at Parliament, but as of today has yet to be successful.

The failure of the Rwanda LIP is perhaps a victim of how it came to be; that is, through donor-funded development. Creating sustainable online databases requires a government commitment of financial support. Just as Amategeko.net before it, the Rwanda LIP was created through a donor-funded initiative, and at its conclusion the LIP's source of funding also ended. For any donor-funded development initiative, sustainability is a key concern, and significant government collaboration is necessary for initiatives to remain after donor-funded projects end. This concept is especially true with legal information systems, and is perhaps the cause for the Rwanda LIP's demise. While created in partnership with the Government of Rwanda, it failed to adequately secure a commitment for continued funding at its outset. Sustainability issues are not unique to Rwanda's experience with online legal information systems. The availability of financial resources is one of the key challenges to creating a sustainable online database of laws. Working with developing countries in Africa, SAFLII found that sustainability issues come from "shortages of resources, skills and technical services." While donor-funded projects have serious limitations, others experiencing the sustainability challenge have suggested databases supported by private enterprise, "offering free content as well as value-added services for sale." One thing for certain is that long-term sustainability remains one of the biggest challenges for online legal information systems.

View of the Kigali Public Library in Kigali, Rwanda.

View of the Kigali Public Library in Kigali, Rwanda.

Print to Digital Transition and Overcoming the Digital Divide In addition to sustainability, transition from print to digital poses its own complications, and has emerged as a major issue in law libraries, from even the most established institutions. This challenge is especially unique in the context of developing and transitional states, where access to the internet can pose a significant challenge. This problem, known as the "digital divide," has been described as something that "disproportionately disenfranchises certain segments of society and runs counter to the notion that inclusiveness and opportunity build strong communities and countries." This is an even larger problem in developing and transitional states, where there is far less wealth and technological infrastructure for internet connectivity, and a greater disparity in access between and among communities.

Of all countries in the process of developing online legal information systems, however, Rwanda is perhaps the best suited to succeed. With high-speed fibre-optic internet cables recently installed throughout the small East African country, Rwanda has one of the best internet penetration rates in the developing world. So, while Rwanda's law libraries (and other libraries) throughout the country have print copies of laws, there may be a legitimate opportunity to give a large number of citizens online access. For example, the Kigali Public Library, the flagship institution of the Rwanda Library Services, houses print copies of the laws of Rwanda but also has an internet cafe giving free access to online resources. Kigali Independent University has an "Internet Library" with more than 500 computers for student use. Rwanda's law libraries are also open and accessible to the public, some of which have computers for use by the public as well. Other libraries, including the law library at the National University of Rwanda, have increasing access to online resources to serve their users.

In Rwanda, a new access to information law (Official Gazette No. 10 of 11.03.2013) makes online legal information even more critical in the developing state, and Rwanda's current efforts can serve as an example for the importance of modernizing online legal information. The access to information law imposes a positive obligation on the Government of Rwanda, and some private companies working under government contracts, to disclose a broad range of information to the public and press. It has been stated that the law "meets standards of best practice in terms of scope and application" for freedom of information laws. Despite the law's conditions to withhold information under Article 4, the significant shift in policy and the law's broad range of information available are very positive signs. This and similar laws across the developing world have created a need for the improvement of existing legal information systems, or the creation of new systems to adequately make available essential legal information. A critical component to the implementation of this law, therefore, is a reliable and sustainable online legal information system.

A view of the volcanoes in the Northern Province of Rwanda.

A view of the volcanoes in the Northern Province of Rwanda.

Lessons Learned from Rwanda's Experience
While Amategeko.net and the Rwanda LIP are no longer online, institutions within the justice sector of Rwanda are currently working on solutions. In the meantime, there is no meaningful way to search Rwanda's laws online. It is possible that a stronger financial commitment at the outset of the Rwanda LIP would have solved this. In the future, long-term sustainability should be one of the primary qualifications for creating an online system.

In the meantime, there are other ways of expanding Rwanda's access to online legal information through databases of foreign law and secondary sources. Talking with law librarians in Rwanda, I learned that there is little, if any research instruction being delivered from law libraries. Even in the few libraries with subscription electronic databases, users aren't necessarily being directed to relevant legal resources. Furthermore, law librarians generally collect, catalog and retrieve legal materials for users, rather than directing users to relevant sources. Users of legal information in Rwanda (and elsewhere) would be well served by being exposed to other online sources of legal information. Sites like the LII, WorldLII, and the Directory of Open Access Journals offers access to a wealth of free online primary and secondary materials that could be useful to researchers. Creating research guides and offering research instruction in these areas costs very little, and opens up countless resources that could be valuable to users of legal information in Rwanda, and elsewhere. Those working in justice sector development should investigate the possibility for this, in conjunction with creating online legal information systems of domestic laws.

Directional sign outside the Law Faculty at the Independent Institute of Lay Adventist of Kigali.

Directional sign outside the Law Faculty at the Independent Institute of Lay Adventist of Kigali.

Finally, the majority of those working as librarians in Rwanda's law libraries have no formal instruction in library or information science. Nonetheless, it is remarkable that those with little or no formal training are competent librarians. Formal training or not, qualified librarians generally do not have the opportunity to offer research training to users of legal information. Treating law librarians as professionals would open up many opportunities to increase the capacity of users of legal information, and the online resources available.

 

IMG_1857Brian Anderson is a Reference Librarian and Assistant Professor at the Taggart Law Library at Ohio Northern University. His research involves the use of law libraries and legal information systems to support the rule of law in developing and transitional states. In September 2013 Brian presented two papers at the 2013 Law Via the Internet conference related to this topic; one related to civil society organizations and the use of the internet to strengthen the rule of law, and another about starting online legal information systems from scratch.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

There have been a series of efforts to create a national legislative data standard - one master XML format to which all states will adhere for bills, laws, and regulations.Those efforts have gone poorly.

Few states provide bulk downloads of their laws. None provide APIs. Although nearly all states provide websites for people to read state laws, they are all objectively terrible, in ways that demonstrate that they were probably pretty impressive in 1995. Despite the clear need for improved online display of laws, the lack of a standard data format and the general lack of bulk data has enabled precious few efforts in the private sector. (Notably, there is Robb Schecter's WebLaws.org, which provides vastly improved experiences for the laws of California, Oregon, and New York. There was also a site built experimentally by Ari Hershowitz that was used as a platform for last year's California Laws Hackathon.)

A significant obstacle to prior efforts has been the perceived need to create a single standard, one that will accommodate the various textual legal structures that are employed throughout government. This is a significant practical hurdle on its own, but failure is all but guaranteed by also engaging major stakeholders and governments to establish a standard that will enjoy wide support and adoption.

What if we could stop letting the perfect be the enemy of the good? What if we ignore the needs of the outliers, and establish a "good enough" system, one that will at first simply work for most governments? And what if we completely skip the step of establishing a standard XML format? Wouldn't that get us something, a thing superior to the nothing that we currently have?

The State Decoded
This is the philosophy behind The State Decoded. Funded by the John S. and James L. Knight Foundation, The State Decoded is a free, open source program to put legal codes online, and it does so by simply skipping over the problems that have hampered prior efforts. The project does not aspire to create any state law websites on its own but, instead, to provide the software to enable others to do so.

Still in its development (it's at version 0.4), The State Decoded leaves it to each implementer to gather up the contents of the legal code in question and interface it with the program's internal API. This could be done via screen-scraping off of an existing state code website, modifying the parser to deal with a bulk XML file, converting input data into the program's simple XML import format, or by a few other methods. While a non-trivial task, it's something that can be knocked out in an afternoon, thus avoiding the need to create a universal data format and to persuade Wexis to provide their data in that format.

The magic happens after the initial data import. The State Decoded takes that raw legal text and uses it to populate a complete, fully functional website for end-users to search and browse those laws. By packaging the Solr search engine and employing some basic textual analysis, every law is cross-referenced with other laws that cite it and laws that are textually similar. If there exists a repository of legal decisions for the jurisdiction in question, that can be incorporated, too, displaying a list of the court cases that cite each section. Definitions are detected, loaded into a dictionary, and make the laws self-documenting. End users can post comments to each law. Bulk downloads are created, letting people get a copy of the entire legal code, its structural elements, or the automatically assembled dictionary. And there's a REST-ful, JSON-based API, ready to be used by third parties. All of this is done automatically, quickly, and seamlessly. The time elapsed varies, depending on server power and the length of the legal code, but it generally takes about twenty minutes from start to finish.

The State Decoded is a free program, released under the GNU Public License. Anybody can use it to make legal codes more accessible online. There are no strings attached.

It has already been deployed in two states, Virginia and Florida, despite not actually being a finished project yet.

State Variations
The striking variations in the structures of legal codes within the U.S. required the establishment of an appropriately flexible system to store and render those codes. Some legal codes are broad and shallow (e.g., Louisiana, Oklahoma), while others are narrow and deep (e.g., Connecticut, Delaware). Some list their sections by natural sort order, some in decimal, a few arbitrarily switch between the two. Many have quirks that will require further work to accommodate.

For example, California does not provide a catch line for their laws, but just a section number. One must read through a law to know what it actually does, rather than being able to glance at the title and get the general idea. Because this is a wildly impractical approach for a state code, the private sector has picked up the slack - Westlaw and LexisNexis each write their own titles for those laws, neatly solving the problem for those with the financial resources to pay for those companies' offerings. To handle a problem like this, The State Decoded either needs to be able to display legal codes that lack section titles, or pointedly not support this inferior approach, and instead support the incorporation of third-party sources of title. In California, this might mean mining the section titles used internally by the California Law Revision Commission, and populating the section titles with those. (And then providing a bulk download of that data, allowing it to become a common standard for California's section titles.)

Many state codes have oddities like this. The State Decoded combines flexibility with open source code to make it possible to deal with these quirks on a case-by-case basis. The alternative approach is too convoluted and quixotic to consider.

Regulations
There is strong interest in seeing this software adapted to handle regulations, especially from cash-strapped state governments looking to modernize their regulatory delivery process. Although this process is still in an early stage, it looks like rather few modifications will be required to support the storage and display of regulations within The State Decoded.

More significant modifications would be needed to integrate registers of regulations, but the substantial public benefits that would provide make it an obvious and necessary enhancement. The present process required to identify the latest version of a regulation is the stuff of parody. To select a state at random, here are the instructions provided on Kansas's website:

To find the latest version of a regulation online, a person should first check the table of contents in the most current Kansas Register, then the Index to Regulations in the most current Kansas Register, then the current K.A.R. Supplement, then the Kansas Administrative Regulations. If the regulation is found at any of these sequential steps, stop and consider that version the most recent.

If Kansas has electronic versions of all this data, it seems almost punitive not to put it all in one place, rather than forcing people to look in four places. It seems self-evident that the current Kansas Register, the Index to Regulations, the K.A.R. Supplement, and the Kansas Administrative Regulations should have APIs, with a common API atop all four, which would make it trivial to present somebody with the current version of a regulation with a single request. By indexing registers of regulations in the manner that The State Decoded indexes court opinions, it would at least be possible to show people all activity around a given regulation, if not simply show them the present version of it, since surely that is all that most people want.

A Tapestry of Data
In a way, what makes The State Decoded interesting is not anything that it actually does, but instead what others might do with the data that it emits. By capitalizing on the program's API and healthy collection of bulk downloads, clever individuals will surely devise uses for state legal data that cannot presently be envisioned.

The structural value of state laws is evident when considered within the context of other open government data.

Major open government efforts are confined largely to the upper-right quadrant of this diagram - those matters concerned with elections and legislation. There is also some excellent work being done in opening up access to court rulings, indexing scholarly publications, and nascent work in indexing the official opinions of attorneys general. But the latter group cannot be connected to the former group without opening up access to state laws. Courts do not make rulings about bills, of course - it is laws with which they concern themselves. Law journals cite far more laws than they do bills. To weave a seamless tapestry of data that connects court decisions to state laws to legislation to election results to campaign contributions, it is necessary to have a source of rich data about state laws. The State Decoded aims to provide that data.

Next Steps
The most important next step for The State Decoded is to complete it, releasing a version 1.0 of the software. It has dozens of outstanding issues - both bug fixes and new features - so this process will require some months. In that period, the project will continue to work with individuals and organizations in states throughout the nation who are interested in deploying The State Decoded to help them get started.

Ideally, The State Decoded will be obviated by states providing both bulk data and better websites for their codes and regulations. But in the current economic climate, neither are likely to be prioritized within state budgets, so unfortunately there's liable to remain a need for the data provided by The State Decoded for some years to come. The day when it is rendered useless will be a good day.

Waldo Jaquith is a website developer with the Miller Center at the University of Virginia in Charlottesville, Virginia. He is a News Challenge Fellow with the John S. and James L. Knight Foundation and runs Richmond Sunlight, an open legislative service for Virginia. Jaquith previously worked for the White House Office of Science and Technology Policy, for which he developed Ethics.gov, and is now a member of the White House Open Data Working Group.
[Editor's Note: For topic-related VoxPopuLII posts please see: Ari Hershowitz & Grant Vergottini, Standardizing the World's Legal Information - One Hackathon At a Time; Courtney Minick, Universal Citation for State Codes; John Sheridan, Legislation.gov.uk; and Robb Schecter, The Recipe for Better Legal Information Services. ]

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

The Swedish legal publisher Notisum AB has been on the Swedish market for online legal publishing since 1996.  Our Internet-based law book at www.notisum.se is read by more than 50,000 persons per week and our customers range from municipalities and government institutions to Swedish multinationals.

Now we are heading for China, and I would like to share with you some practical experiences from this highly dynamic market and our challenges in trying to conquer it.

The case for a legal monitoring tool, codenamed “EnviTool”

In close co-operation with our customers, we had developed a set of specialized Internet based tools  in Sweden for supporting the process of legal compliance and legal information sharing within big organizations. The key driver of these needs was the growing number of certificates according to the international environmental management standard ISO 14001:2004.

ISO 14001 is a worldwide industry standard to help companies to improve their environmental performance through the implementation of an environmental management system. There is much to say about management systems. Continuous improvement is the heart of the matter--it is all about doing the right things right. Establish a plan, do what you planned, check your results and then start all over by correcting your plans. Plan, Do, Check, Act.

According to the standard, you have to identify the relevant environmental legislation for your organization. You need access to those laws and regulations, and you have to keep an updated list. You should also make the information available to the people of your organization.

By providing an online legal register, monitored for changes, with a whole set of information sharing and workflow features, Notisum helps the certified companies to comply with the environmental legislation.

We developed this system step by step.  When it came to going outside the borders of the Kingdom of Sweden, we changed the name from Rättsnätet+Miljö to EnviTool.

The case for China

Sweden is a country of very high penetration of the ISO 14001 standard, and the use of the standard is in a mature phase in most organizations. China, on the other hand, is number one in the world, with more than 70,000 certificates issued. The growth is double-digit. So China is the place to be if you have products for this specific customer group. The users of the standard are yet immature in China, so we knew there were some challenges out there.

The market for legal information tools is overall immature in China and legal compliance is not always on top of the manager’s priority lists. However, Notisum took the first steps, starting in 2009, to take on the challenge to make China our second home market. Many challenges, expected and unexpected, were waiting for EnviTool.

Step one – the product

Like many commercial ventures, the EnviTool project was the result of a randomly started chain of events. Our Swedish CEO was playing golf with a professor at KTH, the Royal Institute of Technology in Stockholm.  The professor was in charge of a student exchange program between National University of Singapore (NUS) and KTH. We were asked to host an internship for an ambitious computer science student in our company for one academic year.

The internship was successful, our student was doing a great job and we learned a lot about Asia and the Chinese culture. We have now hosted three excellent NUS students from Singapore, all good representatives of their university and their country.  And all of them bilingual English and Chinese. That's when we decided that China would be an interesting market to try. And yes – China is far away from Sweden, it is terribly big and it was really too large a challenge for our company. We wanted to try anyway, with the hope that Singapore could be the bridge for us.

We decided to start a subsidiary in Singapore and so we did. It is easy, by the way. According to the World Bank, Singapore ranks number one in the world in ease of doing business. Coming from Sweden, ranked number 50 in the world in terms of how easily you pay your taxes, I had an almost religious moment when we got a letter of gratitude from the Singaporean tax authorities after paying our taxes. Not so in Sweden, I may add…

With the first NUS intern now as our first employee, we started translating and adapting our internet tool together with our development manager in Sweden. The technological challenges were there, of course. We base our technology on the Microsoft.NET platform, but the support for the simplified Chinese character set was not totally implemented everywhere.  Multi-language support was developed, and plenty were the occasions in the beginning when Swedish words popped up unexpectedly. The search function in Chinese is different in EnviTool and the relations between the legislative documents were so different from the Swedish and European law that we had to re-design our database structure.

Step two – the market research

With good help from the Swedish Trade Council in China, we did market research to see if there could be a similar market in China and if our business model could work.

After three journeys and two projects together with the trade council, we decided to give it a try. The EnviTool China project was about to take off. Learning to eat properly with chopsticks was part of the experience. Learning to appreciate the Chinese food was easier although there are some zoological challenges there too, outside the scope of this blog entry.

At this point in time we also employed a Chinese/Swedish project manager with extensive knowledge and experience in the field.

Step three – the content

Translating the tool to Chinese and English was the easy part. When it came to the content, we had to throw out everything from Sweden and put in Chinese legislation and comments. We soon found interesting challenges.

Our first experience of the Chinese legal tradition,which is in many ways different from where we come from, was the search for a standard for citations. In the Swedish databases we had successfully used computer software to automatically find citations, law titles, cross references and other document data.  It became clear to us that there were no shortcuts in the Chinese material. We had to input all data manually.

We decided to restrict the information to cover relevant legislation in the EHS (Environmental, Health & Safety) and CSR (Corporate Social Responsibility) field and to concentrate on the national level with some provincial/municipal areas like Beijing and Shanghai. The EHS/CSR users are professionals in their field of work and their industries. They are not lawyers and not very used to legal information systems. EnviTool were developed with EHS/CSR managers in our minds. We wrote the editorial content to suit the needs of our target audience.

We realized that we needed a partner in China to provide fast and timely information. In ChinaLawInfo, established by Peking University in association with the university’s Legal Information Center, we found a great partner. They are the most important legal information provider in China and we saw that Notisum of Sweden and ChinaLawInfo had many similarities in experience and way of working. Yes, we are small and they are big, but that goes for Sweden and China all over. So  EnviTool now provides the EHS/CSR laws and regulations from both ChinaLawInfo and government sources. We also have an on-going editorial co-operation in Beijing.

By now we also had good content. The EnviTool Internet service and database, provided from our Singapore company servers, were released in its first version in the fall of 2010.

Step four – market introduction

If company start-up was a short track in Singapore, it was a longer journey in the world’s second biggest economy. After having tried 50 other names, Envitool finally was translated to 安纬同 in Chinese and we got the business permit in August 2011.

We employed the people we needed and found a partner to help us with HR and finance issues.  Since then we have started our sales and marketing activities, moving slowly forward. The use of legal information tools served from Singapore is combined with management consulting from our team in Shanghai. We provide training in using the tool and can assist the clients in finding the laws and regulations relevant to their operation.

The second generation of the site is up and running at www.envitool.com and we are proud to have customers from China, the US, Japan and four different European countries.

What we have learned and what we think of the future

To get to know China and the Chinese people is of course one part of the fun. Being a European, you make many mistakes, sometimes because of language, sometimes cultural.

One example of this confusion was when I intervened in the editorial process. In EnviTool we provide bi-lingual Chinese/English short and long comments to laws and regulations. In the Swedish service, which I am more familiar with, the short comment is rendered in italics with the longer comment below in plain text. In the English version of the comments in EnviTool, the short one was not in italics. I complained and our programmer quickly changed this. Shortly thereafter, at a customer meeting, I showed the comments, now in Chinese language version. (I don't understand a word of Chinese.)  Can you imagine Chinese characters in italics? I can tell you, it makes no sense and it looks bad. That was the language mistake. The cultural mistake was managerial. A Swedish employee would have told me how stupid I were, if I came up with such a bad idea. The Asian employee (highly intelligent and highly educated) probably saw the problem and maybe thought “the boss is more stupid than usual, but he is my boss so I have better do what he tells me!”. A lot to learn, many aspects to consider.

To conclude, the start-up was a bit slow because of the red tape but so far, our government contacts have been smooth. We have felt very welcome at the Chinese authorities like the Ministry of Environmental Protection and local governments. In the end, our goals are similar: better environmental and occupational health & safety legal compliance - better environment and better life for the citizens.

We know it will take a long time for us to get the knowledge and experience needed to be a significant player in the Chinese market, and we are prepared to stay there and step by step build our presence.  It took many years to build a loyal and substantial customer base in Sweden. It will take even longer in China.

 

Magnus Svernlöv is the founder and chairman of the Swedish online legal publisher Notisum (www.notisum.se) and its Chinese subsidiary Envitool (www.envitool.cn). He holds an MBA from INSEAD, France, a MScEE degree from Chalmers University of Technology, Sweden and a BA from the School of Business, Ecnomics and Law, University of Gothenburg, Sweden. He welcomes any comment or feedback to ms@notisum.se

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed. The information above should not be considered legal advice. If you require legal representation, please consult a lawyer.

 

Farmland outside Matatiele

My father was, as was his father before him, a country lawyer in a remote but very beautiful part of South Africa, in the foothills of the Maluti mountains on the border between South Africa and Lesotho. Prominent in his legal office near the Magistrate's Court were shelves of leather bound volumes of South African statutes, cases, and law reports, which I found impressive, with their gold blocking on red spines. Even back then, South African lawyers were well supplied with legal publications, the production of which dated back to the mid-19th century, when a Dutch immigrant, Jan Carel Juta (who was married to Karl Marx's sister) published the first law reports. This means that the legal profession in South Africa has access to a century and a half of legal records, something of undoubted value, given that many African countries have no legal publications at all.

If it was a court day, one could hear from my father's office the hubbub of conversations in Sotho, Xhosa, English, and Afrikaans floating down the road from outside the Magistrate's Court, where blanket-clad Sotho men down from the mountains had tied up their horses at a hitching post alongside police vans and farmer's trucks.

Rural settlements

This was Wild West country in the 19th century -- and cross-border cattle rustling cases continue to figure large -- but when I grew up, in the wake of the Second World War, it presented itself as a quiet village, in a prosperous farming area surrounded by very large 'trust lands' (in colonial- and apartheid-speak) of traditional Black peasant communities, where the place names were those of the presiding chiefs. This naming was a symptom of the colonial manipulation of the legal system, described by Mahmood Mamdani, to impose an autocratic and patriarchal 'customary' system, a heritage that lingers on in a democratic South Africa. In a legal practice like my father's, there was a startling dichotomy between the well-paid work done for the prosperous white community with its commercial- and property-law needs, and the customary-law and criminal cases that came from the overwhelmingly larger black communities, dependent on legal aid or paying their fees in small cash installments to a clerk in a back office.

Village traders

I was thus aware at a young age of conflicting values at the intersection between western concepts of the law, its formal and Latinate expression and punctilious enforcement, and the needs of rural black communities; the problematic role that language played in the adversarial ritual of criminal court procedure, alien to many participants; and the difficulties inherent in responding to the needs of very large and widely geographically dispersed poor and disenfranchised communities. The stories my father told about his days in court as a defending attorney were often tales of incomprehension compounded by mistranslation.

This rural setting provides a vivid and useful map of divergent needs for access to legal information in the complexity of an African context. In fact this setting throws a stark spotlight on issues of legal access that are easily obscured in the global North. In an urban setting in South Africa, the issues would be different respecting details, but generally the same: the question is how to bridge the gap between the formalities and rituals of colonially-based and imported legal discourse and the ways in which the legal system impacts on the lives of most of the population. In this context, how does one transform into action Nick Holmes's concerns, as expressed in his VoxPopuLII blog, about making the law accessible, i.e., suited to meeting the needs of citizens and lawyers in less privileged practices, in an appropriate language and format? Or, to use Isabel Moncion's distinction between the law and justice, how does one communicate the law in such a way as to reach the people who need the information? And lastly -- of vital importance in an African setting where resources are scarce -- how does one make such a publishing enterprise sustainable?

I do not come to this discussion with a legal training. I would have become a lawyer, no doubt, like the generations of my father's family, but 1950s gender stereotypes got in the way. Instead, I became an academic publisher, and then a consultant and researcher on the potential of digital media in Africa. This trajectory gives a particular coloration to my concerns for access to legal information in Africa: my approach brings together an acknowledgement of the need for professional skills and sustainability with an awareness of the serious limitations of the current publishing regime in providing comprehensive access to legal information.

Law publishing in South Africa

The fact that South Africa has a well-established legal publishing sector sets that nation apart from the rest of Africa. The strength of the legal publishing industry is a reflection not only of South Africa's prosperity, but also of the distinctiveness of the South African legal system, a fusion of Romano-Dutch and British legal traditions. The uniqueness of this system meant that South African law publishing could not rely on purely British sources, and gave local South African legal publishers a market not subject to competition from Britain. However, the nature of this legal system also gave it a tendency, at least in its early stages, towards a particularly impenetrable mode of expression, fueled by the Latinisms of its Roman roots.

Lawyers in practice, the legal departments of big companies, and the courts are relatively well served by the South African legal publishing industry, and the system is self-sustaining. However, there are problems. One is that the industry still clings to print-based business models. The focus is on the readership that can pay and on the topics that are of interest to this readership. The danger resides in seeing this situation as sufficient: in seeing the relatively wealthy market being served as the whole market, and the narrow range of publications produced as satisfying the totality of publication needs. With the South African legal profession still struggling to diversify out of white male dominance, this is an important issue.

As global media have consolidated in the last few decades, South African legal publishers have shown a decreasing willingness to try to find ways of addressing commercially marginal markets. This has meant that, although mainstream legal publishers in South Africa have long produced digital publications, there is reliance on a high-price market model. In other countries one might talk of a failure to address niche markets, but in South Africa it is the mass of the population who are marginalised by this business model. A smaller specialist publisher, Simon Sefton's Siber Ink, seems more aware than the bigger players of the need for accessible language and affordable prices for legal resources, as well as active social media engagement to create debates about key community issues.

Some hope of solutions to the question of access by otherwise marginalised readers lies in the development, on the margins of the publishing industry, of innovative smaller players leveraging digital media to reach new readerships, often using open source models that combine the free and the paid for.

Access to legal information - The role of government

The main efforts being put into access to legal information in South Africa are quite rightly focusing on government-generated information, which, being taxpayer funded, should be in the public domain and is indeed available on the South African Government Information site. Progress is being made by the Southern African Legal Information Institute (SAFLII) in improving the accessibility of primary legal resources, and success would mean the availability of a substantial body of information that would then be available for interpretation and translation.

Beyond this, government practice in ensuring this level of access is patchy. Some departments are good at posting legislation on their Websites, others less so. Government Gazettes, although theoretically accessible to all, can be difficult to find and navigate; and the collation of legislative amendments with the original Acts is also patchy. There is -- at least in theory -- an acceptance of the need in government for an open government approach, but the fact that there is a publishing industry serving the profession and the courts ironically reduces the pressure to achieve this goal.

South Africa Truth and Reconciliation Commission Report

The Truth and Reconciliation Commission

There is a danger, however, when government sees the print-publication profit model as the natural and only way of producing sustainable publications. This was brought home in 1998 with a very important publication: the Report on the Truth and Reconciliation Commission (TRC). This sad and salutary story is worth telling in some detail.  But first, a disclaimer: I was working at the time for the company that distributed the Report, and I was actively involved in securing the bid from publishers, although I was not supportive of the business model that was imposed in the end.

Five volumes of testimony, analysis, and findings from the Commission were produced to high production standards. The compilers saw the archival material that lay behind these volumes as 'the Commission's greatest legacy' and the published volumes as 'a window on this incredible resource, offering a road map to those who wish to travel into our past' (p.2).  The Department of Justice, working on the stereotypical view of how publication works, insisted that production and printing costs had to be fully recovered. The Department set a high price to be charged by the appointed distributor, Juta Law and Academic Publishers.

The second set of problems arose with the digital version of the publication that Juta had offered to develop. The digital division of the legal publisher insisted on high prices. It was this inappropriate digital business model that created a row in the press. Then, a 'pirate' version of the publication was produced by the developer of the TRC Website, who claimed that he had the rights to a free online product. Public opinion was firmly behind the idea that the digital version should be free and that the publisher was profiteering out of South Africa's pain.

In the end, hardly any copies of the Report were sold. The lesson was a hard one for a publishing company: digital content that is seen as part of the national heritage cannot be subjected to high-price commercial strategies.

The full text of the TRC Report is now online on the South African Government Information Website.

The LRC Website

Leaping the divide - Law and land

What is more difficult and diffuse is the route to providing access to really useful information that could help communities engage with the impact of legislation on their lives, whether the issue be housing policy or land tenure legislation, gender rights or press freedom.

If we go back to my initial example of rural communities and their access to the law, there is a dauntingly wide range of issues at stake -- questions of individual agency, gender rights, fair labour practice, property rights and access to land, food sustainability, and a number of human rights issues -- including legislative process as the ANC government implements the Communal Land Rights Act of 2004. In Matatiele, the village in which my father practised, there has been a long-drawn-out dispute about provincial boundaries, with the community challenging the legislative process in the Constitutional Court.

Questions of access to this kind of information are addressed in an ecosystem broader than the conventional publishing industry. NGOs and research units based in universities and national research councils address the wider concerns of community justice; using a variety of business models, these organizations produce a range of publications and work closely with communities. In the case of the Communal Land Rights Act, the Legal Resources Centre (LRC) supported a Constitutional Court challenge and published a book on the Act and its problems. The LRC, like other organisations of its kind, makes booklets, brochures, and reports freely available online. These efforts tend to be donor-funded and, increasingly, donors like the Canadian International Development Research Centre (IDRC) insist that publications be distributed under Creative Commons licenses. In the case of books published by commercial publishers, this means an open access digital version, and a print version for sale.

A major problem in providing commentary on legislative issues for the general public is that of ensuring a lack of bias. In the case of the Communal Land Rights Act -- as well as for the other critical justice issues that it covers -- the LRC explicitly aimed to provide a comprehensive insight into the issues for experts and the general public; the Centre accordingly placed the full text of its submissions to the hearings as well as answering affidavits on a CD-ROM and online. It also produces a range of resources, online text, and audio, targeted at communities.

Similar publication efforts are undertaken by a number of other NGOs and research centres -- such as the Institute for Poverty, Land, and Agrarian Studies (PLAAS) at the University of the Western Cape and the African Centre for Cities at the University of Cape Town -- on a wide range of issues. These organizations' publishing activities tend to be interdisciplinary and the general practice is to place reports and other publications online for free download. There is a growing wave, in scholarly publishing in particular, to seek a redefinition of what constitutes 'proper' publishing; this process has yielded the notion of a continuum between scholarly (and professional) work and the 'translation' of this work into more accessible versions.

A useful strategic exercise would be to tag and aggregate the legal publishing contributions of NGOs and research centres -- as these resources are often difficult to track, or hidden deep in university Websites -- preferably with social networking spaces for discussion and evaluation.

Sustainability models

These civil society publishers are generally dependent on donor funding. What is needed is to recognise them as part of the publishing ecosystem. The question is how to create publishing models that can offer longer-term sustainability that might work beyond a well-resourced country like South Africa. The most promising and sustainable future looks to be in small and innovative digital companies using open source publishing models, offering free content as well as value-added services for sale. Examples are currently mostly to be found in textbook and training models, like the Electric Book Works Health Care series, which offers free content online, with payment for print books, training, and accreditation.

What is clear is that multi-pronged solutions must be found over time to the question of how to bridge the divide in African access to reliable and relevant legal information, and that a promising site for these solutions is the intersection between research and civil society organisations and community activists.

Eve GrayEve Gray is an Honorary Research Associate in the Centre for Educational Technology at the University of Cape Town and an Associate in the IP Law and Policy Research Unit. She is a specialist in scholarly communications in the digital age, working on strategies for leveraging information technologies to grow African voices in an unequal global environment.

Photos: Eve Gray CC BY

VoxPopuLII is edited by Judith Pratt. Editor-in-Chief is Robert Richards, to whom queries should be directed. The statements above are not legal advice or legal representation. If you require legal advice, consult a lawyer. Find a lawyer in the Cornell LII Lawyer Directory.