open source software » VoxPopuLII

AT4AM: the XML web editor used by Members of European Parliament

Electronic government, European Union, Legal XML, Legislative information systems, open source software No Responses »

Aug 152013

AT4AM – Authoring Tool for Amendments – is a web editor provided to Members of European Parliament (MEPs) that has greatly improved the drafting of amendments at European Parliament since its introduction in 2010.

The tool, developed by the Directorate for Innovation and Technological Support of European Parliament (DG ITEC) has replaced a system based on a collection of macros developed in MS Word and specific ad hoc templates.

Why move to a web editor?

The need to replace a traditional desktop authoring tool came from the increasing complexity of layout rules combined with a need to automate several processes of the authoring/checking/translation/distribution chain.

In fact, drafters not only faced complex rules and had to search among hundreds of templates in order to get the right one, but the drafting chain for all amendments relied on layout to transmit information down the different processes. Bold / Italic notation or specific tags were used to transmit specific information on the meaning of the text between the services in charge of subsequent revision and translation.

Over the years, an editor that was initially conceived to support mainly the printing of documents was often used to convey information in an unsuitable manner. During the drafting activity, documents transmitted between different services included a mix of content and layout where the layout sometime referred to some information on the business process that should rather be transmitted via other mediums.

Moreover, encapsulating in one single file all the amendments drafted in 23 languages was a severe limitation for subsequent revisions and translations carried out by linguistic sectors. Experts in charge of legal and linguistic revision of drafted amendments, who need to work in parallel on one document grouping multilingual amendments, were severely hampered in their work.

All the needs listed above justified the EP undertaking a new project to improve the drafting of amendments. The concept was soon extended to the drafting, revision, translation and distribution of the entire legislative content in the European Parliament, and after some months the eParliament Programme was initiated to cover all projects of the parliamentary XML-based drafting chain.

It was clear from the beginning that, in order to provide an advanced web editor, the original proposal to be amended had to be converted into a structured format. After an extensive search, XML Akoma Ntoso format was chosen, because it is the format that best covers the requirements for drafting legislation. Currently it is possible to export amendments produced via AT4AM in Akoma Ntoso. It is planned to apply Akoma Ntoso schema to the entire legislative chain within eParliament Programme. This will enable EP to publish legislative texts in open data format.

What distinguishes the approach taken by EP from other legislative actors who handle XML documents is the fact that EP decided to use XML to feed the legislative chain rather than just converting existing documents into XML for distribution. This aspect is fundamental because requirements are much stricter when the result of XML conversion is used as the first step of legislative chain. In fact, the proposal coming from European Commission is first converted in XML and after loaded into AT4AM. Because the tool relies on the XML content, it is important to guarantee a valid structure and coherence between the language versions. The same articles, paragraphs, point, subpoints must appear at the correct position in all the 23 language versions of the same text.

What is the situation now?

After two years of intensive usage, Members of European Parliaments have drafted 285.000 amendments via AT4AM. The tool is also used daily by the staff of the secretariat in charge of receiving tabled amendments, checking linguistic and legal accuracy and producing voting lists. Today more then 2300 users access the system regularly, and no one wants to go back to the traditional methods of drafting. Why?

Because it is much simpler and faster to draft and manage amendments via an editor that takes care of everything, thus allowing drafters to concentrate on their essential activity: modifying the text.

Soon after the introduction of AT4AM, the secretariat’s staff who manage drafted amendments breathed a sigh of relief, because errors like wrong position references, which were the cause of major headaches, no longer occurred.

What is better than a tool that guides drafters through the amending activity by adding all the surrounding information and taking care of all the metadata necessary for subsequent treatment, while letting the drafter focus on the text amendments and produce well-formatted output with track changes?

After some months of usage, it was clear that not only the time to draft, check and translate amendments was drastically reduced, but also the quality of amendments increased.

The slogan that best describes the strength of this XML editor is: “You are always just two clicks away from tabling an amendment!”

Web editor versus desktop editor: is it an acceptable compromise?

One of the criticisms that users often raise against web editors is that they are limited when compared with a traditional desktop rich editor. The experience at the European Parliament has demonstrated that what users lose in terms of editing features is highly compensated by the gains of getting a tool specifically designed to support drafting activity. Moreover, recent technologies enable programmers to develop rich web WYSIWYG (What You See Is What You Get) editors that include many of the traditional features plus new functions specific to a “networking” tool.

What’s next?

The experience of EP was so positive and so well received by other Parliaments that in May 2012, at the opening of the international workshop “Identifying benefits deriving from the adoption of XML-based chains for drafting legislation“, Vice President Wieland announced the launch of a new project aimed at to providing an open source version of the AT4AM code.

in a video conference with the United Nations Department for General Assembly and Conference Management from New York on 19 March 2013, Vice President Wieland announced, the UN/DESA’s Africa i-Parliaments Action Plan from Nairobi and the Senate of Italy from Rome, the availability of AT4AM for All, which is the name given to this open source version, for any parliament and institution interested in taking advantage of this well-oiled IT tool that has made the life of MEPs much easier.

The code has been released under EUPL(European Union Public Licence), an open source licence provided by European Commission that is compatible with major open source licences like Gnu GPLv2 with the advantage of being available in the 22 official languages of the European Union.

AT4AM for All is provided with all the important features of the amendment tool used in the European Parliament and can manage all type of legislative content provided in the XML format Akoma Ntoso. This XML standard, developed through the UN/DESA’s initiative Africa i-Parliaments Action Plan, is currently under certification process at OASIS, a non-profit consortium that drives the development, convergence and adoption of open standards for the global information society. Those who are interested may have a look to the committee in charge of the certification: LegalDocumentML

Currently the Documentation Division, Department for General Assembly and Conference Management of United Nations is evaluating the software for possible integration in their tools to manage UN resolutions.

The ambition of EP is that other Parliaments with fewer resources may take advantage of this development to improve their legislative drafting chain. Moreover, the adoption of such tools allows a Parliament to move towards an XML based legislative chain. The distribution of legislative content in open document formats like XML allows other parties to treat in an efficient way the legislation produced.

Thanks to the efforts of European Parliament, any parliament in the world is now able to use the advanced features of AT4AM to support the drafting of amendments. AT4AM will serve as a useful tool for all those interested in moving towards open data solutions and more democratic transparency in the legislative process.

At AT4AM for All website it is possible to get the status of works and run a sample editor with several document types. Any Parliament interested can go to the repository and download the code.

Claudio Fabiani is Project Manager at the Directorate-General for Innovation and Tecnological Support of European Parliament. After an experience of several years in private sector as IT consultant, he started his career as civil servant at European Commission, in 2001, where he has managed several IT developments. Since 2008 he is responsible of AT4AM project and more recently he has managed the implementation of AT4AM for All, the open source version.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

Following the Law with Scout

Citizen participation in lawmaking, free access to law, Legislative information systems, Open Government Data, open source software 1 Response »

Sep 302012

At my organization, the Sunlight Foundation, we follow the rules. I don’t just mean that we obey the law — we literally track the law from inception to enactment to enforcement. After all, we are a non-partisan advocacy group dedicated to increasing government transparency, so we have to do this if we mean to serve one of our main functions: creating and guarding good laws, and stopping or amending bad ones.

One of the laws we work to protect is the Freedom of Information Act. Last year, after a Supreme Court ruling provided Congress with motivation to broaden the FOIA’s exemption clauses, we wanted to catch any attempts to do this as soon as they were made. As many reading this blog will know, one powerful way to watch for changes to existing law is to look for mentions of where that law has been codified in the United States Code. In the case of the FOIA, it’s placed at 5 U.S.C. § 552. So, what we wanted was a system that would automatically sift through the full text of all legislation, as soon as it was introduced or revised, and email us if such a citation appeared.

With modern web technology, and the fact that the Government Printing Office publishes nearly every bill in Congress in XML, this was actually a fairly straightforward thing to build internally. In fact, it was so straightforward that the next question felt obvious: why not do this for more kinds of information, and make it available freely to the public?

That’s why we built Scout, our search and notification system for government action. Scout searches the bills and speeches of Congress, and every federal regulation as they’re drafted and proposed. Through the awe-tacular power of our Open States project, Scout also tracks legislation as it emerges in statehouses all over the country. It offers simple and advanced search operators, and any search can be turned into an email alert or an RSS feed. If your search turns up a bill worth following, you can subscribe to bill-specific alerts, like when a vote on it is coming up.

This has practical applications for, really, just about everyone. If you care about an issue, be it as an environmental activist, a hunting enthusiast, a high (or low) powered lawyer, or a government affairs director for a company – finding needles in the giant haystack of government is a vital function. Since launching, Scout’s been used by thousands of people from a wide variety of backgrounds, by professionals and ordinary citizens alike.

Search and notifications are simple stuff, but simple can be powerful. Soon after Scout was operational, our original FOIA exemption alerts, keyed to mentions of 5 U.S.C. § 552, tipped us off to a proposal that any information a government passed to the Food and Drug Administration be given blanket immunity to FOIA if the passing government requested it.

If that sounds crazily broad, that’s because it is, and when we in turn passed this information onto the public interest groups who’d helped negotiate the legislation, they too were shocked. As is so often the case, the bill had been negotiated for 18 months behind closed doors, the provision was inserted immediately and anonymously before formal introduction, and was scheduled for a vote as soon as Senate processes would allow.

Because of Scout’s advance warning, there was just barely enough time to get the provision amended to something far narrower, through a unanimous floor vote hours before final passage. Without it, it’s entirely possible the provision would not have been noticed, much less changed.

This is the power of information; it’s why many newspapers, lobbying shops, law firms, and even government offices themselves pay good money for services like this. We believe everyone should have access to basic political intelligence, and are proud to offer something for free that levels the playing field even a little.

Of particular interest to the readers of this blog is that, since we understand the value of searching for legal citations, we’ve gone the extra mile to make US Code citation searches extra smart. If you search on Scout for a phrase that looks like a citation, such as “section 552 of title 5”, we’ll find and highlight that citation in any form, even if it’s worded differently or referencing a subsection (such as “5 U.S.C. 552(b)(3)”). If you’re curious about how we do this, check out our open source citation extraction engine – and feel free to help make it better!

It’s worth emphasizing that all of this is possible because of publicly available government information. In 2012, our legislative branch (particularly GPO and the House Clerk) and executive branch (particularly the Federal Register) provide a wealth of foundational information, and in open, machine-readable formats. Our code for processing it and making it available in Scout is all public and open source.

Anyone reading this blog is probably familiar with how easily legal information, even when ostensibly in the public domain, can be held back from public access. The judicial branch is particularly badly afflicted by this, where access to legal documents and data is dominated by an oligopoly of pay services both official (PACER) and private-sector (Westlaw, LexisNexis).

It’s easy to argue that legal information is arcane and boring to the everyday person, and that the only people who actually understand the law work at a place with the money to buy access to it. It’s also easy to see that as it stands now, this is a self-fulfilling prophecy. If this information is worth this much money, services that gate it amplify the political privilege and advantage that money brings.

The Sunlight Foundation stands for the idea that when government information is made public, no matter how arcane, it opens the door for that information to be made accessible and compelling to a broader swathe of our democracy than any one of us imagines. We hope that through Scout, and other projects like Open States and Capitol Words, we’re demonstrating a few important reasons to believe that.

Eric Mill is a software developer and international program officer for the Sunlight Foundation. He works on a number of Sunlight’s applications and data services, including Scout and the Congress app for Android.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

[Editor’s Note: For topic-related VoxPopuLII posts please see, among others: Nick Holmes, Accessible Law; Matt Baca & Olin Parker, Collaborative, Open Democracy with LexPop; and John Sheridan, Legislation.gov.uk

LexML Brazil Project

elegislation, elegislation systems, information retrieval, Legal identifiers, Legal metadata, Legal ontologies, Legal text processing, Legal XML, Legislative information systems, open source software, search 2 Responses »

Oct 152010

This post is divided into three topical sections. The first one is an introduction to the LexML Brazil Project and its unified search portal, after which some aspects related to semantic interoperability shall be presented and, at the end, we show the current work and future direction of the project.

Before going on to the aforementioned subjects, a few words about Brazil and its legislative and legal systems are necessary. Brazil is a country of continental proportions, composed of 27 states and more than five thousand municipalities, or cities, as in Brazil no distinction is made between town and city. As a federative system, each state and municipality has its own legislative chamber. While states and cities follow a unicameral system, the Federation itself has a bicameral system, with the National Congress divided into a Chamber of Deputies and the Federal Senate. These legislatures generate a great number of laws, or normative acts. The abundance of normative acts is very significant, considering that, in contrast with Common Law systems, Brazil’s legal system, based on the Civil Law, is characterized by the predominance of normative acts.

According to Edilenice Passos, “the proliferation of normative acts, of higher or lower hierarchy, eventually causes total chaos, for this big mass of juridical documents hampers the work of lawyers, of researchers, and of the very citizens, who are ruled by Brazilian laws.” Edilenice Passos also cites Arnoldo Wald, who, in 1969, was already alerting Brazilians that “the true legislative labyrinth created as a result of an inflation of statutes passed in recent years has turned the ruling Brazilian law into a patchwork, in which the mere legislative updating becomes a daily torture for a lawyer and a judge who are searching for the rules applicable to a specific subject, from among acts, supplementary acts, institutional acts, decree-laws, and other normative acts.”

Almost all Brazilian legal and legislative information is available through the Internet. However, this information is distributed among several thousand sites, each containing documents produced by a specific government institution. Thus, the relationships between acts of different institutions is not available explicitly, making it very hard to understand this “legal patchwork.”

Nowadays, much time is lost looking for this information, filtering the results of search engines. As Roy Tennant says, “Librarians like to search; everyone else likes to find,” and further adds: “People generally want to find everything they can on a topic, ranked by relevance and displayed in ways that make it easy to narrow in on their goal.”

Born to address these issues, LexML Brazil is an information network that aims to organize Brazil’s legislative and legal information. The project is an initiative of the “Comunidade TI Controle” (IT Control Community) and is being implemented by the Brazilian Federal Senate, through PRODASEN (the Senate’s special secretariat for information systems) and Interlegis (a virtual community of Brazilian legislatures).

LexML Brazil’s first product is the Legislative and Legal Information Portal, which opened on June 30, 2009, indexing 1.28 million documents. In September 2010, its index ranged through more than 1.5 million documents. By indexing the metadata collected from several institutions using the OAI-PMH protocol, the portal unifies access to a variety of legislative and legal information sources, which is a step toward the goal of guaranteeing Brazilians’ constitutional right of access to information.

LexML Portal

The LexML Portal home page layout is very simple and is similar to Google‘s main page. At this screen, it is possible to restrict the search to Legislation, Jurisprudence, or Bills.

The search results page allows the user to refine the search by using filters, according to his or her information requirements. Five filters are available: location, issuing authority, document type, date, and acronyms.

The detail page provides links to the official publication version of each document, and to other publications available in information systems of network participants, which, in this particular case, are: National Press, Presidency, Chamber of Deputies, and Federal Senate. General information about the document is available by clicking one of “Mais Detalhes (More details)” links, which directs the Web browser to the corresponding network participant’s metadata page. A service providing automatic identification of textual references can be activated by clicking the “Linker” label.

Semantic Interoperability

While systems interoperability and syntactic issues can be managed with the estabilished standards of representation, codification, and exchange (XML, METS, Unicode, OAI-PMH, etc.), structural and semantic interoperability demands the adoption of a reference model that allows the integration of several models and the use of a unified terminology for indexing different sources of information. According to Patel et al., the general purpose of semantic interoperability is “to support complex and advanced context-sensitive query processing over heterogeneous information resources.” Lack of semantic interoperability generates then the “information silos” problem, characterized by the lack of information integration and consequent inability to process complex queries.

The next section presents the design choices made by the LexML Brazil Project to address issues related to semantic interoperability using Ranganathan‘s “stratification planes” classification system, featuring: an idea plane, a verbal plane, and a notational plane.

Idea Plane

The idea plane is composed of the abstract entities of a domain, independently of how they are nominated or identified.

The metadata standards that propose to address interoperability issues do so either for a specific, restricted domain or for heterogeneous domains. Specialized metadata standards (MARC, EAD, MODS, etc.) allow different sources of information about specific domains (bibliographical or archival information) to be integrated and searched in an advanced form. On the other hand, the Dublin Core standard is one of the few that try to integrate arbitrarily heterogeneous sources using a minimum set of elements and qualifiers. Its characteristic simplicity enables easy adoption by multiple actors, but also hinders query processing, preventing the use of the rich chain of relationships among entities. The lack of generality or expressiveness of these standards precludes their use for achieving semantic interoperability of heterogeneous sources of legislative and legal information in Brazil.

An alternative is to use formal ontologies instead of metadata standards. According to Martin Doerr, “recently, more and more projects and theoreticians support the use of formal ontologies as common conceptual schema for information integration.” One such ontology, the CIDOC CRM model, was designed to help the integration, mediation, and interchange of heterogeneous cultural heritage information. It was developed in 1994 and has since been approved as the ISO 21127:2006 standard. The CIDOC CRM model is then a natural choice for conceptual schemas of legal and legislative information, if one considers that the text corpus consisting of a nation’s sources of law is a part of the nation’s cultural heritage information.

However, the CIDOC CRM “document” concept lacks the necessary detail needed to describe the relationships among the several information abstraction levels: work, expression, manifestation, and item. That requirement is fulfilled by the FRBR_ER entity-relationship model, which was considered as a reference model in earlier phases of the project (“An Adaptation of the FRBR Model to Legal Norms,” João Lima, Proceedings of the V Legislative XML Workshop, Florence, 2005) .

The FRBR_OO standard, an ontology created by a working group formed in 2003 by representatives of IFLA (International Federation of Library Associations and Institutions) and ICOM (International Council of Museums) for purposes of harmonizing both models, was adopted by the LexML project because it combines the advantages of both models while addressing their shortcomings. As such, FRBR_OO manifests a great affinity to the LexML domain (“A Time-aware Ontology for Legal Resources,” João Lima et al., Proceedings of the Tenth International ISKO Conference, 2008).

One of the great innovations of the CIDOC CRM model is the information structuring around temporal events, a central concept in the model. This contrasts with most other metadata models, which have resources as the central objects of interest. This innovative approach defines events as entities that connect actors, things (concrete and abstract), places, and time intervals.

This particular emphasis could be criticized on the ground that the user is generally interested in a specific resource, such as the text of a law. However, the result of a search for information about a law is much more relevant if it includes an organized list of events related to the resource, along with the resource itself.

The importance of choosing a suitable reference model is easily observable in the present discussion about what particular syntax to use to codify persistent identifiers — urn:lex, LegisLink, Akoma Ntoso, etc. Before reaching the syntax level, such discussions should focus first on the idea plane, where a greater potential for integration exists. A consensus reached at this level would allow great flexibility for the specification of diverse persistent identifier syntaxes.

Verbal Plane

The CIDOC CRM ontology separates the class of types and denominations from other classes. Multiple names, identifiers, and types can be attributed to all entities of the CRM, allowing any domain class to be classified by several taxonomies and be known by multiple names and identifiers.

This approach is used in LexML to represent different terms that identify the same concepts. Six classes form LexML’s uniform resource identifiers: place, authority, type of document, event, type of content, and language. To externalize the LexML vocabularies specification, we recommend, and use, the W3C SKOS (Simple Knowledge Organization System).

Notational Plane

The definition of uniform and persistent identifiers is fundamental for the creation and maintenance of an information chain. Identifiers are already part of the legal domain. For identification purposes, numbers are attributed to rulings, decisions, abridgments, and bills, allowing references by means of textual remissions. In the computational environment, the creation of persistent and uniform identifiers allows not only identification and reference, but also access to documents by means of textual hyperlinks.

Based on the experience of the Italian project Norme in Rete with respect to URN (Uniform Resource Name) identifiers, LexML defines a grammar for the construction of identifiers for legislative and legal documents in Brazil. As an example, the name “urn:lex:br:federal:lei:1993-06-21;8666” identifies, in a persistent and unique way, the “Federal Act No. 8666, of June 21, 1993.” If all information systems agree with respect to the identifiers, it is possible to share descriptive metadata, as well as information about semantic relationships, such as regulation, amendment, abrogation, etc.

The Linker service, accessible through the LexML Portal (see, e.g., Act 11.705 without linker and Act 11.705 with linker), creates hyperlinks automatically through a dynamic textual analysis that identifies textual remissions of [i.e., citations to] normative documents. These hyperlinks can be used to navigate through textual remissions.

Future Directions

LexML 1.0 consists of the Search Portal, the Resolution Service, the Persistent Identifier, and the Linker Service. The next version, LexML 2.0, will go further: it will involve the development of open source tools for managing the complete text of documents encoded according to the LexML Brazil XML Schema, which was derived from the schemas of the Akoma Ntoso Project.

The complete management of document texts in a structured form has been a goal of the project since its inception. In as early as 2000, the Federal Constitution Portal was implemented following this idea. This portal allows the user to see all the versions of the constitutional text through a timeline, with the option to see the list of historical changes [see, e.g., art. 12] and with the ability to navigate bi-directional links [for example, in art. 154, click on the blue arrows].

During the development of that portal, taking into account the various forms of XML used to encode normative texts in many countries, and especially the experience of the Italian project Norme in Rete, a decision was made to make a unified portal and a persistent identifier a priority of the LexML project. Presently, our efforts to build open source tools for management of document texts are being renewed. One of these tools, a LexML Document Editor, will enable the authoring of legal texts as if using a word processor, but producing a structured document at the end. Another tool is the Compiler, which will semi-automatically generate modified versions of documents that have been updated by other legal acts. The Consolidator will help to simplify the display of legal information — and users’ experience of the legal system — through the consolidation of several related normative acts into a single act. The Comparator will be used to display the differences between versions of a document. The last tool, the Publisher, will be used to render XML content in different formats, such as html, PDF, PDF-A, EPUB, etc., with the ability to choose different views of the same text, such as the original text, the updated text as of a specific date, etc.

Last but not least, the Information Management Committee, which is a community of practice composed of librarians, archivists, and information analysts of several institutions of the three Brazilian governmental branches, interested in the management of legal and legislative information, is responsible for the definition of the priority and long range planning of the LexML Brazil Project.

[Editor’s Note: For documentation, schemas, and controlled vocabularies respecting LexML Brazil, please see the LexML Brazil Project Website. For more information on these issues, please see the following VoxPopuLII posts: John Sheridan on Legislation.gov.uk, Ivan Mokanov on CANLII‘s innovative legal citation system, Joe Carmel on LegisLink, and Robb Shecter on OregonLaws.org.]

The LexML Brazil core team, from left to right: João Lima (joaolima at senado.gov.br) is the leader of The LexML Project. His Information Science Ph.D. thesis details many of the concepts presented here; João Holanda (jholanda at senado.gov.br) holds a BSc in History from UnB; João Rafael (jrafael at senado.gov.br) holds a MSc in Computer Science from UFMG and a BSc in Computer Science from UnB; Marcos Fragomeni (fragomeni at senado.gov.br) holds a BSc in Computer Science from UnB.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

Crowdsourcing Legal Commentary

Adding legal commentary to free access to law services, Applications, Crowdsourcing and free access to law, Crowdsourcing and legal information systems, Crowdsourcing the writing of secondary legal resources, free access to law, Legal commentary, Legal metadata, Legal social media, Legal social networks, Legal treatises, open source software, Public access to legal information, Secondary legal resources, Web 2.0 and law, Wikis and law 5 Responses »

Mar 312010

Background: The Need for Free Legal Commentaries

A legal commentary — also known as a legal treatise — is an unofficial text, intended to complement a particular source of law, often consisting of one or more statutes. A commentary on a statute provides information on how to interpret terms in the statutory text, summarizes examples of the statute’s application, references other relevant parts of the statutory law, and explains the legislative history and policy background of the statute. As statutory law is typically written in an open-ended way, setting forth norms in general language and usually without examples of how the law should be applied, a newcomer may have difficulty understanding it. Legal commentaries help with this.

In many jurisdictions, the texts of statutes are published by the legislature, usually without claim to copyright, and thus are made available to all, including to free-access-to-law services. Legal commentaries, by contrast, are written by private parties, who have a copyright on the resulting text, copies of which they typically sell at price levels that prevent most persons other than legal professionals from accessing them. Thus, most citizens may have free access to law, but not to the texts necessary for understanding it.

My background is that of a software developer, but in 2005 I started law school in Sweden. At that time, I couldn’t find any good freely accessible web service containing Swedish statutory law, so I built one called lagen.nu (which, translated, means “the law, now”). When the site debuted it contained around 4,000 pages of statutory law, and another 10,000 pages with headnotes on legal cases, with hyperlinks and cross-references. Over the next few years the site was gradually improved, with better hyperlinking of references, the addition of the full text of case law, and an improved graphical interface. The purpose of the site was and is to make law accessible to the common person. But making available official data such as statutory and case law can only get you so far, as there are many aspects of the law which are not apparent from the face of primary legal documents.

The Swedish legal system, like most civil law systems, is based more on statutory law than are most common law systems. Jurists in Sweden therefore spend much time interpreting statutory law. There are several publishers — including Thomson Fakta and Norstedts Juridik — that provide legal commentaries on Swedish central acts. The commentaries are written for legal professionals, which is evident in both the extent of the commentaries and their price. (As an example, the standard commentary on the Swedish code of judicial procedure fills four loose-leaf binders and costs 4435 SEK [$611 US].) As such, they are not accessible in any realistic sense to laypeople.

Therefore, I started thinking about how one could create a free commentary on Swedish law. Traditional commentaries require enormous resources, the most critical and expensive being the time of professors or other experts. Commentaries written for legal professionals generally fall into either of two categories: (1) in-depth works, or (2) practice tools. In-depth works — such as Fitger’s Rättegångsbalken — provide extensive treatment of the subject, delve deeply into the historical and teleological background of the regulation in question, and examine every conceivable exceptions-from-the-exception-to-the-rule detail. Those commentaries can be ten times the length of the actual statutory text, and are typically published in book form. In particular, they are written for readers who already understand the basic concepts and structure of the regulation, and who have time to dig deeper. For ordinary people, this level of detail is neither needed nor wanted.

In addition, there are practice tools, shorter commentaries more suited for the practicing lawyer who needs to quickly understand a particular regulation. These are still written for professionals, and are typically accessible as part of an electronic database subscription. Such subscriptions are priced far beyond the reach of non-professionals as well.

Thus neither form of commentary is written for laypeople, and neither is readily available to them. In order for the law to be accessible to all, it needs to be explained differently, and the explanation needs to be freely available. Writing this sort of commentary doesn’t require a tenured professor. In fact, the basic aspects of any central act can be adequately explained by any law student having the following two qualities:

a) A thorough understanding of the subject (for example, having taken the relevant course and having received a good grade); and

b) A talent for explaining complex things succinctly. At this, the student may even outdo the professor, as the student — having recently been a novice respecting the topic — will have a better understanding of how the subject is approached by someone new to it.

In any given act, there are a handful of key sections. A brief introduction to the act, combined with short explanations of these key sections, would be enough to create something useful to a nonlawyer. A person with a good knowledge of the act could write this in an evening. Something more extensive, like a basic commentary on the interesting parts of any central act, could be written in one or two weeks of work.

Still, there are a lot of central acts in Swedish law, and of only a few of these do I have extensive knowledge. Since I wouldn’t be able to write all the commentary myself, I’d have to create something that made it possible for people more knowledgeable than I to contribute their knowledge. Thus, the solution would have two aspects — one technological, one social. During the spring, summer, and autumn of 2009, we tried to create this solution.

Technology: Collaborative Writing Tools

In our proposed free online system, a commentary for a single act would normally contain a brief introduction to the act itself, followed by comments on the most important sections. A simple way to do this is to write the entire commentary as a single text document, divided into different parts, each referencing a particular section of the act. Acts in the Swedish law system can vary greatly in length, with some being only a sentence long, and some being over 100,000 words long and containing over 1000 different sections. However, the decision to use one single commentary text per act was made in order to keep things simple.

Swedish acts are typically divided into chapters, which are divided into sections (though for shorter acts, only sections are used). For example, the second section of the fourth chapter of an act is referenced as “4 kap. 2 §”. We adopted a convention of using such a reference as a header preceding the commentary for the referenced section.

Apart from headings, we tried to use as little formatting as possible. Basic bold and italics were desirable, as were different sorts of lists (ordered, unordered, and definition lists), but not things like multiple fonts or footnotes. Hyperlinks, both internal within the system and external to other web pages, were encouraged.

Apart from commentaries on the acts themselves, we wanted to be able to describe important concepts referred to in the acts. To write succinct explanations of statutes, one often has to refer to central legal concepts (such as “The rule of law“). If that reference can be linked to a page that describes the concept in greater detail, the commentary can be made shorter and the user can decide whether to follow the link if more explanation is needed. Furthermore, the concept can be referred to from multiple commentaries.

In order for the process to be as simple as possible, we needed a web-based editing system that didn’t require any particular piece of software on the user’s computer. We also didn’t want to spend significant time developing or customizing software.

The decision was made to use MediaWiki, the software behind Wikipedia, for the task. MediaWiki is a very robust and well developed piece of software with an active developer community. It’s also easy to extend with “hooks,” that is, small pieces of code that are configured to run in certain situations (such when a page is saved). Pages can be hyperlinked, and basic formatting such as headings and lists can be done using a simple markup language, normally referred to as “wikitext.”

As MediaWiki is based around the editing of pages, the commentary for each act in our system would take the form of a page, and the description for each legal concept would also constitute a page. In Swedish law, acts and ordinances are published in a collection called SFS (Svensk författningssamling). Each act or ordinance is given an identifying number: e.g., the public access to information and secrecy act is known as “SFS 2009:400”. In our free online system, this would correspond to a commentary page called SFS/2009:400.

Unlike Wikipedia, we did not want to use MediaWiki as the interface for the reader. The primary reason for this was that we wanted to present the statutory text and the commentary side-by-side. In order to do that, we needed to extract the text that we had edited using MediaWiki, split it up into commentaries for each individual section, and then weave the statutory text and the commentary text together. (Click here to see an example.)

The architecture of lagen.nu is such that no pages are ever created dynamically in response to a user’s request. Instead, everything is pre-rendered in the form of static HTML files on disc. This makes the site responsive even when running on modest hardware and under high loads. A hook was developed so that every time a wiki page in the main namespace was saved, a program to weave together that commentary text with the statutory text was run. The result was stored as a normal HTML file on disc.

Writing the program to do this weaving was not a trivial task. In particular, translating the minimalist wikitext markup into XHTML fragments proved to be difficult, and not all MediaWiki markup was supported. We also needed to do custom processing of the commentary text. Luckily, a library was found that parsed the MediaWiki markup and returned the equivalent XHTML markup. This library was extended in order to identify references to acts, sections, and legal cases present in the system. This enabled the commentary authors, who were not expected to learn advanced hyperlinking syntax, to just write normal legal prose, which was then linked automatically.

Using this setup, the editing process boiled down to these steps:

Find the act to be commented on on the main site, and click the “edit commentary” link.
This leads to a MediaWiki editing page — or, if the user is not logged in, to a login form.
The user creates or edits the commentary, saving occasionally.
When saving, the edit can be flagged as a “minor edit”. When this is the case, the weaving process is not run, enabling the user to save work that is still in progress without changing the contents of the main site.
When the user is satisfied (which does not necessarily mean that he/she is altogether done with writing the commentary, just that it is in such a state that it can be published on the main site), the page is saved and the weaving process is run.
The user can then check the main site and verify that the statutory text and the commentary text look OK side by side.

Social Aspects: Coordinating the Writing Effort

The plan was to get motivated law students to write the commentary for our free online system. We formed a group of 14 people, consisting mostly of law students, but including some practicing or retired lawyers. They were each given an assignment, normally consisting of a single act, and the instruction to write the best commentary on that act that they could fashion, within a 40 hour period. For some longer acts, this limit was extended to 60 or 80 hours. The authors retained copyright in their work, and were to receive attribution for it, but they agreed to license it under a Creative Commons license (specifically the CC-BY-SA license). They were also given a small monetary compensation for their work. The deadline for completing the writing assignments was set for October 1st.

At the start of summer 2009, we held an initial workshop where the motivation and impetus of the project were explained, and where people were given some hands-on instruction in how to write and edit commentary. For the rest of the summer, project members communicated using an email mailing list, together with occasional get-togethers on Saturdays at the central library in Stockholm.

During this time, the framework for writing, as well as the results, were constantly examined and improved. My roles were those of software developer (fine-tuning the weaving process), editor (suggesting improvements to individual commentaries as they were written), and project manager (gently reminding everyone of the impending October deadline).

It soon became apparent that everyone had his or her own idea of how commentaries should be written. In general, this diversity is a good thing — if everyone tries out their ideas, we can see what works and what doesn’t. However, if everyone is forced to invent their own wheels, progress is slow and resources that could be used for writing are wasted on other things.

Therefore, we wrote a style guide, containing basic guidelines with examples on how to write concisely and simply, how to refer and link to legal concepts and external resources, how to prioritize, whom to write for, and so on. A getting-started guide, and a shorter guide to MediaWiki markup, were also written.

During the writing period we were featured in the daily press, trade magazines and national radio. We were also nominated for a prize awarded by the Centre for Easy-to-Read.

Evaluation: Does It Work?

From July to September 2009, we commented on 17 acts, 1,200 individual sections, and 500 legal concepts, resulting in a total output of over 400 pages of text. A great amount of knowledge was created in an amazingly short time span. The quality assurance, which was done by me, proved to be an interesting challenge, as I’m not (as stated above) an expert on all of the acts for which commentaries were written. However, having taken basic classes on all the areas of law addressed by the commentaries, I was able to recognize most of the content of the commentaries (and to dig deeper if I found statements that seemed at odds with what I had learned). Overall, the quality of the commentaries was surprisingly high.

In conclusion, the commentary project has been a learning experience. Following the intensive activity of the summer and autumn, fewer commentaries have been added to the system. One reason for this might be that we no longer have the funds to pay authors. The monetary compensation was small but not insignificant. However, it seemed that the major incentives for authors were the opportunity to make law more accessible to others, as well as the chance to educate themselves by explaining the law to others. We have now enabled non-registered users to comment on the commentaries, using the Discussion namespace of MediaWiki. This way, we have identified and fixed many omissions and mistakes in the original commentaries.

One thing that seemed to work really well was the procedure of assigning work and following up on it. Writing a commentary on a long statute can be accomplished using free time over the course of a month or two. Setting a deadline, monitoring progress, and providing relevant feedback can serve as incentives for authors, too.

The commentaries produced by our project are written using best-efforts principles. There are no guarantees that the information in the commentaries is complete, updated, or even correct. However, having a principal author for each commentary gives that author an incentive to ensure the information in the commentary is as good as possible. Authors may use, and in fact have used, their commentaries as work samples. Even so, having someone other than the author read the text and provide feedback can improve the quality of a commentary as well.

With the commentary project, I feel we have proved that legal commentaries can be written in a crowdsourced way (even though we used monetary compensation to motivate our authors), and that wiki technology has the required capabilities, and is sufficiently user-friendly, for such an undertaking.

The next step for our commentaries project is to formalize and sustain an assignment and feedback workflow. This will require multiple project managers, and ideally some form of delegation of responsibility for different parts of the law. Further, I am confident that the project can be sustained using a voluntary framework. This remains to be proven, though.

Note: This project was supported by a grant from Internetfonden, for which we are eternally grateful.

Staffan Malmgren — the creator of the Swedish free access to law service lagen.nu — is a project manager at the Swedish Courts administration by day, working on coordinating the publishing of official legal information on the web, using structured and standardized formats (by night, he’s working on his final law school thesis, on the subject of jurisprudential relevance ranking in legal information systems).

Confessions of a Legal Info-holic

digital law, Digital law libraries, digital libraries, india, information retrieval, liis, open source software 2 Responses »

Feb 012010

In an extraordinary story, Jorge Luis Borges writes of a “Total Library”, organized into ‘hexagons’ that supposedly contained all books:

When it was proclaimed that the Library contained all books, the first impression was one of extravagant happiness. All men felt themselves to be the masters of an intact and secret treasure. . . . At that time a great deal was said about the Vindications: books of apology and prophecy which . . . [contained] prodigious arcana for [the] future. Thousands of the greedy abandoned their sweet native hexagons and rushed up the stairways, urged on by the vain intention of finding their Vindication. These pilgrims disputed in the narrow corridors . . . strangled each other on the divine stairways . . . . Others went mad. . . . The Vindications exist . . . but the searchers did not remember that the possibility of a man’s finding his Vindication, or some treacherous variation thereof, can be computed as zero. As was natural, this inordinate hope was followed by an excessive depression. The certitude that some shelf in some hexagon held precious books and that these precious books were inaccessible, seemed almost intolerable.

About three years ago I spent almost an entire sleepless month coding OpenJudis – my rather cool, “first-of-its-kind” free online database of Indian Supreme Court cases. The database hosts the full texts of about 25,000 cases decided since 1950. In this post I embark on a somewhat personal reflection on the process of creating OpenJudis – what I learnt about access to law (in India), and about “legal informatics,” along with some meditations on future pathways.

Having, by now, attended my share of FLOSS events, I know it is the invariable tendency of anyone who’s written two lines of free code to consider themselves qualified to pronounce on lofty themes – the nature of freedom and liberty, the commodity, scarcity, etc. With OpenJudis, likewise, I feel like I’ve acquired the necessary license to inflict my theory of the world on hapless readers – such as those at VoxPopuLII!

I begin this post by describing the circumstances under which I began coding OpenJudis. This is followed by some of my reflections on how “legal informatics” relates to and could relate to law.

Online Access to Law in India
India is privileged to have quite a robust ICT architecture. Internet access is relatively inexpensive, and the ubiquity of “cyber cafes” has resulted in extensive Internet penetration, even in the absence of individual subscriptions.

Government bodies at all levels are statutorily obliged to publish, on the Internet, vital information regarding their structure and functioning. The National Informatics Centre (NIC), a public sector corporation, is responsible for hosting, maintaining and updating the websites of government bodies across the country. These include, inter alia, the websites of the Union (federal) Government, the various state governments, union and state ministries, constitutional bodies such as the Election Commission and the Planning Commission, and regulatory bodies such as the Securities Exchange Board of India (SEBI). These websites typically host a wealth of useful information including, illustratively, the full texts of applicable legislations, subordinate legislations, administrative rulings, reports, census data, application forms etc.

The NIC has also been commissioned by the judiciary to develop websites for courts at various levels and publish decisions online. As a result, beginning in around the year 2000, the Supreme Court and various high courts have been publishing their decisions on their websites. The full texts of all Supreme Court decisions rendered since 1950 have been made available, which is an invaluable free resource for the public. Most High Court websites however, have not yet made archival material available online, so at present, access remains limited to decisions from the year 2000 onwards. More recently the NIC has begun setting up websites for subordinate courts, although this process is still at a very embryonic stage.

Apart from free government websites, a handful of commercial enterprises have been providing online access to legal materials. Among them, two deserve special mention. SCCOnline – a product of one of the leading law report publishers in India – provides access to the full texts of decisions of the Indian Supreme Court. The CD version of SCCOnline sells for about INR 70,000 (about US$1,500), which is around the same price the company charges for a full set of print volumes of its reporter. For an additional charge, the company offers updates to the database. The other major commercial venture in the field is Manupatra, which offers access to the full text of decisions of various courts and tribunals as well as the texts of legislation. Access is provided for a basic charge of about US$100, plus a charge of about US$1 per document downloaded. While seemingly modest by international standards, these charges are unaffordable by large sections of the legal profession and the lay public.

OpenJudis
In December 2006, I began coding OpenJudis. My reasons were purely selfish. While the full texts of the decisions of the Supreme Court were already available online for free, the search engine on the government website was unreliable and inadequate for (my) advanced research needs. The formatting of the text of cases themselves was untidy, and it was cumbersome to extract passages from them. Frequently, the website appeared overloaded with users, and alternate free sources were unavailable. I couldn’t afford any of the commercial databases. My own private dissatisfaction with the quality of service, coupled with (in retrospect) my completely naive optimism, led me to attempt OpenJudis. A third crucial factor on the input side was time, and a “room of my own,” which I could afford only because of a generous fellowship I had from the Open Society Institute.

I began rashly, by serially downloading the full texts of the 25,000 decisions on the Supreme Court website. Once that was done (it took about a week), I really had no notion of how to proceed. I remember being quite exhilarated by the sheer fact of being in possession of twenty five thousand Supreme Court decisions. I don’t think I can articulate the feeling very well. (I have some hope, however, that readers of this blog and my fellow LII-ers will intuitively understand this feeling.) Here I was, an average Joe poking around on the Internet, and just-like-that I now had an archive of 25,000 key documents of our republic, cumulatively representing the articulations of some of the finest (and some not-so-fine) legal minds of the previous half-century, sitting on my laptop. And I could do anything with them.

The word “archive,” incidentally, as Derrida informs us, derives from the Greek arkheion, the residence of the superior magistrates, the archons – those who commanded. The archons both “held and signified political power,” and were considered to possess the right to both “make and represent the law.” “Entrusted to such archons, these documents in effect speak the law: they recall the law and call on or impose the law”. Surely, or I am much mistaken, a very significant transformation has occurred when ordinary citizens become capable of housing archives – when citizens can assume the role of archons at will.

Giddy with power, I had an immediate impulse to find a way to transmit this feeling, to make it portable, to dissipate it – an impulse that will forever mystify economists wedded to “rational” incentive-based models of human behavior. I wasn’t a computer engineer, I didn’t have the foggiest idea how I’d go about it, but I was somehow going to host my own online free database of Indian Supreme Court cases. The audacity of this optimism bears out one of Yochai Benkler‘s insights about the changes wrought by the new “networked information economy” we inhabit. According to Benkler,

The belief that it is possible to make something valuable happen in the world, and the practice of actually acting on that belief, represent a qualitative improvement in the condition of individual freedom [because of NIE]. They mark the emergence of new practices of self-directed agency as a lived experience, going beyond mere formal permissibility and theoretical possibility.

Without my intending it, the archive itself suggested my next task. I had to clean up the text and extract metadata. This process occupied me for the longest time during the development of OpenJudis. I was very new to programming and had only just discovered the joys of Regular Expressions. More than my inexperience with programming techniques, however, it was the utter heterogeneity of reporting styles that took me a while to accustom myself to. Both opinion-writing and reporting styles had changed dramatically in the course of the fifty years my database covered, and this made it difficult to find patterns when extracting, say, the names of judges involved. Eventually, I had cleaned up the texts of the decisions and extracted an impressive (I thought) set of metadata, including the names of parties, the names of the judges, and the date the case was decided. To compensate for the absence of headnotes, I extracted names of statutes cited in the cases as a rough indicator of what their case might relate to. I did all this programming in PHP with the data housed in a MySQL database.

And then I encountered my first major roadblock that threatened to jeopardize the whole operation: I ran my first full-text Boolean search on the MySQL database and the results took a staggering 20 minutes to display. I was devastated! More elaborate searches took longer. Clearly, this was not a model I could host online. Or do anything useful with. Nobody in their right mind would want to wait 20 minutes for the results of their search. I had to look for a quicker database, or, as I eventually discovered, a super fast, lightweight indexing search engine. After a number of failed attempts with numerous free search engine software programs, none of which offered either the desired speed or the search capability I wanted, I was getting quite desperate. Fortunately, I discovered Swish-e, a lightweight, Perl-based Boolean search engine which was extremely fast and, most importantly, free – exactly what I needed. The final stage of creating the interface, uploading the database, and activating the search engine happened very quickly, and sometime in the early hours of December 22nd, 2006, OpenJudis went live. I sent announcement emails out to several e-groups and waited for the millions to show up at my doorstep.

They never did. After a week, I had maybe a hundred users. In a month, a few hundred. I received some very complimentary emails, which was nice, but it didn’t compensate for the failure of “millions” to show up. Over the next year, I added some improvements:
1) First, I built an automatic update feature that would periodically check the Supreme Court website for new cases and update the database on its own.
2) In October 2007, I coded a standalone MS Windows application of the database that could be installed on any system running Windows XP. This made sense in a country where PC penetration is higher than Internet penetration. The Windows application became quite popular and I received numerous requests for CDs from different corners of the country.
3) Around the same time, I also coded a similar application for decisions of the Central Information Commission – the apex statutory tribunal for adjudicating disputes under the Right to Information Act.
4) In February 2008, both applications were included in the DVD of Digit Magazine – a popular IT magazine in India.

Unfortunately, in August 2008, the Supreme Court website changed its design so that decisions could no longer be downloaded serially in the manner I had been accustomed to. One can only speculate about what prompted this change – since no improvements were made to the actual presentation of the cases. The only thing that changed was that one could no longer download cases serially as I’d been doing. The new format was far more difficult for me to “hack” and I abandoned the attempt. My work left me with no time to attempt to circumvent the new format.

Fortunately at the same time, an exciting new project called IndianKanoon was started by Sushant Sinha, an Indian computer science graduate at Michigan. In addition to decisions of the Supreme Court, his site covers several high courts and links up to the text of legislation of various kinds. Although I have not abandoned plans to develop OpenJudis, the presence of IndianKanoon has allowed me to step back entirely from this domain – secure in the knowledge that it is being taken forward by abler hands than mine.

Predictions, Observations, Conclusions
I’d like to end this already-too-long post with some reflections, randomly ordered, about legal information online.
1) I think one crucial area commonly neglected by most LIIs is client-side software that enables users to store local copies of entire databases. The urgency of this need is highlighted in the following hypothetical about digital libraries by Siva Vaidhyanathan (from The Anarchist in the Library):

So imagine this: An electronic journal is streamed into a library. A library never has it on its shelf, never owns a paper copy, can’t archive it for posterity. Its patrons can access the material and maybe print it, maybe not. But if the subscription runs out, if the library loses funding and has to cancel that subscription, or if the company itself goes out of business, all the material is gone. The library has no trace of what it bought: no record, no archive. It’s lost entirely.

It may be true that the Internet will be around for some time, but it might be worthwhile for LIIs to stop emulating the commercial database models of restricting control while enabling access. Only then can we begin to take seriously the task of empowering users into archons.

2) My second observation pertains to interface and usability. I have for long been planning to incorporate a set of features including tagging, highlighting, annotating, and bookmarking that I myself would most like to use. Additionally, I have been musing about using Web 2.0 to enable user-participation in maintenance and value-add operations – allowing users to proofread the text of judgments and to compose headnotes. At its most ambitious, in these “visions” of mine, OpenJudis looks like a combination of LII + social networking + Wikipedia.

A common objection to this model is that it would upset the authority of legal texts. In his brilliant essay A Brief History of the Internet from the 15th to the 18th century, the philosopher Lawrence Liang reminds us that the authority of knowledge that we today ascribe to printed text was contested for the longest period in modern history.

Far from ensuring fixity or authority, this early history of Printing was marked by uncertainty, and the constant refrain for a long time was that you could not rely on the book; a French scholar Adrien Baillet warned in 1685 that “the multitude of books which grows every day” would cast Europe into “a state as barbarous as that of the centuries that followed the fall of the Roman Empire.”

Europe’s non-descent into barbarism offers us a degree of comfort in dealing with Adrien Baillet-type arguments made in the context of legal information. The stability that we ascribe to law reports today is a relatively recent historical innovation that began in the mid-19th century. “Modern” law has longer roots than that.

3) While OpenJudis may look like quite a mammoth endeavor for one person, I was at all times intensely aware that this was by no means a solitary undertaking, and that I was “standing on the shoulders of giants.” They included the nameless thousands at the NIC who continue to design websites, scan and upload cases on the court websites – a Sisyphian task – and the thousands whose labor collectively produced the free software I used : Fedora Core 4, PHP, MySQL, Swish-E. And lastly, the nameless millions who toil to make the physical infrastructure of the Internet itself possible. Like the ground beneath our feet, we take it for granted, even as the tragic recent events in Haiti in recent weeks remind us to be more attentive. (For a truly Herculean endeavor, however, see Sushant Sinha’s IndianKanoon website, about which many ballads may be composed in the decades to come.)

It might be worthwhile for the custodians of LIIs to enable users to become derivative producers themselves, to engage in “practices of self-directed agency” as Benkler suggests. Without sounding immodest, I think the real story of OpenJudis is how the Internet makes it plausible and thinkable for average Joes like me (and better-than-average people like Sushant Sinha) to think of waging unilateral wars against publishing empires.

4) So, what is the impact that all this ubiquitous, instant, free electronic access to legal information is likely to have on the world of law? In a series of lectures titled “Archive Fever,” the philosopher Derrida posed a similar question in a somewhat different context: What would the discipline of psychoanalysis have looked like, he asked, if Sigmund Freud and his contemporaries had had access to computers, televisions, and email? In brief, his answer was that the discipline of psychoanalysis itself would not have been the same – it would have been transformed “from the bottom up” and its very events would have been altered. This is because, in Derrida’s view:

The archive . . . in general is not only the place for stocking and for conserving an archivable content of the past. . . . No, the technical structure of the archiving archive also determines the structure of the archivable content even in its coming into existence and in its relationship to the future. The archivization produces as much as it records the event.

The implication, following Derrida, is that in the past, law would not have been what it currently is if electronic archives had been possible. And the obverse is true as well: in the future, because of the Internet, “rule of law” will no longer observe the logic of the stable trajectories suggested by its classical “analog” commentators. New trajectories will have to be charted.

5) In the same book, Derrida describes a condition he calls “Archive fever”:

It is to burn with a passion. It is never to rest, interminably, from searching for the archive right where it slips away. It is to run after the archive even if there’s too much of it. It is to have a compulsive, repetitive and nostalgic desire for the archive, an irrepressible desire to return to the origin, a homesickness, a nostalgia for the return to the most archaic place of absolute commencement.

I don’t know about other readers of VoxPopulII (if indeed you’ve managed to continue reading this far!), but for the longest time during and after OpenJudis, I suffered distinctively from this malady. I downloaded indiscriminately whole sets of data that still sit unused on my computer, not having made it into OpenJudis. For those in a similar predicament, I offer Borges’s quote with which I began this text, as a reminder of the foolishness of the notion of “Total Libraries.”

Prashant Iyengar is a lawyer affiliated with the Alternative Law Forum, Bangalore, India. He is currently pursuing his graduate studies at Columbia University in New York. He runs OpenJudis, a free database of Indian Supreme Court cases.

VoxPopuLII is edited by Judith Pratt. Editor in Chief is Rob Richards.

Preserving Born-Digital Legal Materials…Where to Start?

Digital law libraries, Law librarians and legal informatics, open source software, Standards 9 Responses »

Jan 102010

It’s tempting to begin any discussion of digital preservation and law libraries with a mind-blowing statistic. Something to drive home the fact that the clearly-defined world of information we’ve known since the invention of movable type has evolved into an ephemeral world of bits and bytes, that it’s expanding at a rate that makes it nearly impossible to contain, and that now is the time to invest in digital preservation efforts.

But, at this point, that’s an argument that you and I have already heard. As we begin the second decade of the 21st century, we know with certainty that the digital world is ubiquitous because we ourselves are part of it. Ours is a world where items posted on blogs are cited in landmark court decisions, a former governor and vice-presidential candidate posts her resignation speech and policy positions to Facebook, and a busy 21st-century president is attached at the thumb to his Blackberry.

We have experienced an exhilarating renaissance in information, which, as many have asserted for more than a decade, is threatening to become a digital dark age due to technology obsolescence and other factors. There is no denying the urgent need for libraries to take on the task of preserving our digital heritage. Law libraries specifically have a critically important role to play in this undertaking. Access to legal and law-related information is a core underpinning of our democratic society. Every law librarian knows this to be true. (I believe it’s what drew us to the profession in the first place.)

Frankly speaking, our current digital preservation strategies and systems are imperfect – and they most likely will never be perfected. That’s because digital preservation is a field that will be in a constant state of change and flux for as long as technology continues to progress. Yet, tremendous strides have been made over the past decade to stave off the dreaded digital dark age, and libraries today have a number of viable tools, services, and best practices at our disposal for the preservation of digital content.

Law libraries and the preservation of born-digital content

In 2008, Dana Neacsu, a law librarian at Columbia University Law School, and I decided to explore the extent to which law libraries were actively involved in the preservation of born-digital legal materials. So, we conducted a survey of digital preservation activity and attitudes among state and academic law libraries.

We found an interesting incongruity among our respondent population of library directors who represented 21 law libraries: less than 7 percent of the digital preservation projects being planned or underway at our respondents’ libraries involved the preservation of born-digital materials. The remaining 93 percent involved the preservation of digital files created through the digitization of print or tangible originals. Yet, by a margin of 2 to 1, our respondents expressed that they believed born-digital materials to be in more urgent need of preservation than print materials.

This finding raises an interesting question: If law librarians (at least those represented among our respondents) believe born-digital materials to be in more urgent need of preservation, why were the majority of digital preservation resources being invested in the preservation of files resulting from digitization projects?

I speculate that part of the problem is that we often don’t know where to start when it comes to preserving born-digital content. What needs to be preserved? What systems and formats should we use? How will we pay for it?

What needs to be preserved? A few thoughts…

Determining what needs to be preserved is not as complicated as it may seem. The mechanisms for content selection and collection development that are already in place at most law libraries lend themselves nicely to prioritizing materials for digital preservation, as I have learned through the Georgetown Law Library’s involvement in The Chesapeake Project Legal Information Archive. A collaborative effort between Georgetown and partners at the State Law Libraries of Maryland and Virginia, The Chesapeake Project was established to preserve born-digital legal information published online and available via open-access URLs (as opposed to within subscription databases).

So, how did we approach selection for the digital archive? Within a broad, shared project collection scope (limited to materials that were law- or policy-related, digitally born, and published to the “free Web” per our Collection Plan) each library simply established its own digital archive selection priorities, based on its unique institutional mandates and the research needs of its users. Libraries have historically developed their various print collections in a similar manner.

The Maryland State Library focused on collecting documents relating to public-policy and legal issues affecting Maryland citizens. The Virginia State Library collected the online publications of the Supreme Court of Virginia and other entities within Virginia’s judicial branch of government. As an academic library, the Georgetown Law Library developed topical and thematic collection priorities based on research and educational areas of interest at the Georgetown University Law Center. (Previously, online materials selected for the Georgetown Law Library’s collection had been printed from the Web on acid-free paper, bound, cataloged, and shelved. Digital preservation offered an attractive alternative to this system.)

To build our topical digital archive collections, the Georgetown Law Library assembled a team of staff subject specialists to select content (akin to our collection development selection committee), and, to make things as simple as possible, submissions were made and managed using a Delicious bookmark account, which allowed our busy subject specialists to submit online content for preservation with only a few clicks.

As a research library, we preserved information published to the free Web under a claim of fair use. Permission from copyright holders was sought only for items published either outside of the U.S. or by for-profit entities. Taking our cues from the Internet Archive, we determined to respect the robots.txt protocol in our Web harvesting activities and provide rights holders with instructions for requesting the removal of their content from the archive.

Fear of duplicating efforts

We have, on occasion, knowingly added digital materials to our archive collection that were already within the purview of other digital preservation programs. There is a fear of duplicating efforts when it comes to digital preservation, but there is also a strong argument to be made for multiple, geographically dispersed entities maintaining duplicate preserved copies of important digital resources.

This philosophy, especially as relates to duplicating the digital-preservation efforts of the Government Printing Office, is currently being echoed among several Federal Depository Libraries (and prominently by librarians who contribute to the Free Government Information blog) who are supporting the concept of digital deposit to maintain a truly distributed Federal Depository Library Program. Should there ever be a catastrophic failure at GPO, or even a temporary loss of access (such as that caused by the PURL server crash last August), user access to government documents would remain uninterrupted, thanks to this distributed preservation network. Currently there are 156 academic law libraries listed as selective depositories on the Federal Depository Library Directory; each of these would be candidates for digital deposit should the program come to fruition.

Libraries with perpetual access or post-cancellation access agreements with publishers may also find it worthwhile to invest in digital preservation activities that may be redundant. Some publishers offer easy post-cancellation access to purchased digital content via nonprofit initiatives such as Portico and LOCKSS, both of which function as digital preservation systems. Other publishers, however, may simply provide subscribers with a set of CDs or DVDs containing their purchased subscription content. In these cases, it is worthwhile to actively preserve these files within a locally managed digital archive to ensure long-term accessibility for library patrons, rather than relegating these valuable digital files, stored on an unstable optical medium, to languishing on a shelf.

Law reviews and legal scholarship

It has been suggested that academic law libraries take responsibility for the preservation of digital content cited within their institutions’ law reviews to ensure that future researchers will able to reference source materials even if they are no longer available at the cited URLs. While there aren’t specific figures relating to the problem of citation link rot in law reviews, research on Web citations appearing in scientific journals has shown that roughly 10 percent of these citations become inactive within 15 months of the citing article’s publication. When it comes to Web-published law and policy information, our own Chesapeake Project evaluation efforts have found that about 14 percent, or 1 out of every 7, Web-based items had disappeared from their original URLs within two years of being archived.

In the near future, we may find ourselves in the position of taking responsibility for the digital preservation of our law reviews themselves, given the call to action in the Durham Statement on Open Access to Legal Scholarship. After all, if law schools end print publication of journals and commit “to keep the electronic versions available in stable, open, digital formats” within open-access online repositories, there is an implicit mandate to ensure that those repositories offer digital preservation functionality, or that a separate dark digital preservation system be used in conjunction with the repository, to ensure long-term access to the digital journal content. (It is important to note that digital repository software and services do not necessarily feature standard digital preservation functionality.)

Speaking of digital repositories, the responsibility for establishing and maintaining institutional repositories most certainly falls to the law library, as does the responsibility for preserving the digital intellectual output of their law schools’ faculty, institutes, centers, and students (many of whom go on to impressive heights).

At the Georgetown Law Library, we’ve also taken on the task of preserving the intellectual output published to the Law Center’s Web sites.

The Preserv project has compiled an impressive bibliography on digital preservation aimed specifically at preservation services for institutional repositories (but also covering many of the larger issues in digital preservation), which is worth reviewing.

What systems and formats should we use?

Did I mention that our current digital preservation strategies and systems are imperfect? Well, it’s true. That’s the bad news. No matter which system or service you chose, you will surely encounter occasional glitches, endure system updates and migrations, and be forced to revise your processes and workflows from time to time. This is a fledgling, evolving field, and it’s up to us to grow and evolve along with it.

But, take heart! The good news is that there are standards and best practices established to guide us in developing strategies and selecting digital preservation systems, and we have multiple options to choose from. The key to embarking on a digital preservation project is to be versed in the language and standards of digital preservation, and to know what your options are.

The language and standards of digital preservation

I have heard a very convincing argument against standards in digital preservation: Because digital preservation is a new, evolving field, complying with rigid standards can be detrimental to systems that require a certain amount of adaptability in the face of emerging technological challenges. While I agree with this argument, I also believe that it is tremendously useful for those of us who are librarians, as opposed to programmers or IT specialists, to have standards as a starting point from which to identify and evaluate our options in digital preservation software and services.

There are a number of standards to be aware of in digital preservation. Chief among these is the Open Archival Information System (OAIS) Reference Model, which provides the central framework for most work in digital preservation. A basic question to ask when evaluating a digital preservation system or service is, “Does this system conform to the OAIS model?” If not, consider that a red flag.

The Trustworthy Repositories Audit & Certification Criteria and Checklist, or TRAC, is a digital repository evaluation tool currently being incorporated into an international standard for auditing and certifying digital archives. A small number of large repositories have undergone (or are undergoing) TRAC audits, including E-Depot at the Koninklijke Bibliotheek (National Library of the Netherlands), LOCKSS, Portico, and HathiTrust. This number can be expected to increase in the coming years.

The TRAC checklist is also a helpful resource to consult in conducting your own independent evaluations. Last year, for example, the libraries participating in The Chesapeake Project commissioned the Center for Research Libraries to conduct an assessment (as opposed to a formal audit) of our OCLC digital archive system based on TRAC criteria, which provided useful information to strengthen the project.

The PREMIS Data Dictionary provides a core set of preservation metadata elements to support the long-term preservation and future renderability of digital objects stored within a preservation system. The PREMIS working group has created resources and tools to support PREMIS implementation, available via the Library of Congress’s Web site. It is useful to consult the data dictionary when establishing local policy, and to ask about PREMIS compatibility when evaluating digital preservation options.

While we’re on the exciting topic of metadata, the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH, not to be confused with OAIS), is another protocol to watch for, especially if discovery and access are key components of your preservation initiative. OAI-PMH is a framework for sharing metadata between various “silos” of content. Essentially, the metadata of an OAI-PMH compliant system could be shared with and made discoverable via a single, federated search interface, allowing users to search the contents of multiple, distributed digital archives at the same time.

For an easy-to-read overview of digital preservation practices and standards, I recommend Priscilla Caplan’s The Preservation of Digital Materials, which appeared in the Feb./March 2008 issue of Library Technology Reports. There are also a few good online glossaries available to help decipher digital preservation jargon: the California Digital Library Glossary, the Internet Archives’ Glossary of Web Archiving Terms, and the Digital Preservation Coalition’s Definitions and Concepts.

Open source formats and software

Open source and open standard formats and software play a vital role in the lifecycle management of digital content. In the context of digital preservation, open-source formats, which make their source code and specifications freely available, facilitate the future development of tools that can assist in the migration of files to new formats as technology progresses and older formats become obsolete. PDF, for example, although developed originally as a proprietary format by Adobe Systems, became a published open standard in 2008, meaning that developers will have a foundation for making these files accessible in the future.

Other open source formats commonly used in digital preservation include the TIFF format for digital images, the ARC or WARC file for Web archiving, and the Extensible Markup Language (XML) text format for encoding data or document structure information. Microsoft formats, such as Word Documents, do not comply with open standards; the proprietary nature of these formats will inhibit future access to these documents when these formats become obsolete. The Library of Congress has a useful Web site devoted to digital formats and sustainability (including moving image and sound formats), which is worth reviewing.

Open source software is also looked upon favorably in digital preservation because, similar to open source formats, the software development and design process is made transparent, allowing current and future developers to develop new interfaces to or updates to the software over time.

Open source does not necessarily mean free-of-charge, and in fact, many service providers utilize open source software and open standards in developing fee-based or subscription digital preservation solutions.

Digital preservation solutions

There are many factors to consider in selecting a digital preservation solution. What is the nature of the content being preserved, and can the system accommodate it? Is preservation the sole purpose of the system — so that the system need include only a dark archive — or is a user access interface also necessary? How much does the system cost, and what are the expected ongoing maintenance costs, both in terms of budget and staff time? Is the system scalable, and can it accommodate a growing amount of content over time? This list could go on…

Keep in mind that no system will perfectly accommodate your needs. (Have I mentioned that digital preservation systems will always be imperfect?) And there is no use in waiting for the “perfect system” to be developed. We must use what’s available today. In selecting a system, consider its adherence to digital preservation standards, the stability of the institution or organization providing the solution, and the extent to which the digital preservation system has been accepted and adopted by institutions and user communities.

In a perfect world, perhaps every law library would implement a free, build-it-yourself, OAIS-compliant, open-source digital preservation solution with a large and supportive user community, such as DSpace or Fedora. These systems put full control in the hands of the libraries, which are the true custodians of the preserved digital content. But, in practice, our law libraries often do not have the staff and technological expertise to build and maintain an in-house digital preservation system.

As a result, several reputable library vendors and nonprofit organizations have developed fee-based digital preservation solutions, often built using open-source software. The Internet Archive offers the Archive-It service for the preservation of Web sites. The Stanford University-based LOCKSS program provides a decentralized preservation infrastructure for Web-based and other types of digital content, and the MetaArchive Cooperative provides a preservation repository service using the open-source LOCKSS software. The Ex Libris Digital Preservation System and the collaborative HathiTrust repository both support the preservation of digital objects.

For The Chesapeake Project, the Georgetown, Maryland State, and Virginia State Law Libraries use OCLC systems: the Digital Archive for preservation, coupled with a hosted instance of CONTENTdm as an access interface.

In our experience, working with a vendor that hosted our content at a secure offsite location and managed system updates and migrations allowed us to focus our energies on the administrative and organizational aspects of the project, rather than the ongoing management of the system itself. We were able to develop shared project documentation, including preferred file format and metadata policies, and conduct regular project evaluations. Moreover, because our project was collaborative, it worked to our advantage to enlist a third party to store all three libraries’ content, rather than place the burden of hosting the project’s content upon one single institution. In short, working with a vendor can actually benefit your project.

The ultimate question: How will we pay for it?

We still seem to be in the midst of a global economic recession that has impacted university and library budgets. Yet, despite budget stagnation, there has been a steady increase in the production of digital content.

Digital preservation can be expensive, and law library staff members with digital preservation expertise are few. The logical solution to these issues of budget and staff limitations is to seek out opportunities for collaboration, which would allow for the sharing of costs, resources, and expertise among participating institutions.

Collaborative opportunities exist with the Library of Congress, which has created a network of more than 130 preservation partners throughout the U.S., and the law library community is also in the process of establishing its own collaborative digital archive, the Legal Information Archive, to be offered through the Legal Information Preservation Alliance, or LIPA.

During the 2009 AALL annual meeting, LIPA’s executive director announced that The Chesapeake Project had become a LIPA-sanctioned project under the umbrella of the new Legal Information Archive. As a collaborative project with expenses shared by three law libraries, The Chesapeake Project’s costs are currently quite low compared to other annual library expenditures, such as those for subscription databases. These annual costs will decrease as more law libraries join this initiative.

I firmly believe that law libraries must invest in digital preservation if we are to remain relevant and true to our purpose in the 21st century. The core reason libraries exist is to build collections, to make those collections accessible, to assist patrons in using our collections, and to preserve our collections forever. No other institution has been created to take on this responsibility. Digital preservation represents an opportunity in the digital age for law libraries to reclaim their traditional roles as stewards of information, and to ensure that our digital legal heritage will be available to legal scholars and the public well into the future.

Sarah Rhodes is the digital collections librarian at the Georgetown Law Library in Washington, D.C., and a project coordinator for The Chesapeake Project Legal Information Archive, a digital preservation initiative of the Georgetown Law Library in collaboration with the State Law Libraries of Maryland and Virginia.

VoxPopuLII is edited by Judith Pratt. Editor in Chief is Rob Richards.

Venture Capital and Peer Production

open source software, Peer production, software innovation 5 Responses »

Oct 192009

This blog entry focuses on the need for more and better software to reap the benefits of the legal information treasures available. As you’ll see, this turns out to be more complex than one may think.
Network

For commercial software developers, it is surprisingly hard to stay radically innovative, especially when they are successful. To start with, software development itself is a risky undertaking. Despite five decades of research in managing this development process, projects frequently are late, over budget, and much less impressive than originally envisioned. IBM once famously bet the company on a new computer platform, but the development of the associated operating system was so much behind schedule that it threatened IBMs’ existence. Management was tempted to throw ever more human resources at the development problem, only to discover that this in itself causes further delays – leaving us with the useful term “mythical man-month”.

But the difficulty in envisioning hurdles in a complex software engineering project is not the only source of risk for innovative software developers. Successful developers may pride themselves on a large and increasing user base. Such success, however creates its own unintended constraints.

Customers will dislike rapid change in the software they use, as they will have to relearn how to operate it, may have to expend efforts on converting data to new formats, and/or may need to adjust the preferences and customization options they utilized. This gets worse if the successful software is the platform for a thriving ecosystem of other developers and service providers. Any severe change in the underlying platform means that those living in it have to adapt their code. Each time a customer has to invest time in relearning a software product, it offers competing software providers a chance to nab a customer. This prompts software developers, especially very successful ones, to be relatively conservative in their plans for updates and upgrades. They don’t want to undermine their market success, and thus will be tempted to opt for gradual rather than radical innovation when designing the next version of their successful wares.

We have seen it over and over again: Microsoft’s Word, Powerpoint and Excel have gone through numerous iterations over the past decades, but the basic elements of the user experience have changed relatively little. Similarly, concerns for legacy code by third party developers have been a key holdback for Microsoft’s Windows product team. Don’t break something – even if it is utterly ancient and inefficient, buggy and broken – as long as it works for the customers. That’s the understandable, but frustrating, mantra.

Or think of Google: the search engines’ user interface hasn’t seen any major changes since its inception more than a decade ago. Only Apple, it seems, has been getting away with radical innovation that breaks things and forces users to relearn, to convert data, and to expend time. That is the advantage of a small but fervently loyal user base. But even Apple has recently seen the need to take a breather in radical change with Snow Leopard.

And in the legal information context, think of Westlaw and Lexis/Nexis. Despite direct competition with one another, when was the last time we saw a truly radical innovation coming from either of these two companies?

Radical innovation requires the will to risk alienating users. As companies grow and pay attention to shareholder expectations, the will-to-risk often wanes. With radical innovation in the marketplace, the challenge lies in the time axis. If one is very successful with a radically new product at time T, it is hard to throw that product away, and try to risk radically reinventing it, for T+1.

On a macro level, we combat this conservative tendency against radical change by providing incentives for innovative entrepreneurs to develop and market competing offerings. If enough customers are unhappy with Excel, perhaps entrepreneurs with radically new and improved concepts of how to crunch and manage numbers in a structured way will seize the opportunity and develop a new killer app that they’ll pit against Excel. That’s enormously risky, but also offers the potential of very steep rewards. Angel investors and venture capitalists thrive on providing the lubricant (in the form of financial resources) for such high risk, high reward propositions. They flourish on the improbable. What they don’t like are “small ideas.” (It happened to me, too, when I pitched innovative ideas to VCs; they thought my ideas had a very high likelihood of success, but not enough of a lever to reap massive returns. Obviously I was dismayed, but they were right: it is what we need if we want to incentivize radical innovation.)

This also implies, however, that for venture capital to work, markets need be large enough to offer high rewards for risky ventures. If the market is not large enough, venture capital may not be available for a sufficient number of radical innovators to keep pushing the limit. Therefore, existing providers may survive for a long time with incremental innovations. Perhaps that is why Westlaw and Lexis are still around, even though they could fight the tendency toward piecemeal development if they wanted to.

skunk Other large corporations, realizing the bias towards incremental innovation, have repeatedly resorted to radical steps to remedy the problem. They have established skunk works, departments that are largely disconnected from the rest of the company, freeing the members to try revolutionary rather than evolutionary solutions. Sometimes companies acquire a group of radically innovative engineers from the outside, to inject some fresh thinking into internal development processes that may have become too stale.

Peer production models, almost always based on an open source foundation, are not dependent on market success. (On the drivers of peer production see Yochai Benkler’s “The Wealth of Networks”). They are not profit driven, and thus may put less pressure on the developers to abstain from radical change. Because Firefox does not have to win in the marketplace, its developers can, at least in theory, be bolder than their commercial counterparts.

Unfortunately, open-source peer produced software may also lose its appetite for radical innovation over time – not because of monetary incentives, but because of the collaborative structures utilized in the design process. If a large number of volunteering bug reporters, testers, and coders with vastly differing values and preferences work on a joint project, it is likely that development will revert towards a common denominator of what needs to be done, and thus be inherently gradual and evolutionary, rather than radical. Of course, a majority of participants may at rare moments get together and agree on a revolution – much like those in what then was a British colony in 1776. But that is the brilliant exception to a rather boring rule.

Indecisiveness that stems from too small a common ground, however, is not the only danger. On the other end of the spectrum, communities and groups with too many ties among each other cause a mental alignment, or “group think,” that equally stifles radical innovation. Northwestern University professor Brian Uzzi has written eloquently about this problem. Finding the right sweet spot between the two extremes is what’s necessary, but in the absence of an outside mechanism that balance is difficult to achieve for open source peer-producing groups. fish

If we would like to remedy this situation, how could we offer incentives to peer producing communities to more often give radical rather than incremental innovation a try? What could be the mechanism that takes on the role of venture capitalists and skunk works in the peer production context?

It surely isn’t telling dissenters with a radically new idea to “fork out” of a project. That’s like asking a commercial design group to leave the company and try on their own, but without providing them with enough resources or incentives. Not a good idea if we want to make radical innovation – the experimentation with revolutionary rather than incremental ideas – easier, not harder.

But what is the venture capital/skunk works equivalent in the peer-producing world?

A few thoughts come to mind, but I invite you to add your ideas, because I may not be thinking radically enough.

(1) User: Users, from large to small, could volunteer, perhaps through a website, to dedicate some modicum of their time to advancing an open source project not by contributing to its design, but by committing to being first adopters of more radical design solutions. One may imagine a website that helps link users (including law firms) willing to dedicate some “risk” to such riskier open source peer produced projects, perhaps on a sectoral basis (Could this be yet another mission for the LII?).

(2) Designers: Quite a number of corporations and organizations explicitly support open source peer producing projects, mostly by dedicating some of their human resources to improving the code base. These organizations could, if they wanted to improve the capability of such projects to push for more radical innovation, set up incentives for employees to select riskier projects.

(3) Tools: The very tools used to organize peer production of software code already offer many techniques for managing a diverse array of contributors. These tools could be altered to evaluate the a group’s level of diversity and willingness to take risks, based on the findings of social network theory. Such an approach would at least provide the community with a sense of its potential and propensity for radical innovation, and could help group organizers in influencing group composition and group dynamics. (Yes, this is “data.gov” and the government IT dashboards applied to this context.)

These are nothing more than a few ideas. Many more are necessary to identify the best ones to implement. But given the rise and importance of peer production, and the constraints inherent in how it is organizing itself, the conversation about how to best provide incentives for radical innovation in the legal information context – and beyond – is one we must have.

[NB: What do you all think? How does this apply to the world of legal information, and to specialized software applications that support it — things like point-in-time legislative systems, specialized processing tools, and so on? Comments please…. (the ed.)]

Viktor Mayer-Schönberger is Associate Professor of Public Policy and Director of the Information + Innovation Policy Research Centre at the LKY School of Public Policy / National University of Singapore. He is also a faculty affiliate of the Belfer Center of Science and International Affairs at Harvard University. He has published many books, most recently “Delete – The Virtue of Forgetting in the Digital Age.”He is a frequent public speaker, and sought expert for print and broadcast media worldwide. He is also on the boards of numerous foundations, think tanks and organizations focused on studying the foundations of the new economy, and advises governments, businesses and NGOs on new economy and information society issues. In his spare time, he likes to travel, go to the movies, and learn about architecture.

VoxPopuLII is edited by Judith Pratt.

Suffusion theme by Sayontan Sinha

VoxPopuLII

AT4AM: the XML web editor used by Members of European Parliament

Following the Law with Scout

LexML Brazil Project

Confessions of a Legal Info-holic