Over the last several months, I have spent an awful lot of time travelling. I met with a lot of people who work in legal information, both here in the US and abroad. And I had every intention of filling this blog with posts about interesting things I'd seen and heard -- a kind of travelogue of legal informatics. It's been slow in coming. Actually, I'm not convinced that a travelogue per se is what's needed. Event-by-event reporting is easy. Drawing a map of everything that's going on out there in the world of legal information is not.
It's a big world now. People are doing legal informatics in a lot of places. And that phrase -- "doing legal informatics" -- now includes a breathtaking number of disciplines and perspectives. It used to be that we thought of ourselves as situated at the corner of law publishing and computer science. Now we need to add big chunks of information science (itself a composite field), legal bibliography, digital librarianship, e-government studies, political science, and sociology of the professions. At various times during the last six months I've had fascinating discussions with representatives from each of those academic disciplines, and from the practical side of librarianship, government, and publishing. Each was working in a distinct context leading to different kinds of insights and solutions. Each worked within a different legal regime. Each had mapped a different part of the world.
We badly need communication across borders -- national, disciplinary, institutional. It's important that we do that now. We have opportunities -- and challenges -- of unprecedented scale and scope. We can act on those most effectively if we can stitch together all the little maps, overlay them, get a more complex and mutually-informed view of the world. And stop reinventing the wheel in each of its out-of-the-way corners.
It has taken well over a decade to reach this point. Legal and government information showed up on the Web in the early 90's -- our own efforts here at the LII and Carl Malamud's liberation of the EDGAR database were leading examples. Those were quickly followed by open-access projects in Canada and Australia. Digital-government projects began in the US around 1995 or 1996, including many self-publication projects in courts and legislatures. These efforts created significant pools of data based on open standards, and the availability of that data made it possible for information-science researchers to pay far more attention to legal data than they could when it was behind proprietary barriers. Now we're seeing lot more work on legal data by computer scientists working with language technologies, database specialists, semantic-Web engineers, and others. In Europe, work on integration of government information was propelled (and, ultimately, funded) by the requirements of unification. Everywhere, more and more courts, legislatures, and agencies are putting information on the Internet in more and better ways using improved technologies.
A condensed narrative like the preceding demands oversimplification, and I apologize if I've slighted anyone out of sheer middle-aged forgetfulness. And this tale no doubt has its beginnings much earlier -- you could, for example, point at the long cooperation between the statistical arms of various government agencies and academia and industry as part of the story. But as with so many other things the rise of the Web was the start of a new wave. That long, slow groundswell -- the product of many individual efforts over a decade and a half -- is now peaking.
The American press first saw fit to remark on it about a year ago, with the release of extensive caselaw datasets by public.resource.org -- Carl Malamud's latest effort. A community has started to form. Just within the last year or so, we've seen:
- a variety of open-government initiatives continuing important work already underway at the Sunlight Foundation
- a critical study of the e-rulemaking system by a special committee of the American Bar Association
- public clamor for greater Congressional transparency
- a highly influential paper by a group at Princeton advocating the creation of a robust market in valued-added government data, premised on wide release in bulk form
- the creation of a public commons of legal-education information
- an effort to identify and standardize "Rosetta Stone" datasets that link together other datasets in ways that promote aggregation, integration, and interoperability
- advocacy for new models for cooperation between volunteers wanting to "hack government" and the government itself
- more and more attempts to bring open standards to legislation and judicial opinions, including MetaLex, OAI4Courts, and the nascent (and as yet unpublished) urn:lex standard
But ... we need to be talking to each other much, much more. We need the kind of efficiency that we can only get by learning from one another. We need to make informed choices between inexpensive automated approaches that work by brute force and the hand-crafted, highly-accurate approaches of legal bibliography that are not always scalable or affordable. We need to recalibrate what we mean by "authority", and begin to think about measures of quality and reliability for legal text that avoid the creation of unnatural monopolies in legal information.
We do all need to be talking more, and this week the LII starts a modest effort in that direction. Our new guest blog, VoxPopuLII, is designed to help the conversation along with biweekly posts from folks you may not have heard from before. They're from all different tribes in all different places on the intellectual and global map. We've asked for their big ideas -- and if you've got big ideas of your own, I'd invite you to get in touch with me about writing something for us. And of course we invite your comments and suggestions about what you find there.