“There is nothing like looking, if you want to find something.”

-J.R.R. Tolkein


This summer, at close to the very last minute, I set out for Cambridge, Massachusetts to pursue a peculiar quest for open access to law. Steering clear of the dragon on its pile of gold, I found some very interesting people in a library doing something in some ways parallel, and in many ways complementary, to what we do at LII.

At the Harvard Law School Library, there’s a group called the Library Innovation Lab, which uses technology to improve preservation and public access to library materials, including digitizing large corpora of legal documents. It is a project which complements what we do at the LII, and I went there to develop some tools that would be of help to us both and to others.

The LIL summer fellowship program that made this possible brought together a group with wide-ranging interests, in both substantive areas, such as Neel Agrawal’s website on the history of African drumming laws to Muira McCammon’s research on the Guantanamo detainee library, to crowdsourced documentation and preservation projects such as Tiffany Tseng’s Spin and Pix devices, Alexander Nwala’s Local Memory and Ilya Kreymer’s Webrecorder, to infrastructure projects such as Jay Edwards’s Caselaw Access Project API.

My project involved work on a data model to help developers make connections between siloed collections of text and metadata — which will hopefully help future developers to automate the process of connecting concepts in online legal corpora (both that at the LIL and ours at LII) to enriching data and context from multiple different sources.

The work involved exploring a somewhat larger-than-usual number of ontologies, structured vocabularies, and topic models. Each, in turn, came with one or more sets of subjects. Some (like Eurovoc and the topic models) came with sizable amounts of machine-readable text; others (like Linked Data For Libraries) came with very little machine-accessible text. As my understanding of the manageable as well as the insurmountable challenges associated with each one increased, I developed a far greater appreciation for the intuition that had led me to the project all along: there is a lot of useful information locked in these resources; each has a role to play.

In the process, I drew enormous inspiration from the dedication and creativity of the LIL group, from Paul Deschner’s Haystacks project, which provides a set of filters to create a manageable list of books on any subject or search term, to Brett Johnson’s work supporting the H2O open textbook platform, to Matt Phillips’s exploration of private talking spaces, to the Caselaw Access Project visualizations such as Anastasia Aisman’s topic mapping and  Jack Cushman’s word clouds (supported by operational, programming, and metadata work from Kerri Fleming, Andy Silva, Ben Steinberg, and Steve Chapman). (All of this is thanks to the Harvard Law Library leadership of Jonathan Zittrain, LIL founder Kim Dulin, managing director Adam Ziegler, and library director Jocelyn Kennedy.)

And back again…

Returning to home to LII, I’m grateful to have the rejuvenating energy that arises from talking to new people, observing how other high-performing groups do their work, and having had the dedicated time to bring a complicated idea to fruition. All in all, it was a marvelous summer with marvelous people. But they did keep looking at me as if to ask why I’d brought along thirteen dwarfs, and how I managed to vanish any time I put that gold ring on my finger.