skip navigation

Ending up in legal informatics was probably more or less inevitable for me, as I wanted to study both law and electrical engineering from early on, and I just hoped that the combination would start making some sense sooner or later. ICT law (which I still pursue sporadically) emerged as an obvious choice, but AI and law seemed a better fit for my inner engineer.

dada fuzzy teacupThe topic for my (still ongoing-ish) doctoral project just sort of emerged. Reading through the books recommended by my master’s thesis supervisor (professor Peter Blume in Copenhagen) a sentence in Cecilia Magnusson Sjöberg‘s dissertation caught my eye: “According to Bench-Capon and Sergot, fuzzy logic is unsuitable for modelling vagueness in law.” (translation ar) Having had some previous experiences with fuzzy control, this seemed like an interesting question to study in more detail. To me, vagueness and uncertainty did indeed seem like good places to use fuzzy logic, even in the legal domain.

After going through loads of relevant literature, I started looking for an example domain to do some experiments. The result was MOSONG, a fairly simple model of trademark similarity that used Type-2 fuzzy logic to represent both vagueness and uncertainty at the same time. Testing MOSONG yielded perfect results on the first validation set as well, which to me seemed more suspicious than positive. If the user/coder could decide the cases correctly without the help of the system, would it not affect the coding process as well? As a consequence I also started testing the system on a non-expert population (undergraduates, of course), and the performance started to conform better to my expectations.

My original idea for the thesis was to look at different aspects of legal knowledge by building a few working prototypes like MOSONG and then explaining them in terms of established legal theory (the usual suspects, starting from Hart, Dworkin, and Ross). Testing MOSONG had, however, made me perhaps more attuned to the perspective of an extremely naive reasoner, certainly a closer match for an AI system than a trained professional. From this perspective I found conventional legal theory thoroughly lacking, and so I turned to the more general psychological literature on reasoning and decision-making. After all, there is considerable overlap between cognitive science and artificial intelligence as multidisciplinary ventures. Around this time, the planned title of my thesis also received a subtitle, thus becoming Fuzzy Systems and Legal Knowledge: Prolegomena to a Cognitive Theory of Law, and a planned monograph morphed into an article-based dissertation instead.

Illustration by David PlunkettOne particularly useful thing I found was the dual-process theory of cognition, on which I presented a paper at IVR-2011 just a couple of months before Daniel Kahneman’s Thinking, Fast and Slow came out and everyone started thinking they understood what System 1 and System 2 meant. In my opinion, the dual-process theory has important implications for AI and law, and also explains why it has struggled to create widespread systems of practical utility. Representing legal reasoning only in classically rational System 2 terms may be adequate for expert human reasoners (and simple prototype systems), but AI needs to represent the ecological rationality (as opposed to the cognitive biases) of System 1 as well, and to do this properly, different methods are needed, and on a different scale. Hello, Big Dada!

In practice this means that the ultimate way to properly test one’s theories of legal reasoning computationally is through a full-scale R&D process of an AI system that hopefully does something useful. In an academic setting, doing the R part is no problem, but the D part is a different matter altogether, both because much of the work required can be fairly routine and too uninteresting from a publication standpoint, and because the muchness itself makes the project incompatible with normal levels of research funding. Instead, typically, an interested external recipient is required in order to get adequate funding. A relevant problem domain and a base of critical test users should also follow as a part of the bargain.

In the case of legal technology, the judiciary and the public administration are obvious potential recipients. Unfortunately, there are at least two major obstacles for this. One is attitudinal, as exemplified by the recent case of a Swedish candidate judge whose career path was cut short after creating a more usable IR system for case law on his own initiative. The other one is structural, with public sector software procurement in general in a state of crisis due to both a limited understanding of how to successfully develop software systems that result in efficiency rather than frustration, and the constraints of procurement law and associated practices which make such projects almost impossible to carry out successfully even if the required will and know-how were there.

The private sector is of course the other alternative. With law firms, the prevailing business model based on hourly billing offers no financial incentives for technological innovation, as most notably pointed out by Richard Susskind, and the attitudinal problems may not be all that different. Legal publishers are generally not much better, either. And overall, in large companies the organizational culture is usually geared towards an optimal execution of plans from above, making it too rigid to properly foster innovation, and for established small companies the required investment and the associated financial risk are too great.

So what is the solution? To all early-stage legal informatics researchers out there: Find yourselves a start-up! Either start one yourself (with a few other people with complementary skillsets) or find an existing one that is already trying to do something where your skills and knowledge should come in handy, maybe just on a consultancy basis. In the US, there are already over a hundred start-ups in the legal technology field. The number of start-ups doing intelligent legal technology (and European start-ups in the legal field in general) is already much smaller, so it should not be too difficult to gain a considerable advantage over the competition with the right idea and a solid implementation. I myself am fortunate enough to have found a way to leverage all the work I have done on MOSONG by co-founding Onomatics earlier this year.

This is not to say that just any idea, even one that is good enough to be the foundation for a doctoral thesis, will make for a successful business. This is indeed a common pitfall with the commercialization of academic research in general. Just starting with an existing idea, a prototype or even a complete system and then trying to find problems it (as such) could solve is a proven way to failure. If all you have is a hammer, all your problems start to look like nails. This is also very much the case with more sophisticated tools. A better approach is to first find a market need and then start working towards a marketable technological solution for it, of course using all one’s existing knowledge and technology whenever applicable, but without being constrained by them, when other methods work better.

Testing one’s theories by seeing whether they can actually be used to solve real-world problems is the best way forward towards broader relevance for one’s own work. Doing so typically involves considerable amounts of work that is neither scientifically interesting nor economically justifiable in an academic context, but which all the same is necessary to see if things work as they should. Because of this, such real-world integration is more feasible when done on a commercial basis. In this lies a considerable risk for the findings of this type of applied research to remain entirely confidential and proprietary as trade secrets, rather than becoming published at least to some degree, thus fuelling future research also in the broader research community and not just the individual company. To avoid this, active cooperation between the industry and academia should be encouraged.

Anna Ronkainen is currently working as the Chief Scientist of Onomatics, Inc., a legal technology start-up of which she is a co-founder. Previously she has worked with language technology both commercially and academically for over fifteen years. She is a serial dropout with (somehow) a LL.M. from the University of Copenhagen, and she expects to defend her LL.D. thesis Fuzzy Systems and Legal Knowledge: Prolegomena to a Cognitive Theory of Law at the University of Helsinki during the 2013/14 academic year. She blogs at (with Anniina Huttunen) and


VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

The making of law is a quintessentially public activity. The suggestion that any private person or group owns the law-making process is a cynical one—if it were true, there would something deeply wrong with the state.

At first blush, then, “public legal information” seems to be redundant. If it’s legal information, then of course it’s public. Even since Hammurabi inscribed his code of laws on a large stone and placed it in a public place for all to read, it’s been understood that public dissemination is part of the law-making process. Thus, when we discover that the dissemination of legal information is largely in the hands of private, profit-making publishing companies, it’s only natural to think of this as a worrisome condition. Somehow, it seems, private interests have appropriated a public function, and something ought to be done to return legal information to its proper public status.

My purpose in this blog entry (which must absolutely not be confused with the purpose, policy, or opinion of my employer, or anyone else for that matter) is neither to support nor oppose the notion that legal information should become more public or less private. It is, instead, to reflect on why the dissemination of legal information, at least in the United States, became a subject of private commerce, and to say a few words about the relationship between public legal information services like the LII and commercial law publishers.

One way of explaining the prevalence of private legal information is as a default of the public authorities. The government that makes the law had the responsibility to disseminate it, but it didn’t, and so private publishers stepped in to fill the need. There is an element to truth to this, but I don’t think it tells the whole story. It is true that various law-making bodies have, from time to time, seemed to decide that their responsibility to promulgate the law extended no further than ensuring that the law was somehow available to the public—that it was open for inspection in a clerk’s office, perhaps, or even just that it was spoken in a public place. As long as the law was not kept secret, it was public enough.

But before we criticize the government too harshly for failing to take the steps needed to make the law actually known to the public rather than simply theoretically available, consider what those next steps are. Let’s look first at the situation before the coming of electronic publishing.

For about the first two hundred years of the republic, the only practical way to disseminate the law was in print. And the process of printing the law has two elements that are inextricably intertwined: the mechanical tasks of setting type, printing pages, binding leaves into a book, and so forth; and the editorial task of deciding what the printed book would contain, and how it would be organized.

The making the law is a quintessentially public function, and the mechanical parts of book production can be done as readily by the government as by anyone else. But the task of editing the law is inherently difficult for governments to perform. It’s hard enough for a representative body of elected officials to decide what the text of the law is. Deciding what law should be published in what volume, how it should be arranged within the volume, and what indexing and abstracting should be applied to it are tasks of a complexity and delicacy that make them very difficult for collective decision-makers.

Consider, for example, the fairly common situation in which a state makes official publication of its session laws, but leaves the publication of the codification of those laws to private publishers. The printing of session laws requires a relatively low level of editorial judgment—the laws are printed one after another in the order they were passed or signed by the governor. If an index is provided, it doesn’t need to be a very good index, because, as a practical matter, almost no one is going to use it. The task of printing session laws is often contracted to private publishers, but it can also be done quite satisfactorily by the government itself if the government has the necessary expertise in the mechanical aspects of book production.

Now, a state with published session laws has done considerably better than one that simply makes laws available for inspection in the office of the legislative clerk, but it still falls far short of making those laws known to the public. No one learns the state of the law by reading the session laws if a code is available. And producing a code requires a much higher degree of editorial judgment in the selection, arrangement, and indexing of its contents.

The federal government and some states have taken charge of the basic aspects of the codification process for statutes. And sometimes, as with the United States Code, there is an officially-produced version of the codified laws. But even this is never the resource of choice for those who wish to learn the state of the law. Even if there is an official code available, legal researchers much prefer to use an annotated code—a code which has the basic structure of the official code, but which enhances it a much broader array of editorial aids, chiefly in the form of annotations and detailed indexes. And very few legislatures publish official annotated codes.

Private publishers came to dominate the task of making the law known because they excel at the editorial functions needed to make a large body of law intelligible to those who wish to learn the law. It is not impossible for public bodies to edit the law, but they’re not very good at it. Even where public law publishing has not been made largely irrelevant by the products of the private publishers, its editorial quality is suspect. My favorite example of this is the official index to the Federal Register—it is not a subject index, it’s an agency index. This illustrates one the problems of public law publishing: from a bureaucratic point of view, what’s important is not what the law is about, but who made it.

Private law publishers were better than the government at the essential editorial tasks necessary to make the law known. And making the law known is a task that admits of degrees. A person with access to the Statutes at Large and the Federal Register had, in some sense, the wherewithal to know the federal tax laws of the United States. A person with access to the United States Code and the Code of Federal Regulations would have found the task much easier. A person with access to United States Code Annotated (or United States Code Service) and the Code of Federal Regulations Annotated would do better still. But no serious tax practitioner would venture into the area without access to one of the premier loose-leaf sets like the United States Tax Reporter or the CCH Standard Federal Tax Reporter.

In the world of paper law publishing, no matter what level of service the government chose to provide, there would always be (at least for areas of the law of wide interest or high value) commercial offerings that were better than the official ones. This might not be the case if the law were simple, straightforward, and easy to understand. But in the real world, the law is almost always so complicated and voluminous that it’s all but incomprehensible without a good measure of editorial guidance.

Thus, the state of the legal information world before the coming of computers was one where there was a limited amount of official publication, but in which nearly all of the most useful and valuable information tools were produced by private companies. Notwithstanding the private status of the publishers, the law publishing industry was seen by many people as a quasi-public activity. All of the major players made it a point to cultivate a public-service image for the enterprise, and a reputation for both punctilious accuracy and strict neutrality.

OK, so much for the world of paper books occupied by our ancestors. How much of this is still relevant in the age of the internet? Most of it, I think.

With the coming of the internet and advanced text-processing tools, the mechanical tasks associated with the production of paper books have largely been supplanted. No one needs to know how to set type, print pages, or bind books in order to create a useful public legal information resource on the internet. But the mechanical aspects of book production were never the real problem. In order for a public legal information resource on the internet to be able to take the place of private law publishers, there also needs to be a good technological substitute for the editorial component of law publishing.

For a while, this seemed possible to many people. Lexis became a major legal information resource without any significant investment in editorial work. The Saltonian orthodoxy that held sway in the information retrieval community from the 1960s though the 1980s taught that the days of manual intellectual indexing were numbered—that the future belonged to clever free text algorithms. But beginning with the Blair and Maron paper in 1985, it has become increasingly clear that, though exceedingly useful, free text retrieval techniques are not a complete substitute for intellectual indexing. The death knell for the Saltonian outlook sounded a few years ago when Lexis decided that it needed to add a measure of intellectual indexing to its main case law databases. Lexis, which founded its business on the idea that free text provided adequate access to the law, and which spent years pooh-poohing the intellectual indexing offered by Westlaw, threw in the towel. Perhaps there will, one day, be technology that obviates the need for human editorial effort, but that horizon is now too distant for company that aspires to provide high-end legal information today.

The hope of liberating legal information from the private sector was, I think, greatly influenced by the early success of Lexis. With the coming of the internet and some related technologies, it became apparent that the functional equivalent of the early Lexis system could be assembled at relatively low cost by anyone who cared to do so. And as long as Lexis seemed to prosper without making any investment in editorial resources, it seemed reasonable that a public legal information resource could follow in its wake. Now, however, that most people have lost the technological optimism of the Saltonians, it is less easy to see how a public resource can reasonably compete with the offerings of the major private publishers.

Now a sophisticated tax practitioner would not attempt to do serious research without having access to the full array of tax information available through an electronic resource like Checkpoint, or Westlaw, or Lexis.  And many would insist on having access to more than one of these services, and to a number of others besides. Electronic publication has not made the products of the private publishers dispensable, it’s just changed the list of indispensable resources, and inspired the creation of some new ones.

I don’t mean to suggest that the availability of the new crop of public (and low-cost commercial) legal information resources has had no effect on legal research—far from it. But that effect has not, for the most part, been to make legal research simpler and less expensive: sometimes it’s simpler and cheaper, but just as often it’s more complex and more expensive. The unambiguous effect is that legal research is, or at least can be, better.

There are any number of respects in which legal research has gotten better in recent years—let me just mention one obvious example. There was a time when few people cared that the paper citation indexes everyone relied on were months out of date—there was only one way to check citations, so that one way was good enough. One of the first major effects of electronic publishing on actual research practice was to make it possible, and thus needful, to verify citations to within days, or sometimes hours or even minutes, rather than months.

The effect of the public legal information movement has not been to supplant commercial services, but to drive them to innovate. If basic legal information is freely available, the only way to make money in the segment is to offer more. If the goal of the LII is to put Westlaw and Lexis out of business, LII is bound to fail. But LII can, and does, make legal information better and more available. The new low-end providers like Loislaw are pushed to provide more because they need to be better than LII. Lexis has been pushed to offer more, because it has to be better than Loislaw. And Westlaw has been pushed to offer more because it has to be better than Lexis.

LII is important to legal information not because it’s the best service, but because it alters the ecology of the legal information market. It makes everyone else better. And when it gets better, everyone else will get better still.

dabney.jpgDan Dabney is Senior Director for Classification at Thomson Reuters
Global Resources in Zug, Switzerland.  Dan has a law degree and a Ph.D.
in library and information studies, and worked as a lawyer, a law
librarian, and a library school professor before entering the private
sector.  He was one of the principal designers of KeyCite.

VoxPopuLII is edited by Judith Pratt