“To be blunt, there is just too much stuff.” (Robert C. Berring, 1994 )
Law is an information profession where legal professionals take on the role of intermediaries towards their clients. Today, those legal professionals routinely use online legal research services like Westlaw and LexisNexis to gain electronic access to legislative, judicial and scholarly legal documents.
Put simply, legal service providers make legal documents available online and enable users to search these text collections in order to find documents relevant to their information needs. For quite some time the main focus of providers has been the addition of more and more documents to their online collections. Quite contrary to other areas, like Web search, where an increase in the number of available documents has been accompanied by major changes in the search technology employed, the search systems used in online legal research services have changed little since the early days of computer-assisted legal research (CALR).
It is my belief, however, that the search technology employed in CALR systems will have to dramatically change in the next years. The future of online legal research services will more and more depend on the systems’ ability to create useful result lists to users’ queries. The continuing need to make additional texts available will only speed up the change. Electronic availability of a sufficient number of potentially relevant texts is no longer the main issue; quick findability of a few highly relevant documents among hundreds or even thousands of other potentially relevant ones is.
To reach that goal, from a search system’s perspective, relevance ranking is key. In a constantly growing number of situations – just like Professor Berring already stated almost 20 years ago (see above ) – even carefully chosen keywords bring back “too much stuff”. Successful ranking, that is the ordering of search results according to their estimated relevance, becomes the main issue. A system’s ability to correctly assess the relevance of texts for every single individual user, and for every single of their queries will quickly become – or has arguably already become in most cases – the next holy grail of computer-assisted legal research.
Until a few years back providers could still successfully argue that search systems should not be blamed for the lack of “theoretically, maybe, sometimes feasible” relevance-ranking capabilities, but rather that users had to be blamed for their missing search skills. I do not often hear that line of argumentation any longer, which certainly does not have to do with any improvement of (Boolean) search skills of end users. Representatives of service providers do not dare to follow that line of argumentation any longer, I think, because every single day every one of them uses Google by punching in vague, short queries and still mostly gets back sufficiently relevant top results. Why should this not work in CALR systems?
Indeed. Why, one might ask, is there not more Web search technology in contemporary computer-assisted legal research? Sure, according to another often-stressed argument of system providers, computer-assisted legal research is certainly different from Web search. In Web search we typically do not care about low recall as long as this guarantees high precision, while in CALR trading off recall for precision is problematic. But even with those clear differences, I have, for example, not heard a single plausible argument why the cornerstone of modern Web search, link analysis, should not be successfully used in every single CALR system out there.
These statements certainly are blunt and provocative generalizations. Erich Schweighofer, for example, has already even shown in 1999 (pre-mainstream-Web), that there had in fact been technological changes in legal information retrieval in his well-named piece “The Revolution in Legal Information Retrieval or: The Empire Strikes Back”. And there have also been free CALR systems like PreCYdent that have fully employed citation-analysis techniques in computer-assisted legal research and have thereby – even if they did not manage to stay profitable – shown “one of the most innovative SE [search engine] algorithms“, according to experts.
An exhaustive and objective discussion of the various factors that contribute to the slow technological change in computer-assisted legal research can certainly neither be offered by myself alone nor in this short post. For a whole mix of reasons, there is not (yet) more “Google” in CALR, including the fear of system providers to be held liable for query modifications which might (theoretically) lead to wrong expert advice, and the lack of pressure from potential and existing customers to use more modern search technology.
What I want to highlight, however, is one more general explanation which is seldom put forward explicitly. What slows down technological innovation in online legal research, in my opinion, is also the interest of the whole legal profession to hold on to a conception of “legal relevance” that is immune to any kind of computer algorithm. A successfully employed, Web search-like ranking algorithm in CALR would after all not only produce comfortable, highly relevant search results, but would also reveal certain truths about legal research: The search for documents of high “legal relevance” to a specific factual or legal situation is, in most cases, a process which follows clear rules. Many legal research routines follow clear and pre-defined patterns which could be translated into algorithms. The legal profession will have to accept that truth at some point, and will therefore have to define and communicate “legal relevance” much less mystically and more pragmatically.
Again, also at this point, one might ask “Why?” I am certain that if the legal profession, that is legal professionals and their CALR service providers, do not include up-to-date search technology in their CALR systems, someone else will at some point do so without the need for a lot of involvement of legal professionals. To be blunt, at this point, Google can still serve as an example for our systems, at some point soon it might simply set an example instead of our systems.
Anton Geist is Law Librarian at WU (Vienna University of Economics and Business) University Library. He law degrees from University of Vienna (2006) and University of Edinburgh (2010). He is grateful for feedback and discussions and can be contacted at email@example.com.
 Berring, Robert C. (1994), Collapse of the Structure of the Legal Research Universe: The Imperative of Digital Information, 69 Wash. L. Rev. 9.
VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed. The information above should not be considered legal advice. If you require legal representation, please consult a lawyer.
[Editor's Note] For topic-related VoxPopuLII posts please see: Núria Casellas, Semantic Enhancement of legal information … Are we up for the challenge?; Marcie Baranich, HeinOnline Takes a New Approach to Legal Research With Subject Specific Research Platforms; Elisabetta Fersini, The JUMAS Experience: Extracting Knowledge From Judicial Multimedia Digital Libraries; João Lima, et.al, LexML Brazil Project; Joe Carmel, LegisLink.Org: Simplified Human-Readable URLs for Legislative Citations; Robert Richards, Context and Legal Informatics Research; John Sheridan, Legislation.gov.uk