skip navigation
search

§.1.- Foreword

«If folksonomies work for pictures (Flickr), books (Goodreads), questions and answers (Quora), basically everything else (Delicious), why shouldn’t they work for law?» (Stefania Manzioli)

In a post on this blog, Stefania Manzioli distinguishes three uses of taxonomies in law: (1) for research of legal documents, (2) in teaching to law students, and (3) for its practical application.

In regard to her first point, she notes that (observation #1) to increase the availability of legal resources is compelling change of the whole information architecture, and – correctly, in my opinion – she exposes some objections to the heuristic efficiency of folksonomies: (objection #1) they are too “flat” to constitute something useful for legal research and (objection #2) it is likely that non-expert users could “pollute” the set of tags. Notwithstanding these issues, she states (prediction #1) that folksonomies could be helpful with non-legal users.

On the second point, she notes (observation #2) that folksonomies could be beneficial to study the law, because they could allow one to penetrate easier into its conceptual frameworks; she also formulates the hypothesis (prediction #2) that this teaching method could shape a more flexible mindset in students.

In discussing the third point, she notes (observation #3) that different taxonomies entail different ways of apply the law, and (prediction #3) she formulates the hypothesis that, in a distant perspective in which folksonomies would replace taxonomies, the result would be a whole new way to apply the law.

I appreciated Manzioli's post and accepted with pleasure the invitation of Christine Kirchberger – to whom I am grateful – to share my views with the readers of this prestigious blog. Hereinafter I intend to focus on the theoretical profiles that aroused my curiosity. My position is partly different from that of Stefania Manzioli.

 

§.2.- Introduction

In order to detect the issues stemming from folksonomies, I think it is relevant to give some preliminary clarifications.

In collective tagging systems, by tagging we can describe the content of an object – an image, a song or a document – label it using any lexical expression preceded by the “hashtag” (the symbol “#”) and share it with our friends and followers or also recommend it to an audience of strangers.

Folksonomies (blend of the words “taxonomy” and “folk”) are sets of categories resulting from the use of tags in the description of on line resources by the users, allowing a "many to many" connection between tags, users and resources.

Basic pattern of a folksonomy

Basic pattern of a folksonomy

 

Thomas Vander Wal coined the word a decade ago – ten years is really a long time in ICTs – and these technologies, as reported by Stefania Manzioli, have now been adopted in most of the social networks and e-commerce systems.

The main feature of folksonomies is that tags aggregate spontaneously in a semantic core; therefore, they are often associated with taxonomies or ontologies, although in these latter cases hierarchies and categories are established before the collection of data, as “a priori”.

Simplifying, I can say that tags may describe three aspects of the resources, using particulars  (i.e. a picture of a flowerpot lit by the sun):

(1) The content of the resources (i.e. #flowers),

(2) The interaction with other specific resources and the environment in general (i.e. #sun or #summer),

(3) The effect that these resources have on users having access to them (i.e. #beautiful).

Since it seems to me that none of these aspects should be disregarded in an overall assessment of folksonomies, I will consider all of them.

Having regard to law, they end to match with these three major issues:

(1) Law as a “content”. Users select legal documents among others available and choose those that seem most relevant. As a real interest is – normally – the driving criterion of the search, and as this typically is given by the need to solve a legal problem, I designate this profile with the expression «Quid juris?».

(2) Law as a “concept”. This problem emerges because the single legal document can not be conceived separately from the context in which it appears, namely the relations it has with the legal system to which it belongs. Consequently becomes inevitable to ask what the law is, as a common feature of all legal documents. Recalling Immanuel Kant in the "Metaphysics of Morals",  here I use the expression «Quid jus?».

(3) Law as a “sentiment”. What emerges in folksonomies is a subjective attitude that regards the meaning to be attributed to the research of resources and that affects the way in which it is performed. To this I intend to refer using the expression «Cur jus?».

 

§.3.- Folksonomies, Law, and «Quid juris?»: legal information management and collective tagging systems

In this respect, I agree definitely with Stefania Manzioli. Folksonomies seem to open very interesting perspectives in the field of legal information management; we admit, however, that these technologies still have some limitations. For instance: just because the resources are tagged freely, it is difficult to use them to build taxonomies or ontologies; inexperienced users classify resources less efficiently than the other, diluting all the efforts of more skilled users and “polluting" well-established catalogs; vice versa, even experienced users can make mistakes in the allocation of tags, worsening the quality of information being shared.

Though in some cases these issues can be solved in several ways –  i.e., the use of tags can be guided with the tag's recommendation, hence the distinction between broad and narrow folksonomies – and even if it can reasonably be expected that these tools will work even better in the future, for now we can say that folksonomies are useful just to integrate pre-existing classifications.

I may add, as an example, that an Italian law requires the creation of "user-created taxonomies (folksonomies)," "Guidelines for websites of public administrations" of 29 July 2011, page 20.  These guidelines have been issued pursuant to art. 4 of Directive 26th November 2009 n. 8, of the "Minister for Public Administration and Innovation", according to the Legislative Decree of 7 March 2005, n. 82, "Digital Administration Code" (O.J. n. 112 of 16th May 2005, S.O. n. 93). It may be interesting to point out that in Italian law the innovation in administrative bodies is promoted by a specific institution, the Agency for Digital Italy ("Agenzia per l'Italia Digitale"), which coordinates the actions in this field and sets standards for usability and accessibility. Folksonomies indeed fall into this latter category.

Following this path, a municipality (Turin) has recently set up a system of "social bookmarking" for the benefit of citizens called TaggaTO.

 

§.4.- Folksonomies, Law, and «Quid jus?»: the difference between the “map” and the “territory”

In this regard, my theoretical approach is different from that of Stefania Manzioli. Here is the reason our findings are opposite.

Human beings are "tagging animals", since labelling things is a natural habit. We can note it in common life: each of us, indeed, organizes his environment at home (we have jars with “salt” or “pepper” written on the caps) and at work (we use folders with “invoices” or “bank account” printed on the cover). The significance of tags is obvious if we consider using it with other people: it allows us to establish and share a common information framework. For the same reasons of convenience, tags have been included in most of the software applications we use (documents, e-mail, calendars) and, as said above, in many online services. To sum up, labels help us to build a representation of reality: they are tools for our knowledge.

In regard to reality and knowledge, it may be recalled that in the twentieth century there were two philosophical perspectives: the “continental tradition”, focused on the first (reality) and pretty much common in Europe, and the “analytic philosophy”, centered on the second (knowledge and widespread among USA, UK and Scandinavia. More recently, this distinction has lost much of its heuristic value and we have seen rising a different approach, the “philosophy of information”, which proposes, developing some theoretical aspects of cybernetics, a synthesis of reality and knowledge in an unifying vision that originates from a naturalistic notion of “information”.

I will try to simplify, saying that if reality is a kind of “territory”, and if taxonomies (and in general ontologies) can be considered as a sort of representation of knowledge, then they can be considered as “maps”.

In light of these premises, I should explain what to me “sharing resources” and “shared knowledge” mean in folksonomies. Folksonomies are a kind of “map”, indeed, but different than ontologies. In a metaphor: ontologies could be seen as “maps” created by a single geographer overlapping the reliefs of many “territories”, and sold indiscriminately to travelers; folksonomies could be seen as “maps” that inhabitants of different territories help each other to draw by telephone or by texting a message. Both solutions have advantages and disadvantages: the former may be detailed but more difficult to consult, while the latter may be always updated but affected by inaccuracies. In this sense, folksonomies could be said “antifragile” – according to the brilliant metaphor of Nassim Nicholas Taleb – because their value improves with increased use, while ontologies could be seen as "fragile", because of the linearity of the process of production and distribution.

Therefore, as the “map” is not the “territory”, reality does not change depending on the representation. Nevertheless, this does not mean that the “maps” are not helpful to travel to unknown “territories”, or to reach faster the destination even in “territories” that are well known (just like when driving in the car with the aid of GPS).

On the application of folksonomies to the field of law, I shall say that, after all, legal science has always been a kind of “natural folksonomy”. Indeed, it has always been a widespread knowledge, ready to be practiced, open to discussion, and above all perfectly “antifragile”: new legal issues to be solved determine a further use of the systems, thus causing an increase in knowledge and therefore a greater accuracy in the description of the legal domain. In this regard, Stefania Manzioli in her post also mentioned the Corpus Juris Civilis, which for centuries has been crucial in the Western legal culture. Scholars went to Italy from all over Europe to study it, at the beginning by noting few elucidations in the margins of the text (glossatores), then commenting on what they had learned (commentatores), and using their legal competences to decide cases that were submitted to them as judges or to argue in trials as lawyers.

Modern tradition has refused all of this, imposing a rationalistic and rigorous view of law. This approach – “fragile”, continuing with the paradigm of Nassim Nicholas Taleb – has spread in different directions, which simplifying I can lower to three:

(1) Legal imperativism: law as embodied in the words of the sovereign.

Leviathan (Thomas Hobbes)

Leviathan (Thomas Hobbes)

(2) Legal realism: law as embodied in the words of the judge.

 

Gavel

Gavel

(3) Legal formalism: law as embodied in administrative procedures.

 

The Castle (Franz Kafka)

The Castle (Franz Kafka)

For too long we have been led to pretending to see only the “map” and to ignore the “territory”. In my opinion, the application of folksonomies to law can be very useful to overcome these prejudices emerging from the traditional legal positivism, and to revisit a concept of law that is a step closer to its origin and its nature. I wrote “a step closer”; I'd like to clarify, to emphasize that the “map”, even if obtained through a participatory process, remains a representation of the “territory”, and to suggest that the vision known as the "philosophy of information" seems an attempt to overlay or replace the two terms – hence its “naturalism” – rather than to draw a “map” as similar as possible to the “territory”.

 

§.5- Folksonomies, Law and «Cur jus?»: the user in folksonomies: from “anybody” to “somebody"

This profile does not fall within the topics covered in Manzioli’s post, but I would like to take this opportunity to discuss it because it is the most intriguing to me.

Each of us arranges his resources according to the meaning that he intends to give his world. Think of how each of us arrays the resources containing information that he needs in his work: the books on the desk of a scholar, the files on the bench of a lawyer or a judge, the documents in the archive of a company. We place things around us depending on the problem we have to address: we use the surrounding space to help us find the solution.

With folksonomies, in general, we simply do the same in a context in which the concept of “space” is just a matter of abstraction.

What does it mean? We organize things, then we create “information”. Gregory Bateson in a very famous book, Steps to an Ecology of Mind – in which he wrote on “maps” and “territories”, too – stated that “information” is “the difference that makes the difference”. This definition, brilliant in its simplicity, raises the tremendous problem of the meaning of our existence and the freedom of will. This issue can be explained through an example given by a very interesting app called “Somebody”, recently released by the contemporary artist Miranda July.

The app works as follows: a message addressed to a given person is written and transmitted to another, who delivers it verbally. In other words, the actual recipient receives the message from an individual who is unknown to him. The point that fascinates me is this: someone suddenly comes out to tell that you “make a difference,” that you are not “anybody” because you are “somebody” for “somebody.” Moreover, at the same time this same person, since he is addressing you, becomes “somebody,” because the sender of the message chose him  among others, since he  “meant something” to him.

For me, the meaning of this amazing app can be summed up in this simple equation:

 

“Being somebody” = “Mean something” = “Make a difference”
 

This formula means that each of us believes he is worth something (“being somebody”), that his life has a meaning (“mean something”), that his choices or actions can change something – even if slightly – in this world (“make a difference”).

Returning to Bateson, if it is important to each of us to “make a difference”, if we all want to be “somebody”, then how could we settle down for recognize ourselves as just an “organizing agent”? Self-consciousness is related to semantics and to the freedom of choice: who is not free at all, does not create any “difference” in the world. Poetically, Miranda July makes people talk to each other, giving a meaning to humanity and a purpose to freedom: this is what “making a difference” means for humans.

In applying folksonomies to law, we should consider all this. It is true that folksonomies record the way in which each user arrays available legal documents, but it should be emphasized the purpose for which this activity is carried out. Therefore, it should be clear that an efficient cataloguing of resources depends on several conditions: certainly that the user shall know the law and remember its ontologies, but also that he shall be focused on what he is doing. This means that the user needs to be well-motivated, in order to recognize the value of what he is doing, so that to give meaning to his activity.

 

§.6- Conclusion

I believe that folksonomies can teach us a lot. In them we can find not only an extraordinary technical tool, but also – and most importantly – a reason to overcome the traditional legal positivism – which is “ontological” and therefore “fragile” – and thus rediscover the cooperation not only among experts, but also with non-experts, in the name of an “antifragile” shared legacy of knowledge that is called "law".

All this will work - or at least, it will work better - if we remember that we are human beings.

 

Federico Costantini

Federico Costantini. Lawyer and Researcher in Philosophy of Law (Legal informatics) in the Department of Legal sciences at the University of Udine (Italy). He holds a L.M.M. in Law and a Ph.D. in Philosophy of Law from the University of Padua (Italy).
His research moves among philosophy, computer science and law, focusing primarily on the struggle between technology and human nature.
He teaches Legal Informatics at the Faculty of Law of the University of Udine. His lectures on cyberlaw (especially in privacy, copyright and social networks) are aimed at bringing out the critical profiles of the "Information Society".

 

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

Artisanal Algorithms

Down here in Durham, NC, we have artisanal everything: bread, cheese, pizza, peanut butter, and of course coffee, coffee, and more coffee. It's great—fantastic food and coffee, that is, and there is no doubt some psychological kick from knowing that it's been made carefully by skilled craftspeople for my enjoyment. The old ways are better, at least until they're co-opted by major multinational corporations.

Artisanal Cheese. Source: Wikimedia Commons

Aside from making you either hungry or jealous, or perhaps both, why am I talking about fancy foodstuffs on a blog about legal information? It's because I'd like to argue that algorithms are not computerized, unknowable, mysterious things—they are produced by people, often painstakingly, with a great deal of care. Food metaphors abound, helpfully I think. Algorithms are the “special sauce” of many online research services. They are sets of instructions to be followed and completed, leading to a final product, just like a recipe. Above all, they are the stuff of life for the research systems of the near future.

Human Mediation Never Went Away

When we talk about algorithms in the research community, we are generally talking about search or information retrieval (IR) algorithms. A recent and fascinating VoxPopuLII post by Qiang Lu and Jack Conrad, “Next Generation Legal Search – It's Already Here,” discusses how these algorithms have become more complicated by considering factors beyond document-based, topical relevance. But I'd like to step back for a moment and head into the past for a bit to talk about the beginnings of search, and the framework that we have viewed it within for the past half-century.

Many early information-retrieval systems worked like this: a researcher would come to you, the information professional, with an information need, that vague and negotiable idea which you would try to reduce to a single question or set of questions. With your understanding of Boolean search techniques and your knowledge of how the document corpus you were searching was indexed, you would then craft a search for the computer to run. Several hours later, when the search was finished, you would be presented with a list of results, sometimes ranked in order of relevance and limited in size because of a lack of computing power. Presumably you would then share these results with the researcher, or perhaps just turn over the relevant documents and send him on his way. In the academic literature, this was called “delegated search,” and it formed the background for the most influential information retrieval studies and research projects for many years—the Cranfield Experiments. See also “On the History of Evaluation in IR” by Stephen Robertson (2008).

In this system, literally everything—the document corpus, the index, the query, and the results—were mediated. There was a medium, a middle-man. The dream was to some day dis-intermediate, which does not mean to exhume the body of the dead news industry. (I feel entitled to this terrible joke as a former journalist... please forgive me.) When the World Wide Web and its ever-expanding document corpus came on the scene, many thought that search engines—huge algorithms, basically—would remove any barrier between the searcher and the information she sought. This is “end-user” search, and as algorithms improved, so too would the system, without requiring the searcher to possess any special skills. The searcher would plug a query, any query, into the search box, and the algorithm would present a ranked list of results, high on both recall and precision. Now, the lack of human attention, evidenced by the fact that few people ever look below result 3 on the list, became the limiting factor, instead of the lack of computing power.

A search for delegated search

A search for delegated search

The only problem with this is that search engines did not remove the middle-man—they became the middle-man. Why? Because everything, whether we like it or not, is editorial, especially in reference or information retrieval. Everything, every decision, every step in the algorithm, everything everywhere, involves choice. Search engines, then, are never neutral. They embody the priorities of the people who created them and, as search logs are analyzed and incorporated, of the people who use them. It is in these senses that algorithms are inherently human.

Empowering the Searcher by Failing Consistently

In the context of legal research, then, it makes sense to consider algorithms as secondary sources. Law librarians and legal research instructors can explain the advantages of controlled vocabularies like the Topic and Key Number System®, of annotated statutes, and of citators. In several legal research textbooks, full-text keyword searching is anathema because, I suppose, no one knows what happens directly after you type the words into the box and click search. It seems frightening. We are leaping without looking, trusting our searches to some kind of computer voodoo magic.

This makes sense—search algorithms are often highly guarded secrets, even if what they select for (timeliness, popularity, and dwell time, to name a few) is made known. They are opaque. They apparently do not behave reliably, at least in some cases. But can't the same be said for non-algorithmic information tools, too? Do we really know which types of factors figure in to the highly vaunted editorial judgment of professionals?

To take the examples listed above—yes, we know what the Topics and Key Numbers are, but do we really know them well enough to explain why the work the way they do, what biases are baked-in from over a century of growth and change? Without greater transparency, I can't tell you.

How about annotated statutes: who knows how many of the cases cited on online platforms are holdovers from the soon-to-be print publications of yesteryear? In selecting those cases, surely the editors had to choose to omit some, or perhaps many, because of space constraints. How, then, did the editors determine which cases were most on-point in interpreting a given statutory section, that is, which were most relevant? What algorithms are being used today to rank the list of annotations? Again, without greater transparency, I can't tell you.

And when it comes to citators, why is there so much discrepancy between a case's classification and which later-citing cases are presented as evidence of this classification? There have been several recent studies, like this one and this one, looking into the issue, but more research is certainly needed.

Finally, research in many fields is telling us that human judgments of relevance are highly subjective in the first place. At least one court has said that algorithmic predictive coding is better at finding relevant documents during pretrial e-discovery than humans are.

Where are the relevant documents? Source: CC BY 2.0, flickr user gosheshe

I am not presenting these examples to discredit subjectivity in the creation of information tools. What I am saying is that the dichotomy between editorial and algorithmic, between human and machine, is largely a false one. Both are subjective. But why is this important?

Search algorithms, when they are made transparent to researchers, librarians, and software developers (i.e. they are “open source”), do have at least one distinct advantage over other forms of secondary sources—when they fail, they fail consistently. After the fact or even in close to real-time, it's possible to re-program the algorithm when it is not behaving as expected.

Another advantage to thinking of algorithms as just another secondary source is that, demystified, they can become a less privileged (or, depending on your point of view, less demonized) part of the research process. The assumption that the magic box will do all of the work for you is just as dangerous as the assumption that the magic box will do nothing for you. Teaching about search algorithms allows for an understanding of them, especially if the search algorithms are clear about which editorial judgments have been prioritized.

Beyond Search, Or How I Learned to Stop Worrying and Love Automated Research Tools

As an employee at Fastcase, Inc. this past summer, I had the opportunity to work on several innovative uses of algorithms in legal research, most notably on the new automated citation-analysis tool Bad Law Bot. Bad Law Bot, at least in its current iteration, works by searching the case law corpus for significant signals—words, phrases, or citations to legal documents—and, based on criteria selected in advance, determines whether a case has been given negative treatment in subsequent cases. The tool is certainly automated, but the algorithm is artisanal—it was massaged and kneaded by caring craftsmen to deliver a premium product. The results it delivered were also tested meticulously to find out where the algorithm had failed. And then the process started over again.

This is just one example of what I think the future of much general legal research will look like—smart algorithms built and tested by people, taking advantage of near unlimited storage space and ever-increasing computing power to process huge datasets extremely fast. Secondary sources, at least the ones organizing, classifying, and grouping primary law, will no longer be static things. Rather, they will change quickly when new documents are available or new uses for those documents are dreamed up. It will take hard work and a realistic set of expectations to do it well.

Computer assisted legal research cannot be about merely returning ranked lists of relevant results, even as today's algorithms get better and better at producing these lists. Search must be only one component of a holistic research experience in which the searcher consults many tools which, used together, are greater than the sum of their parts. Many of those tools will be built by information professionals and software engineers using algorithms, and will be capable of being updated and changed as the corpus and user need changes.

It's time that we stop thinking of algorithms as alien, or other, or too complicated, or scary. Instead, we should think of them as familiar and human, as sets of instructions hand-crafted to help us solve problems with research tools that we have not yet been able to solve, or that we did not know were problems in the first place.

Aaron KirschenfeldAaron Kirschenfeld is currently pursuing a dual J.D. / M.S.I.S. at the University of North Carolina at Chapel Hill. His main research interests are legal research instruction, the philosophy and aesthetics of legal citation analysis, and privacy law. You can reach him on Twitter @kirschsubjudice.

His views do not represent those of his part-time employer, Fastcase, Inc. Also, he has never hand-crafted an algorithm, let alone a wheel of cheese, but appreciates the work of those who do immensely.

 

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

[Editor's Note: We are pleased to publish this piece from Qiang Lu and Jack Conrad, both of whom worked with Thomson Reuters R&D on the WestlawNext research team. Jack Conrad continues to work with Thomson Reuters, though currently on loan to the Catalyst Lab at Thomson Reuters Global Resources in Switzerland. Qiang Lu is now based at Kore Federal in the Washington, D.C. area. We read with interest their 2012 paper from the International Conference on Knowledge Engineering and Ontology Development (KEOD), “Bringing order to legal documents: An issue-based recommendation system via cluster association”, and are grateful that they have agreed to offer some system-specific context for their work in this area. Their current contribution represents a practical description of the advances that have been made between the initial and current versions of Westlaw, and what differentiates a contemporary legal search engine from its predecessors.  -sd]

In her blog on “Pushing the Envelope: Innovation in Legal Search” (2009) [1], Edinburgh Informatics Ph.D. candidate K. Tamsin Maxwell presents her perspective of the state of legal search at the time. The variations of legal information retrieval (IR) that she reviews − everything from natural language search (e.g., vector space models, Bayesian inference net models, and language models) to NLP and term weighting − refer to techniques that are now 10, 15, even 20 years old. She also refers to the release of the first natural language legal search engine by West back in 1993−WIN (Westlaw Is Natural) [2]. Adding to this on-going conversation about legal search, we would like to check back in, a full 20 years after the release of that first natural language legal search engine. The objective we hope to achieve in this posting is to provide a useful overview of state-of-the-art legal search today.

What Maxwell’s article could not have predicted, even five years ago, are some of the chief factors that distinguish state-of-the-art search engines today from their earlier counterparts. One of the most notable distinctions is that unlike their predecessors, contemporary search engines, including today’s state-of-the-art legal search engine, WestlawNext , separate the function of document retrieval from document ranking. Whereas the first retrieval function primarily addresses recall, ensuring that all potentially relevant documents are retrieved, the second and ensuing function focuses on the ideal ranking of those results, addressing precision at the highest ranks. By contrast, search engines of the past effectively treated these two search functions as one and the same. So what is the difference? Whereas the document retrieval piece may not be dramatically different from what it was when WIN was first released in 1993, what is dramatically different lies in the evidence that is considered in the ranking piece, which allows potentially dozens of weighted features to be taken into account and tracked as part of the optimal ranking process.

Figure 1: Views

Figure 1. The set of evidence (views) that can be used by modern legal search engines.

In traditional search, the principal evidence considered was the main text of the document in question. In the case of traditional legal search, those documents would be cases, briefs, statutes, regulations, law reviews and other forms of primary and secondary (a.k.a. analytical) legal publications. This textual set of evidence can be termed the document view of the world. In the case of legal search engines like Westlaw, there also exists the ability to exploit expert-generated annotations or metadata. These annotations come in the form of attorney-editor generated synopses, points of law (a.k.a. headnotes), and attorney-classifier assigned topical classifications that rely on a legal taxonomy such as West’s Key Number System [3]. The set of evidence based on such metadata can be termed the annotation view. Furthermore, in a manner loosely analogous to today’s World Wide Web and the lattice of inter-referencing documents that reside there, today’s legal search can also exploit the multiplicity of both out-bound (cited) sources and in-bound (citing) sources with respect to a document in question, and, frequently, the granularity of these citations is not merely at a document-level but at the sub-document or topic level. Such a set of evidence can be termed the citation network view. More sophisticated engines can examine not only the popularity of a given cited or citing document based on the citation frequency, but also the polarity and scope of the arguments they wager as well.

In addition to the “views” described thus far, a modern search engine can also harness what has come to be known as aggregated user behavior. While individual users and their individual behavior are not considered, in instances where there is sufficient accumulated evidence, the search function can consider document popularity thanks to a user view. That is to say, in addition to a document being returned in a result set for a certain kind of query, the search provider can also tabulate how often a given document was opened for viewing, how often it was printed, or how often it was checked for its legal validity (e.g., through citator services such as KeyCite [4]). (See Figure 1) This form of marshaling and weighting of evidence only scratches the surface, for one can also track evidence between two documents within the same research session, e.g., noting that when one highly relevant document appears in result sets for a given query-type, another document typically appears in the same result sets. In summary, such a user view represents a rich and powerful additional means of leveraging document relevance as indicated through professional user interactions with legal corpora such as those mentioned above.

It is also worth noting that today’s search engines may factor in a user’s preferences, for example, by knowingVOX.LegalResearch what jurisdiction a particular attorney-user practices in, and what kinds of sources that user has historically preferred, over time and across numerous result sets.

While the materials or data relied upon in the document view and citation network view are authored by judges, law clerks, legislators, attorneys and law professors, the summary data present in the annotation view is produced by attorney-editors. By contrast, the aggregated user behavior data represented in the user view is produced by the professional researchers who interact with the retrieval system. The result of this rich and diverse set of views is that the power and effectiveness of a modern legal search engine comes not only from its underlying technology but also from the collective intelligence of all of the domain expertise represented in the generation of its data (documents) and metadata (citations, annotations, popularity and interaction information). Thus, the legal search engine offered by WestlawNext (WLN) represents an optimal blend of advanced artificial intelligence techniques and human expertise [5].

Given this wealth of diverse material representing various forms of relevance information and tractable connections between queries and documents, the ranking function executed by modern legal search engines can be optimized through a series of training rounds that “teach” the machine what forms of evidence make the greatest contribution for certain types of queries and available documents, along with their associated content and metadata. In other words, the re-ranking portion of the machine learns how to weigh the “features” representing this evidence in a manner that will produce the best (i.e., highest precision) ranking of the documents retrieved.

Nevertheless, a search engine is still highly influenced by the user queries it has to process, and for some legal research questions, an independent set of documents grouped by legal issue would be a tremendous complementary resource for the legal researcher, one at least as effective as trying to assemble the set of relevant documents through a sequence of individual queries. For this reason, WLN offers in parallel a complement to search entitled “Related Materials” which in essence is a document recommendation mechanism. These materials are clustered around the primary, secondary and sometimes tertiary legal issues in the case under consideration.

Legal documents are complex and multi-topical in nature. By detecting the top-level legal issues underlying the original document and delivering recommended documents grouped according to these issues, a modern legal search engine can provide a more effective research experience to a user when providing such comprehensive coverage [6,7]. Illustrations of some of the approaches to generating such related material are discussed below.

Take, for example, an attorney who is running a set of queries that seeks to identify a group of relevant documents involving “attractive nuisance” for a party that witnessed a child nearly drowned in a swimming pool. After a number of attempts using several different key terms in her queries, the attorney selects the “Related Materials” option that subsequently provides access to the spectrum of "attractive nuisance"-related documents. Such sets of issue-based documents can represent a mother lode of relevant materials. In this instance, pursuing this navigational path rather than a query-based one turns out to be a good choice. Indeed, the query-based approach could take time and would lead to a gradually evolving set of relevant documents. By contrast, harnessing the cluster of documents produced for "attractive nuisance" may turn out to be the most efficient approach to total recall and the desired degree of relevance.

To further illustrate the benefit of a modern legal search engine, we will conclude our discussion with an instructive search using WestlawNext, and its subsequent exploration by way of this recommendation resource available through “Related Materials.”

The underlying legal issue in this example is “church support for specific candidates”, and a corresponding query is issued in the search box. Figure 2 provides an illustration of the top cases retrieved.

image-2

Figure 2: Search result from WestlawNext

Let’s assume that the user decides to closely examine the first case. By clicking the link to the document, the content of the case is rendered, as in Figure 3. Note that on the right-hand side of the panel, the major legal issues of the case “Canyon Ferry Road Baptist Church … v. Unsworth” have been automatically identified and presented with hierarchically structured labels, such as “Freedom of Speech / State Regulation of Campaign Speech” and “Freedom of Speech / View of Federal Election Campaign Act / Definition of Political Committee,” … By presenting these closely related topics, a user is empowered with the ability to dive deep into the relevant cases and other relevant documents without explicitly crafting any additional or refined queries.

image-3

Figure 3: A view of a case and complementary materials from WestlawNext

By selecting these sets of relevant topics, a set of recommended cases will be rendered under that particular label. Figure 4, for example, shows the related topic view of the case under the label of “Freedom of Speech / View of Federal Election Campaign Act / Definition of Political Committee.” Note that this process can be repeated based on the particular needs of a user, starting with a document in the original results set.

image-4

Figure 4: Related Topic view of a case

In summary, by utilizing the combination of human expert-generated resources and sophisticated machine-learning algorithms, modern legal search engines bring the legal research experience to an unprecedented and powerful new level. For those seeking the next generation in legal search, it’s no longer on the horizon. It’s already here.

References

[1] K. Tamsin Maxwell, “Pushing the Envelope: Innovation in Legal Search,” in VoxPopuLII, Legal Information Institute, Cornell University Law School, 17 Sept. 2009. http://blog.law.cornell.edu/voxpop/2009/09/17/pushing-the-envelope-innovation-in-legal-search/
[2] Howard Turtle, “Natural Language vs. Boolean Query Evaluation: A Comparison of Retrieval Performance," In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research & Development in Information Retrieval (SIGIR 1994) (Dublin, Ireland), Springer-Verlag, London, pp. 212-220, 1994.
[3] West's Key Number System: http://info.legalsolutions.thomsonreuters.com/pdf/wln2/L-374484.pdf
[4] West's KeyCite Citator Service: http://info.legalsolutions.thomsonreuters.com/pdf/wln2/L-356347.pdf
[5] Peter Jackson and Khalid Al-Kofahi, “Human Expertise and Artificial Intelligence in Legal Search,” in Structuring of Legal Semantics, A. Geist, C. R. Brunschwig, F. Lachmayer, G. Schefbeck Eds., Festschrift ed. for Erich Schweighofer, Editions Weblaw, Bern, pp. 417-427, 2011.
[6] On Cluster definition and population: Qiang Lu, Jack G. Conrad, Khalid Al-Kofahi, William Keenan, "Legal Document Clustering with Build-in Topic Segmentation," In Proceedings of the 2011 ACM-CIKM Twentieth International Conference on Information and Knowledge Management (CIKM 2011)(Glasgow, Scotland), ACM Press, pp. 383-392, 2011.
[7] On Cluster association with individual documents: Qiang Lu and Jack G. Conrad, “Bringing order to legal documents: An Issue-based Recommendation System via Cluster Association,” In Proceedings of the 4th International Conference on Knowledge Engineering and Ontology Development  (KEOD 2012) (Barcelona, Spain), SciTePress DL, pp. 76-88, 2012.

Jack G. Conrad currently serves as Lead Research Scientist with the Catalyst Lab at Thomson Reuters Global Resources in Baar, Switzerland. He was formerly a Senior Research Scientist with the Thomson Reuters Corporate Research & Development department. His research areas fall under a broad spectrum of Information Retrieval, Data Mining and NLP topics. Some of these include e-Discovery, document clustering and deduplication for knowledge management systems. Jack has researched and implemented key components for WestlawNext, West‘s next-generation legal search engine, and PeopleMap, a very large scale Public Record aggregation system. Jack completed his graduate studies in Computer Science at the University of Massachusetts–Amherst and in Linguistics at the University of British Columbia–Vancouver.

Qiang Lu was a Senior Research Scientist with Thomson Reuters Corporate Research & Development department. His research interests include data mining, text mining, information retrieval, and machine learning. He has extensive experience of applying various NLP technologies in various data sources, such as news, legal, financial, and law enforcement data. Qiang was a key member of WestlawNext research team. He has a Ph.D. in computer science and engineering from State University of New York at Buffalo. He is now a managing associate at Kore Federal in Washington D.C. area.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed.

“To be blunt, there is just too much stuff.” (Robert C. Berring, 1994 [1])

Law is an information profession where legal professionals take on the role of intermediaries towards their clients. Today, those legal professionals routinely use online legal research services like Westlaw and LexisNexis to gain electronic access to legislative, judicial and scholarly legal documents.

Put simply, legal service providers make legal documents available online and enable users to search these text collections in order to find documents relevant to their information needs. For quite some time the main focus of providers has been the addition of more and more documents to their online collections. Quite contrary to other areas, like Web search, where an increase in the number of available documents has been accompanied by major changes in the search technology employed, the search systems used in online legal research services have changed little since the early days of computer-assisted legal research (CALR).

It is my belief, however, that the search technology employed in CALR systems will have to dramatically change in the next years. The future of online legal research services will more and more depend on the systems’ ability to create useful result lists to users’ queries. The continuing need to make additional texts available will only speed up the change. Electronic availability of a sufficient number of potentially relevant texts is no longer the main issue; quick findability of a few highly relevant documents among hundreds or even thousands of other potentially relevant ones is.

To reach that goal, from a search system’s perspective, relevance ranking is key. In a constantly growing number of situations – just like Professor Berring already stated almost 20 years ago (see above ) – even carefully chosen keywords bring back “too much stuff”.  Successful ranking, that is the ordering of search results according to their estimated relevance, becomes the main issue. A system’s ability to correctly assess the relevance of texts for every single individual user, and for every single of their queries will quickly become – or has arguably already become in most cases – the next holy grail of computer-assisted legal research.

Until a few years back providers could still successfully argue that search systems should not be blamed for the lack of  “theoretically, maybe, sometimes feasible” relevance-ranking capabilities, but rather that users had to be blamed for their missing search skills. I do not often hear that line of argumentation any longer, which certainly does not have to do with any improvement of (Boolean) search skills of end users. Representatives of service providers do not dare to follow that line of argumentation any longer, I think, because every single day every one of them uses Google by punching in vague, short queries and still mostly gets back sufficiently relevant top results. Why should this not work in CALR systems?

Indeed. Why, one might ask, is there not more Web search technology in contemporary computer-assisted legal research? Sure, according to another often-stressed argument of system providers, computer-assisted legal research is certainly different from Web search. In Web search we typically do not care about low recall as long as this guarantees high precision, while in CALR trading off recall for precision is problematic. But even with those clear differences, I have, for example, not heard a single plausible argument why the cornerstone of modern Web search, link analysis, should not be successfully used in every single CALR system out there.

These statements certainly are blunt and provocative generalizations. Erich Schweighofer, for example, has already even shown in 1999 (pre-mainstream-Web),  that there had in fact been technological changes in legal information retrieval in his well-named piece “The Revolution in Legal Information Retrieval or: The Empire Strikes Back”. And there have also been free CALR systems like PreCYdent that have fully employed citation-analysis techniques in computer-assisted legal research and have thereby – even if they did not manage to stay profitable – shown “one of the most innovative SE [search engine] algorithms“, according to experts.

An exhaustive and objective discussion of the various factors that contribute to the slow technological change in computer-assisted legal research can certainly neither be offered by myself alone nor in this short post. For a whole mix of reasons, there is not (yet) more “Google” in CALR, including the fear of system providers to be held liable for query modifications which might (theoretically) lead to wrong expert advice, and the lack of pressure from potential and existing customers to use more modern search technology.

What I want to highlight, however, is one more general explanation which is seldom put forward explicitly. What slows down technological innovation in online legal research, in my opinion, is also the interest of the whole legal profession to hold on to a conception of “legal relevance” that is immune to any kind of computer algorithm. A successfully employed, Web search-like ranking algorithm in CALR would after all not only produce comfortable, highly relevant search results, but would also reveal certain truths about legal research: The search for documents of high “legal relevance” to a specific factual or legal situation is, in most cases, a process which follows clear rules. Many legal research routines follow clear and pre-defined patterns which could be translated into algorithms. The legal profession will have to accept that truth at some point, and will therefore have to define and communicate “legal relevance” much less mystically and more pragmatically.

Again, also at this point, one might ask “Why?” I am certain that if the legal profession, that is legal professionals and their CALR service providers, do not include up-to-date search technology in their CALR systems, someone else will at some point do so without the need for a lot of involvement of legal professionals. To be blunt, at this point, Google can still serve as an example for our systems, at some point soon it might simply set an example instead of our systems.

Anton GeistAnton Geist is Law Librarian at WU (Vienna University of Economics and Business) University Library. He law degrees from University of Vienna (2006) and University of  Edinburgh (2010). He is grateful for feedback and discussions and can be contacted at home@antongeist.com.

[1] Berring, Robert C. (1994), Collapse of the Structure of the Legal Research Universe: The Imperative of Digital Information, 69 Wash. L. Rev. 9.

VoxPopuLII is edited by Judith Pratt. Editors-in-Chief are Stephanie Davidson and Christine Kirchberger, to whom queries should be directed. The information above should not be considered legal advice. If you require legal representation, please consult a lawyer.

[Editor's Note] For topic-related VoxPopuLII posts please see: Núria Casellas, Semantic Enhancement of legal information … Are we up for the challenge?; Marcie Baranich, HeinOnline Takes a New Approach to Legal Research With Subject Specific Research Platforms; Elisabetta Fersini, The JUMAS Experience: Extracting Knowledge From Judicial Multimedia Digital Libraries; João Lima, et.al, LexML Brazil Project; Joe Carmel, LegisLink.Org: Simplified Human-Readable URLs for Legislative Citations; Robert Richards, Context and Legal Informatics Research; John Sheridan, Legislation.gov.uk

RIP, MIX, LEARN: FROM CASE LAW TO CASEBOOKS

Like many projects, the Free Law Reporter (FLR) started out as way to scratch an itch for ourselves. As a publisher of legal education materials and developer of legal education resources, CALI finds itself doing things with the text of the law all the time. Our open casebook project, eLangdell, is the most obvious example.

The theme of the 2006 Conference for Law School Computing was “Rip, Mix, Learn” and first introduced the idea of open access casebooks and what later became the eLangdell project. At the keynote talk I laid out a path to open access electronic casebooks using public domain case law as a starting point. On the ebook front, I was a couple of years early.

The basic idea was that casebooks were made up of cases (mostly) and that it was a fairly obvious idea to give the full text of cases to law faculty so that they could write their own casebooks and deliver them to their students electronically via the Web or as PDF files. This was before the Amazon Kindle and Apple iPad legitimized the ebook marketplace.

The devilish details involved getting our hands on the full text of cases. We did a quick-and-dirty study of the 100 top casebooks and found that there was a lot of overlap in the cases. This was not too surprising, but it meant that the universe of case law -- as represented by all the cases in all the law school casebooks -- was only about 5,000 cases, and that if you extended that to all the cases mentioned -- not just included -- in a casebook, the number was closer to 15,000. I approached the major vendors of online case databases to try to obtain unencumbered copies of these cases, but I had no luck. Although disappointing, this too is not surprising, considering that these same case law database vendors are part of larger corporations that also sell print casebooks to the law school market.

Of course, the cases themselves are public domain and anyone with a userID and password could access and download the cases I needed. But the end-user agreements that every user must click “I Agree” to, include contract language that precluded anyone from making copies of these public domain cases for anything but personal use. Contract law trumped access to the public domain materials.

Fast forward a couple of years, to the appearance of Carl Malamud's public.resource.or g, providing tarballs of well-formatted case law every single week. Add to that the promise of re-keying a large back catalog of cases via the YesWeScan.org project (also from public.resource.org) and we could now begin to explore ideas that had been simmering on the back burner for several years.

CASE SEARCH AS EBOOK: LEAN FORWARD / LEAN BACK

One of the neat features at FreeLawReporter.org is that it allows you to convert the results of a search into a downloadable ebook in .epub format which you can read on your Apple iPad or Barnes & Noble Nook and other ereader devices. (.epub ebooks may be readable on Amazon Kindles soon.)  The idea for this feature sprang from some articles I had read about how people read on the Web versus how people read books. Jakob Nielsen explains it well in a post entitled “Writing Style for Print vs. Web”:

Print publications — from newspaper articles to marketing brochures — contain linear content that's often consumed in a more relaxed setting and manner than the solution-hunting behavior that characterizes most high-value Web use.

What does this have to do with case law and ebooks?

It's all about what kind of reading you are doing. When you are doing research -- especially online research, which involves refining your search terms, clicking through lots of links, and opening lots of browser tabs -- you are “leaning forward,” actively looking for something that you plan to read in greater depth later. In the case of legal research, the results of your efforts are a collection of cases -- dozens or hundreds of pages long. Once you have found the most on-point cases, you know that you need to read them deeply and carefully in order to follow and understand the arguments. This type of reading I call “leaning back,” and is more suited to the environment you create as a book reader than the one you create as a Web reader.

Turning case law searches into books seems like a natural consequence of the movement between “lean forward” Web searching and “lean back” book reading. There is a lot of anecdotal writing about this, but I am h ard-pressed to find scientific literature that is definitive. Fortunately, with FreeLawReporter.org, open source tools, and a smart developer, we can experiment and let users decide what works best for them. This is an important point that deserves some expansion.

"FREE" AS IN "FREE TO EXPERIMENT AND INNOVATE"

The primary product of the online legal database vendors is targeted primarily at big law firms. They get the big cases, have the big clients, and spend the most on legal research. As you move down the scale of firm size, you also move down in ability and willingness to pay for legal research, or ability to charge the cost of legal research back to the client. By the time you arrive at small firms and solo practitioners, the amount of time spent doing legal research is much reduced, and, in the case of purely transactional practices, legal research is done only rarely.

The use of these databases in legal education, however, is different. Legal research instructors try to give students a flavor for what using the databases in the real world will be like, but without knowing what type of law the students will end up practicing. The instruction, therefore, must be generalized. The databases are optimized for users who have almost unlimited (in time and cost) access. The databases were not designed for optimizing legal education. With the online database vendors, you get a powerful and comprehensive product, but you cannot change it to suit particular educational goals. You must adjust to it.

A database of the law should be available to the legal education community as a free, open, and customizable system that has affordances for instructors and researchers, i.e., law librarians and law faculty. We are only beginning to explore these ideas, but one analogy is that Wexis is to the Free Law Reporter as Windows is to Linux. The free and open aspect of the Free Law Reporter (FLR) will let legal research instructors, law faculty, law students, and even the public do things that are not possible within the contractually locked-down and/or digitally rights-managed systems that are designed primarily as a product for the most expensive lawyers in the marketplace.

With FLR, we can experiment with tweaking the algorithms behind the search engine to optimize for specific legal research situations. With FLR, we could create closed-universe subsets that could be used for legal research exercises or even final exams. With FLR, we could try out all sorts of things that we cannot do anywhere else.

I don't expect FLR to be a replacement for anything else. It is a new thing that we have not seen before -- a playground, a workshop, a research project, and a tool shed for legal educators. It can only grow in value and increase in quality, but we need help.

WHY “REPORTER”?

The choice of the name "Free Law Reporter" was deliberate. The “free” refers to both the cost and the open source aspects of the project, in the Free Software Foundation tradition. Richard Stallman has often expounded on the importance of access to the code you run on your computer; so too should every citizen have access to the laws of the land. In the past, case law was outsourced by the government to vendors who created the original Reporter system, which was made widely available to the public via state, county, and academic law libraries. Many libraries have, of necessity, cut back on their print subscriptions, reduced their hours of access, reduced their staff, or closed altogether, but the real loss of access to the public started when the law transitioned to online legal databases.

Now that online access to the law is the new normal, the disintermediation of law libraries is nearly complete, but the courts and governments have not kept up with the equal access during the transition. In the legal publishing lifecycle, there is an opportunity to add value, between the generation of the raw data of law, and the fee-based publication of law by online database vendors. FLR, with the help of law librarians, can seize that opportunity. This is not just a value proposition respecting public access to the law. Academic law libraries should have free and open access to the law, access that allows them to define and construct the educational environment for law students.

I am not sure whether the Free Law Reporter (FLR) can grow into what I envision. We are only at the beginning, but I believe it's about time we got started. I do know thatCALI: The Center for Computer-Assisted Legal Instruction we cannot succeed without the assistance and participation of the law librarian community. Right now, this assistance is mostly provided by law schools' continued annual CALI membership.

We are working to make participation in the growth of FLR possible, by finding ways to tap the cognitive surplus of law librarians, students, faculty, and lawyers. The key challenge, I believe, is the construction of a participation framework where many small contributions can be aggregated into something of great, cumulative value. Wikipedia, Linux, and many other open source projects are exemplars from which we can take cues. There is so much to do and I am excited by the technical and organizational challenges that FLR presents. Expect to hear more from us about this project as we get our legs underneath us.

John MayerJohn Mayer is the Executive Director of the Center for Computer-Assisted Legal Instruction (CALI), a 501(c)(3) consortium of over 200 US law schools. He has a BS in Computer Science from Northwestern University and an MSCS from the Illinois Institute of Technology. He can reached at jmayer@cali.org or @johnpmayer.

VoxPopuLII is edited by Judith Pratt. Editor-in-Chief is Robert Richards, to whom queries should be directed.

Supreme Court Building, IndiaIndian Kanoon is a free search engine for Indian law, providing access to more than 1.4 million central laws, and judgments from The Supreme Court of India, 24 High Courts, 17 law tribunals, constituent assembly debates, law commission reports, and a few law journals.

The development of Indian Kanoon began in the summer of 2007 and was publicly announced on 4 January 2008. Developing this service was a part-time project when I was working towards my doctorate degree in Computer Science at the University of Michigan under of guidance of Professor Farnam Jahanian of Arbor Networks fame. My work on Indian Kanoon continues to be a part-time affair because of my full-time job at Yahoo! India (Bangalore). Keep in mind, however, that I don't have a law background,  nor am I an expert on information retrieval. My PhD thesis is entitled Context-Aware Network Security.

The Genesis

Indian Kanoon was started as a result of my curiosity about publicly available law data. In a blog article, Indian Kanoon - The road so far and the road ahead, written a year after the launch of Indian Kanoon, I explained how the project was started, how it ran during the first year, and the promises for the next year.

When I was considering starting Indian Kanoon, the idea of free Indian law search was not new. Prashant Iyengar, a law student from NALSAR Hyderabad, borgestotallibrary.jpgfaced the same problem. The law data was available but the search tools were far from satisfactory. So he started OpenJudis to provide search tools for Indian law data that were publicly available. He traces the availability of government data and the development of OpenJudis in detail in his VoxPopuLII post, Confessions of a Legal Info-holic.

Prashant Iyengar traces the genesis, successes, and impacts of Indian Kanoon in a more detailed fashion in his 2010 report, Free Access to Law in India - Is it Here to Stay?

The Goal

I have to make it clear that Indian Kanoon was started in a very informal fashion; the goals of Indian Kanoon were not well established at the outset. The broadest goal for the project came to me while I was writing the "About" page of Indian Kanoon. From this point on, the goals for Indian Kanoon started to crystallize. The second paragraph of this page summed it up as follows:

india-fear-justice.jpg"Even when laws empower citizens in a large number of ways, a significant fraction of the population is completely ignorant of their rights and privileges. As a result, common people are afraid of going to police and rarely go to court to seek justice. People continue to live under fear of unknown laws and a corrupt police."

The Legal Thirst

During the first year after the launch of Indian Kanoon, one constant doubt that lingered in the minds of everyone familiar with the project (including me) concerned just how many people really needed a tool like Indian Kanoon. After all, this was a very specialized tool, which quite possibly would be useful only to lawyers or law students. But what constantly surprises me is the increasing number of users of the Website.  Indian Kanoon now has roughly half a million users per month, and the number keeps growing.

The obvious question is: Why is this legal thirst -- this desire for access to full text of the law -- arising in India now? I can think of umpteen reasons, such as an increase in the number of Indian citizens getting on the Internet, which is proving to be a better access medium than libraries; or that the general media awareness of law, or the spread of blogging culture, is fueling this desire.voxthirstgateofindia.jpg

On further reflection, I think there are two main drivers of this thirst for legal information. The first one is the resources now available for free and open access to law. Until very recently, most law resources in India were provided by libraries or Websites that charged a significant amount of money. In effect, they prohibited access to a significant portion of the population that wanted to look into legal issues. The average time spent per page on the Indian Kanoon Website is six minutes; this shows that most users actually read the legal text, and apparently find it easier to understand than they had previously expected. (This is precisely what I discovered when I began to read legal texts on a regular basis.)

The spread of the Internet, considered by itself, is not an important reason for the current thirst for law in India, in my view. Subscription-based legal Websites have been around for a while in India, but because of the pay-walls that they erected, none of them has been able to generate a strong user base. While the open nature of the Internet made it easy to compete against these providers, the availability of legal information free of charge -- not just availability of the Internet -- has removed huge barriers, both to start ups, and to access by the public.

The second major reason for this thirst for legal information -- and for the traffic growth to Indian Kanoon -- lies in technological advancement. Government websites and even private legal information providers in India are, generally, quite technologically deficient. To provide access to law documents, these providers typically have offered interfaces that are mere replicas of the library world. For example, our Supreme Court website allows searching for judgments by petitioner, respondent, case number, etc. While lawyers are often accustomed to using these interfaces, and of course understand these technical legal terms,indiasupreme_court_files.gif requiring prior knowledge of this kind of technical legal information as a prerequisite for performing a search raises a big barrier to access by common people. Further, the free-text search engines provided by these Websites have no notion of relevance. So while the technology world has significantly advanced in the areas of text search and relevance, government-based -- and, to some extent, private, fee-based -- legal resources in India have remained tied to stone-age technology.

Better Technology Improves Access

Allowing users to try and test any search terms that they have in mind, and providing a relevant set of links in response to their queries, significantly reduces the need for users to understand technical legal information as a prerequisite for reading and comprehending the law of the land. So, overall, I think advances in technology, some of which have been introduced by Indian Kanoon, are responsible for fostering a desire to read the law, and for affording more people access to the legal resources of India.

The Road Ahead

Considering, however, that fear of unknown laws remains in the minds of large numbers of the Indian people, now is not the time to gloat over the initial success of IndianKanoon. The task of Indian Kanoon is far from complete, and certainly more needs to be done to make searching for legal information by ordinary people easy and effective.

Sushant Sinha runs the search engine Indian Kanoon and currently works on the document processing team for Yahoo! India. Earlier he earned his PhD in Computer Science from the University of Michigan under the guidance of Professor Farnam Jahanian. He received his bachelor and masters degrees in computer science from IIT Madras, Chennai and was born and brought up in Jamshedpur, India. He was recently named one of "18 Young Innovators under 35 in India" by MIT's Technology Review India.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

WorldLII[Editor's Note: We are republishing here, with some corrections, a post by Dr. Núria Casellas that appeared earlier on VoxPopuLII.]

The organization and formalization of legal information for computer processing in order to support decision-making or enhance information search, retrieval and knowledge management is not recent, and neither is the need to represent legal knowledge in a machine-readable form. Nevertheless, since the first ideas of computerization of the law in the late 1940s, the appearance of the first legal information systems in the 1950s, and the first legal expert systems in the 1970s, claims, such as Hafner’s, that “searching a large database is an important and time-consuming part of legal work,” which drove the development of legal information systems during the 80s, have not yet been left behind.

Similar claims may be found nowadays as, on the one hand, the amount of available unstructured (or poorly structured) legal information and documents made available by governments, free access initiatives, blawgs, and portals on the Web will probably keep growing as the Web expands. And, on the other, the increasing quantity of legal data managed by legal publishing companies, law firms, and government agencies, together with the high quality requirements applicable to legal information/knowledge search, discovery, and management (e.g., access and privacy issues, copyright, etc.) have renewed the need to develop and implement better content management tools and methods.

Information overload, however important, is not the only concern for the future of legal knowledge management; other and growing demands are increasing the complexity of the requirements that legal information management systems and, in consequence, legal knowledge representation must face in the future. Multilingual search and retrieval of legal information to enable, for example, integrated search between the legislation of several European countries; enhanced laypersons' understanding of and access to e-government and e-administration sites or online dispute resolution capabilities (e.g., BATNA determination); the regulatory basis and capabilities of electronic institutions or normative and multi-agent systems (MAS); and multimedia, privacy or digital rights management systems, are just some examples of these demands.

How may we enable legal information interoperability? How may we foster legal knowledge usability and reuse between information and knowledge systems? How may we go beyond the mere linking of legal documents or the use of keywords or Boolean operators for legal information search? How may we formalize legal concepts and procedures in a machine-understandable form?

In short, how may we handle the complexity of legal knowledge to enhance legal information search and retrieval or knowledge management, taking into account the structure and dynamic character of legal knowledge, its relation with common sense concepts, the distinct theoretical perspectives, the flavor and influence of legal practice in its evolution, and jurisdictional and linguistic differences?

These are challenging tasks, for which different solutions and lines of research have been proposed. Here, I would like to draw your attention to the development of semantic solutions and applications and the construction of formal structures for representing legal concepts in order to make human-machine communication and understanding possible.

Semantic metadata

For example, in the search and retrieval area, we still perform nowadays most legal searches in online or application databases using keywords (that we believe to be contained in the document that we are searching for), maybe together with a combination of Boolean operators, or supported with a set of predefined categories (metadata regarding, for example, date, type of court, etc.), a list of pre-established topics, thesauri (e.g., EuroVoc), or a synonym-enhanced search.

These searches rely mainly on syntactic matching, and -- with the exception of searches enhanced with categories, synonyms, or thesauri -- they will return only documents that contain the exact term searched for. To perform more complex searches, to go beyond the term, we require the search engine to understand the semantic level of legal documents; a shared understanding of the domain of knowledge becomes necessary.

Although the quest for the representation of legal concepts is not new, these efforts have recently been driven by the success of the World Wide Web (WWW) and, especially, by the later development of the Semantic Web. Sir Tim Berners-Lee described it as an extension of the Web “in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”

From Web 2.0 to Web 3.0

Thus, the Semantic Web is envisaged as an extension of the current Web, which now comprises collaborative tools and social networks (the Social Web or Web 2.0). The Semantic Web is sometimes also referred to as Web 3.0, although there is no widespread agreement on this matter, as different visions exist regarding the enhancement and evolution of the current Web.

These efforts also include the Web of Data (or Linked Data), which relies on the existence of standard formats (URIs, HTTP and RDF) to allow the access and query of interrelated datasets, which may be granted through a SPARQL endpoint (e.g., Govtrack.us, US census data, etc.). Sharing and connecting data on the Web in compliance with the Linked Data principles enables the exploitation of content from different Web data sources with the development of search, browse, and other mashup applications. (See the Linking Open Data cloud diagram by Cyganiak and Jentzsch below.) [Editor's Note: Legislation.gov.uk also applies Linked Data principles to legal information, as John Sheridan explains in his recent post.]

LinkedData

Thus, to allow semantics to be added to the current Web, new languages and tools (ontologies) were needed, as the development of the Semantic Web is based on the formal representation of meaning in order to share with computers the flexibility, intuition, and capabilities of the conceptual structures of human natural languages. In the subfield of computer science and information science known as Knowledge Representation, the term "ontology" refers to a consensual and reusable vocabulary of identified concepts and their relationships regarding some phenomena of the world, which is made explicit in a machine-readable language. Ontologies may be regarded as advanced taxonomical structures, Semantic Web Stackwhere concepts are formalized as classes and defined with axioms, enriched with the description of attributes or constraints, and properties.

The task of developing interoperable technologies (ontology languages, guidelines, software, and tools) has been taken up by the World Wide Web Consortium (W3C). These technologies were arranged in the Semantic Web Stack according to increasing levels of complexity (like a layer cake). In this stack, higher layers depend on lower layers (and the latter are inherited from the original Web). These languages include XML (eXtensible Markup Language), a superset of HTML usually used to add structure to documents, and the so-called ontology languages: RDF/RDFS (Resource Description Framework/Schema), OWL, and OWL2 (Ontology Web Language). While the RDF language offers simple descriptive information about the resources on the Web, encoded in sets of triples of subject (a resource), predicate (a property or relation), and object (a resource or a value), RDFS allows the description of sets. OWL offers an even more expressive language to define structured ontologies (e.g. class disjointess, union or equivalence, etc.

Moreover, a specification to support the conversion of existing thesauri, taxonomies or subject headings into RDF triples has recently been published: the SKOS, Simple Knowledge Organization System standard. These specifications may be exploited in Linked Data efforts, such as the New York Times vocabularies. Also, EuroVoc, the multilingual thesaurus for activities of the EU is, for example, now available in this format.

Although there are different views in the literature regarding the scope of the definition or main characteristics of ontologies, the use of ontologies is seen as the key to implementing semantics for human-machine communication. Many ontologies have been built for different purposes and knowledge domains, for example:

  • OpenCyc: an open source version of the Cyc general ontology;
  • SUMO: the Suggested Upper Merged Ontology;
  • the upper ontologies PROTON (PROTo Ontology) and DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering);
  • the FRBRoo model (which represents bibliographic information);
  • the RDF representation of Dublin Core;
  • the Gene Ontology;
  • the FOAF (Friend of a Friend) ontology.

Although most domains are of interest for ontology modeling, the legal domain offers a perfect area for conceptual modeling and knowledge representation to be used in different types of intelligent applications and legal reasoning systems, not only due to its complexity as a knowledge intensive domain, but also because of the large amount of data that it generates. The use of semantically-enabled technologies for legal knowledge management could provide legal professionals and citizens with better access to legal information; enhance the storage, search, and retrieval of legal information; make possible advanced knowledge management systems; enable human-computer interaction; and even satisfy some hopes respecting automated reasoning and argumentation.

Regarding the incorporation of legal knowledge into the Web or into IT applications, or the more complex realization of the Legal Semantic Web, several directions have been taken, such as the development of XML standards for legal documentation and drafting (including Akoma Ntoso, LexML, CEN Metalex, and Norme in Rete), and the construction of legal ontologies.

Ontologizing legal knowledge

During the last decade, research on the use of legal ontologies as a technique to represent legal knowledge has increased and, as a consequence, a very interesting debate about their capacity to represent legal concepts and their relation to the different existing legal theories has arisen. It has even been suggested that ontologies could be the “missing link” between legal theory and Artificial Intelligence.

The literature suggests that legal ontologies may be distinguished by the levels of abstraction of the ideas they represent, the key distinction being between core and domain levels. Legal core ontologies model general concepts which are believed to be central for the understanding of law and may be used in all legal domains. In the past, ontologies of this type were mainly built upon insights provided by legal theory and largely influenced by normativism and legal positivism, especially by the works of Hart and Kelsen. Thus, initial legal ontology development efforts in Europe were influenced by hopes and trends in research on legal expert systems based on syllogistic approaches to legal interpretation.

More recent contributions at that level include the LKIF-Core Ontology, the LRI-Core Ontology, the DOLCE+CLO (Core Legal Ontology), and the Ontology of Fundamental Legal Concepts.Blue Scene Such ontologies usually include references to the concepts of Norm, Legal Act, and Legal Person, and may contain the formalization of deontic operators (e.g., Prohibition, Obligation, and Permission).

Domain ontologies, on the other hand, are directed towards the representation of conceptual knowledge regarding specific areas of the law or domains of practice, and are built with particular applications in mind, especially those that enable communication (shared vocabularies), or enhance indexing, search, and retrieval of legal information. Currently, most legal ontologies being developed are domain-specific ontologies, and some areas of legal knowledge have been heavily targeted, notably the representation of intellectual property rights respecting digital rights management (IPROnto Ontology, the Copyright Ontology, the Ontology of Licences, and the ALIS IP Ontology), and consumer-related legal issues (the Customer Complaint Ontology (or CContology), and the Consumer Protection Ontology). Many other well-documented ontologies have also been developed for purposes of the detection of financial fraud and other crimes; the representation of alternative dispute resolution methods, privacy compliance, patents, cases (e.g., Legal Case OWL Ontology), judicial proceedings, legal systems, and argumentation frameworks; and the multilingual retrieval of European law, among others. (See, for example, the proceedings of the JURIX and ICAIL conferences for further references.)

A socio-legal approach to legal ontology development

Thus, there are many approaches to the development of legal ontologies. Nevertheless, in the current legal ontology literature there are few explicit accounts or insights into the methods researchers use to elicit legal knowledge, and the accounts that are available reflect a lack of consensus as to the most appropriate methodology. For example, some accounts focus solely on the use of text mining techniques towards ontology learning from legal texts; while others concentrate on the analysis of legal theories and related materials to extract and formalize legal concepts. Moreover, legal ontology researchers disagree about the role that legal experts should play in ontology development and validation.

Orange SceneIn this regard, at the Institute of Law and Technology, we are developing a socio-legal approach to the construction of legal conceptual models. This approach stems from our collaboration with firms, government agencies, and nonprofit organizations (and their experts, clients, and other users) for the gathering of either explicit or tacit knowledge according to their needs. This empirically-based methodology may require the modeling of legal knowledge in practice (or professional legal knowledge, PLK), and the acquisition of knowledge through ethnographic and other social science research methods, together with the extraction (and merging) of concepts from a range of different sources (acts, regulations, case law, protocols, technical reports, etc.) and their validation by both legal experts and users.

For example, the Ontology of Professional Judicial Knowledge (OPJK) was developed in collaboration with the Spanish School of the Judicary to enhance search and retrieval capabilities of a Web-based frequentl- asked-question system (IURISERVICE) containing a repository of practical knowledge for Spanish judges in their first appointment. The knowledge was elicited from an ethnographic survey in Spanish First Instance Courts. On the other hand, the Neurona Ontologies, for a data protection compliance application, are based on the knowledge of legal experts and the requirements of enterprise asset management, together with the analysis of privacy and data protection regulations and technical risk management standards.

This approach tries to take into account many of the criticisms that developers of legal knowledge-based systems (LKBS) received during the 1980s and the beginning of the 1990s, including, primarily, the lack of legal knowledge or legal domain understanding of most LKBS development teams at the time. These criticisms were rooted in the widespread use of legal sources (statutes, case law, etc.) directly as the knowledge for the knowledge base, instead of including in the knowledge base the "expert" knowledge of lawyers or law-related professionals.

Further, in order to represent knowledge in practice (PLK), legal ontology engineering could benefit from the use of social science research methods for knowledge elicitation, institutional/organizational analysis (institutional ethnography), as well as close collaboration with legal practitioners, users, experts, and other stakeholders, in order to discover the relevant conceptual models that ought to be represented in the ontologies. Moreover, I understand the participation of these stakeholders in ontology evaluation and validation to be crucial to ensuring consensus about, and the usability of, a given legal ontology.

Challenges and drawbacks

Although the use of ontologies and the implementation of the Semantic Web vision may offer great advantages to information and knowledge management, there are great challenges and problems to be overcome.

First, the problems related to knowledge acquisition techniques and bottlenecks in software engineering are inherent in ontology engineering, and ontology development is quite a time-consuming and complex task. Second, as ontologies are directed mainly towards enabling some communication on the basis of shared conceptualizations, how are we to determine the sharedness of a concept? And how are context-dependencies or (cultural) diversities to be represented? Furthermore, how can we evaluate the content of ontologies?

Collaborative Current research is focused on overcoming these problems through the establishment of gold standards in concept extraction and ontology learning from texts, and the idea of collaborative development of legal ontologies, although these techniques might be unsuitable for the development of certain types of ontologies. Also, evaluation (validation, verification, and assessment) and quality measurement of ontologies are currently an important topic of research, especially ontology assessment and comparison for reuse purposes.

Regarding ontology reuse, the general belief is that the more abstract (or core) an ontology is, the less it owes to any particular domain and, therefore, the more reusable it becomes across domains and applications. This generates a usability-reusability trade-off that is often difficult to resolve.

Finally, once created, how are these ontologies to evolve? How are ontologies to be maintained and new concepts added to them?

Over and above these issues, in the legal domain there are taking place more particularized discussions:  for example, the discussion of the advantages and drawbacks of adopting an empirically based perspective (bottom-up), and the complexity of establishing clear connections with legal dogmatics or general legal theory approaches (top-down). To what extent are these two different perspectives on legal ontology development incompatible? How might they complement each other? What is their relationship with text-based approaches to legal ontology modeling?

I would suggest that empirically based, socio-legal methods of ontology construction constitute a bottom-up approach that enhances the usability of ontologies, while the general legal theory-based approach to ontology engineering fosters the reusability of ontologies across multiple domains.

The scholarly discussion of legal ontology development also embraces more fundamental issues, among them the capabilities of ontology languages for the representation of legal concepts, the possibilities of incorporating a legal flavor into OWL, and the implications of combining ontology languages with the formalization of rules.

Finally, the potential value to legal ontology of other approaches, areas of expertise, and domains of knowledge construction ought to be explored, for example: pragmatics and sociology of law methodologies, experiences in biomedical ontology engineering, formal ontology approaches, salamander.jpgand the relationships between legal ontology and legal epistemology, legal knowledge and common sense or world knowledge, expert and layperson’s knowledge, legal information and Linked Data possibilities, and legal dogmatics and political science (e.g., in e-Government ontologies).

As you may see, the challenges faced by legal ontology engineering are great, and the limitations of legal ontologies are substantial. Nevertheless, the potential of legal ontologies is immense. I believe that law-related professionals and legal experts have a central role to play in the successful development of legal ontologies and legal semantic applications.

[Editor's Note: For many of us, the technical aspects of ontologies and the Semantic Web are unfamiliar. Yet these technologies are increasingly being incorporated into the legal information systems that we use everyday, so it's in our interest to learn more about them. For those of us who would like a user-friendly introduction to ontologies and the Semantic Web, here are some suggestions:

Dr. Núria Casellas Dr. Núria Casellas is a visiting researcher at the Legal Information Institute at Cornell University. She is a researcher at the Institute of Law and Technology and an assistant professor at the UAB Law School (on leave). She has participated in several national and European-funded research projects regarding legal ontologies and legal knowledge management: these concern the acquisition of knowledge in judicial settings (IURISERVICE), modeling privacy compliance regulations (NEURONA), drafting legislation (DALOS), and the Legal Case Study of the Semantically Enabled Knowledge Technologies (SEKT VI Framework project), among others. Co-editor of the IDT Series, she holds a Law Degree from the Universitat Autònoma de Barcelona, a Master's Degree in Health Care Ethics and Law from the University of Manchester, and a PhD ("Modelling Legal Knowledge through Ontologies. OPJK: the Ontology of Professional Judicial Knowledge").

VoxPopuLII is edited by Judith Pratt. Editor in Chief is Robert Richards.

In May of this year, HeinOnline began taking a new approach to legal research, offering researchers the ability to search or browse varying types of legal research material all related to a specialized area of law in one database. We introduced this concept as a new legal research platform with the release of World Constitutions Illustrated: Contemporary & Historical Documents & Resources, which we’ll discuss in further detail later on in this post. First, we must take a brief look at how HeinOnline started and where it is going. Then, we will continue on by looking at the scope of the new platform and how it is being implemented across HeinOnline’s newest library modules.

This is how we started...
Traditionally, HeinOnline libraries featured one title or a single type of legal research material. For example, the Law Journal Library, HeinOnline’s largest and most used database, contains law and law-related periodicals. The Federal Register Library contains the Federal Register dating back to inception, with select supporting resources. The U.S. Statutes at Large Library contains the U.S. Statutes at Large volumes dating back to inception, with select supporting resources.

WhereBeen

This is where we are going...
The new subject-specific legal research platform, introduced earlier this year, has shifted from that traditional approach to a more dynamic approach of offering research libraries focused on a subject area, versus a single title or resource. This platform combines primary and secondary resources, books, law review articles, periodicals, government documents, historical documents, bibliographic references and other supporting resources all related to the same area of law, into one database, thus providing researchers one central place to find what they need.

WhereWeAreGoing

How is this platform being implemented?
In May, HeinOnline introduced the platform with the release of a new library called World Constitutions Illustrated: Contemporary & Historical Documents & Resources. The platform has since been implemented in every new library that HeinOnline has released including History of Bankruptcy: Taxation & Economic Reform in America, Part III and Intellectual Property Law Collection.

Pilot project: World Constitutions Illustrated
First, let’s look at the pilot project, World Constitutions Illustrated. Our goal when releasing this new library was to present legal researchers with a different scope than what is currently available for those studying constitutional law and political science. To achieve this, the library was built upon the new legal research platform, which brings together: constitutional documents, both current and historical; secondary sources such as the CIA’s World Fact Book, Modern Legal Systems Cyclopedia, the Library of Congress’s Country Studies and British and Foreign State Papers; books; law review articles; bibliographies; and links to external resources on the Web that directly relate to the political and historical development of each country. By presenting the information in this format, researchers no longer have to visit multiple Web sites or pull multiple sources to obtain the documentary history of the development of a country’s constitution and government.

Inside the interface, every country has a dedicated resource page that includes the Constitutions and Fundamental Laws, Commentaries & Other Relevant Sources, Scholarly Articles Chosen by Our Editors, a Bibliography of Select Constitutional Books, External Links, and a news feed. Let’s take a look at France.

France

Constitutions & Fundamental Laws
France has a significant hierarchy of constitutional documents from the current constitution as amended to 2008 all the way back to the Declaration of the Rights of Man and of the Citizen promulgated in 1789. Within the hierarchy of documents, one can find consolidated texts, amending laws, and the original text in multiple languages when translations are available.

FranceConstitutions

Commentaries & Other Relevant Sources
Researchers will find more than 100 commentaries and other relevant sources of information related to the development of the government of France and the French Constitution. These sources include secondary source books and classic constitutional law books. To further connect these sources to the French Constitution, our Editors have reviewed each source book and classic constitutional book and linked researchers to the specific chapters or sections of the works that directly relate to the study of the French Constitution. For example, the work titled American Nation: A History, by Albert Bushnell Hart, has direct links to chapters from within volumes 11 and 13, each of which discusses and relates to the development of the French government.

Commentaries

Scholarly Articles Chosen by Our Editors
This section features more than 40 links to scholarly articles from HeinOnline’s Law Journal Library that are directly related to the study of the French Constitution and the development of the government of France. The Editors hand-selected and included these articles from the thousands of articles in the Law Journal Library due to their significance and relation to the constitutional and political development of the nation. When browsing the list of articles, one will also find Hein’s ScholarCheck integrated, which allows a researcher to view other law review articles that cite that specific article. In order for researchers to access the law review articles, they must be subscribed to the Law Journal Library.

ScholarlyArticles

Bibliography of Select Constitutional Books
There are thousands of books related to constitutional law. Our Editors have gone through an extensive list of these resources and hand-selected books relevant to the constitutional development of each country. The selections are presented as a bibliography within each country. France has nearly 100 bibliographic references. Many bibliographic references also contain the ISBN which links to WorldCat, allowing researchers to find the work in a nearby library.

Bibliography

External Links
External links are also selected by the Editors as they are developing the constitutional hierarchies for each country. If there are significant online resources available that support the study of the constitution or the country’s political development, the links are included on the country page.

ExternalLinks

News Feeds
The last component on each country’s page is a news feed featuring recent articles about the country’s constitution. The news feed is powered by a Google RSS news feed and researchers can easily use the RSS icon to add it to their own RSS readers.

NewsFeed

In addition to the significant and comprehensive coverage of every country in World Constitutions Illustrated, the collection also features an abundance of material related to the study of constitutional law at a higher level. This makes it useful for those researching more general or regional constitutional topics.

Searching capabilities on the new platform
To further enhance the capabilities of this platform, researchers are presented with a comprehensive search tool that allows one to search the documents and books by a number of metadata points including the document date, promulgated date, document source, title, and author. For researchers studying the current constitution, the search can be narrowed to include just the current documents that make up the constitution for a country. Furthermore, a search can be generated across all the documents, classic books, or reference books for a specific country, or it can be narrower in scope to include a specific type of resource. After a search is generated, researchers will receive faceted search results, allowing them to quickly and easily drill down their results set by using facets including document type, date, country, and title.

ConstitutionSearch

Contributing to the project
An underlying concept behind the new legal research platform is encouraging legal scholars, law libraries, subject area experts, and other professionals to contribute to the project. HeinOnline wants to work with scholars and libraries from all around the world to continue to build upon the collection and to continue developing the constitutional timelines for every country. Several libraries and scholars from around the world have already contributed constitutional works from their libraries to World Constitutions Illustrated.

Extending the platform beyond the pilot project
As mentioned earlier, this platform has been implemented in every new library that HeinOnline has released including History of Bankruptcy: Taxation & Economic Reform in America, Part III and Intellectual Property Law Collection. Therefore, it’s necessary to briefly take a moment to look at these two libraries.

History of Bankruptcy: Taxation & Economic Reform in America, Part III
The History of Bankruptcy library includes more than 172,000 pages of legislative histories, treatises, documents and more, all related to bankruptcy law in America. The primary resources in this library are the legislative histories, which can be browsed by title, public law number, or popular name. Also included are classic books dating back to the late 1800’s and links to scholarly articles that were selected by our editors due to their significance to the study of bankruptcy law in America.

Bankruptcy

As with the searching capabilities presented in the World Constitutions Illustrated library, researchers can narrow a search by the type of resource, or search across everything in the library. After a search is generated, researchers will receive faceted search results, allowing them to quickly and easily drill down their results set by document type, date, or title.

banksearch.png

Intellectual Property Law Collection
The Intellectual Property Law Collection, released just over a month ago, features nearly 2 million pages of legal research material related to patents, trademarks, and copyrights in America. It includes more than 270 books, more than 100 legislative histories, links to more than 50 legal periodicals, federal agency documents, the Manual of Patent Examining Procedure, CFR Title 37, U.S. Code Titles 17 & 35, and links to scholarly articles chosen by our Editors, all related to intellectual property law in America.

IntellectualProperty

Furthermore, this library features a Google Patent Search widget that will allows researchers to search across more than 7 million patents made available to the public through an arrangement with Google and the United States Patent and Trademark Office.

GooglePatents

Searching in the Intellectual Property Law Collection allows researchers to search across all types of documents, or narrow a search to just books, legislative histories, or federal agency decisions, for example. After a search is generated, researchers will receive faceted search results, allowing them to quickly and easily drill down their results set by using facets including document type, date, country, or title.

SearchIP

HeinOnline is the modern link to legal history, and the new legal research platform bolsters this primary objective. The platform brings together the primary and secondary sources, other supporting documents, books, links to articles, periodicals, and links to other online sources, making it a central stop for researchers to begin their search for legal research material. The Editors have selected the books, articles, and sources that they deem significant to that area of the law. This is then presented in one database, making it easier for researchers to find what they need. With the tremendous growth of digital media and online sources, it can prove difficult for a researcher to quickly navigate to the most significant sources of information. HeinOnline’s goal is to make this navigation easier with the implementation of this new legal research platform.

BaranichMarcie Baranich is the Marketing Manager at William S. Hein & Co., Inc. and is responsible for the strategic marketing processes for both Hein's traditional products and its growing online legal research database, HeinOnline. In addition to her Marketing role, she is also active in the product development, training and support areas for HeinOnline. She is an author of the HeinOnline Blog, Wiki, YouTube channel, Facebook, and Twitter pages, and manages the strategic use of these resources to communicate and assist users with their legal research needs.

VoxPopuLII is edited by Judith Pratt.

Editor-in-Chief is Robert Richards, to whom queries should be directed.