Sushant Sinha » VoxPopuLII

About LII / Get the law / Find a lawyer / Legal Encyclopedia / Help Out

Indian Kanoon - The Genesis And The Legal Thirst

Demand for free access to law, Demand for public access to legal information, free access to law, india, information retrieval, Innovation in legal technology, Public access to legal information, Relevance ranking in legal information retrieval 7 Responses »

Apr 222011

Indian Kanoon is a free search engine for Indian law, providing access to more than 1.4 million central laws, and judgments from The Supreme Court of India, 24 High Courts, 17 law tribunals, constituent assembly debates, law commission reports, and a few law journals.

The development of Indian Kanoon began in the summer of 2007 and was publicly announced on 4 January 2008. Developing this service was a part-time project when I was working towards my doctorate degree in Computer Science at the University of Michigan under of guidance of Professor Farnam Jahanian of Arbor Networks fame. My work on Indian Kanoon continues to be a part-time affair because of my full-time job at Yahoo! India (Bangalore). Keep in mind, however, that I don’t have a law background, nor am I an expert on information retrieval. My PhD thesis is entitled Context-Aware Network Security.

The Genesis

Indian Kanoon was started as a result of my curiosity about publicly available law data. In a blog article, Indian Kanoon – The road so far and the road ahead, written a year after the launch of Indian Kanoon, I explained how the project was started, how it ran during the first year, and the promises for the next year.

When I was considering starting Indian Kanoon, the idea of free Indian law search was not new. Prashant Iyengar, a law student from NALSAR Hyderabad, faced the same problem. The law data was available but the search tools were far from satisfactory. So he started OpenJudis to provide search tools for Indian law data that were publicly available. He traces the availability of government data and the development of OpenJudis in detail in his VoxPopuLII post, Confessions of a Legal Info-holic.

Prashant Iyengar traces the genesis, successes, and impacts of Indian Kanoon in a more detailed fashion in his 2010 report, Free Access to Law in India – Is it Here to Stay?

The Goal

I have to make it clear that Indian Kanoon was started in a very informal fashion; the goals of Indian Kanoon were not well established at the outset. The broadest goal for the project came to me while I was writing the “About” page of Indian Kanoon. From this point on, the goals for Indian Kanoon started to crystallize. The second paragraph of this page summed it up as follows:

“Even when laws empower citizens in a large number of ways, a significant fraction of the population is completely ignorant of their rights and privileges. As a result, common people are afraid of going to police and rarely go to court to seek justice. People continue to live under fear of unknown laws and a corrupt police.”

The Legal Thirst

During the first year after the launch of Indian Kanoon, one constant doubt that lingered in the minds of everyone familiar with the project (including me) concerned just how many people really needed a tool like Indian Kanoon. After all, this was a very specialized tool, which quite possibly would be useful only to lawyers or law students. But what constantly surprises me is the increasing number of users of the Website. Indian Kanoon now has roughly half a million users per month, and the number keeps growing.

The obvious question is: Why is this legal thirst — this desire for access to full text of the law — arising in India now? I can think of umpteen reasons, such as an increase in the number of Indian citizens getting on the Internet, which is proving to be a better access medium than libraries; or that the general media awareness of law, or the spread of blogging culture, is fueling this desire.

On further reflection, I think there are two main drivers of this thirst for legal information. The first one is the resources now available for free and open access to law. Until very recently, most law resources in India were provided by libraries or Websites that charged a significant amount of money. In effect, they prohibited access to a significant portion of the population that wanted to look into legal issues. The average time spent per page on the Indian Kanoon Website is six minutes; this shows that most users actually read the legal text, and apparently find it easier to understand than they had previously expected. (This is precisely what I discovered when I began to read legal texts on a regular basis.)

The spread of the Internet, considered by itself, is not an important reason for the current thirst for law in India, in my view. Subscription-based legal Websites have been around for a while in India, but because of the pay-walls that they erected, none of them has been able to generate a strong user base. While the open nature of the Internet made it easy to compete against these providers, the availability of legal information free of charge — not just availability of the Internet — has removed huge barriers, both to start ups, and to access by the public.

The second major reason for this thirst for legal information — and for the traffic growth to Indian Kanoon — lies in technological advancement. Government websites and even private legal information providers in India are, generally, quite technologically deficient. To provide access to law documents, these providers typically have offered interfaces that are mere replicas of the library world. For example, our Supreme Court website allows searching for judgments by petitioner, respondent, case number, etc. While lawyers are often accustomed to using these interfaces, and of course understand these technical legal terms, requiring prior knowledge of this kind of technical legal information as a prerequisite for performing a search raises a big barrier to access by common people. Further, the free-text search engines provided by these Websites have no notion of relevance. So while the technology world has significantly advanced in the areas of text search and relevance, government-based — and, to some extent, private, fee-based — legal resources in India have remained tied to stone-age technology.

Better Technology Improves Access

Allowing users to try and test any search terms that they have in mind, and providing a relevant set of links in response to their queries, significantly reduces the need for users to understand technical legal information as a prerequisite for reading and comprehending the law of the land. So, overall, I think advances in technology, some of which have been introduced by Indian Kanoon, are responsible for fostering a desire to read the law, and for affording more people access to the legal resources of India.

The Road Ahead

Considering, however, that fear of unknown laws remains in the minds of large numbers of the Indian people, now is not the time to gloat over the initial success of IndianKanoon. The task of Indian Kanoon is far from complete, and certainly more needs to be done to make searching for legal information by ordinary people easy and effective.

Sushant Sinha runs the search engine Indian Kanoon and currently works on the document processing team for Yahoo! India. Earlier he earned his PhD in Computer Science from the University of Michigan under the guidance of Professor Farnam Jahanian. He received his bachelor and masters degrees in computer science from IIT Madras, Chennai and was born and brought up in Jamshedpur, India. He was recently named one of “18 Young Innovators under 35 in India” by MIT’s Technology Review India.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

Confessions of a Legal Info-holic

digital law, Digital law libraries, digital libraries, india, information retrieval, liis, open source software 2 Responses »

Feb 012010

In an extraordinary story, Jorge Luis Borges writes of a “Total Library”, organized into ‘hexagons’ that supposedly contained all books:

When it was proclaimed that the Library contained all books, the first impression was one of extravagant happiness. All men felt themselves to be the masters of an intact and secret treasure. . . . At that time a great deal was said about the Vindications: books of apology and prophecy which . . . [contained] prodigious arcana for [the] future. Thousands of the greedy abandoned their sweet native hexagons and rushed up the stairways, urged on by the vain intention of finding their Vindication. These pilgrims disputed in the narrow corridors . . . strangled each other on the divine stairways . . . . Others went mad. . . . The Vindications exist . . . but the searchers did not remember that the possibility of a man’s finding his Vindication, or some treacherous variation thereof, can be computed as zero. As was natural, this inordinate hope was followed by an excessive depression. The certitude that some shelf in some hexagon held precious books and that these precious books were inaccessible, seemed almost intolerable.

About three years ago I spent almost an entire sleepless month coding OpenJudis – my rather cool, “first-of-its-kind” free online database of Indian Supreme Court cases. The database hosts the full texts of about 25,000 cases decided since 1950. In this post I embark on a somewhat personal reflection on the process of creating OpenJudis – what I learnt about access to law (in India), and about “legal informatics,” along with some meditations on future pathways.

Having, by now, attended my share of FLOSS events, I know it is the invariable tendency of anyone who’s written two lines of free code to consider themselves qualified to pronounce on lofty themes – the nature of freedom and liberty, the commodity, scarcity, etc. With OpenJudis, likewise, I feel like I’ve acquired the necessary license to inflict my theory of the world on hapless readers – such as those at VoxPopuLII!

I begin this post by describing the circumstances under which I began coding OpenJudis. This is followed by some of my reflections on how “legal informatics” relates to and could relate to law.

Online Access to Law in India
India is privileged to have quite a robust ICT architecture. Internet access is relatively inexpensive, and the ubiquity of “cyber cafes” has resulted in extensive Internet penetration, even in the absence of individual subscriptions.

Government bodies at all levels are statutorily obliged to publish, on the Internet, vital information regarding their structure and functioning. The National Informatics Centre (NIC), a public sector corporation, is responsible for hosting, maintaining and updating the websites of government bodies across the country. These include, inter alia, the websites of the Union (federal) Government, the various state governments, union and state ministries, constitutional bodies such as the Election Commission and the Planning Commission, and regulatory bodies such as the Securities Exchange Board of India (SEBI). These websites typically host a wealth of useful information including, illustratively, the full texts of applicable legislations, subordinate legislations, administrative rulings, reports, census data, application forms etc.

The NIC has also been commissioned by the judiciary to develop websites for courts at various levels and publish decisions online. As a result, beginning in around the year 2000, the Supreme Court and various high courts have been publishing their decisions on their websites. The full texts of all Supreme Court decisions rendered since 1950 have been made available, which is an invaluable free resource for the public. Most High Court websites however, have not yet made archival material available online, so at present, access remains limited to decisions from the year 2000 onwards. More recently the NIC has begun setting up websites for subordinate courts, although this process is still at a very embryonic stage.

Apart from free government websites, a handful of commercial enterprises have been providing online access to legal materials. Among them, two deserve special mention. SCCOnline – a product of one of the leading law report publishers in India – provides access to the full texts of decisions of the Indian Supreme Court. The CD version of SCCOnline sells for about INR 70,000 (about US$1,500), which is around the same price the company charges for a full set of print volumes of its reporter. For an additional charge, the company offers updates to the database. The other major commercial venture in the field is Manupatra, which offers access to the full text of decisions of various courts and tribunals as well as the texts of legislation. Access is provided for a basic charge of about US$100, plus a charge of about US$1 per document downloaded. While seemingly modest by international standards, these charges are unaffordable by large sections of the legal profession and the lay public.

OpenJudis
In December 2006, I began coding OpenJudis. My reasons were purely selfish. While the full texts of the decisions of the Supreme Court were already available online for free, the search engine on the government website was unreliable and inadequate for (my) advanced research needs. The formatting of the text of cases themselves was untidy, and it was cumbersome to extract passages from them. Frequently, the website appeared overloaded with users, and alternate free sources were unavailable. I couldn’t afford any of the commercial databases. My own private dissatisfaction with the quality of service, coupled with (in retrospect) my completely naive optimism, led me to attempt OpenJudis. A third crucial factor on the input side was time, and a “room of my own,” which I could afford only because of a generous fellowship I had from the Open Society Institute.

I began rashly, by serially downloading the full texts of the 25,000 decisions on the Supreme Court website. Once that was done (it took about a week), I really had no notion of how to proceed. I remember being quite exhilarated by the sheer fact of being in possession of twenty five thousand Supreme Court decisions. I don’t think I can articulate the feeling very well. (I have some hope, however, that readers of this blog and my fellow LII-ers will intuitively understand this feeling.) Here I was, an average Joe poking around on the Internet, and just-like-that I now had an archive of 25,000 key documents of our republic, cumulatively representing the articulations of some of the finest (and some not-so-fine) legal minds of the previous half-century, sitting on my laptop. And I could do anything with them.

The word “archive,” incidentally, as Derrida informs us, derives from the Greek arkheion, the residence of the superior magistrates, the archons – those who commanded. The archons both “held and signified political power,” and were considered to possess the right to both “make and represent the law.” “Entrusted to such archons, these documents in effect speak the law: they recall the law and call on or impose the law”. Surely, or I am much mistaken, a very significant transformation has occurred when ordinary citizens become capable of housing archives – when citizens can assume the role of archons at will.

Giddy with power, I had an immediate impulse to find a way to transmit this feeling, to make it portable, to dissipate it – an impulse that will forever mystify economists wedded to “rational” incentive-based models of human behavior. I wasn’t a computer engineer, I didn’t have the foggiest idea how I’d go about it, but I was somehow going to host my own online free database of Indian Supreme Court cases. The audacity of this optimism bears out one of Yochai Benkler‘s insights about the changes wrought by the new “networked information economy” we inhabit. According to Benkler,

The belief that it is possible to make something valuable happen in the world, and the practice of actually acting on that belief, represent a qualitative improvement in the condition of individual freedom [because of NIE]. They mark the emergence of new practices of self-directed agency as a lived experience, going beyond mere formal permissibility and theoretical possibility.

Without my intending it, the archive itself suggested my next task. I had to clean up the text and extract metadata. This process occupied me for the longest time during the development of OpenJudis. I was very new to programming and had only just discovered the joys of Regular Expressions. More than my inexperience with programming techniques, however, it was the utter heterogeneity of reporting styles that took me a while to accustom myself to. Both opinion-writing and reporting styles had changed dramatically in the course of the fifty years my database covered, and this made it difficult to find patterns when extracting, say, the names of judges involved. Eventually, I had cleaned up the texts of the decisions and extracted an impressive (I thought) set of metadata, including the names of parties, the names of the judges, and the date the case was decided. To compensate for the absence of headnotes, I extracted names of statutes cited in the cases as a rough indicator of what their case might relate to. I did all this programming in PHP with the data housed in a MySQL database.

And then I encountered my first major roadblock that threatened to jeopardize the whole operation: I ran my first full-text Boolean search on the MySQL database and the results took a staggering 20 minutes to display. I was devastated! More elaborate searches took longer. Clearly, this was not a model I could host online. Or do anything useful with. Nobody in their right mind would want to wait 20 minutes for the results of their search. I had to look for a quicker database, or, as I eventually discovered, a super fast, lightweight indexing search engine. After a number of failed attempts with numerous free search engine software programs, none of which offered either the desired speed or the search capability I wanted, I was getting quite desperate. Fortunately, I discovered Swish-e, a lightweight, Perl-based Boolean search engine which was extremely fast and, most importantly, free – exactly what I needed. The final stage of creating the interface, uploading the database, and activating the search engine happened very quickly, and sometime in the early hours of December 22nd, 2006, OpenJudis went live. I sent announcement emails out to several e-groups and waited for the millions to show up at my doorstep.

They never did. After a week, I had maybe a hundred users. In a month, a few hundred. I received some very complimentary emails, which was nice, but it didn’t compensate for the failure of “millions” to show up. Over the next year, I added some improvements:
1) First, I built an automatic update feature that would periodically check the Supreme Court website for new cases and update the database on its own.
2) In October 2007, I coded a standalone MS Windows application of the database that could be installed on any system running Windows XP. This made sense in a country where PC penetration is higher than Internet penetration. The Windows application became quite popular and I received numerous requests for CDs from different corners of the country.
3) Around the same time, I also coded a similar application for decisions of the Central Information Commission – the apex statutory tribunal for adjudicating disputes under the Right to Information Act.
4) In February 2008, both applications were included in the DVD of Digit Magazine – a popular IT magazine in India.

Unfortunately, in August 2008, the Supreme Court website changed its design so that decisions could no longer be downloaded serially in the manner I had been accustomed to. One can only speculate about what prompted this change – since no improvements were made to the actual presentation of the cases. The only thing that changed was that one could no longer download cases serially as I’d been doing. The new format was far more difficult for me to “hack” and I abandoned the attempt. My work left me with no time to attempt to circumvent the new format.

Fortunately at the same time, an exciting new project called IndianKanoon was started by Sushant Sinha, an Indian computer science graduate at Michigan. In addition to decisions of the Supreme Court, his site covers several high courts and links up to the text of legislation of various kinds. Although I have not abandoned plans to develop OpenJudis, the presence of IndianKanoon has allowed me to step back entirely from this domain – secure in the knowledge that it is being taken forward by abler hands than mine.

Predictions, Observations, Conclusions
I’d like to end this already-too-long post with some reflections, randomly ordered, about legal information online.
1) I think one crucial area commonly neglected by most LIIs is client-side software that enables users to store local copies of entire databases. The urgency of this need is highlighted in the following hypothetical about digital libraries by Siva Vaidhyanathan (from The Anarchist in the Library):

So imagine this: An electronic journal is streamed into a library. A library never has it on its shelf, never owns a paper copy, can’t archive it for posterity. Its patrons can access the material and maybe print it, maybe not. But if the subscription runs out, if the library loses funding and has to cancel that subscription, or if the company itself goes out of business, all the material is gone. The library has no trace of what it bought: no record, no archive. It’s lost entirely.

It may be true that the Internet will be around for some time, but it might be worthwhile for LIIs to stop emulating the commercial database models of restricting control while enabling access. Only then can we begin to take seriously the task of empowering users into archons.

2) My second observation pertains to interface and usability. I have for long been planning to incorporate a set of features including tagging, highlighting, annotating, and bookmarking that I myself would most like to use. Additionally, I have been musing about using Web 2.0 to enable user-participation in maintenance and value-add operations – allowing users to proofread the text of judgments and to compose headnotes. At its most ambitious, in these “visions” of mine, OpenJudis looks like a combination of LII + social networking + Wikipedia.

A common objection to this model is that it would upset the authority of legal texts. In his brilliant essay A Brief History of the Internet from the 15th to the 18th century, the philosopher Lawrence Liang reminds us that the authority of knowledge that we today ascribe to printed text was contested for the longest period in modern history.

Far from ensuring fixity or authority, this early history of Printing was marked by uncertainty, and the constant refrain for a long time was that you could not rely on the book; a French scholar Adrien Baillet warned in 1685 that “the multitude of books which grows every day” would cast Europe into “a state as barbarous as that of the centuries that followed the fall of the Roman Empire.”

Europe’s non-descent into barbarism offers us a degree of comfort in dealing with Adrien Baillet-type arguments made in the context of legal information. The stability that we ascribe to law reports today is a relatively recent historical innovation that began in the mid-19th century. “Modern” law has longer roots than that.

3) While OpenJudis may look like quite a mammoth endeavor for one person, I was at all times intensely aware that this was by no means a solitary undertaking, and that I was “standing on the shoulders of giants.” They included the nameless thousands at the NIC who continue to design websites, scan and upload cases on the court websites – a Sisyphian task – and the thousands whose labor collectively produced the free software I used : Fedora Core 4, PHP, MySQL, Swish-E. And lastly, the nameless millions who toil to make the physical infrastructure of the Internet itself possible. Like the ground beneath our feet, we take it for granted, even as the tragic recent events in Haiti in recent weeks remind us to be more attentive. (For a truly Herculean endeavor, however, see Sushant Sinha’s IndianKanoon website, about which many ballads may be composed in the decades to come.)

It might be worthwhile for the custodians of LIIs to enable users to become derivative producers themselves, to engage in “practices of self-directed agency” as Benkler suggests. Without sounding immodest, I think the real story of OpenJudis is how the Internet makes it plausible and thinkable for average Joes like me (and better-than-average people like Sushant Sinha) to think of waging unilateral wars against publishing empires.

4) So, what is the impact that all this ubiquitous, instant, free electronic access to legal information is likely to have on the world of law? In a series of lectures titled “Archive Fever,” the philosopher Derrida posed a similar question in a somewhat different context: What would the discipline of psychoanalysis have looked like, he asked, if Sigmund Freud and his contemporaries had had access to computers, televisions, and email? In brief, his answer was that the discipline of psychoanalysis itself would not have been the same – it would have been transformed “from the bottom up” and its very events would have been altered. This is because, in Derrida’s view:

The archive . . . in general is not only the place for stocking and for conserving an archivable content of the past. . . . No, the technical structure of the archiving archive also determines the structure of the archivable content even in its coming into existence and in its relationship to the future. The archivization produces as much as it records the event.

The implication, following Derrida, is that in the past, law would not have been what it currently is if electronic archives had been possible. And the obverse is true as well: in the future, because of the Internet, “rule of law” will no longer observe the logic of the stable trajectories suggested by its classical “analog” commentators. New trajectories will have to be charted.

5) In the same book, Derrida describes a condition he calls “Archive fever”:

It is to burn with a passion. It is never to rest, interminably, from searching for the archive right where it slips away. It is to run after the archive even if there’s too much of it. It is to have a compulsive, repetitive and nostalgic desire for the archive, an irrepressible desire to return to the origin, a homesickness, a nostalgia for the return to the most archaic place of absolute commencement.

I don’t know about other readers of VoxPopulII (if indeed you’ve managed to continue reading this far!), but for the longest time during and after OpenJudis, I suffered distinctively from this malady. I downloaded indiscriminately whole sets of data that still sit unused on my computer, not having made it into OpenJudis. For those in a similar predicament, I offer Borges’s quote with which I began this text, as a reminder of the foolishness of the notion of “Total Libraries.”

Prashant Iyengar is a lawyer affiliated with the Alternative Law Forum, Bangalore, India. He is currently pursuing his graduate studies at Columbia University in New York. He runs OpenJudis, a free database of Indian Supreme Court cases.

VoxPopuLII is edited by Judith Pratt. Editor in Chief is Rob Richards.

Suffusion theme by Sayontan Sinha

VoxPopuLII

Indian Kanoon - The Genesis And The Legal Thirst

Confessions of a Legal Info-holic

Recent Posts

VoxPop people and posts

Subscribe to VoxPopuLII

Blogroll

VoxPopuLII

Indian Kanoon - The Genesis And The Legal Thirst

Confessions of a Legal Info-holic

Recent Posts

VoxPop people and posts

Subscribe to VoxPopuLII

Blogroll

Tags