skip navigation
search

THE JUDICIAL CONTEXT: WHY INNOVATE?

The progressive deployment of information and communication technologies (ICT) in the courtroom (audio and video recording, document scanning, courtroom management systems), jointly with the requirement for paperless judicial folders pushed by e-justice plans (Council of the European Union, 2009), are quickly transforming the traditional judicial folder into an integrated multimedia folder, where documents, audio recordings and video recordings can be accessed, usually via a Web-based platform. This trend is leading to a continuous increase in the number and the volume of case-related digital judicial libraries, where the full content of each single hearing is available for online consultation. A typical trial folder contains: audio hearing recordings, audio/video hearing recordings, transcriptions of hearing recordings, hearing reports, and attached documents (scanned text documents, photos, evidences, etc.). The ICT container is typically a dedicated judicial content management system (court management system), usually physically separated and independent from the case management system used in the investigative phase, but interacting with it.

Most of the present ICT deployment has been focused on the deployment of case management systems and ICT equipment in the courtrooms, with content management systems at different organisational levels (court or district). ICT deployment in the judiciary has reached different levels in the various EU countries, but the trend toward full e-justice is clearly in progress. Accessibility of the judicial information, both of case registries (more widely deployed), and of case e-folders, has been strongly enhanced by state-of-the-art ICT technologies. Usability of the electronic judicial folders is still affected by a traditional support toolset, such that an information search is limited to text search, transcription of audio recordings (indispensable for text search) is still a slow and fully manual process, template filling is a manual activity, etc. Part of the information available in the trial folder is not yet directly usable, but requires a time-consuming manual search. Information embedded in audio and video recordings, describing not only what was said in the courtroom, but also the specific trial context and the way in which it was said, still needs to be exploited. While the information is there, information extraction and semantically empowered judicial information retrieval still wait for proper exploitation tools. The growing amount of digital judicial information calls for the development of novel knowledge management techniques and their integration into case and court management systems. In this challenging context a novel case and court management system has been recently proposed.

The JUMAS project (JUdicial MAnagement by digital libraries Semantics) was started in February 2008, with the support of the Polish and Italian Ministries of Justice. JUMAS seeks to realize better usability of multimedia judicial folders — including transcriptions, information extraction, and semantic search –to provide to users a powerful toolset able to fully address the knowledge embedded in the multimedia judicial folder.

The JUMAS project has several objectives:

  • (1) direct searching of audio and video sources without a verbatim transcription of the proceedings;
  • (2) exploitation of the hidden semantics in audiovisual digital libraries in order to facilitate search and retrieval, intelligent processing, and effective presentation of multimedia information;
  • (3) fusing information from multimodal sources in order to improve accuracy during the automatic transcription and the annotation phases;
  • (4) optimizing the document workflow to allow the analysis of (un)structured information for document search and evidence-based assessment; and
  • (5) supporting a large scale, scalable, and interoperable audio/video retrieval system.

JUMAS is currently under validation in the Court of Wroclaw (Poland) and in the Court of Naples (Italy).

THE DIMENSIONS OF THE PROBLEM

In order to explain the relevance of the JUMAS objectives, we report some volume data related to the judicial domain context. Consider, for instance, the Italian context, where there are 167 courts, grouped in 29 districts, with about 1400 courtrooms. In a law court of medium size (10 courtrooms), during a single legal year, about 150 hearings per court are held, with an average duration of 4 hours. Considering that in approximately 40% of them only audio is recorded, in 20% both audio and video, while the remaining 40% has no recording, the multimedia recording volume we are talking about is 2400 hours of audio and 1200 hours of audio/video per year. The dimensioning related to the audio and audio/video documentation starts from the hypothesis that multimedia sources must be acquired at high quality in order to obtain good results in audio transcription and video annotation, which will affect the performance connected to the retrieval functionalities. Following these requirements, one can figure out a storage space of about 8.7 megabytes per minute (MB/min) for audio and 39 MB/min for audio/video. This means that during a legal year for a court of medium size we need to allocate 4 terabytes (TB) for audio/video material. Under these hypotheses, the overall size generated by all the courts in the justice system — for Italy only — in one year is about 800 TB. This shows how the justice sector is a major contributor to the data deluge (The Economist, 2010).

In order to manage such quantities of complex data, JUMAS aims to:

  • Optimize the workflow of information through search, consultation, and archiving procedures;
  • Introduce a higher degree of knowledge through the aggregation of different heterogeneous sources;
  • Speed up and improve decision processes by enabling discovery and exploitation of knowledge embedded in multimedia documents, in order to consequently reduce unnecessary costs;
  • Model audio-video proceedings in order to compare different instances; and
  • Allow traceability of proceedings during their evolution.

THE JUMAS SYSTEM

To achieve the above-mentioned goals, the JUMAS project has delivered the JUMAS system, whose main functionalities (depicted in Figure 1) are: automatic speech transcription, emotion recognition, human behaviour annotation, scene analysis, multimedia summarization, template-filling, and deception recognition.

 

Figure 1: Overview of the JUMAS functionalities

The architecture of JUMAS, depicted in Figure 2, is based on a set of key components: a central database, a user interface on a Web portal, a set of media analysis modules, and an orchestration module that allows the coordination of all system functionalities.

Figure 2: Overview of the JUMAS architecture

The media stream recorded in the courtroom includes both audio and video that are analyzed to extract semantic information used to populate the multimedia object database. The outputs of these processes are annotations: i.e., tags attached to media streams and stored in the database (Oracle 11g). The integration among modules is performed through a workflow engine and a module called JEX (JUMAS EXchange library). While the workflow engine is a service application that manages all the modules for audio and video analysis, JEX provides a set of services to upload and retrieve annotations to and from the JUMAS database.

JUMAS: THE ICT COMPONENTS

KNOWLEDGE EXTRACTION

Automatic Speech Transcription. For courtroom users, the primary sources of information are audio-recordings of hearings/proceedings. In light of this, JUMAS provides an Automatic Speech Recognition (ASR) system (Falavigna et al., 2009 and Rybach et al., 2009) trained on real judicial data coming from courtrooms. Currently two ASR systems have been developed: the first provided by Fondazione Bruno Kessler for the Italian language, and the second delivered by RWTH Aachen University for the Polish language. Currently, the ASR modules in the JUMAS system offer 61% accuracy over the generated automatic transcriptions, and represent the first contribution for populating the digital libraries with judicial trial information. In fact, the resulting transcriptions are the main information resource that are to be enriched by other modules, and then can be consulted by end users through the information retrieval system.

Emotion Recognition. Emotional states represent an aspect of knowledge embedded into courtroom media streams that may be used to enrich the content available in multimedia digital libraries. Enabling the end user to consult transcriptions by considering the associated semantics as well, represents an important achievement, one that allows the end user to retrieve an enriched written sentence instead of a “flat” one. Even if there is an open ethical discussion about the usability of this kind of information, this achievement radically changes the consultation process: sentences can assume different meanings according to the affective state of the speaker. To this purpose an emotion recognition module (Archetti et al., 2008), developed by the Consorzio Milano Ricerche jointly with the University of Milano-Bicocca, is part of the JUMAS system. A set of real-world human emotions obtained from courtroom audio recordings has been gathered for training the underlying supervised learning model.

Human Behavior Annotation. A further fundamental information resource is related to the video stream. In addition to emotional states identification, the recognition of relevant events that characterize judicial proceedings can be valuable for end users. Relevant events occurring during proceedings trigger meaningful gestures, which emphasize and anchor the words of witnesses, and highlight that a relevant concept has been explained. For this reason, the human behavior recognition modules (Briassouli et al., 2009, Kovacs et al., 2009), developed by CERTH-ITI and by MTA SZTAKI Research Institute, have been included in the JUMAS system. The video analysis captures relevant events that occur during the course of a trial in order to create semantic annotations that can be retrieved by judicial end users. The annotations are mainly concerned with the events related to the witness: change of posture, change of witness, hand gestures, gestures indicating conflict or disagreement.

Deception Detection. Discriminating between truthful and deceptive assertions is one of the most important activities performed by judges, lawyers, and prosecutors. In order to support these individuals’ reasoning activities, respecting corroborating/contradicting declarations (in the case of lawyers and prosecutors) and judging the accused (judges), a deception recognition module has been developed as a support tool. The deception detection module developed by the Heidelberg Institute for Theoretical Studies is based on the automatic classification of sentences performed by the ASR systems (Ganter and Strube, 2009). In particular, in order to train the deception detection module, a manual annotation of the output of the ASR module — with the help of the minutes of the transcribed sessions — has been performed. The knowledge extracted for training the classification module deals with lies, contradictory statements, quotations, and expressions of vagueness.

Information Extraction. The current amount of unstructured textual data available in the judicial domain, especially related to transcriptions of proceedings, highlights the necessity of automatically extracting structured data from unstructured material, to facilitate efficient consultation processes. In order to address the problem of structuring data coming from the automatic speech transcription system, Consorzio Milano Ricerche has defined an environment that combines regular expressions, probabilistic models, and background information available in each court database system. Thanks to this functionality, the judicial actors can view each individual hearing as a structured summary, where the main information extracted consists of the names of the judge, lawyers, defendant, victim, and witnesses; the names of the subjects cited during a deposition; the date cited during a deposition; and data about the verdict.

KNOWLEDGE MANAGEMENT

Information Retrieval. Currently, to retrieve audio/video materials acquired during a trial, the end user must manually consult all of the multimedia tracks. The identification of a particular position or segment of a multimedia stream, for purposes of looking at and/or listening to specific declarations, is possible either by remembering the time stamp when the events occurred, or by watching or hearing the whole recording. The amalgamation of automatic transcriptions, semantic annotations, and ontology representations allows us to build a flexible retrieval environment, based not only on simple textual queries, but also on broad and complex concepts. In order to define an integrated platform for cross-modal access to audio and video recordings and their automatic transcriptions, a retrieval module able to perform semantic multimedia indexing and retrieval has been developed by the Information Retrieval group at MTA SZTAKI. (Darczy et al., 2009)

Ontology as Support to Information Retrieval. An ontology is a formal representation of the knowledge that characterizes a given domain, through a set of concepts and a set of relationships that obtain among them. In the judicial domain, an ontology represents a key element that supports the retrieval process performed by end users. Text-based retrieval functionalities are not sufficient for finding and consulting transcriptions (and other documents) related to a given trial. A first contribution of the ontology component developed by the University of Milano-Bicocca (CSAI Research Center) for the JUMAS system provides query expansion functionality. Query expansion aims at extending the original query specified by end users with additional related terms. The whole set of keywords is then automatically submitted to the retrieval engine. The main objective is to narrow the search focus or to increase recall.

User Generated Semantic Annotations. Judicial users usually manually tag some documents for purposes of highlighting (and then remembering) significant portions of the proceedings. An important functionality, developed by the European Media Laboratory and offered by the JUMAS system, relates to the possibility of digitally annotating relevant arguments discussed during a proceeding. In this context, the user-generated annotations may aid judicial users in future retrieval and reasoning processes. The user-generated annotations module included in the JUMAS system allows end users to assign free tags to multimedia content in order to organize the trials according to their personal preferences. It also enables judges, prosecutors, lawyers, and court clerks to work collaboratively on a trial; e.g., a prosecutor who is taking over a trial can build on the notes of his or her predecessor.

KNOWLEDGE VISUALIZATION

Hyper Proceeding Views. The user interface of JUMAS — developed by ESA Projekt and Consorzio Milano Ricerche — is a Web portal, in which the contents of the database are presented in different views. The basic view allows browsing of the trial archive, as in a typical court management system, to view general information (dates of hearings, name of people involved) and documents attached to each trial. JUMAS’s distinguishing features include the automatic creation of a summary of the trial, the presentation of user-generated annotations, and the Hyper Proceeding View: i.e., an advanced presentation of media contents and annotations that allows the user to perform queries on contents, and jump directly to relevant parts of media files.

 

Multimedia Summarization. Digital videos represent a fundamental information resource about the events that occur during a trial: such videos can be stored, organized, and retrieved in a short time and at low cost. However, considering the dimensions that a video resource can assume during the recording of a trial, judicial actors have specified several requirements for digital trial videos: fast navigation of the stream, efficient access to data within the stream, and effective representation of relevant contents. One possible solution to these requirements lies in multimedia summarization, which derives a synthetic representation of audio/video contents with a minimal loss of meaningful information. In order to address the problem of defining a short and meaningful representation of a proceeding, a multimedia summarization environment based on an unsupervised learning approach has been developed (Fersini et al., 2010) by Consorzio Milano Ricerche jointly with University of Milano-Bicocca.

CONCLUSION

The JUMAS project demonstrates the feasibility of enriching a court management system with an advanced toolset for extracting and using the knowledge embedded in a multimedia judicial folder. Automatic transcription, template filling, and semantic enrichment help judicial actors not only to save time, but also to enhance the quality of their judicial decisions and performance. These improvements are mainly due to the ability to search not only text, but also events that occur in the courtroom. The initial results of the JUMAS project indicate that automatic transcription and audio/video annotations can provide additional information in an affordable way.

Elisabetta Fersini has a post-doctoral research fellow position at the University of Milano-Bicocca. She received her PhD with a thesis on “Probabilistic Classification and Clustering using Relational Models.” Her research interest is mainly focused on (Relational) Machine Learning in several domains, including Justice, Web, Multimedia, and Bioinformatics.

VoxPopuLII is edited by Judith Pratt.

Editor-in-Chief is Robert Richards, to whom queries should be directed.

In this post, I will describe how natural language processing can help in creating computer systems dealing with the law.

Law Books EmileA lot of computer systems are being designed to help users deal with legal texts — accessing, understanding, or applying them. [Editor’s Note: Michael Poulshock’s Jureeka is an example of a system that automates the application of legal texts.] Other systems — such as DALOS — are about creating legal texts, providing support for the writers, or simulating the effects of a text. Such systems are based on something more than “just” the legal text: there is XML mark-up, an OWL ontology, or a representation of the rules in SWRL or some programming language. This means that any piece of legislation that you want to use on your computer system needs to be translated into this computer representation.

We try to support this translation using natural language processing, so that (part of) the translation can be done by a computer. This automation should have a number of advantages. First of all, computers are cheaper than human experts, and automating the process should reduce the amount of resources needed for this task. Second, the models that are produced by automated processes are more consistent; human experts may treat two similar sentences differently, but a computer program will always behave the same. Finally, an approach employing structures ensures that there is a clear mapping between the elements of the computer model and the original text.

Natural Language Processing isn’t perfect yet: computers cannot understand human language. However, legal text is quite structured, and offers a lot more handholds for automated translation than, say, a novel.

Document Structure

The first step that we will have to undertake is to determine the structure of the document. Online services like Legislation.gov.uk and wetten.nl can make it easier to access legal documents because they can point you to the right part of the document (such as a chapter, paragraph, sentence, etc.). In most law texts, the structure has been made explicit using clear headings, like: Chapter 1 or Chapter 1. General Provisions. So, in order to detect structure, we need to detect these headings. This means we’ll need to search the document for lines starting with Chapter, followed by some designation (which we refer to as an index), and perhaps followed by some text – say, the title of the chapter. The index can Numbersbe a lot of things: Arabic numbers (1, 2, 3, …), Roman numbers (I, II, III, …) or letters (a, b, c, …). Sometimes the index is an ordinal appearing before the chapter label: First chapter. It may even be a combination of several numbers and letters (5.2a). This is not a great problem, as we can more or less assume that whatever follows the word Chapter is the index.

The main problem with this approach is that there are also regular sentences that start with the word Chapter, and we need to separate those out. To do so, we can use some heuristics: A title will not end with a full stop (.); a heading will always start on a new line; etc.

Queen BeatrixThis procedure to find the headings for chapters is repeated to find headings for sections, subsections, etc. Also, some sections (like numbered paragraphs or list items) will not have a full heading, but just a number, which we also need to recognise. Finally, some sections don’t have a heading but can be recognised because they start with a fixed language pattern. For example, a preamble in a (recent) Dutch Law — such as this — will start with: We, Beatrix, Queen of the Netherlands, Princess of Orange-Nassau, etc. etc. etc.

This procedure assumes that the input for the process is just text. Many documents will contain more information — such as textual markup — and headings may be more easily identified because they are marked as bold text, or even as headings. So, in situations where the input is made up of documents that are marked-up in a consistent way, it may be easier to recognise the patterns by taking layout into account in addition to text.

To actually find the patterns, we can use existing toolkits like GATE. After the patterns have been found, and the structure has been recognised, we can store it using a format such as MetaLex.

References

The second step is to detect the references from a portion of a law text to other portions of that text, or from a law text to other texts. References, like headings, follow a pattern. The simplest patterns are rather similar to headings; the text chapter 13 is probably a reference, unless it is part of a heading. ReferencesJust like headings, basic references consist of a label (section, chapter, article) and an index (13, 13.2.1, XIII,  m). And, just as with headings, we can find the references by looking for these patterns in the text.

However, this is only the simplest form of references. Besides references to a specific section, such as chapter 13, there are of course also references to a complete law. Some of these references follow a pattern as well, such as the law of October 1st, 2007. Most laws are cited by means of a citation title, though, such as the Railroad Act. Such titles can contain all kinds of words, and they don’t follow a strict pattern. Thus, such references cannot be detected using patterns. Instead, we use a list containing all (citation) titles to detect such references.

Other, more complex references contain multiple references in one statement, such as articles 13 and 14, or multiple levels: article 13, item e, of the Railroad Act, or even more complex combinations of the two: articles 13, item e, 14, item f, 15 and 16, items a and b, of the Railroad Act. Though more complex than the simple combination of label and index, these references still follow clear (sometimes recurring) patterns, and can be found in the text by searching for such patterns.

At the Leibniz Center for Law, we’ve created a parser based on these patterns, which had an accuracy of over 95%. For each reference found, we can construct some standardised name, and store it. With this technology, not only can we add hyperlinks to documents; we can also search for documents that refer to some specific document.

Classification

Now that we’ve got the structure and links in place, it’s time to start with the actual meaning of the text. Rather than tackling the entire text as a whole, we’ve selected sentences as the basic building blocks, and we attempt to create computer models for individual sentences first. Later, we can integrate those individual models to a complete model.

As a first step in creating the models, we start by assigning a broad meaning, or classification, to each sentence. Does the sentence give a definition for a concept, describe an obligation, or make a change in another law? In total, we distinguish fourteen different classes of sentences that appear in Dutch law texts. The next step in our automated approach is to assign a class to each sentence automatically.

To do so, we turn once again to language patterns. Legal language is rather strict, and legislative drafters don’t vary their language a lot — in a novel, variation may make for a more appealing text, but in a law, variation invites ambiguity. In fact, there are official Guidelines for Legislative Drafting that (among other things) reduce the variety of texts used. [Editor’s Note: For example, drafters of legislation in the U.S. House of Representatives Office of the Legislative Counsel have used Donald Hirsch’s Drafting Federal Law.] This means that for each of our classes, there’s a rather limited set of language patterns used. For example, definitions will look like one of these:

Under … is understood …

This law understands under … …

There are some variations in word order, but in the end, a small set of patterns is sufficient to describe all commonly used phrases. There is only one class of sentences where we cannot define a full set of patterns: obligations. In Dutch laws, obligations are often expressed without signal words like must or is obliged to. Instead, the obligations are presented as a fact:

No bodies are buried on a closed cemetery.

However, since the obligations are the only sentences lacking all-compassing patterns, we will assume that any sentence that does not mark a pattern is one of these obligations.

Based on the patterns found, we’ve created a classifier that attempts to sort sentences into these different classes. This classifier has an accuracy of 91%, and we expect that this can improved a bit further.

(As a side note: For classification tasks as these, a machine learning approach is often preferred; see, e.g., here. With such an approach, you provide the computer, not with patterns, but with a bunch of sample sentences. The computer will then extract its own patterns from those sentences, and use these to classify any new sentences. We’ve tried this approach as well (using the toolkit WEKA), and reached similarly accurate results.)

Modelling

Having classified the sentences, we now want to create models of the sentences. In essence, this means breaking down each sentence into smaller components and defining relationships between them. In some cases, the patterns used to classify the sentence already give us sufficient information to break up the sentence. Suppose we have a sentence like:

In article 7.12, sub one, second sentence, «article 7.3b» is replaced by: article 7.3c.

We classify this sentence as a replacement because of the text is replaced by. We can then also conclude that the text between angle quotes is the text to be replaced, the text following the colon is the replacing text, and the reference preceding it (which we’ve already detected) is the location where the replacement should take place.

This works fine for sentences that are somehow “about” the law. But for sentences that deal with some other domain, such as taxes, traffic, or commerce, we cannot predict all the elements. These sentences could be about anything — and statutes are full of such sentences. For such sentences, we need to follow a generic method. The aim is to model rules as a situation or action that is allowed or not allowed, similar to the models created in the HARNESS system of the ESTRELLA project. For example, for an obligation, we assume that the sentence describes some action that must be done. warrantWe try to identify who should be doing the action, and what other elements are involved. Thus, for the sentence:

Our Minister issues a warrant to the negligent person.

we would like to extract the following information:

Obligation
Action: Issue
Agent: Our Minister
Patient: Warrant
Recipient: Negligent person

(Such a table, or frame, is not the same as a computer model, but has all the elements needed to create one.)

Now, identifying these different elements of the sentence (agent, patient, recipient) is something that computer linguists have already worked on for a long time, which means we do not have to start from scratch. Instead, we can use existing parsers to do much of the work for us. For our Dutch laws, we use the Alpino parser. voxemiletree.jpgSuch a parser will create a parse tree of a sentence. In this parse tree, the sentence will be split up in parts. The parser can identify which part is the subject, the direct object, the indirect object, etc. Based on this information, we can determine the agent, patient, and recipient (so-called semantic roles). In a sentence with a verb in the active voice, the subject is the agent, the direct object is the patient, and the indirect object is the recipient. Furthermore, the parser will determine the relationship between words, such as an adjective that modifies a noun. This information, too, helps us to make more accurate models.

We start out with the output of these parsers, and then try to extract all terms that have some more significance. If we want an application to compute whether or not a situation is allowed, a word like car can be treated in a generic way, but terms like allowed and not some special attention.

To Be Continued…

We still need to refine the method for making these models, and evaluate the results. After that, the individual models will need to be merged. But even as things stand now, we think these tools will help with getting legal text from paper into your computer systems.

[Editor’s Note: For more information about this topic, please see Dr. Adam Wyner’s post, Weaving the Legal Semantic Web with Natural Language Processing.]

Emile_de_MaatEmile de Maat is a researcher at the Leibniz Center for Law (University of Amsterdam). His research focuses on the automatic extraction of metadata and meaning from legal sources.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

Law BooksQuestion: Is there a good reason why judges should not be blogging their opinions?

Follow my thinking here.

I, like many librarians, love books. By that I mean I love physical books. I love the feel of paper in my hand. I love the smell of books. When I attended library school, there was no doubt in my mind that I would work in a place surrounded by shelf after shelf of beautiful books. I was confident that I would be able to transfer that love of books to a new generation.

That’s not how things turned out. Without recounting exactly how I got here, I should say that I am a technology librarian, and have been since even before I graduated library school. Technology is where I found my calling, and where libraries seem to need the most help. As I delve deeper into the world of library technology, particularly in the academic setting, I am increasingly forced to confront an uncomfortable reality: Print formats are inferior to electronic. And in some of my darker moments, I may even go so far as to echo the comments of Jeff Jarvis in his book “What Would Google Do” when he writes: “print sucks.”

On page 71, talking about the burden of physical “stuff,” Jarvis writes:

“It’s expensive to produce content for print, expensive to manufacture, and expensive to deliver. Print limits your space and your ability to give readers all they want. It restricts your timing and the ability to keep readers up-to-the-minute. Print is already stale when it’s fresh. It is one-size-fits-all and can’t be adapted to the needs of each customer. It comes with no ability to click for more. It can’t be searched or forwarded. It has no archive. It kills trees. It uses energy. And you really should recycle it, though that’s just a pain. Print sucks. Stuff sucks.”

In this paragraph, Jarvis may as well have been talking about the current state of online legal information. Although we may not have figured out the magic bullets of authenticity and preservation, the fact remains that print is a burden. In many cases, it is a burden to our governments, and our libraries.

There are good reasons to proceed cautiously towards online legal information. However, the most significant barriers to accepting new modes of publishing official legal information online, like judges’ blogging opinions, may be cultural and political. In the end, law librarians and other legal professionals can’t allow our own nostalgia and habit to stand in the way of changes that can, should, and must happen.

AALL Working Groups

As many readers may know, the American Association of Law Libraries (AALL) began forming state working groups earlier this year. The purpose of those working groups was to “help AALL ensure access to electronic legal information in your state.” This is certainly a worthwhile goal, and one I obviously support. But the PDF document online, calling for formation of these working groups, sends a mixed message.

The very first duty of each working group is to “take action to oppose any plan in your state to eliminate an official print legal resource in favor of online-only unless the electronic version is digitally authenticated and will be preserved for permanent public access, or to charge fees to access legal information electronically. This is an increasingly common problem as states respond to severe budget cuts.”

Perhaps it’s just the phrasing of the document that bothered me. Rather than even providing guidance to states planning to eliminate print legal resources, AALL has set as its default position the opposition to any such plan.

In fairness, I note that the document hints that online-only legal resources might be acceptable if states don’t charge for them, or if such resources meet the rather complex standards laid out in the Association of Reporters of Judicial Decisions’ Statement of Principles.

The Association of Reporters of Judicial Decisions (ARJD) published Statement of Principles: “Official” On-Line Documents in February 2007, revised in May 2008. Most tellingly, in Principle 3 of the Statement they write: “Print publication, because of its reliability, is the preferred medium for government documents at present.”

Later in the document we find out why print is so reliable. Talking about electronic versions, the ARJD says they should not be considered official unless they are “permanent in that they are impervious to corruption by natural disaster, technological obsolescence, and similar factors and their digitized form can be readily translated into each successive electronic medium used to publish them.”

Without question, electronic material must be able to survive a natural disaster. The practice of storing information on a single server or keeping all backups in the same facility could be problematic. But emerging trends and best practices could help safeguard against these problems. In addition, programs like LOCKSS (Lots of Copies Keep Stuff Safe) can help alleviate some of these concerns by making sure many copies of each digital item exist at multiple geographic locations.

Also, digital format obsolescence has largely been overstated. PDF documents are not going anywhere anytime soon. Even conservative estimates establish PDF as a reliable format for the foreseeable future.

HTML may be no different. Consider that the very first Web document, Links and Anchors, is almost valid HTML5. Nearly 20 years later, that document is compatible with modern Web browsers.

BookOn the other side of the equation, is print impervious to natural disaster, or even technological obsolescence? Of course not. At Yale, with our rare books library and large historical collection, I have witnessed first hand the damage time can do to a physical book. Even more importantly, books in the last hundred years have been published so cheaply they may fall apart even sooner than books published centuries ago.

Print and Electronic Costs

The reality is that moving to online-only legal information is a good thing for everyone involved in producing and consuming such information. The burden of print is not limited to the costs forced upon states that produce it; that burden is also borne by libraries and citizens who consume it.

As mentioned above respecting the AALL working group document, many states are already looking at going online-only to cut costs, and why shouldn’t they? With current budget situations across the country being what they are, printing costs being particularly high, and electronic publishing costs being so low, of course states are looking at saving money by ending needless printing.

But libraries would also benefit from the cost savings of governments’ moving to electronic formats. Not only do libraries currently have to subsidize printing costs by paying for the “official” print copies of legal materials; libraries also have to pay for the shelf space, as well as manpower to process incoming material and place it on the shelf, and may also have to pay additional costs for preserving the physical material. Not to mention the fact that we may pay for additional services that furnish access to the exact same material in an electronic format.

The costs involved in dealing with print legal resources are well known to most librarians. So why aren’t we clamoring for governments to publish online-only legal information?

Officialness, Authenticity, Preservation, and Citeability

Of course there are genuine concerns about online-only legal information. The big sticking points seem to be (in no particular order) officialness, authenticity, preservation, and citeability. Each issue is worthy of, and has been the subject of, much discussion.

Officiality may be in some ways the easiest and most difficult hurdle for online-only legal information to leap. To make an online version of legal material official, an appropriate authoritative body need only declare that version “official.” The task seems simple enough.

The more difficult part may be political. With organizations like AALL and ARJD currently opposing online-only options, that action may be politically difficult. Persuading lawyers, judges, and legislatures to approve such a declaration could be even more difficult. Can you imagine a bill, regulation, or some other action making a blog the “official” outlet for a particular court’s opinions?

The question of authenticity is more difficult to deal with from a technological perspective, although there has been interesting work done with respect to PDFs, electronic signatures, and public and private keys. The Government Printing Office (GPO) has done a great job leading the way in the area of authenticity: http://www.gpoaccess.gov/authentication/. The new Legislation.gov.uk site unveiled recently has taken a different approach from the GPO’s. As John Sheridan has written in an earlier post, at the moment The U.K. National Archives are not taking any steps towards authenticating the information on the Legislation.gov.uk site, but they recognize the need to address the issue at some point. John Joergensen at Rutgers-Camden has taken yet another approach. And Claire Germain, in a recent paper about authentication practices respecting international legal information (pdf), states that those practices vary throughout the world. Thus the prickly question of authenticating online legal information is an issue that’s not going away any time soon.

AALL and ARJD have made a big deal about preservation of online legal information, an issue that’s important for librarians, too. Unfortunately, this is another area where no good answer exists to guide us. As Sarah Rhodes wrote earlier this year, “our current digital preservation strategies and systems are imperfect – and they most likely will never be perfected.”

The Library of Congress National Digital Information Infrastructure & Preservation Program (NDIIPP) has some helpful resources. The Legal Information Preservation Alliance (LIPA) also provides some good guidance in this area. However, many librarians are still reluctant to accept that digital preservation practices may enable us to end our reliance on print.

A similar reluctance can be seen in resistance to the Durham Statement, which — though directed at law reviews — also says something about other kinds of online legal information. Most notably, Margaret Leary of the University of Michigan chose not to sign the Durham Statement, and discussed her decision to continue to rely on print at a recent AALL program. In a listserv posting quoted in Richard Danner’s recent paper, Ms. Leary asserted: “I do not agree with the call to stop publishing in print, nor do I think we have now or will have in the foreseeable future the requisite ‘stable, open, digital formats’.” Similarly, Richard Leiter explains that he signed the Durham Statement with an asterisk because of the statement’s call for an end to the printing of law reviews.

What constitutes ‘stable, open, digital formats’ for the purposes of satisfying some librarians is unclear. As I mentioned earlier, a number of digital formats currently fit this description. This makes me think that there’s something else going on here, a resistance to abandoning print for other reasons.

Citeability also becomes an issue as print legal information disappears. If there is no print reporter volume in which an opinion is issued, then how would one cite to an opinion (setting aside for a moment Lexis and Westlaw citations)?

However, efforts towards implementing “medium-neutral legal citation formats” have already been made. According to Ivan Mokanov’s recent VoxPopuLII post, most citations in Canada are of a neutral format. In the United States, LegisLink.org has made an effort to improve online citations, as Joe Carmel describes in his recent post. Work on URN:LEX and other standards has resulted in some progress towards dealing with the citeability issue. Organizations like the AALL Electronic Legal Information Access & Citation Committee also deserve credit for taking this on. [Editor’s Note: Those organizations have produced universal citation standards — such as the AALL Universal Citation Guide — which have been adopted by a number of U.S. jurisdictions.] Even The Bluebook supports alternative citation formats. For example, rule 10.3.3, “Public Domain Format,” specifies how to cite to a public domain or “medium-neutral format.” The Bluebook even goes so far as to allow citation in a jurisdiction’s specified format.

But despite all this work, nothing has yet stuck.

The Next Step

One thing you’ll notice respecting all of these issues is that they are currently unsettled. While AALL and ARJD have both suggested that they would look favorably on online-only legal information if it were official, authenticated, and preserved (they do not mention citeability), there is no indication of when we will reach a level of achievement on these issues that would be satisfactory to these organizations. Can governments, libraries, and citizens afford to wait?

Asking states to continue to bear the burden of publishing material in print as they run out of funding, and libraries to bear the expense of preserving that print, is irresponsible. While we might not have all of the answers now, we certainly have enough to move forward in an intelligent manner.

The National Conference of Commissioners on Uniform State Laws (NCCUSL) has been working on an Authentication and Preservation of State Electronic Legal Materials Act. [Editor’s Note: The Chair of the Act’s Drafting Committee is Michele L. Timmons, the Revisor of Statutes for the State of Minnesota, and its Reporter is Professor Barbara Bintliff of the University of Texas School of Law.] According to the Study Committee’s Report and Recommendations for the Act’s Drafting Committee, the goal of the draft should be to “describ[e] minimum standards for the authentication and preservation of online state legal materials.” This seems like an appropriate place to start.

Rather than setting unrealistic or vague expectations, the minimum standards provided by the draft act seem to allow some flexibility for how states could address some of these issues. As opposed to working towards a “stable and open digital format,” which seems more a moving target than an attainable goal, the draft act sets forth an outline for how states can get started with publishing official and authentic online-only legal information. While far from finished, the draft act appears to be a step in the right direction.

What Is the Real Issue?

I think the real sticking point on this matter is mental or emotional. It comes from an uneasiness about how to deal with new methods of publishing legal information. For hundreds of years, legal information has been based in print. Even information available on the Lexis and Westlaw online services has its roots in print, if not full print versions of the same material. It’s as if the lack of a print or print-like version will cause librarians to lose the compass that helps us navigate the complex legal information landscape.

Of course, publishing legal information electronically brings its own challenges and costs for libraries. Electronic memory and space are not free, and setting up the IT infrastructure to consume, make available, and preserve digital materials can be costly. But in the long run, dealing with electronic material can and will be much easier and less costly for all involved, as well as giving greater access to legal information to the citizens who need it.

So Judges Blogging?Gavel

Question: Is there a good reason why judges should not be blogging their opinions?

Although he was the co-chair of the ARJD committee that produced the Statement of Principles, even Frank Wagner, the outgoing U.S. Supreme Court reporter of decisions, acknowledges that “budgetary constraints may eventually force most governmental units to abandon the printed word in favor of publishing their official materials exclusively online.” He also recognizes that the GPO’s work in this area may put an end to the printed U.S. Reports sooner than other “official publications.”

So were an appropriate authority to make them official, and some form of authentication were decided on, and methods of preservation and citation had been taken into account, would you feel comfortable with judges’ blogging their opinions?

We have to get over our unease with new formats for publishing online legal information. We have to stop handcuffing governments and libraries by placing unrealistic and unattainable expectations on them for publishing online legal information. We have to prepare ourselves for a world where online is the only outlet for official legal information.

I still enjoy taking a book off the shelf and reading. I enjoy flipping through and browsing the pages. But nostalgia and habit are not valid strategies for libraries of the future.

jason_eisemanJason Eiseman is the Librarian for Emerging Technologies at Yale Law School. He has experience in academic and law firm libraries working with intranets, websites, and technology training.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

Voting BoothsIn this post, I’d like to connect a specific area of my expertise—electronic voting (e-voting)—to issues of interest to the legal information community. Namely, I’ll talk about how new computerized methods of voting might affect elements of direct democracy: that is, ballot questions, including referenda and recall. Since some readers may be unfamiliar with issues related to electronic voting, I’ll spend the first two parts of this post giving some background on electronic voting and internet voting. I’ll then discuss how ballot questions change the calculus of e-voting in subtle ways.

Background on E-voting

The images of officials from 2000 closely scrutinizing punchcard ballots during the U.S. presidential election tend to give theofficial scrutiny mistaken impression that if we could just fix the outdated technology we used to cast ballots, a similar dispute wouldn’t happen again. However, elections are about “people, processes, and technology”; focusing on just one of those elements disregards the fact that elections are complex systems. Since 2000, the system of election administration in the United States has seen massive reform, with a lot of attention paid to issues of voting technology.

In the years after 2000, this system that had mostly “just worked” in previous decades was now seen as having endemic, fundamental problems. During the turn ofTammany Vote the 20th century, frauds involving ballot box-stuffing, vote-buying, and coercion were the major policy concern and the principal focus of reform. In contrast, at the turn of the 21st century, the prevalence of close, contentious contests—e.g., see this example of an analysis of New Jersey elections—often put the winning margin well within the “error” or “noise” level associated with ballot casting methods.

In 2002, Congress passed the Help America Vote Act (HAVA), which provided the first federal funding for election administration, created the Election Assistance Commission (EAC) and established the first federal requirements for voting systems, provisional balloting, and statewide voter registration databases. As my colleague Aaron Burstein and I argue in an article currently in preparation, in terms of advancing the state-of-the-art in voting  technology, HAVA conspicuously focused on providing funds that had to be spent quickly on types of voting systems that were then available on the market or soon would be available. The systems on the market at the time were invariably of a specific type: “Direct Recording Electronic” (DRE) voting machines, in which the record of a voter’s vote is kept entirely in digital form.

In the years since the passage of HAVA, computer science, usability, and information systems researchers have highlighted a number of shortcomings with this species of voting equipment. Three principal critiques voiced by this community are:

  • There is no proper way to do a recount on these systems. That is, if a race is close and a candidate calls for a recount, in most cases this will mean simply rerunning the software that added up all the digital votes; the exact same number would result. DREs do not keep a record that captures the voter’s intent; rather, these systems “collapse” voter intent into a digital  representation kept in digital memory. In other types of systems, such as optical scan systems—where voters fill in bubbles on paper ballots which are then scanned in for counting—the voter’s marks are directly preserved with the ballot. In a traditional recount with non-DRE systems, election staffers interpret these marks made by voters and come up with a count based on how a trained human would interpret ballots. This is not possible with DRE voting systems and lever machines, which do not preserve individual records of voter intent.
  • There is no way to know if the software that runs DREs is correctly recording votes, and we’ve seen numerous cases of software errors, including errors that have resulted in lost ballots. However, the addition of a “voter-verified paper record” (VVPR)—that is, an independent record that the voter can verify before casting his or her vote—alleviates not only this problem of recounting records that show voter intent, but also the myriad of problems associated with software flaws and “malware” (malicious software) in these machines. If voters check these records and agree that the records reflect how they want to vote, this renders the paper records “independent” of the system’s software, and the records can safely be audited and/or recounted if there do turn out to be software-based problems.
  • In a number of state-level technical reviews of voting systems, of which I have been a part in California and Ohio, we have found serious vulnerabilities in each voting system we examined. These findings leave little confidence in the equipment that was purchased by election officials in the wake of the 2000 election. Moreover, this was a clear indication that the systems for certifying this equipment at the state and federal level had serious shortcomings that have allowed sub-standard systems into the field.

Now, in 2010, many states have passed laws requiring auditable voting systems, and increasing numbers of election officials are moving from DRE-based systems to optical scan systems. Despite these reforms which have, in my opinion, moved e-voting in the right direction, the specter of internet voting looms large.

Internet VotingInternet Voting

During public talks I am often asked, “When will we vote over the internet?” People have an intuitive feeling that since they’re doing so much online, it makes sense to vote online, too. However, we need to recognize what kinds of activities the internet is good for, and voting is perhaps the last thing we want to happen online.

Things that we do online now that require high security, such as banking, are not anonymous processes; there is a named record associated with each transaction. Yet the secret ballot is a very important part of removing coercion and vote-buying from possibly corrupting influences on the vote. (See this superb article by Allison Hayward: “Bentham & Ballots: Tradeoffs between Secrecy and Accountability in How We  Vote”.)

Moreover, banks and other online establishments can purchase insurance to contain the risk of losses due to online fraud (although there are some indications that even this is becoming more difficult due to the increased sophistication and magnitude of online banking fraud). But there is still no firm that offers insurance for computer intrusions and attacks, or simply just errors, because it is very difficult to estimate the magnitude and likelihood of such losses. The “value” of a vote is very different from the value of currency: the value of your vote doesn’t just matter to you as a voter; it also matters to other voters. (“Vote dilution,” for example, is when processes conspire to render one voter’s vote more or less effective than another’s.) Also, it can be very hard to estimate the fitness of a given piece of software; said another way, we haven’t yet figured out how to write impervious or bug-free software.

Finally, as I mention above, the voting systems that the market has responded with in recent years leave a lot to be desired in terms of security, usability, and reliability. Internet voting essentially takes systems like those and adds the complications of sending voted electronic ballots over the public internet from users’ personal computers—neither of which are reliable or secure—with no VVPR.

We are far from the day in which highly secure processes can happen over the public internet from users’ computing devices. We will have to make significant technical advances in the security of personal computing devices and in network security before we can be sure that internet votes can be cast in a manner that approaches the privacy and security afforded by polling place voting.

Unfortunately, most designs for internet voting systems are un-auditable. Since these systems lack a paper trail, it is impossible to tell whether the voted ballot contents received at election headquarters correspond with what the voter intended to vote. The answer here would seem to be cryptographic voting systems, where the role of a paper trail is played by cryptographically secure records that can be transmitted over the network. Systems of this type have become increasingly more sophisticated, easy to use, and easy to understand, and have even been used in a binding municipal election here in the U.S.

E-voting and Direct Democracy

Elections don’t just elect people in the U.S.; in many states, voters vote on elements of direct democracy, specifically ballot referenda and recall questions. However, we should be even more concerned about opportunities to game these kinds of contests — and, equivalently, about how errors introduced by ballot casting methods for ballot questions could affect how we govern — than we are about the risks of voting fraud in candidate races.

It’s difficult to compare the importance of candidate elections to that of ballot questions. Certainly, ballot questions can be as simple as asking the voters to approve of city ordinances, such as increasing the amount of square footage for single-family homes. And, of course, on even-numbered years divisible by four, we elect the President of the United States, which unequivocally changes how our entire country is governed and operates. In between these two extremes are elections that many people don’t vote on, from judicial elections to highly contentious ballot propositions (like Proposition 8 in California), or transportation tax bonds that can result in hundreds of millions of dollars for local firms.

Can we compare the risks involved with candidate elections and ballot questions? In some sense, being able to bound the risk of fraud or error causing the election of the wrong candidate is similar to that resulting in “electing” the wrong decision in a ballot question; it’s equally difficult to compare the relative importance of elected contests and to decide on some level of likelihood that a contest runs a high risk of being targeted for attack or Voter Pollmight be especially sensitive to errors in the count. Polling may help, but it’s far from perfect. However, ballot questions have one aspect that should make this process a bit easier: rather than having the considerable uncertainty of what policies a potential candidate may institute once elected, ballot measures are concrete policy proposals or actions where we know very well what will happen if they are passed. This would seem to make ballot questions more attractive to attack; the uncertainty involved with what candidates may do is not present, so the net benefit of a successful attack, all other things being equal, should be larger.

Are there special risks involved with ballot questions that we should be concerned about in the face of electronic voting methods? Certainly. First, ballot propositions are invariably at the end of the ballot; hence, they’re referred to as “down-ticket” contests. Post-election auditing, where a subset of ballot records are hand-counted as a check against the electronic results, often doesn’t include ballot questions. To be certain, states like California require post-election auditing of all contests on the ballot. But there are many states that do not do comprehensive election auditing; they either don’t do any auditing at all or focus their auditing attention on top-ticket contests on the ballot (for more, see Sections 1 and 2 of: “Implementing Risk-Limiting Post-Election Audits in California”).

While we have seen little evidence of fraud using newer computerized voting systems compared to the massive record of paper ballot fraud in our country’s past, this should serve as little comfort. Just as in finance, where “past results are no indication of future performance,” adversarial security is similar. That we haven’t seen much evidence of computer fraud involving voting systems doesn’t mean it isn’t happening and doesn’t mean it can’t happen. Multi-million dollar ballot questions and constitutional amendments are exactly the kinds of law-making activities in which I expect to see the first evidence of outright computerized election hacking. This rings especially true if we start using the public internet for casting ballots. While foreign interests or hackers out of the reach of US law enforcement might certainly be interested in top-ticket candidate contests, the opportunities to affect state and local law as well as economic interests embodied in ballot questions would seem to be especially attractive.

Where Should We Go From Here?

To be sure, there is a lot of momentum behind moving parts of our elections processes online. In some cases, such as online voter registration, the security and reliability risks are small and the net benefits are particularly high. However, I can’t say the same about internet voting, especially in the sense that elements of direct democracy may be particularly attractive to powerful foreign interests and parties outside our collective jurisdiction. The recently passed Military and Overseas Voter  soldier voteEmpowerment (MOVE) Act has been interpreted to allow states to experiment with  online ballot casting, and the relevant agencies charged with implementing the  law—the Department of Defense’s Federal Voting Assistance Program (FVAP), the EAC, and the National Institute of Standards and Testing (NIST)—have collectively interpreted the MOVE Act as requiring them to institute standards and pilot programs for internet voting for military and overseas voters. I’m on record as disagreeing with this interpretation, but I can understand that they feel limited-scale pilot projects are appropriate. I predict that the first incontrovertible evidence of computerized vote manipulation will be associated with military and overseas internet voting efforts, and it’s not hard to imagine a down-ticket ballot question as being the focus of such an attack.

Should we re-think our forays into computerized voting? Definitely not. In my opinion, this is more a question of responsible uses of technology in elections than a black or white decision about using computerized voting systems or not. There is much good that stems from the use of computerized voting systems, including improved accessibility for the disabled and voters who don’t speak English, improved usability of ballots on-screen versus what can be accomplished on paper, and the speed and accuracy of computerized vote counts on election night. However, these voting systems must be recountable and auditable, and those audits must be conducted after each election in such a way that we limit the risk of an incorrect candidate or ballot measure being certified as the winner.

In contrast to the beginning of the past decade, when election officials were swimming in federal money for the purchase of equipment and trying to spend these funds before a looming deadline, what we really need is regular commitments of federal funding to improve local election administration. With a sustained source of federal funds to budget and plan for technology upgrades, the market will be stable, rather than going through the upheaval of mergers and dissolutions we have recently seen. Elections are perhaps the most poorly funded of all of the critical elements of democracy in the U.S., and we get what we pay for.

joe-hallJoseph Lorenzo Hall is a postdoctoral researcher at the UC Berkeley School of Information and a visiting postdoctoral fellow at the Princeton Center for Information Technology Policy. His Ph.D. thesis examined electronic voting as a critical case study in the transparency of digital government systems.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

justice2.jpg

This post explores ways in which information technology (IT) can enhance access to justice. What does it mean when we talk about “the access to justice crisis,” and how can information technology help to resolve it? The discussion that follows is based on my 2009 book, Technology for Justice: How Information Technology Can Support Judicial Reform, particularly Part 4, on the role of information and IT in access to justice.

The normative framework for access to justice

International conventions guarantee access to a court. Everyone is entitled to a fair and public hearing by an independent tribunal in the determination of their civil rights and obligations or of any criminal charge against him or her, according to The International Covenant on Civil and Political Rights (article 14) and regional conventions like the The European Convention on Human Rights (article 6). In practice, the normative framework for access to justice does not provide us with clearly defined concepts.

The major barriers to access to justice identified in the scholarly literature are:

  • Distance, which can be a factor impeding access to courts. In many countries, courts are concentrated in the main urban centers or in the capital.
  • Language barriers, which are present when justice seekers use a language that is different from the language of the courts.
  • Physical challenges, like impaired sight and hearing and motor and cognitive impairments; these as a barrier to access are an emerging topic in the debate on technology support in courts.

These first three factors are all relatively straightforward and do not strike at the heart of the legal process.

  • Cost, for instance lawyers’ fees, court fees and other components of the price of access to justice, in many forms, has been identified as a factor affecting access to courts. However, cost is extremely hard to research and subject to a lot of ramifications. Because of this complexity, cost will not be discussed directly in this post.
  • Lack of information and knowledge, lack of familiarity with the court process, the complexity of legal and administrative systems, and lack of access to legal information are commonly identified factors (Cotterrell, The Sociology of Law p. 251; Hammergren, Envisioning Reform: Improving Judicial Performance in Latin America, p. 136). They are related because they all refer to the availability of information. They are the starting point for our discussion.

Potentially, information on the Internet can provide some form of solution for these problems, in two ways. First, access to information can support fairer administration of justice by equipping people to respond appropriately when confronted with problems with a potentially legal solution. Access to information can compensate, to some extent, for the disadvantage one-shotters experience in litigation, thereby increasing their chance of obtaining a fair decision. Second, the Internet provides a channel for legal information services, although experience with such online service provision is limited in most judiciaries. The discussion here will therefore focus on access to legal information and knowledge. Lack of information and knowledge as a barrier to access to justice is the focus for discussion in the first few paragraphs. The first step is to identify the barriers.

Knowledge and information barriers to access to justice

What are the information barriers individuals experience when they encounter problems with a potentially legal solution? We need empirical evidence to find an answer to this question, and fortunately some excellent research has been done, which may help us. In the U.K., Hazel Genn led a team that researched what people do and think about going to law. Their 1999 report is called Paths to Justice. A similar exercise led by Ben van Velthoven and Marijke ter Voert in The Netherlands, called Geschilbeslechtingsdelta 2003 (Dispute Resolution Delta 2003), was published in 2004. Although there are some marked differences between them, both studies looked at how people deal with “justiciable problems”: problems that are experienced as serious and have a potentially legal solution. Analysis of empirical evidence of people and their justiciable problems in England and Wales and The Netherlands produced the following findings with regard to these barriers:

  • Inaction in the face of a justiciable problem because of lack of information and knowledge occurs in a small percentage of cases.
  • Unavailability of advice negatively affects dispute resolution outcomes. It lowers the resolution rate. Cases in which people attempted to find advice were resolved with a higher rate of success than those of the self-helpers.
  • Respecting the inability to find advice: If people go looking for advice, the barriers to finding it have more to do with their own competencies, such as confidence, emotional fortitude, and literacy skills, than with the availability of the advice. In the United Kingdom, about 20 percent of the population is so poor at reading and writing that they cannot cope with the demands of modern life, according to data from the National Literacy Trust. In The Netherlands, the percentage of similarly low literacy is estimated at about 10 percent, according to data from the Stichting Lezen en Schrijven, the Reading and Writing Foundation.
  • Respecting incompetence in implementing the information received: Different competence levels will affect what can be done with information and advice. Competencies in implementing the information received include, for example, skills such as working out what the problem is, what result is wanted, and how to find help; simple case-recording skills; managing correspondence; confidence and assertiveness; and negotiating skills, according to research reported by Advicenow in 2005. Some people do not want to be empowered by having information available. They want assistance, or even someone to take over dealing with their problem. People with low levels of competence in terms of education, income, confidence, verbal skill, literacy skill, or emotional fortitude are likely to need some help in resolving justiciable problems.
  • Ignorance about legal rights exists across most social groups. Genn notes that people generally are not educated about their legal rights (Genn p. 102).
  • Respecting lack of confidence in the legal system and the courts and negative feelings about the justice system, Genn observes that people are unwilling voluntarily to become involved with the courts. People associate courts with criminal justice. People’s image of the courts is formed by media stories about high profile criminal cases (Genn p. 247). This issue is related to the public image of courts, as well as to the wider role of courts as setters of norms.

Information needs for resolving justiciable problems

After identifying knowledge and information barriers, the next step is to uncover needs for information and knowledge related to access to justice. Those needs are most strongly related to the type of problem people experience. The most frequently occurring justiciable problems are simple, easy-to-solve problems, mostly those concerning goods and services. People themselves resolve such problems, occasionally with advice from specialist organizations like the consumers’ unions (e.g., in the U.S., the National Consumers League). For more important, more complex problems, people tend to seek expert help more frequently. The most difficult to resolve are problems involving a longer-term relationship, such as labor or family problems. Any of the problems discussed in this section may lead to a court procedure. However, the problems that are the toughest to resolve are also the ones that most frequently come to court.

The first need people experience is for information on how to solve their problem. In The Netherlands, the primary sources for this type of information are specialized organizations, with legal advice providers in second place. In England and Wales, solicitors are the first port of call, followed by the Citizens’ Advice Bureaux. In both countries, the police are a significant source of information on justiciable problems. This is especially remarkable because the problems researched were not criminal justice issues.

If people require legal information, they primarily need straightforward information about rules and regulations. Next, they look for information about ways to settle and handle disputes once they arise. Information about court procedures is a separate category that becomes relevant only in the event people need to go to court.

Respecting taking their case to court: People need information on how to resolve problems, on rights and duties, and on taking a case to court. The justiciable problems that normally come to court tend to be difficult for people themselves to resolve. These problems are also experienced as serious. Many of them involve long-term relationships: family, employment, neighbors. Therefore, people will tend to go looking for advice. Some of them may need assistance. Most people seek and receive some kind of advice before they come to court.

In summary, information needs in this context are mostly problem-specific. Most problems are resolved by people themselves, sometimes with the help of information, or help in the form of advice or assistance. The help is provided by many different organizations, but mostly by specialized organizations or providers of legal aid and alternative dispute resolution (ADR).

Different dispute resolution cultures

There are, besides these general trends, interesting differences between England and Wales and The Netherlands. The results with regard to dispute outcome, for instance, show the following:

table-1.JPG

The Netherlands has fewer unresolved disputes, more disputes resolved by agreement, and the rate of resolution by adjudication is half that of England and Wales. It looks as if there is more capacity for resolving justiciable problems in Dutch society than there is in society in England and Wales. Apart from the legacy of the justice system where there is a propensity to settle differences that Voltaire described in one of his letters, many factors may be at work in The Netherlands to produce a higher level of problem-solving capacity. One probable factor is the level of education and the related competence levels for dealing with problems and the legal framework. The functional illiteracy rate is only half that in the United Kingdom. Another factor may be a propensity to settle differences by reducing the complexity of problems through policies and routines.

Diversion or access, empowerment or court improvement?

The debate respecting whether diversion or court improvement should come first as an objective of legal policy, has been going on for some time. These are the options under discussion:

  • Preventing problems and disputes from arising;
  • Equipping as many members of the public as possible to solve problems when they do arise without needing recourse to legal action;
  • Diverting cases away from the courts into private dispute resolution forums; and
  • Enhancing access to legal forums for the resolution of disputes.

Genn argues that it is not an answer to say that diversion and access should be the twin objectives of policy, because they logically conflict. I would like to contribute some observations that could provide a way out of this apparent dilemma.

First, user statistics from the introduction of the online claim service Money Claim Online and the case study in Chapter 2.3 of my book suggest that changes in procedure facilitating access do not in themselves lead to higher caseloads. Changes observed in the caseloads are attributable to market forces in both instances.

The other observation is that Paths to Justice and the Dispute Resolution Delta clearly found that self-help is experienced as more satisfying and less stressful than legal proceedings. Moreover, resolutions are to a large degree problem specific. A way out of the dilemma could be that specialist organizations that make it their business to provide specific information, advice, and assistance, should enhance their role. There is an empirical basis for this way out in the research reported in Paths to Justice and the Dispute Resolution Delta. Although goods and services problems are largely resolved through self-help, out-of-court settlement, or ADR, nonetheless a fair number of them still come to court. Devising ways to assist individuals in informal problem solving and diverting them to other dispute resolution mechanisms can keep still more of these problems out of court. Even in matters for which a court decision is compulsory, like divorce, mediation mechanisms can sort out differences before the case is filed. Clearly, information on the Internet will provide an entry point for all of these dispute resolution services. Online information can thus help to keep as many problems out of court as possible. All this should not keep us from making going to court when necessary less stressful. Information can help reduce people’s stress, even as it improves their chances of achieving justice. The Internet can be a vehicle for this kind of information service, too.

Taking up this point, the next section focuses on courts and how information technology, particularly the Internet, can support them in their role of information providers to improve access to justice. Two strains concerning the role of information in access to justice run through this theme: information to keep disputes out of court, and information on taking disputes to court.

Information to keep disputes out of court

An almost implicit understanding in the research literature is that parties with information on the “rules of thumb” of how courts deal with types of disputes will settle their differences more easily and keep them out of court. Such information supports settlement in the shadow of the law. Most of this type of settlement will be done with the support of legal or specialist organizations. In the pre-litigation stage, information about the approaches judges and courts generally take to specific types of problems can help the informal resolution of those problems. This will require that information about the way courts deal with those types of problems becomes available. Some of the ways in which courts deal with specific issues are laid down in policies. Moreover, judicial decision making is sometimes assisted by decision support systems reflecting policies. In order to help out-of-court settlement, policies and decision support systems need to be available publicly.

Information on taking disputes to court

If a dispute needs to come to court, information can reduce the disadvantage one-shotters have in dealing with the court and with legal issues. This disadvantage of the one-shotters — those who come to court only occasionally — over against the repeat players who use courts as a matter of business, was enunciated by Marc Galanter in his classic 1974 article, Why the Haves Come Out Ahead: Speculations on the Limits of Legal Change. Access to information for individual, self-represented litigants increases their chances of obtaining just and fair decisions. Litigants need information on how to take their case to court. This information needs to be legally correct, as well as effective. By “effective,” I mean that the general public can understand the information, and that someone after reading it will (1) know what to do next, and (2) be confident that this action will yield the desired result. In a case study, I have rated several court-related Web sites in the U.K. and in The Netherlands on those points, and found most of them wanting. My test was done in 2008, and most of the sites have since changed or been replaced. And although the U.K. Court Service leaflet D 184 on how to get a divorce got the best score, my favorite Web site is Advicenow.

table-2.JPG
Such an information service requires a proactive, demand-oriented attitude from courts and judiciaries. Multi-channel information services, such as a letter from the court with reference to information on the court’s or judiciary’s Web site, can meet people’s information needs.

Beyond information push

Other forms of IT, increasingly interactive, can provide access to court. [Editor’s note: Document assembly systems for self-represented litigants are a notable example.] Not all of them require full-scale implementation of electronic case management and electronic files. In order to be effective for everyone, the information services discussed will require human help backup. There are also technologies to provide this, but they may still not be sufficient for everyone. The information services discussed here, in order to be effective, will need to be provided by a central agency for the entire legal system. A final finding is the importance of public trust in the courts in order for individuals to achieve access to justice. Judiciaries can actively contribute to improved access to justice in this field by ensuring that correct information about their processes is furnished to the public.

In summary, access to justice can be effectively improved with IT services. Such services can help to ameliorate the access-to-justice crisis by keeping disputes out of court. The information services identified here should serve the purpose of getting justice done. They should not keep people from getting the justice they deserve by preventing them from taking a justified concern to court. If people need to go to court, information services can help them deal with the courts more effectively.

[Editor’s Note: A very useful list of resources about applying technology to access to justice appears at the technola blog.]

dorybw_0093klein.jpg

Dory Reiling, mag. iur. Ph.D., is a judge in the first instance court in Amsterdam, The Netherlands. She was the first information manager for The Netherlands’ Judiciary, and a senior judicial reform expert at The World Bank. She is currently on the editorial board of The Hague Journal on the Rule of Law and on the Board of Governors of The Netherlands’ Judiciary’s Web site Rechtspraak.nl. She has a Weblog in Dutch, and an occasional Weblog in English, and can be followed on Twitter at @doryontour.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

The Problem: URLs and Internal Links for Legislative Documents

LegisLink LogoLegislative documents reside at various government Websites in various formats (TXT, HTML, XML, PDF, WordPerfect). URLs for these documents are often too long or difficult to construct. For example, here is the URL for the HTML format version of bill H.R. 3200 of the 111th U.S. Congress:

http://www.gpo.gov/fdsys/pkg/BILLS-111hr3200IH/html/BILLS-111hr3200IH.htm

More importantly, “deep” links to internal locations (often called “subdivisions” or “segments”) within a legislative document (the citations within the law, such as section 246 of bill H.R. 3200) are often not supported, or are non-intuitive for users to create or use. For most legislative Websites, users must click through or fill out forms and then scroll or search for the specific location in the text of legislation. This makes it difficult if not impossible to create and share links to official citations. Enabling internal links to subdivisions of legislative documents is crucial, because in most situations, users of legal information need access only to a subdivision of a legal document, not to the entire document.

A Solution: LegisLink

LegisLink.org is a URL Redirection Service with the goal of enabling Internet access to legislative material using citation-based URLs rather than requiring users to repeatedly click and scroll through documents to arrive at a destination.  Let’s say you’re reading an article at CNN.com and the article references section 246 in H.R. 3200.  If you want to read the section, you can search for H.R. 3200 and more than likely you will find the bill and then scroll to find the desired section.  On the other hand, you can use something like LegisLink by typing the correct URL.  For example: http://legislink.org/us/hr-3200-ih-246.

LegisLink Screen Shot

 

Benefits

There are several advantages of having a Web service that resolves legislative and legal citations.

(1)   LegisLink provides links to citations that are otherwise not easy for users to create.  In order to create a hyperlink to a location in an HTML or XML file, the publisher must include unique anchor or id attributes within their files.  Even if these attributes are included, they are often not exposed as links for Internet users to re-use.   On the other hand, Web-based software can easily scan a file’s text to find a requested citation and then redirect the user to the requested location.  For PDF files, it is possible to create hyperlinks to specific pages and locations when using the Acrobat plug-in from Adobe.  In these cases, hyperlinks can direct the user to the document location at the official Website.

For example, here is the LegisLink URL that links directly to section 246 within the PDF version of H.R. 3200: http://legislink.org/us/hr-3200-ih-246-pdf

In cases where governments have not included ids in HTML, XML or TXT files, LegisLink can replicate a government document on the LegisLink site, insert an anchor, and then redirect the user to the requested location.

(2)   LegisLink makes it easy to get to a specific location in a document, which saves time.  Law students and presumably all law professionals are relying on online resources to a greater extent than ever before.  In 2004, Stanford Law School published the results of their survey that found that 93% of first year law students used online resources for legal research at least 80% of the time.

hanoicyclers.png(3)   Creating and maintaining a .org site that acts as an umbrella for all jurisdictions makes it easier to locate documents and citations, especially when they have been issued by a jurisdiction with which one is unfamiliar.  Legislation and other legal documents tend to reside at multiple Websites within a jurisdiction.  For example, while U.S. federal legislation (i.e., bills and slip laws) is stored at thomas.loc.gov (HTML and XML) and gpo.gov (at FDsys and GPO Access) (TXT and PDF), the United States Code is available at uscode.house.gov and at gpo.gov (FDsys and GPO Access), while roll call votes are at clerk.house.gov and www.senate.gov.   Governments tend to compartmentalize activities, and their Websites reflect much of that compartmentalization.  LegisLink.org or something like it could, at a minimum, provide a resource that helps casual and new users find where official documents are stored at various locations or among various jurisdictions.

(4) LegisLinks won’t break over time. Governments sometimes change the URL locations for their documents. This often breaks previously relied-upon URLs (a result that is sometimes called “link rot”). A URL Redirection Service lessens these eventual annoyances to users because the syntax for the LegisLink-type service remains the same. To “fix” the broken links, the LegisLink software is simply updated to link to the government’s new URLs. This means that previously published LegisLinks won’t break over time.

(5)   A LegisLink-type service does not require governments to expend resources.  The goal of LegisLink is to point to government or government-designated resources.  If those resources contain anchors or id attributes, they can be used to link to the official government site.  If the documents are in PDF (non-scanned), they can also be used to link to the official government site.  In other cases, the files can be replicated temporarily and slightly manipulated (e.g., the tag <a name=SEC-#> can be added at the appropriate location) in order to achieve the desired results.

RocksAlternatives

While some Websites have implemented Permalinks and handle systems (e.g., the Library of Congress’s THOMAS system), these systems tend to link users to the document level only. They also generally only work within a single Internet domain, and casual users tend not to be aware of their existence.

Other technologies at the forefront of this space include recent efforts to create a URN-based syntax for legal documents (URN:LEX). To quote from the draft specification, “In an on-line environment with resources distributed among different Web publishers, uniform resource names allow simplified global interconnection of legal documents by means of automated hypertext linking.”

The syntax for URN:LEX is a bit lengthy, but because of its specificity, it needs to be included in any universal legal citation redirection service. The inclusion of URN:LEX syntax does not, however, mitigate the need for additional simpler syntaxes.  This distinction is important for the users who just want to quickly access a particular legislative document, such as a bill that is mentioned in a news article.  For example, if LegisLink were widely adopted, users would come to know that the URL http://legislink.org/us/hr-3200 will link to the current Congress’s H.R. 3200; the LegisLink URL is therefore readily usable by humans. And use of LegisLink for a particular piece of legislation is to some extent consistent with the use of URN:LEX for the same legislation: for example, a URN:LEX-based address such as http://legislink.org/urn:lex/us/federal:legislation:2009;
111.hr.3200@official;thomas.loc.gov:en$text-html
could also lead to the current Congress’s H.R. 3200. A LegisLink-type service can include the URN:LEX syntax, but the URN:LEX syntax cannot subsume the simplified syntax being proposed for LegisLink.org.

The goals of Citability.org, another effort to address these issues, calls for the replication of all government documents for point-in-time access. In addition, Citability.org envisions including date and time information as part of the URL syntax in order to provide access to the citable content that was available at the specified date and time. LegisLink has more modest goals: it focuses on linking to currently provided government documents and locations within those documents. Since legislation is typically stored as separate, un-revisable documents for a given legislative term (lasting 2 years in many U.S. jurisdictions), the use of date and time information is redundant with legislative session information.

The primary goal of a legislative URL Redirection Service such as LegisLink.org is to expedite the delivery of needed information to the Internet user. In addition, the LegisLink tools used to link to legislative citations in one jurisdiction can be re-used for other jurisdictions; this reduces developers’ labor as more jurisdictions are added.

PathwayNext Steps

The LegisLink.org site is organized by jurisdiction: each jurisdiction has its own script, and all scripts can re-use common functions. The prototype is currently being built to handle the United States (us), Colorado (us-co), and New Zealand (nz). The LegisLink source code is available as text files at http://legislink.org/code.html.

The challenges of a service like LegisLink.org are: (1) determining whether the legal community is interested in this sort of solution, (2) finding legislative experts to define the needed syntax and results for jurisdictions of interest, and (3) finding software developers interested in helping to work on the project.

This project cannot be accomplished by one or two people. Your help is needed, whether you are an interested user or a software developer. At this point, the code for LegisLink is written in Perl. Please join the LegisLink wiki site at http://legislink.wikispaces.org to add your ideas, to discuss related information, or just to stay informed about what’s going on with LegisLink.

Joe_CarmelJoe Carmel is a part-time consultant and software developer hobbyist. He was previously Chief of the Legislative Computer Systems at the U.S. House of Representatives (2001-2005) and spearheaded the use of XML for the drafting of legislation, the publication of roll call votes, and the creation and maintenance of the U.S. Congressional Biographical Directory.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.

There has been much discussion on this blog about law-related information retrieval systems, celticknotgreen.jpgontologies, and metadata. Today, I’d like to take you into another corner of legal informatics: rule-based legal information systems. I’ll tell you what they are, what their strengths and limitations are, and how they’re made. I’ll also explain why I’m optimistic about their potential to expand public access to law and to improve the way legal expertise is deployed and consumed.

First, what are they?

A rule-based expert system represents knowledge of a particular domain — such as medicine, finance, or law — in the form of “if-then” rules. Here’s an example of a rule:

the employee is entitled to standard FMLA leave IF
the employee is an eligible employee AND
the reason for the leave is enumerated in 29 U.S.C. § 2612

A rule consists of a bunch of variables (here, three Boolean statements) together with some logical operators (if, then, and, or, not, mathematical operators, etc.). Rules are chained together to form a rulebase, which is basically a database of rules. “Chained together” means that the rules connect to each other: a condition in one rule is the consequent or conclusion in another rule. For example, here’s a rule that links to our first rule:

the reason for the leave is enumerated in 29 U.S.C. § 2612 IF
the employee needs to care for a newborn child OR
the employee is becoming an adoptive or foster parent OR
the employee’s relative has a serious health condition OR
the employee cannot perform their job due to a serious health condition

Each of the conditions in this new rule can be defined by yet more rules. And other rules can sprout off of the main rule tree to form a complex web of inference. If we were to visualize such a network of rules, it might begin to look something like this:

rulebase_visualization4.jpg

The rulebase inputs are shown in blue and the outputs – or “goals” – are highlighted in orange. The core function of the inference engine (or rule engine) is to figure out what conclusions can be drawn from the input facts. Also, given incomplete information, an inference engine will figure out what additional facts are needed in order to reach one of the goals.

Rule-based systems in context

From this extremely simple example we can start to get a sense of the strengths and limitations of rule-based representations of legal knowledge. Let’s start with the strengths. First, the law, to a significant degree, seems to consist of rules, and representing them in a constrained, logical language is fairly straightforward and natural. As a result, rule-based systems are transparent: the system code looks a lot like the text that’s being represented. This “isomorphism” means that you can trace the system logic back to the original source material, easily spot errors, and quickly adapt to changes in the law. Furthermore, rule-based systems can justify their determinations by explaining how they arrived at a particular conclusion and by providing audit trails. It’s also fairly easy for people to interact with rule-based systems, as they integrate well with interviews. In short, it’s relatively easy to put legal knowledge into rule-based systems, easy to maintain it, and easy to get it out.

But all this simplicity comes with a price: the sophistication of the knowledge that can be represented. For one thing, common sense knowledge does not lend itself to simple rule-based representations, as the decades-long Cyc project illustrates. A significant portion of my own rule-authoring effort is spent representing mundane concepts, like figuring whether a given date falls on a legal holiday or counting the number of weeks in which a given condition is true. Secondly, there’s the problem of how to model vague or “open-textured” concepts. For instance, if a liability determination turns upon whether a person’s conduct was “reasonable”, the uncertainty and fuzziness of that term can’t be modeled in a way analogous to human thinking. A third limitation facing rule-based systems is the “knowledge acquisition bottleneck.” This is the effort required to codify, test, and validate expert domain knowledge. Part of the challenge derives from the reasons I’ve already mentioned, and part results from the need to capture the knowledge of human subject matter experts who don’t always think in complete and precise “if-then” constructs. Another criticism often lodged at legal expert systems is that law is in essence not rule-based but is instead a fray of competing textual interpretations which cannot be accurately modeled.

My view is that, even given these limitations, there are still many problems that can be solved by rule-based systems. No one is asking them to solve all legal automation problems, or claiming that all legal knowledge can be represented in the form of rules. (Part of why little attention is paid to these systems today is that they were over-hyped during the artificial intelligence boom of the 1970s and 80s.) But there is a place for them, and that place is quite large even given the semantic confines that I just described. Rule-based systems are ideal for encoding legal principles found in statutes, regulations, and agency decisions — that is, law that’s explicit and knowable, but logically complicated. And there are millions of pages of such law, across thousands of jurisdictions around the world, just waiting to be embedded in rule-based systems.

Let me give you a few examples of what rule-based information systems can do, although chances are that you’ve already encountered one. Perhaps, like millions of American taxpayers, you used TurboTax tax preparation software to file your taxes this year. This and other tax preparation programs interview you about your income and finances, perform a multitude of behind-the-scenes calculations, and then fill out the relevant tax forms for you. I don’t actually know how this software was constructed, but if I were doing it I would absolutely take a rule-based approach. In fact, my team did use a rule engine when tasked to build a tax law advisory system for the IRS. That system, the Interactive Tax Assistant, answers seven common tax questions, is driven by about 1,300 rules, and contains around 200 question screens. Rule-based design can also produce systems like the Australian Visa Wizard, DirectLaw, and The Benefit Bank. Other rule-driven systems work behind the scenes at government agencies and corporations to process claims by making fast, consistent, and transparent decisions.

Available tools

In my view, the premier tool for engineering rule-based legal information systems is Oracle Policy Modeling (OPM, formerly known as Haley Office Rules, RuleBurst, and Softlaw). (Full disclosure: I used to work for Oracle.) OPM lets you write natural language rules that capture statutory text, calculations, date and time-based reasoning, and basic ontological relationships. It has decent debugging and rulebase visualization features (that’s how I created the rule network diagram above), and an excellent regression testing facility. OPM lets you deploy rulebases as Web interviews and integrate them into other computer systems. The major downside to OPM is its cost: I understand the list price to be in the ballpark of $100K per license.

You can also model legal rules using other business rule engines, such as ILOG, Blaze Advisor, JBoss Drools (free), and Jess (free). JBoss Drools has a promising feature that lets you create Domain Specific Languages by mapping natural language expressions to the underlying programming code. You could also use traditional logic programming / expert system languages like Prolog or CLIPS, which are extremely powerful but which do not allow for isomorphic representation of the law. OWL-centric ontology editors such as Protege are also beginning to support rule-based knowledge representation.

To address the lack of freely-available, practical legal modeling tools, I’ve been working on Jureeka.org, a project affiliated with Stanford’s CodeX Center for Computers and Law. Jureeka is an open, Web-based rule authoring platform that lets lawyers, law students, and other subject matter experts represent their knowledge as “if-then” rules. Jureeka then uses the rules to generate jurisdiction-specific interviews, which present the relevant topic in a digestible manner. Its strengths are that it’s completely Web-based, it makes navigation of the rules easy, and it lets rule authors work collaboratively to rapidly develop knowledge bases in a wiki-like fashion. The motivating vision is to provide a way for legal knowledge engineers to build topical rulebases, and then connect these modules together to form an information backbone that drives other IT systems and helps the general public get answers to their legal questions.

jureeka_screenshot1.jpg

Jureeka is very much a work in progress, and I’ll be the first to admit that its main weakness is the oversimplicity of its rule syntax. (For example, I’m currently working on an ontology layer and a way to reason across multiple instances of an object or variable.) But this is the type of knowledge-generating project that I’d like to see a developer community coalesce around.

Future potential

Rule-based programming is not the be-all and end-all of legal informatics, but it does have significant untapped potential. Government agencies are beginning to adopt rule-based legal information systems as a way to better serve the public. I think there are also lucrative opportunities available for law firms to seize the first mover advantage by automating slices of the law of interest to consumers. Rule-based systems can help nonprofit organizations advance their missions by guiding constituents through labyrinthine legal processes. And these systems are of obvious benefit to corporations, which need to comply with a variety of regulations across numerous jurisdictions.

Rule-based systems can also benefit the legislative drafting process. For example, an early incarnation of the OPM software helped the Australian Taxation Office simplify that country’s tax code. In addition to this kind of legislative refactoring (which entails clarifying and reorganizing Rube Goldberg-like legal texts), legislatures could also promulgate law in an “inference-ready” machine readable form. That is, portions of the law could be written in a syntax that both humans and machines can read, making the law not only accessible but executable. I’m not merely referring to high-level metadata; I’m talking about code that is intended to be run in an inference engine and that can be deployed as is into society’s computing infrastructure. [See, e.g., Professor Monica Palmirani’s example of legal rules coded in the Legal Knowledge Interchange Format (LKIF) (at slides 48 through 50); please note that this is a 4.5M download.]

Some people have raised the objection that rule-based systems and their creators engage in the unauthorized practice of law by dispensing “legal advice.” I think this concern is overblown and founded upon a lack of understanding of how these systems work. Legal advice entails applying the law to the facts of a particular case or, conversely, interpreting facts in light of the applicable law. Rule-based systems don’t do that.  Instead, they break up complicated legal provisions into atomic pieces and ask users to determine how each atom applies to them. Conceptually, it’s no different than reading a plain language description of legal rules and applying those rules to your own situation.

My goal in this post has been to introduce you to something that you may not have heard about and to convince you that it is a viable and worthwhile activity. Rule-based legal information systems have been around for a few decades, but we still have a long way to go until our rule-based legal modeling tools are as sophisticated as the Mathematica software is in the domain of mathematical computation. As we move in that direction, and as our legal knowledge engineering proficiency grows, we can advance toward the day when all people can take equal advantage of their legal rights. Knowing that they have them is the first step.

mp.jpgMichael Poulshock is a consultant specializing in legal knowledge engineering and a Fellow at Stanford University’s CodeX Center for Computers and Law. He is the creator of Jureeka.org and the Jureeka legal research browser add-on for Firefox and Chrome. He was previously a human rights lawyer.

VoxPopuLII is edited by Judith Pratt. Editor in chief is Robert Richards.