<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Making Metasausage</title>
	<atom:link href="http://blog.law.cornell.edu/metasausage/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.law.cornell.edu/metasausage</link>
	<description>A blog about legislative metadata and other grinding concerns</description>
	<lastBuildDate>Wed, 27 Feb 2013 16:15:14 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Our legislative metadata model</title>
		<link>http://blog.law.cornell.edu/metasausage/2013/02/27/our-legislative-metadata-model/</link>
		<comments>http://blog.law.cornell.edu/metasausage/2013/02/27/our-legislative-metadata-model/#comments</comments>
		<pubDate>Wed, 27 Feb 2013 15:01:07 +0000</pubDate>
		<dc:creator>Bruce Thomas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.law.cornell.edu/metasausage/?p=80</guid>
		<description><![CDATA[Some months ago, I and some of my colleagues at the LII began to release a series of white papers that were written as part of the construction of a (mostly) comprehensive metadata model for Federal legislation.  They are appearing as a series of blog posts in this blog.  One which seemed more appropriate for <a href='http://blog.law.cornell.edu/metasausage/2013/02/27/our-legislative-metadata-model/'>[...]</a>]]></description>
				<content:encoded><![CDATA[<div>Some months ago, I and some of my colleagues at the LII began to release a series of white papers that were written as part of the construction of a (mostly) comprehensive metadata model for Federal legislation.  They are appearing as <a href="http://blog.law.cornell.edu/metasausage">a series of blog posts in this blog</a>.  <a href="http://blog.law.cornell.edu/voxpop/2013/01/24/metadata-quality-in-a-linked-data-context/">One which seemed more appropriate for VoxPopuLII </a>&#8211; it had to do with metadata quality concerns that are not limited to legislation &#8212;  was <a href="http://blog.law.cornell.edu/voxpop/2013/01/24/metadata-quality-in-a-linked-data-context/">posted there</a> yesterday.  We&#8217;ll continue to adapt the white papers as blog posts and release them as Metasausage posts, but we thought that it was high time that we released<a href="http://blog.law.cornell.edu/metasausage/downloads-and-related-information/"> full documentation of the model</a>.  Many of you have known of its existence for a while; we&#8217;ve been slow to release it because, well, we&#8217;re just overwhelmed with work.</div>
<div></div>
<div>The model is Linked-Data-friendly and designed to be highly extensible.  We think it could serve as a reference model (by which I think I really mean &#8220;extensible scaffolding&#8221;) for a much more comprehensive metadata model for Federal legislation.  As you&#8217;ll see when you read the documentation, we made no attempt to model things where we lacked domain expertise (appropriations and reconciliation being two), nor did we try to deal with the finer points of House and Senate rules when modeling process.</div>
<div></div>
<div>We&#8217;ll be interested in your reactions to it, and very, very interested in taking it further.  Over the next month or so, we&#8217;ll actually build out what we&#8217;ve already put in the Open Metadata Registry into a full Linked Data representation online.  Our hope is that this is a very big stone that can be used to make some <a href="http://liicr.nl/ZIon7E">Stone Soup</a>.</div>
<div></div>
<div>The model was primarily done by myself, Diane Hillmann, John Joergensen, and Jon Phipps; other contributors included Sara Frug, Wayne Weibel, Dave Shetland, and Rob Richards.  We had a lot of help from many of you, as well.</div>
<div></div>
<div> I suspect that there may be  some glitsches in the documentation itself, because as most of you probably know e-book compilation software is twitchy and it wouldn&#8217;t surprise me if different versions have a range of ugly formatting problems.  Let me know and we&#8217;ll clean &#8216;em up.  Most of all, we&#8217;re interested in knowing what you think of the model.</div>
<div></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.law.cornell.edu/metasausage/2013/02/27/our-legislative-metadata-model/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Modeling people and organizations for legislative information</title>
		<link>http://blog.law.cornell.edu/metasausage/2012/07/17/modeling-people-and-organizations-for-legislative-information/</link>
		<comments>http://blog.law.cornell.edu/metasausage/2012/07/17/modeling-people-and-organizations-for-legislative-information/#comments</comments>
		<pubDate>Tue, 17 Jul 2012 14:00:11 +0000</pubDate>
		<dc:creator>Bruce Thomas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.law.cornell.edu/metasausage/?p=46</guid>
		<description><![CDATA[These days, there&#8217;s no need to settle on a single answer to the question of what standard to reference in describing people and organizations. The current environment &#8212; where the speed of change can be daunting &#8212; demands strategies that start with descriptive properties that meet local needs as expressed in use cases. Taking the <a href='http://blog.law.cornell.edu/metasausage/2012/07/17/modeling-people-and-organizations-for-legislative-information/'>[...]</a>]]></description>
				<content:encoded><![CDATA[<h2 dir="ltr"><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://blog.law.cornell.edu/metasausage/files/2012/07/faceless-people-cover1.jpeg"><img class="alignleft  wp-image-51" style="margin: 10px 15px;" title="faceless people cover" src="http://blog.law.cornell.edu/metasausage/files/2012/07/faceless-people-cover1-1024x805.jpg" alt="" width="208" height="164" /></a>These days, there&#8217;s no need to settle on a single answer to the question of what standard to reference in describing people and organizations. The current environment &#8212; where the speed of change can be daunting &#8212; demands strategies that start with descriptive properties that meet local needs as expressed in use cases. Taking the further step of mapping these known needs to a variety of existing standards best provides both local flexibility and interoperability with the wider world. </span></strong></strong></h2>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">In the world of Web standards, most thinking about how to describe people and organizations begins with the FOAF vocabulary (</span><a href="http://xmlns.com/foaf/spec/"><span style="font-size: 15px; font-family: Arial; color: #1155cc; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://xmlns.com/foaf/spec/</span></a><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">), developed in 2000 as ‘Friend of a Friend’ and now used extensively on the web to describe people, groups, and organizations. FOAF is an RDF-based specification, and as such is poised to gain further in importance as the ideas behind Linked Data gain traction and more widespread implementation. FOAF is quite simple on its face, but as an RDF vocabulary it is easily extended to meet needs for richer and more complex information.  FOAF is now in a stable state, and its developers have recently<a href="http://dublincore.org/documents/dcmi-foaf/"> entered into an agreement</a> with the Dublin Core Metadata Initiative (DCMI), to provide improved persistence and sustainability for the website and the standard.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">More recent standards efforts are emerging and deserve attention as well. Several that address the building of descriptions for <a href="http://www.google.com/url?q=http%3A%2F%2Fwww.w3.org%2FTR%2F2012%2FWD-vocab-people-20120405%2F&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNH_ZPB7LPhZ6AVpB5LuqEsMaZXgqA">people</a> and <a href="http://www.w3.org/TR/2012/WD-vocab-org-20120405/.">organizations</a> are in working draft at the World Wide Web Consortium (W3C). Although still in draft status, they offer several alternative methods for description that look very useful. Because organizations in these standards are declared as subclasses of foaf:agent, the close association with the FOAF standard is built in. </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">What may be most useful about FOAF &#8212; and more recent standards that seek to extend it &#8212;  is both its simple and unambiguous method of providing identification of people and groups, as well as its recommendations for minting URIs for significant information about the person or group identified. </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">But despite its wide adoption, there are some limitations to basic FOAF that weigh on any assessment of its capacity to describe the diversity of people and organizations involved in the legislative process.  Critically, FOAF lacks a method for assigning a temporal dimension to roles, membership, or affiliations.  That temporal requirement is essential to any model used for describing relationships between legislation, legislators, and legislative groups or organizations, both retrospectively and prospectively.  The <a href="http://www.w3.org/standards/techs/gld#w3c_all">emerging W3C standard for modeling governmental organizational structures</a> (which includes the modelling descriptions of people and organizations mentioned above),  contemplates extensions to FOAF designed to address this limitation.  Another emerging standard, the <a href="http://eac.staatsbibliothek-berlin.de/">Society of American Archivists’ EAC-CPF</a>, also includes provisions for temporal metadata, and seems to take a very broad view of what it models, making it a standard worth watching.</span></p>
<p><img class="alignright" style="border: 0px; margin: 10px 15px;" title="LoC People and Orgs presentation - final.005.jpg" src="http://blog.law.cornell.edu/metasausage/files/2012/07/LoC-People-and-Orgs-presentation-final.005.jpg" alt="LoC People and Orgs presentation  final 005" width="294" height="221" border="0" /></p>
<p><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Thinking about affiliations gives a good feel for the process of working with standards; it takes a certain amount of careful thought and some trimming to fit. As an illustration, think about a member of Congress and her history as a congressional committee member. It&#8217;s not unusual for a member to serve on a committee for a while, become its chairperson, become ranking minority member after a change in the majority party, become chairperson once again, and finally settle down as a regular member of the committee. One might imagine this as a series of memberships in the committee, each with a different flavor, or as a single membership with changing roles. The figure at the right illustrates that history. The illustration at the top represents the &#8220;serial-membership&#8221; approach that is used in the W3C standard. In it, a membership also represents a specific role within the committee and has a duration; the total timespan for an individual&#8217;s committee service can only be found by calculation or inference. The bottom illustration, which represents roles and membership independently, is a little clunky in that it assigns durations to both roles and overall membership independently. Nevertheless, we prefer it. It does not require predecessor/successor relationships to link the individual role-memberships into a continuous membership span, nor does it require the slightly contrived idea of a &#8220;plain-vanilla&#8221; membership. On the other hand, it is a bit clunky in that it requires the assignment of durations in a way that might be considered duplicative.</span></strong></strong></p>
<p><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">We think that modelers are often tempted to choose standards prematurely, taking a kind of Chinese-menu approach to modeling can be overly influenced by the appeal of one-stop shopping. Our preference has been to model as closely to the data as we can. Once we have a model that is faithful to the data, we can start to think about which of its components should be taken from existing models &#8212; no sooner. In that way we avoid representation problems and also some hidden &#8220;gotchas&#8221; such as nonsensical or inappropriate default values, relationships that almost work, and so on. The same can be said of structure and hierarchy among objects &#8212; best to start modeling in a way that is very flat and very close to the data, and only once that is completed gather things into sub- and super-classes, sub properties, and so on.</span></strong></strong></p>
<h3 dir="ltr"><span style="font-size: 16px; font-family: Arial; color: #666666; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Standards encountered in libraries</span></h3>
<p><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">One question that always arises in discussing standards like FOAF in a library context is the prevalence of the MARC model in most discussions of description of people and organizations. Traditionally, libraries have used MARC name authority records as the basis for uniquely identifying people and organizations, providing text strings for both identification and display. Similar functionality has been attempted with the recent additions to the Library of Congress’s Metadata Authority Description Schema (MADS). MADS was originally developed as an XML expression of the traditional MARC authority data. Now, with the arrival of a public draft standard, focus is shifting toward an <a href="http://www.loc.gov/standards/mads/rdf/v1.html">RDF expression</a> to provide a path for migration of MADS data into the Semantic Web environment.  </span></strong></strong></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">MADS, like its parent USMARC Authority Format, focuses on preferred names for people and organizations, including variants, rather than on describing the person or organization more fully. As such it provides a useful ‘hook’ into library data referencing the person or organization, but is not really designed to accommodate the broader uses required for this project. </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">There is also a question about where this new RDF pathway for MADS might go, given the traditional boundaries of the MARC name authority world. In that tradition, names are added to the distributed file based on ‘literary warrant’, requiring that there be an object of descriptive interest which is by or about the person or organization that is a candidate for inclusion.  That is not a particularly useful basis for describing legislators, hearing witnesses, or others who have not written books or been the subject of them. Control of names and name variants will surely be necessary in the new web environment, and the extensive data and experience with the inherent problems of change in names will be essential, but not sufficient, for more widely-scoped projects like this one.</span></p>
<h2 dir="ltr"><span style="font-size: 19px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Groups vs. Organizations </span></h2>
<p><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Legislatures create myriad documents that must be identified and related to one another. For each of those documents, there are people and organizations fulfilling a variety of roles in the events the documents narrate, the creation of the documents themselves, the endorsement  of their conclusions, or the implementation of whatever those documents describe. Those people and organizations include not only legislators and the various committees and other sub-organizations of the legislature, but also the executive branch which, primarily through the President, exercises the final steps in the legislative process, as well as bearing responsibility for implementation. Finally, there are other parties, often outside government, who are involved in the legislative process as hearing witnesses or authors of committee prints, whose identity and organizational affiliations are essential to full description and interpretation. These latter present a particularly strong case for linked-data approaches, as they are unlikely to have any sort of formal description constructed for them by the legislature. The Congressional Biographical Dictionary is an excellent resource &#8212; but it is a dictionary of Congresspeople, not of all those who appear in Congressional documents. The latter would be impossible for any single entity to construct and maintain. But the task can be divided and conquered in concert with resources like the New York Times linked-data publishing effort, DBPedia, Freebase, and so on.</span></strong></strong></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">When discussing organizations, it is sometimes useful to distinguish between more and less formal groupings.  In the FOAF specification, that is conceptualized in the categories “group” and “organisation”  Generally, FOAF imagines that an “organisation” is a more formalized entity with fairly well defined memberships and descriptions, whereas a “group” is a more informal concept, intended to capture collections of agents where a strict specification of membership may be problematic, or impossible.  In practice, the distinction tends to be a very blurry one, and seems to be a sort of summary calculation done on a number of dimensions:</span></p>
<p>&nbsp;</p>
<ul style="margin-top: 0pt; margin-bottom: 0pt;">
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">the temporal stability of the group itself, for example “people eating dinner at Tom&#8217;s house”, as opposed to “the House Judiciary Committee”;</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">the temporal stability of the group’s membership, which may be relatively fixed or constantly churning ( “the Supreme Court” versus “the people waiting in the anteroom” )</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">the existence of institutional trappings such as office locations, meeting rooms and websites;</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">the level of “institutionalization” or “officialness”.  In the case of government institutions in any branch, that may often rest on some legal authority that establishes the group and describes its scope of operations (as with the Federal courts). It may also take the form of a single, very narrow capability (as when an agency is said to have “gift authority”#).  Finally, it may also be established through tradition.  For example, the Congressional Black Caucus has existed for over 40 years, and occupies offices in the Longworth House Office Building, but has no formal existence in law.</span></li>
</ul>
<p><strong id="internal-source-marker_0.24581000371836126" style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><br />
<span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Because that distinction is so blurry, we have chosen to treat all organizations similarly, using common properties that allow users to determine how official the organization is by ‘following their noses’.  The accumulation of statement-level data about any of the dimensions listed above (or others, for that matter) serves as evidence. </span></strong><strong id="internal-source-marker_0.24581000371836126" style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Thus, users of the model are free to draw their own conclusions about the “officialness” of any collection of people, although a statutory or constitutional mandate might well be interpreted as dispositive.</span></strong></p>
<p><strong style="color: #000000; font-family: Times; font-size: medium; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">We end here, as usual, with a couple of musical selections: <a href="http://www.youtube.com/watch?v=9-8gn6vGu_w">1</a>, <a href="http://www.youtube.com/watch?v=sNdOtaHniQc">2</a> , <a href="http://www.youtube.com/watch?v=Db0N4R5O-BU">3</a> Next time: Events.</span></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.law.cornell.edu/metasausage/2012/07/17/modeling-people-and-organizations-for-legislative-information/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Identifiers, part 3</title>
		<link>http://blog.law.cornell.edu/metasausage/2012/06/11/identifiers-part-3/</link>
		<comments>http://blog.law.cornell.edu/metasausage/2012/06/11/identifiers-part-3/#comments</comments>
		<pubDate>Mon, 11 Jun 2012 18:53:54 +0000</pubDate>
		<dc:creator>Bruce Thomas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.law.cornell.edu/metasausage/?p=33</guid>
		<description><![CDATA[[This is part 3 of a three-part post on identifiers. Here are parts 1 and 2] How well does current practice measure up? To judge by the examples presented so far, current practice in legislative identifiers for US materials might best be described as “coping”, and specifically “coping in a way that was largely designed <a href='http://blog.law.cornell.edu/metasausage/2012/06/11/identifiers-part-3/'>[...]</a>]]></description>
				<content:encoded><![CDATA[<p><img style="float: left;" src="http://blog.law.cornell.edu/metasausage/files/2012/05/identifier-e1336301865368.jpg" alt="" width="125" height="125" />[<em>This is part 3 of a three-part post on identifiers. Here are parts <a href="http://blog.law.cornell.edu/metasausage/2012/05/07/identifiers-part-1/">1</a> and <a href="http://blog.law.cornell.edu/metasausage/2012/05/15/identifiers-part-2/">2</a></em>]</p>
<h2 style="font-weight: normal; font-size: medium; font-family: Times; display: inline !important;" dir="ltr"><strong><span style="font-family: Arial; vertical-align: baseline; white-space: pre-wrap; font-size: large;">How well does current practice measure up?</span></strong></h2>
<p><span id="internal-source-marker_0.8899018629454076"><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">To judge by the examples presented so far, current practice in legislative identifiers for US materials might best be described as “coping”, and specifically “coping in a way that was largely designed to deal with the problems of print”. Current practice presents a welter of &#8220;identifiers&#8221;, monikers, names, and titles, all believed by those who create and use them to be sufficiently rigorous to qualify as identifiers whether they are or not.  It might be useful to divide these into four categories:</span></span></p>
<ul style="font-weight: normal; font-size: medium; font-family: Times; margin-top: 0pt; margin-bottom: 0pt;">
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Well-understood</span><span style="vertical-align: baseline; white-space: pre-wrap;"> monikers, issued in predetermined ways as part of the legislative process by known actors.  Their administrative stability may well be the product of statutory requirement or of requirements embedded in House or Senate rules. Many of these will also correspond to definite stages in the legislative process. Examples would include House and Senate bill and resolution numbers.</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">Monikers </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">arising from need</span><span style="vertical-align: baseline; white-space: pre-wrap;"> and possibly semi-formalized, or possibly “bent” versions of monikers created for a purpose other than that they end up serving.   Monikers of this kind are widely relied-on,  but nobody is really responsible for them.  Some end up being embedded in retrieval systems because they’re all there is.  A variety of such approaches are on display in the world of House committee prints.</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">Monikers </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">imposed after the fact</span><span style="vertical-align: baseline; white-space: pre-wrap;"> in an effort to systematize things or otherwise compensate for any deficiencies of monikers issued at earlier stages of the process.  Certainly internal database identifiers would fit this description; so would most official citation.</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">A grab-bag of other monikers. These might be created within government ( as with GPO’s SuDoc numbers), or outside government altogether (as with accession numbers or other schemes that identify historical papers held in other libraries).  Here, a good model would provide a set of properties enabling others to relate their schemes to ours.</span></li>
</ul>
<h2 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Identifiers in a Linked Data context</span></h2>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">John Sheridan (of legislation.gov.uk) has <a href="http://liicr.nl/N4FN8o">written eloquently about the use of legislative Linked Data</a> to support the development of “accountable systems”. The key idea is that exposing legislative data using Linked Data techniques has particular informational and economic value when that data </span><span style="font-weight: normal; font-size: 15px; font-family: Arial; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">defines real-world objects</span><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;"> for legal purposes.  If we turn our attention from statutes to regulations, that value becomes even more obvious.</span></p>
<h3 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><strong><span style="font-family: Arial; vertical-align: baseline; white-space: pre-wrap; font-size: large;">Valuable features of Linked Data approaches to legislative information</span></strong></h3>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Ability to reference real-world objects</span></h4>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-size: 16px; font-family: Arial; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">“</span><span style="font-size: 15px; font-family: Arial; font-weight: normal; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">On the Semantic Web, URIs identify not just Web documents, but also real-world objects like people and cars, and even abstract ideas and non-existing things like a mythical unicorn. We call these real-world objects or things.” &#8212; <a href="http://www.w3.org/TR/cooluris/#semweb">Tim Berners-Lee</a></span></h4>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">There are no unicorns in the United States Code. Nevertheless, legislative data describes and references many, many things.  More, it provides fundamental definitions of how those things are seen by Federal law.  It is valuable to be able to expose such definitions &#8212; and other fundamental information &#8212; in a way that allows it to be related to other collections of information for consumption by a global audience.</span></p>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Avoiding cumbersome standards-building processes</span></h4>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">In <a href="http://www.jenitennison.com/blog/node/140">a particularly insightful blog post that discusses the advantages of the Linked Data methods</a> used in building legislation.gov.uk, Jeni Tennison points out the ability that RDF and Linked Data standards have to solve a longstanding problem in government information systems: the social problem of standard-setting and coordination:</span></p>
<p style="font-weight: normal; font-size: medium; font-family: Times; margin-left: 36pt; margin-top: 0pt; margin-bottom: 0pt;" dir="ltr"><span style="font-size: 15px; font-family: Arial; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">RDF has this balance between allowing individuals and organisations complete freedom in how they describe their information and the opportunity to share and reuse parts of vocabularies in a mix-and-match way. This is so important in a government context because (with all due respect to civil servants) we really want to avoid a situation where we have to get lots of civil servants from multiple agencies into the same room to come up with the single government-approved way of describing a school. We can all imagine how long that would take.</span></p>
<p style="font-weight: normal; font-size: medium; font-family: Times; margin-left: 36pt; margin-top: 0pt; margin-bottom: 0pt;" dir="ltr"><span style="font-size: 15px; font-family: Arial; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">The other thing about RDF that really helps here is that it’s easy to align vocabularies if you want to, post-hoc.</span><a href="http://www.w3.org/TR/rdf-schema/"><span style="font-size: 15px; font-family: Arial; color: #000099; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">RDFS</span></a><span style="font-size: 15px; font-family: Arial; font-style: italic; vertical-align: baseline; white-space: pre-wrap;"> and</span><a href="http://www.w3.org/TR/owl-overview/"><span style="font-size: 15px; font-family: Arial; color: #000099; font-style: italic; vertical-align: baseline; white-space: pre-wrap;">OWL</span></a><span style="font-size: 15px; font-family: Arial; font-style: italic; vertical-align: baseline; white-space: pre-wrap;"> define properties that you can use to assert that this property is really the same as that property, or that anything with a value for this property has the same value for that other property. This lowers the risk for organisations who are starting to publish using RDF, because it means that if a new vocabulary comes along they can opportunistically match their existing vocabulary with the new one. It enables organisations to tweak existing vocabularies to suit their purposes, by creating specialised versions of established properties.</span></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">While Tennison’s remarks here concentrate on vocabularies, a similar point can be made about identifier schemes; it is easy to relate multiple legacy identifiers to a “gold standard”.</span></p>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><strong><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Layering and API-building</span></strong></h4>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Well-designed, URI-based identifier schemes create APIs for the underlying data.  At the moment, the leading example for legislative information is the scheme used by legislation.gov.uk, described in summary at </span><a style="font-weight: normal; font-size: medium; font-family: Times;" href="http://data.gov.uk/blog/legislationgovuk-api"><span style="font-size: 15px; font-family: Arial; color: #000099; vertical-align: baseline; white-space: pre-wrap;">http://data.gov.uk/blog/legislationgovuk-api</span></a><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">  and in detail in a collection of developer documentation linked from that page.  Because a URI is resolvable, functioning as a sort of retrieval hook, it is also the basis of a well-organized scheme for accessing different facets of the underlying information.  </span><a style="font-weight: normal; font-size: medium; font-family: Times;" href="http://legislation.gov.uk/"><span style="font-size: 15px; font-family: Arial; color: #000099; vertical-align: baseline; white-space: pre-wrap;">legislation.gov.uk</span></a><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">  uses a three-layer system to distinguish the abstract identity of a piece of legislation from its current online expression as a document and from a variety of format-specific representations.  </span></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">That is an inspiring approach, but we would want to extend it to encompass point-in-time as well as point-in-process identification (such as being able to retrieve all of the codified fragments of a piece of legislation as codified, using its original bill number, popular name, or what-have-you).  At the moment, </span><a style="font-weight: normal; font-size: medium; font-family: Times;" href="http://legislation.gov.uk/"><span style="font-size: 15px; font-family: Arial; color: #000099; vertical-align: baseline; white-space: pre-wrap;">legislation.gov.uk</span></a><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;"> does this only via search, but the recently announced Dutch statutory collection at </span><a style="font-weight: normal; font-size: medium; font-family: Times;" href="http://doc.metalex.eu/"><span style="font-size: 15px; font-family: Arial; color: #000099; vertical-align: baseline; white-space: pre-wrap;">http://doc.metalex.eu/</span></a><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;"> does support some point-in-time features.   It is worth pointing out that the American system presents greater challenges than either of these,  because of our more chaotic legislative drafting practices, the complexity of the legislative process itself, and our approach to amendment and codification.</span></p>
<h3 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><strong><span style="font-family: Arial; vertical-align: baseline; white-space: pre-wrap; font-size: large;">Identifier challenges arising from Linked Data (and Web exposure generally)</span></strong></h3>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">The idea that we would publish legislative information using Linked Data approaches has obvious granularity implications (see above), but there are others that may prove more difficult.  Here we discuss three:  uniqueness over wider scope, resolvability, and the practical needs of “identifier manufacturing”:</span></p>
<h4 style="font-size: medium; font-family: Times; display: inline !important;" dir="ltr"><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Uniqueness over wider scope</span></h4>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Many of the identifiers developed in the closed silo of the world of legal citation could be reused as URIs in a linked data context, exposing them to use and reuse in environments outside the world where legal citation has developed.  In the open world, identifiers need to carry their context with them, rather than have that context assumed or dependent on bespoke processes for resolution or access.   For the most part, citation of judicial opinions survives wide exposure in fair style.  Other identifiers used for government documents do not cope as well.   Above, we mentioned bill numbers as being limited in chronological scope; other identifiers (particularly those that rely heavily on document titles or dates as the sole means of distinction from other documents in the same corpus) may not fare well either.</span></p>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Resolvability</span></h4>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><strong id="internal-source-marker_0.8899018629454076" style="font-weight: normal;"><span style="font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">The differences between URNs (Uniform Resource Names) and URLs (Uniform Resource Locations, the URIs based on the HTTP protocol) are significant.  Wikipedia notes that the URNs are similar to personal names, the URLs to street addresses&#8211;the first rely on resolution services to function.  In many cases, URNs can provide the basis for URLs, with resolution built into the http address, but in the world we’re now working in, URNs must be seen as insufficient for creating linked open data.</span></strong></h4>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">In reality, they have different goals.  URIs provide resolvability &#8212; that is, the ability to actually find your way to an information resource,  or to information about a real-world thing that is not on the web.  As Jeni Tennison remarks in her blog#, they do that at the expense of creating a certain amount of ambiguity.  Well-designed URN schemes, on the other hand, can be unambiguous in what they name, particularly if they are designed to be part of a global document identification scheme from the beginning, as they are in <a href="http://tools.ietf.org/html/draft-spinosa-urn-lex-06">the emerging URN:Lex specification</a> .   </span></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">For our purposes, we probably want to think primarily in terms of URIs, but (as with legacy identifier schemes) there will be advantages to creating sensible linkages between our system, which emphasizes reliability, and others that emphasize a lack of ambiguity and coordination with other datasets.  </span></p>
<h4 style="font-size: medium; font-family: Times; display: inline !important;" dir="ltr"><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Things not on the Web</span></h4>
<p><strong id="internal-source-marker_0.8899018629454076" style="font-weight: normal;"> </strong></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Legislation is created by real people and it acts on real things.  It is incredibly valuable to be able to relate legislative documents to those things.  The challenge lies, as it always has,  in eliminating ambiguity about which object we are talking about.  A newer and more subtle need is the need to distinguish references to the real-world object itself from references to representations of the object on the web.  The problems of distinguishing one John Smith from another are already well understood in the library community.  URIs present a new set of challenges.  For instance, we might want to think about how we are to correctly interpret a URI that might refer to John Smith, the off-web object that is the person himself, and a URI that refers to the Wikipedia entry that is (possibly one of many) on-web representations of John Smith.  This presents <a href="http://www.jenitennison.com/blog/node/159">a variety of technical challenges that are still being resolved</a>. </span></p>
<h4 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-size: 16px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Practical manufacturing and assignment of Web-oriented identifiers</span></h4>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Thinking about the highly-granular approach needed to make legislative data usefully recombinant &#8212; as suggested in the section on fragmentation and recombination above &#8212; quickly leads to practical questions about where all those granular identifiers will come from. The problem becomes more acute when we being to think about retrofitting such schemes to large bodies of legacy information.  For these among other reasons, the ability to manufacture and assign high-quality identifiers by automated means has become the Philosopher’s Stone of digital legal publishers.  It is not that easy to do.  </span></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">The reasons are many, and some arise from design goals that may not be shared by everyone, or from misperceptions about the data.  For example, it’s reasonable to assume that a sequence of accession numbers represents a chronological sequence of some kind, but as we’ve already seen, that’s not always the case.  Legacy practices complicate this.  For example, it would be interesting to see how the sequence of Supreme Court cases for which we have an exact chronological record (via file datestamping associated with electronic transmission) corresponds to their sequence as officially published in printed volumes.  It may well be that sequence in print has been dictated as much by page-layout considerations as by chronology.  It might well be that two organizations assigning sequential identifiers to the same corpus retrospectively would come up with a different sequence.</span></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Those are the problems we encounter in an identifier scheme that is, theoretically, content-independent.  Content-dependent schemes can be even more challenging.  Automatic creation of identifiers typically rests on the automated extraction of one or more document features that can be concatenated to make a unique identifier of wide scope.  There are some document collections where that may be difficult or impossible, either because there is no combination of extractable document features that will result in a unique identifier, or because legacy practices have somehow obliterated necessary information, or because it is not easy to extract the relevant features by automated means.  We imagine that retroconversion of House Committee prints would present exactly this challenge.  </span></p>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">At the same time, it is worth remembering that the technologies available for extracting document features are improving dramatically, suggesting that a layered, incremental approach might be rewarded in the future.  While the idea of “graceful degradation” seems at first blush to be less applicable to identifiers than to other forms of metadata, it is possible to think about the problem a little differently in the context of corpus retroconversion.  That is a complicated discussion, but it seems possible that the use of provisional, accession-based identifiers within a system of properties and relationships designed to accomodate incomplete knowledge about the document might yield good results.</span></p>
<h2 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">A final note on economics</span></h2>
<p><span style="font-weight: normal; font-size: 15px; font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">Identifiers have special value in an information domain where authority is as important as it is for legal information.  In the event of disputes, parties need to be able to definitively identify a dispositive, authoritative version of a statute, regulation, or other legal document.  There is, then, a temptation toward a soft monopoly in identifiers: the idea that there should be a definitive, authoritative copy somewhere leads to the idea of a definitive, authoritative identifier administered by a single organization. Very often, challenges of scale and scope have dictated that that be a commercial publisher.  Such a scheme was followed for many years in the citation of judicial opinions, resulting in an effective monopoly for one publisher.  That is proving remarkably difficult and expensive to undo, even though it has had serious cost implications and other detrimental effects on the legal profession and for the public.  Care is needed to ensure that the soft, natural monopoly that arises from the creation of authoritative documents by authoritative sources does not harden into real impediments to the free flow of public information, as it did in the case of judicial opinions.</span></p>
<h2 style="font-weight: normal; font-size: medium; font-family: Times;" dir="ltr"><span style="font-family: Arial; vertical-align: baseline; white-space: pre-wrap;">What we recommend</span></h2>
<p><span style="font-weight: normal;">This is not a complete set of general recommendations &#8212; really more a series of guideposts or suggestions, to be altered and tempered by institutional realities:</span></p>
<ul style="font-weight: normal; font-size: medium; font-family: Times; margin-top: 0pt; margin-bottom: 0pt;">
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">At the most fundamental level, </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">everything should have an identifier</span><span style="vertical-align: baseline; white-space: pre-wrap;">. It should be available for use by the public. For example, Congressional committee reports appear not to have any identifiers, but it would be reasonable to assume that some system is in use in the background, at least for their publication by GPO.</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">Many legacy identifier systems will need to be </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">extended  or modified to create a gold standard system</span><span style="vertical-align: baseline; white-space: pre-wrap;">, probably issued by a third party and not by the document creators themselves.  That is especially the case because there is nobody in a position to compel good practice by document creators over the long term.  Such a gold-standard will need to be:</span>
<ul style="margin-top: 0pt; margin-bottom: 0pt;">
<li style="list-style-type: circle; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Unambiguous</span><span style="vertical-align: baseline; white-space: pre-wrap;">. For example, existing bill and resolution numbers would need to be extended by, eg., a date of introduction.</span></li>
<li style="list-style-type: circle; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Designed to resist tampering</span><span style="vertical-align: baseline; white-space: pre-wrap;">. When things are numbered and labelled, there is a temptation to alter numbers and labels to serve short-term interests.  The reservation of “important” bill numbers under House procedural rules is an example; another (from the executive branch) is the long-standing practice of manipulating RIN numbers to color assessments of agency activity.</span></li>
<li style="list-style-type: circle; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Clear as to the separation </span><span style="vertical-align: baseline; white-space: pre-wrap;">of titling, dating, and identification functions.  Presidential documents provide a good example of something currently needing improvement in this respect.</span></li>
<li style="list-style-type: circle; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">Taking advantage of </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">carefully designed relationships</span><span style="vertical-align: baseline; white-space: pre-wrap;"> among identifiers to allow the retention of well-understood legacy monikers for foreground use, while making use of a well-structured “gold standard” from the beginning.  Those relationships should enable automated linkage that will allow retrieval across multiple, related identifier systems.</span></li>
</ul>
</li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="vertical-align: baseline; white-space: pre-wrap;">Where possible, </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">retain useful semantics</span><span style="vertical-align: baseline; white-space: pre-wrap;"> in identifiers as a way of increasing access and reducing errors.  It is possible that different audiences will require different semantics, making this unlikely to happen in the background, but it should be possible to retain this functionality in the foreground.</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Maintain granularity at the level of common citation and crossreferencing practice</span><span style="vertical-align: baseline; white-space: pre-wrap;">, but with a distinction between identifiers and labels.  </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Identifiers</span><span style="vertical-align: baseline; white-space: pre-wrap;"> should be assigned at the whole-document level, with the notion of “whole document” determined on a corpus-by-corpus basis.  </span><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Labels</span><span style="vertical-align: baseline; white-space: pre-wrap;"> may be assigned to subdocuments (eg., a section of a bill) for purposes of navigation and retrieval.  This is similar in function and purpose to the distinction between HREF and NAME attributes in HTML anchor tags.</span></li>
<li style="list-style-type: disc; font-size: 15px; font-family: Arial; vertical-align: baseline;"><span style="font-style: italic; vertical-align: baseline; white-space: pre-wrap;">Use a layered approach</span><span style="vertical-align: baseline; white-space: pre-wrap;">.  In our view, it is important not to hold future systems hostage to what is practicable in legacy document collections.  In general, it will be much harder to implement good practices over documents that were not “born digital”.  That is not a good reason to water down our prospective approach, but it is a good reason to design systems that degrade gracefully when it becomes difficult or impossible to deal with older collections. That is particularly true at a time when the technologies for extracting metadata from legacy documents are improving dramatically, suggesting that a layered, incremental approach might produce great gains in the future.</span></li>
</ul>
<p><em>We conclude, as always, with a <a href="http://www.youtube.com/watch?v=R0mylMh__Sc">musical selection</a> or <a href="http://www.youtube.com/watch?v=eYOC4d9YN34">two</a>.  Next time, some stuff about people and organizations as we find them in the legislative world.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.law.cornell.edu/metasausage/2012/06/11/identifiers-part-3/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Identifiers, Part 2</title>
		<link>http://blog.law.cornell.edu/metasausage/2012/05/15/identifiers-part-2/</link>
		<comments>http://blog.law.cornell.edu/metasausage/2012/05/15/identifiers-part-2/#comments</comments>
		<pubDate>Tue, 15 May 2012 15:29:55 +0000</pubDate>
		<dc:creator>Bruce Thomas</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Federal legislation]]></category>
		<category><![CDATA[identifiers]]></category>

		<guid isPermaLink="false">http://blog.law.cornell.edu/metasausage/?p=20</guid>
		<description><![CDATA[[ Part 2 in a 3 part series. Last time we talked about some general characteristics of identifiers for legislation, and some sources of confusion in legacy systems. This time: some design problems having to do with granularity and use, and the fact that identifiers are situated in legal and bureaucratic process. ] Identifier granularity <a href='http://blog.law.cornell.edu/metasausage/2012/05/15/identifiers-part-2/'>[...]</a>]]></description>
				<content:encoded><![CDATA[<p style="font-size: 9px;"><span style="font-size: medium; font-family: Arial; color: #000000; background-color: transparent; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><em><a href="http://blog.law.cornell.edu/metasausage/files/2012/05/identifier-e1336301865368.jpg"><img class="alignleft size-full wp-image-11" title="identifier" src="http://blog.law.cornell.edu/metasausage/files/2012/05/identifier-e1336301865368.jpg" alt="The Prisoner: Number 6" width="125" height="125" /></a>[ Part 2 in a 3 part series. <a href="http://blog.law.cornell.edu/metasausage/2012/05/07/identifiers-part-1/">Last time we talked about some general characteristics of identifiers for legislation</a>, and some sources of confusion in legacy systems. This time: some design problems having to do with granularity and use, and the fact that identifiers are situated in legal and bureaucratic process. ]</em></span></p>
<p dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Identifier granularity</span></p>
<p><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">How small a thing should we try to identify? It’s difficult to make general prescriptions about that, for needs vary from corpus to corpus.  For the most part, we assume that identifier granularity should follow common citation or cross-referencing practice &#8212; that is, the smallest thing we identify or label should be the smallest thing that standard citation practice would allow a user to navigate to.  That will vary from collection to collection, and from context to context. For example, it’s quite common for citation to the US Code to refer to objects at the subsection level, sometimes right down to the paragraph level.  On the other hand, references to the Code in the Parallel Table of Authorities and Rules generally refer to a full section.  Similarly, although cross-references within the Code of Federal Regulations can be very granular, external references typically target the Part level. In any corpus, amendments can be expressed in ways that are very granular indeed.</span></strong></strong></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Our citation and cross-referencing practices have evolved in the context of print, and we may be able to do things that better reflect the dynamic nature of legislative text.  The move from print to digital overturns background assumptions about practicality.  For example, print typically makes different assumptions about identifier stability than you would find, say, in an online legislative drafting system.  Good examples of this are found in citation practice for the Code of Federal Regulations, which typically cites material at the Part level because (one imagines) changes in numbering and naming of sections are so frequent as to render identifiers tied to such fine divisions unstable &#8212; at least in print, where the shelf life of such fine-grained identifiers is shorter than the shelf life of the edition by an order of magnitude. In a digital environment, it is possible to manage identifiers more closely, permitting graceful failure of those that are no longer valid, and providing automated navigation to things that have moved. We look at some of the possibilities and implications in sections on granularity, fragmentation, and recombination below.  All of those capabilities carry costs, and over-design is a real possibility.</span></p>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Metadata, markup, and embedding</span></h2>
<p><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Thinking about granularity leads to ideas about the linkages between metadata and the target object itself.  Often metadata applies to chunks of documents rather than whole documents.  Cross-referencing in statutes and legislation is usually done at the subdocument level, for instance, and subject-matter classification of a bill containing multiple unrelated provisions would be better if the subject classifications could be separately tied to specific provisions within the bill. That need becomes particularly acute when something important, but unrelated to the main purpose of the bill, has been “snuck in” to a much larger piece of legislation.  A stunning example of such a Frankenstein’s monster appears at  <a href="http://www.google.com/url?q=http%3A%2F%2Fwww.gpo.gov%2Ffdsys%2Fpkg%2FPLAW-111publ226%2Fpdf%2FPLAW-111publ226.pdf&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNFBu1KShAz-M8u82WzNXdll4tD_QQ">111 Pub. L. 226</a> . It is described in its preamble as modernizing the air-traffic control system, but its first major Title heading describes it as an “Education Jobs Fund”,  and its second major Title contemplates highly technical apparatus for providing fiscal relief to state governments.</span></strong></strong></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">We are aware that sometimes we are thinking in terms that are not currently supported by the markup of documents in existing XML-creating systems.  However, we think it makes sense to design identifier systems that are more capable than some document collections will currently support via markup, in the expectation that  markup in those collections will evolve to the same granularity as current cross-referencing and citation practice, and that point-in-time systems supporting the full lifecycle of legislative drafting, passage, and codification will become the norm.  Right now,  divisions of statutory and regulatory text below the section level (“subsection containers”) are among the most prominent examples of “missing markup”; they are provided for in the <a href="http://xml.house.gov/bill.html">legislative XML DTDs at (eg.) xml.house.gov</a>, but do not survive into the FD/SYS versions from GPO.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Most often, we imagine that the flow of document processing leads from markup to metadata, since as a practical matter a lot of metadata is generated simply by extracting text features that have been tagged with some XML or HTML element.  Sometimes the flow is in the other direction; we may want to embed metadata in the documents for various purposes.  Use of microformats, microdata, and other such schemes can be handy for various applications; the use of <a href="http://citationstylist.org/announcements/">research-management software like Zotero</a>, or the embedding of information about document authenticity comes to mind.  These are not part of a legislative data model per se, but represent use cases worth thinking about.</span></p>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Stresses and strains</span></h2>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Next, we turn to things that affect the design of real-world identifier systems, perhaps rendering them less “pure” in information-design terms than we might like.</span><br />
</strong></strong></span></span></h2>
<h3 dir="ltr"><span style="font-size: 19px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Semantics versus purity</span></h3>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Some systems enforce notions of identifier purity &#8212; often defined as some combination of uniqueness, orderliness, and ease of collation and sorting &#8212; by rigorously stripping all semantics from identifiers.  That is an approach that can function reasonably well in back-end systems, but greatly reduces the usefulness of the identifiers to humans (because understanding what the identifier identifies requires database reflection), and introduces extra possibilities for error in application because (among other reasons) errors caused by human transcription are hard to catch when the identifiers are meaningless strings of letters and numbers.  On the other hand, “pure” opaque identifiers counter a tendency to assume that one knows what a semantically laden identifier means, when in fact one might not.  And sometimes opaque identifiers can be used to provide stability in situations where labels change frequently but the labelled objects do not.  </span></strong></strong></span></span></h2>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">At the other end of the spectrum, identifier systems that are heavily burdened with semantics have problems with uniqueness, length, persistence, language, and other issues arising from inherent ambiguity of labels and other home-brewed identifier components.  It is worth remembering, too, that one person’s helpful semantics are another’s mumbo-jumbo; just walk up to someone at random and ask them the dates of the 75th Congress if you need proof of that. Useful systems find a middle ground between extremes of incomprehensible rigor and mindlessly verbose recitation of loosely-constructed labels. </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">It’s worth noting in passing that it can be very difficult to prevent the unwanted exposure of “back-end” identifier to end users.  For example, URIs constructed from back-end components often find their way into the browser bars of authors researching online, who then paste them into documents that would be better served by more brain-compatible, human-digestible versions. </span></p>
<div dir="ltr">
<table style="border-width: initial; border-color: initial; border-image: initial; border-collapse: collapse; width: 624px; border-style: none;">
<colgroup>
<col width="*" />
<col width="*" />
<col width="*" /></colgroup>
<tbody>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Moniker type</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Identifier</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Notes</span></td>
</tr>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Citation</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">18 USC 47</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Standard citation ignores all but Title and section number; intermediate aggregations not needed, and confusing.</span></td>
</tr>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Popular name</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Wild Horse Annie Act</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-family: Arial; font-size: 14px;">Descriptive and often used in popular accounts, the press, agency guidance on related rules, etc., but hard to find in the codified version.</span></td>
</tr>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">LII URI, (“presentable” version) </span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">http://www.law.cornell.edu/uscode/18/47.html</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Based on title and section number</span></td>
</tr>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">LII URI, “formal” version</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><a href="http://www.law.cornell.edu/uscode/18/usc_sec_18_00000047----000-.html"><span style="font-size: 13px; font-family: Arial; color: #0000cc; background-color: #ffff00; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://www.law.cornell.edu/uscode/18/usc_sec_18_00000047&#8212;-000-.html</span></a></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Also title and section based, but padded and normalized to allow proper collation; “supersection” aggregations above the section level are similarly disambiguated.</span></td>
</tr>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">USGPO URI, GPOAccess</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">http://frwebgate.access.gpo.gov/cgi-bin/getdoc.cgi?dbname=browse_usc&amp;docid=Cite:+18USC47</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Parameterized search returning 1 result.</span></td>
</tr>
<tr style="height: 0px;">
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">FindLaw URI</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">http://codes.lp.findlaw.com/uscode/18/I/3/47</span></td>
<td style="border-image: initial; vertical-align: top; border-width: 1px; border-color: #aaaaaa; border-style: dotted; padding: 7px;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Seemingly mysterious, because it interjects subtitle and part numbering, which is not used in citation.  Note that this hierarchy would also vary from Title to Title of the Code &#8212; not all have Subtitles, eg.</span></td>
</tr>
</tbody>
</table>
</div>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The table above shows some “monikers in the wild” &#8212; various real-world approaches to the problem of identifying a particular section of the US Code.  The “formal” LII identifier, highlighted in yellow, shows just how elaborate an identifier needs to be if it is to accommodate all the variation that is present in <a href="http://www.law.cornell.edu/wiki/lexcraft/section_identifiers_lii">US Code section numbering</a> (there is, for example, a 12 USC 1749bbb-10c), while still supporting collation</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.  The FindLaw URI demonstrates the fragility of hierarchical schemes; the intermediate path components would vary enormously from Title to Title, and occasionally lead to some confusion about structure, as intermediate levels of aggregation are called different things in different Titles. It is hard to tell, for example, if Findlaw interpolates &#8220;missing&#8221; levels into the URIs in order to maintain an identical scheme across Titles with dissimilar &#8220;supersection&#8221; structure.</span><br />
</strong></strong></span></span></h2>
<h3 dir="ltr"><span style="font-size: 19px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Administrative zones of control and procedural rules</span></h3>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Every identifier implies a zone of administrative control:  somebody has to assign it, somebody has to ensure its uniqueness, and somebody or something has to resolve it to an actual document location, physical or electronic.  Though it has taken years, the community has recognized that<a href="http://www.w3.org/2001/tag/doc/URNsAndRegistries-50"> qualities of persistence and uniqueness are primarily created by administrative rather than technical apparatus</a>.  That becomes a much more critical factor when dealing with government documents, which may be surrounded by legal restrictions on who may assign identifiers and when, and in some cases what the actual formats must be.  A legislative document may have its roots in ideas and policies formed well outside government, and pass through numerous internal zones of control as it makes its way through the legislature. It may emerge at the other end via a somewhat mysterious intellectual process in which it is blown to bits and the fragments reassigned to a coherent, but altogether different, intellectual structure with its own system of identifiers (we call this ‘codification’).  There may be internal or external requirements that, at various points in the process,  cause the document to be expressed in a variety of publications and formats each carrying its own system of citations and identifiers.</span></strong></strong></span></span></h2>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The legacy process, then, is an accretive one in which an object acquires multiple monikers from multiple sources, each with its own requirements and rules.  Sometimes those requirements and rules are shaped by concerns that are outside, and perhaps at odds with, sound information-organization practice.  </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">For example, the House and Senate each have their own rules of procedure, in which bill numbering is specified.  Bill numbers are usually accession numbers that reset with each new Congress, but the rules of procedure create exceptions.  Under the rules of the House for  the 106th Congress, the first ten bill numbers were reserved for use by the Speaker of the House for a specified time period. During the 107th and 108th Congresses (at least), the time period was extended to the full first session.  We surmise that this may have represented an attempt to reserve “important” bill numbers for things important to the majority party’s legislative agenda.  Needless to say, this rendered any relationship between bill numbers and chronology or order of introduction questionable, at least in a limited number of cases. The important point is that identifier usage will be hostage to political considerations for as long as it is controlled by rules of procedure; that situation is not likely to change.  </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">But there are also virtues to the legacy process, primarily because close association with long-standing institutional practices lends long-term stability to identifier schemes.  Bill numbers have institutional advocates, are well-understood, and unlikely to change very much in deployment or format. They provide good service within their intended scope, however much they may lose when taken outside it.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">That being said, a “gold standard” system of identifiers, specified and assigned by a relatively independent body, is needed at the core.  That gold standard can then be extended via known, stable relationships with existing identifier systems, and designed for extensible use by others outside the immediate legislative community.</span></p>
<h3 dir="ltr"><span style="font-size: 19px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Status, tracing, versioning and parallel activity</span></h3>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">It is useful to distinguish between </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">tracing</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> the evolution of a bill or other legislative document and </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">recording the status</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> of that document.  </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Status</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> usually records a strong association between some version of the document and a particular, well-known stage or event in the process by which it is created, revised, or made binding.  That presents two problems.  There is a </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">granularity</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> problem, in that some legislative events that cause alteration of the document are so trivial that to distinguish all of them would be to create an unnecessarily fine-grained, burdensome, and unworkable system. There is a </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">stability</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> problem in that legislative processes change over time, sometimes in ways that are very significant, as when (in 1975) the House rules changed to allow bills to be considered by multiple committees, and sometimes in ways that are not, as when House procedural rules are revised in trivial, short-lived ways at the beginning of each new Congress.  Optimally, bill status would be a property related to a small vocabulary of documented legislative milestones or events that remains very stable over time.  Detailed tracing of the evolution of a bill would be enabled through a series of relationships among documents that would (for instance) identify predecessor and successor drafts as well as other inter-document relationships.  These properties would exist as part of the data model without need for additional semantics in the identifiers. Such a scheme might readily be extended to accommodate the existence of multiple, parallel drafts, as sometimes happens during committee process.</span></strong></strong></span></span></h2>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">In this way, the model would answer questions about the “version” of a given document by making assertions either about its “status” &#8212; that is, whether it is tied to some well-known milestone in legislative process &#8212; or by some combination of properties that are chained back to such a milestone.  For example, a document might be described as a “committee draft from Committee X that is a direct revision of the document submitted to the committee, dated on such-and-such a date”.  The exact “version” of the document is given by a chain of relationships tied back to a draft that can be definitively associated with a stable milestone in the legislative process.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">It’s worth noting that while it would certainly be possible to identify versions using “version numbers” built out by extending the accession number of the root document with various semantically-derived text strings, it’s not necessary to do so.  The identifiers could, in fact, be anything at all.  All that is needed is for them to be linked to well-known “milestone” documents (e.g., the version reported out of committee) by a chain of relationships ( for example,  “isSuccessorVersionOf”) that chain back to the milestone.  This may be particularly important when the document-to-document relationship extends across boundaries between zones of administrative control, or outside government altogether.</span></p>
<h3 dir="ltr"><span style="font-size: 19px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Granularity</span></h3>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">To a great extent, the things that are being &#8216;identified&#8217; by identifiers are discrete documents, traditionally rendered as discrete print works. There are, however, significant exceptions that should be accommodated. In addition, changes in the nature and structure of documents that may be issued in the future should be anticipated as well.</span></strong></strong></span></span></h2>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The issue of “granularity” arises from the need to identify parts of a discrete document. For example, although a congressional hearing is published as a single document (sometimes in multi-volume form), it may be useful to make specific references to the testimony of individual witnesses. Even more significant would be mapping the relationships between the U.S. Code and the public laws from which it is derived. In these cases, the granularity of the identifiers available should be more fine-grained than the documents being identified. So, although a Public Law or slip law can be completely identified and described by a given set of identifiers, it is valuable to have additional identifiers available for sub-parts of these documents, so that mapping adequate relationships to sections of the U.S. Code can be described.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Of course, admitting such identifiers can be a slippery slope. The set of things that </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><em>could</em></span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> be identified in legislative documents is fairly unbounded, and any identifiers will arguably be useful to </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">someone</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">. An attempt to label all possible things, however, is madness, and should be avoided. The result would be numbers of unused, or seldom used identifiers which would over-complicate entities and the overall structure of the identifier system.</span></p>
<h3 dir="ltr"><span style="font-size: 19px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Fragmentation and recombination</span></h3>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Identifiers are used in ways that go well beyond slapping a unique label on a relatively static document.  They help us keep track of resources that can, in the electronic environment, be highly mobile.  Legislation is often fragmented and re-combined into new expressions, some official and some not.  For many legal purposes, it is important for the fragments to be recognized as </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">authentic</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">, that is, carrying the same weight of authority as the work from which they were originally taken.  Current practice accommodates this through the use of a variety of officially-published finding aids, including significant ones associated with the US Code:  the Table of Popular Names, the “Short Title” notes, and Table III of the printed edition of the US Code, which is essentially a codification map. <a href="http://liicr.nl/qdWBWi">Elsewhere</a>, we&#8217;ve referred to such a work as a “pont”, that is, something that bridges two isolated legal information resources.  Encoding  of ponts in engineered ways that facilitate use in retrieval systems is a particularly crucial function that should be supported by the identifier model.</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> </span><br />
</strong></strong></span></span></h2>
<h4 dir="ltr"><span style="font-size: 16px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Codification</span></h4>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Codification presents challenges, the more so because it can erect substantial barriers for inexperienced researchers.  Citizens often seek legislation by popular name (“Wild Horse Annie Act”). They don’t get far.  The problem is usually (though not always) more difficult than simply uncovering an association between the popular name of the act they’re seeking and some coherent chunk of the United States Code, or a fragment within a document that carries a Public Law number.  Often, the original legislation has been codified in ways that scatter fragments over multiple Titles of the US Code.</span></strong></strong></span></span></h2>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">That is so because even a coherent piece of legislation &#8212; and many are not &#8212;  typically addresses a bundle of issue-related concerns, or the needs of a particular constituency.  A “farm bill” might contain provisions related to tax, land use, regulation of commodities, water rights, and so on.  All of those belong in very different places under the system of topics used by the US Code.  Thus, legislation is fragmented and recombined during the process of codification.  While this results in much more coherent intellectual organization of statutes over the long term, it makes it difficult for users to exchange the tokens they have &#8212; usually the popular name of an Act, or some other moniker assigned by the press </span><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">(“Obamacare”) </span></strong></strong></span><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">&#8211; for access to what they are seeking.</span></strong></strong></span></p>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://uscode.house.gov/table3/table3years.htm">Table III of the United States Code</a> provides a map from provisions of Public Laws to their eventual destination within the US Code, as the Code existed at the time of classification.  That is potentially very useful to a present-day audience, provided that the relationships expressed in it can be traced forward through time; changes to the Code from the time of classification forward  would need to be accounted for.  That would rest on two things:  an identifier system capable of tracking the fragments of the original Act as they are codified, and a series of relationships that account for both the process of codification and the processes by which the Code itself subsequently evolves.</span><br />
</strong></strong></span></span></h2>
<h4 dir="ltr"><span style="font-size: 16px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Fragmentary re-use</span></h4>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong id="internal-source-marker_0.3747975789010525" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Codification is really a special case of something we might call “fragmentary re-use” &#8212; an application in which a document snippet, or other excerpt from an object, is reused outside its parent.  Next time, we&#8217;ll discuss the problems of identifier exposure in a Linked Data context, noting that identifiers must carry their own context.  A noteworthy example of this is the legislative fragment that needs to carry some link back to its provenance, and specifically its legal status or authority.  Minimally, this would be an identifier resolvable to a data resource describing the provenance of the fragment.  Such an approach might fit well into a “layered” URI scheme such as that used by legislation.gov.uk. </span></strong></span></span></h2>
<p><span style="font-family: Arial; color: #000000; background-color: transparent; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap; font-size: 14px;"><span style="font-family: Arial; color: #000000; background-color: transparent; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">[ </span><span style="font-family: Arial; color: #000000; background-color: transparent; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><em>Fragmented and recombined as we are, we'll stop here with a <a href="http://www.youtube.com/watch?v=HdX6egy2k4c">song about codification</a>, a <a href="http://readyrickshaw.com/toob/node/63#2">granular, highly-recombinant and NSFW musical selection</a>, and <a href="http://www.youtube.com/watch?v=41VOvxj0Khc&amp;ob=av2n">a third that queries the object itself (at 2:45</a>) and makes heavy use of visual and audio recombination . <a href="http://blog.law.cornell.edu/metasausage/2012/06/11/identifiers-part-3/">Next time: some problems with current practice, identifier manufacturing, and what happens when we think about Linked Data, as we surely should</a></em></span><span style="font-family: Arial; color: #000000; background-color: transparent; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> ]</span></span></p>
<p><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><strong><br />
</strong></span></span></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.law.cornell.edu/metasausage/2012/05/15/identifiers-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Identifiers, Part 1</title>
		<link>http://blog.law.cornell.edu/metasausage/2012/05/07/identifiers-part-1/</link>
		<comments>http://blog.law.cornell.edu/metasausage/2012/05/07/identifiers-part-1/#comments</comments>
		<pubDate>Mon, 07 May 2012 12:52:35 +0000</pubDate>
		<dc:creator>Bruce Thomas</dc:creator>
				<category><![CDATA[citation]]></category>
		<category><![CDATA[document identifiers]]></category>
		<category><![CDATA[Identifiers]]></category>
		<category><![CDATA[legacy systems]]></category>
		<category><![CDATA[Federal legislation]]></category>
		<category><![CDATA[identifiers]]></category>

		<guid isPermaLink="false">http://blog.law.cornell.edu/metasausage/?p=7</guid>
		<description><![CDATA[[NB: Making MetaSausage is a new blog on legislative metadata and legislative systems.  It's a place to talk geek about legislation.  We make no promises, but we think posts will appear every couple of weeks.  Comments encouraged. ] The law-creating process described in How Our Laws Are Made (HOLAM), and other civics texts like it, <a href='http://blog.law.cornell.edu/metasausage/2012/05/07/identifiers-part-1/'>[...]</a>]]></description>
				<content:encoded><![CDATA[<p><em>[NB: Making MetaSausage is a new blog on legislative metadata and legislative systems.  It's a place to talk geek about legislation.  We make no promises, but we think posts will appear every couple of weeks.  Comments encouraged. ]</em></p>
<p><em><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><strong style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://blog.law.cornell.edu/metasausage/files/2012/05/identifier.jpg"><img class=" wp-image-11 alignleft" title="identifier" src="http://blog.law.cornell.edu/metasausage/files/2012/05/identifier-300x300.jpg" alt="The Prisoner: Number 6" width="168" height="168" /></a>The law-creating process described in </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://thomas.loc.gov/home/lawsmade.toc.html">How Our Laws Are Made</a></span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> (HOLAM), and other civics texts like it, is a lot like the Mississippi River:  formed out of a zillion small tributaries, many of them nameless, joined into a stream that passes through a number of jurisdictions and has lots of side passages, loops and eddies, eventually breaking up again into a series of tiny streams passing through a delta.   There is a central part of the process  &#8211; the mainstream &#8212; that is fairly well mapped, with placenames and milestones that are pretty well understood.  There are hundreds of smaller streams and brooks at either end of the process that are not well understood or named at all, and a few places in the middle where the main stream branches unpredictably.  It is a <a title="Mike Wirth HOLAM infographic" href="http://www.ethicreate.com/portfolio/how-our-laws-are-made-mike-wirth-art/">complicated map</a>, and it describes a territory where many people, places and things are named  &#8211; but  many are not, and some are named in ways that are ambiguous, confusing, or conflicting.</span></strong></strong></em></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">This post is about </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">identifiers, </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">and particularly </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">document identifiers</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> : snippets of text that uniquely identify  documents that are either generated by the legislative process or are found in its vicinity.  That idea is simple enough. But well-thought-out, carefully constructed identifiers are an important foundation of any data model &#8212; and are surprisingly difficult to design.  Legislative data models have (at least) two purposes:  first, they are a kind of specification that precisely describes data encountered in and around the legislative process, the precise relationships among the data items and elements, and (significantly) relationships between the data and the real-world people, groups, and processes that create and manipulate the data.  Second, they are a device to enable communication among system-builders, stakeholders, and users about what is to be collected, what is to be expressed or retrieved, and so on.   Before any of that can be built in a way that is both precise and communicative, we must be sure of what exactly we are talking about.  Identifiers should answer that question &#8212; </span><em><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">what the hell are we talking about?</span></em><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> &#8212; unambiguously.   Or at least we would like them to.  Often, our legacy identifier systems don’t do that very well.  As we shall see,  many existing identifier schemes are burdened with competing constraints and conflicting expectations, with less-than-ideal results.</span></p>
<h2 dir="ltr"><span style="font-size: 24px; font-family: Arial; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">What do identifiers do?</span></h2>
<p><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">In print, identifiers have worked differently than we really want them to in an electronic environment.  The conventions of printed books &#8212; use of pagination, difficulty of recall once issued, relative stability of editions, and most of all the assumption that identifiers will be interpreted by human readers with some knowledge of their context and purpose &#8212; result in identifiers that are less rigorous than what we need in a world of granular data consumed and processed by machines.  Some illustrations are found below. In reality our legacy &#8220;identifiers&#8221; are often less-rigorous monikers serving multiple functions, and in a digital environment we must unpack them into separate items with separate functions.  Here are some of the functions:</span></strong></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">a)</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> Unique naming.</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">  The diverse monikers that document creators and administrators use in current practice are supposed to provide unique names for documents.  Sometimes they do; often they don’t.  Usually that is because a moniker that is unique within a particular scope loses uniqueness in some wider, unanticipated arena. That is especially likely to happen when a collection of objects is moved from its original, intended scope on to the open Web, but you can find examples closer to home.  A Congressional bill number is a good example: it is unique only within the Congress during which it was assigned.  There might be an “H.R. 1234” for several Congresses; “108 H.R. 1234” is made unique by the addition of the number of the Congress during which it was introduced.  Of course, human error is often at fault, as when (for one year in the mid-1990s), there were two very different section 512s in Title 17 of the US Code.   </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">b) </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Navigational reference</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.  Identifiers often serve as search terms or convenient handles for taking the reader to another document, or for retrieving it (we discuss retrieval in the next section).  Standard caselaw citation practice is a special case of this, created specifically for printed books.  In that legacy context, unique identification and citation functions are often run together badly, usually because numbered pages are not sufficiently granular to uniquely identify individual items.   For example, two briefly-reported judicial opinions might well appear on the same page of a print reporter, and thus carry an identical citation.  The citation is then a perfectly good tool for navigating to each case within a series of printed volumes, but is not a unique name or identifier for either of them.   A look at </span><a style="font-family: Times; font-size: medium; white-space: normal;" href="http://bulk.resource.org/courts.gov/c/F3/173/"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://bulk.resource.org/courts.gov/c/F3/173/</span></a><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> will show that numerous cases, each quite short, originally appeared on page 421 of Volume 173 of West’s Federal Reporter, 3rd Series.   A sample is here: </span><a style="font-family: Times; font-size: medium; white-space: normal;" href="http://liicr.nl/rimZJe"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://liicr.nl/rimZJe</span></a><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> .  Any of the cases listed might be cited as 173 F.3d 421.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">c) </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Retrieval hook/container label</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.   Here, we distinguish use of a citation as a retrieval hook from its use as a navigational device. As we make our way around the Web, that distinction is usually blurred. Following a link to its destination puts a chunk of text in front of our eyes, and so it’s hard to remember that the link might refer to the contents of a container for which it also provides a label, rather than to a simple destination milestone.  </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://liicr.nl/pzPQWx">To make the distinction clear</a>, it’s useful to think about incorporation-by-reference or other forms of embedding.  Suppose that we wish to present the current text of a subsection of a statute inside some other online document &#8212; a citizen’s guide to Social Security benefits, for example.  We would likely do that via machine retrieval of the particular statutory subsection based on its identifier &#8212; but our goal would be to summon up a chunk of text, not navigate to a particular destination.  Put another way, our current practice conflates the use of citation as a means of identifying a </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">point, milestone, or destination</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> in a document (a retrieval hook) with a means of identifying a </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">labelled subdocument</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> that can be referenced or retrieved for other purposes ( a container label).</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">As an example, the <a href="http://thomas.loc.gov/home/thomas.php">THOMAS </a>pages for individual bills and resolutions aggregate a great deal of information from the Congressional Record (CR), linking from the Bill Summary ‘Actions’ to both a textual representation of the CR page beginning with the desired text (but sometimes extending past the desired text into other information about unrelated issues) as well as a PDF representation of the page which shows the whole page (where the desired text may start towards the end, plus subsequent pages if the relevant section extends past the initial page). </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">For a specific example of this, the <a href="http://en.wikipedia.org/wiki/Lilly_Ledbetter_Fair_Pay_Act_of_2009">Lily Ledbetter Fair Pay Act of 2009</a> has a list of major actions on Thomas, one of which is a “motion to proceed to consideration of measure withdrawn in Senate” on Jan. 13, 2009.  The link for information on that motion is to CR S349: a specific page of the Congressional Record. Invoking that link leads to this display:</span></p>
<p><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><br />
<img src="https://lh3.googleusercontent.com/j1hi6xZUOwb4ZkLC782u_o72UkzqOpVCmOtpJe9sqAOJXkpwqnhXzQukWRWZpLxt5wU8cxPLy-wwD17wFdta8uAP65Ft0s3iGqns2opd2QQnQW8A_R8" alt="" width="719px;" height="483px;" /><br />
<span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">The Thomas page lists the four items on the particular Congressional Record page, the last of which is the item sought.  When that item is invoked a default page with the specific text of the motion is retrieved, but an additional link to the PDF version of that page can be viewed via a link at the head of the text, with the Lily Ledbetter motion at the bottom of the retrieved PDF.</span></strong></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">d) </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Thread tag/associative marker</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.   Some monikers group related documents into threads &#8212; aggregations whose internal arrangement is implicitly chronological.  An insurance company claim number is, in exactly this way, a dual-purpose tool.  On the one hand, it refers uniquely to a document (a claim form) that you submit after your fender-bender.  On the other, the insurance company tells you that you must “use this claim number in all correspondence”  &#8211; that is,  use it to prospectively tag related documents.  That creates a labelled group of documents. If we then sort the group chronologically, it becomes a kind of narrative thread.  </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">In this way, the moniker implies a relationship between the documents without explicitly naming or describing it, as well as being pressed into service as the identifier for one or more documents in the cluster. Regulatory docket numbers function in this manner. That is intentional, because dockets are meant to be gathering places for documents. What is confusing  &#8211; and important to remember &#8212; is that a moniker that uniquely identifies a </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">process</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> &#8212; a regulatory rulemaking &#8212; has been bent to identify a </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">collection of items</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> associated with that process, and neither the association, the collection of items, nor any particular document have been uniquely identified.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Another conceptually-related but distinct example of this is the use of “captive search” URIs to meet a user’s need to dynamically assemble a set of related documents. For instance, one can retrieve all the environmental law decisions of the Supreme Court at this link:</span></p>
<p><a style="font-family: Times; font-size: medium; white-space: normal;" href="http://www.law.cornell.edu/supct/search/index.html?query=environment+or+environmental%20or%20EPA"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://www.law.cornell.edu/supct/search/index.html?query=environment+or+environmental%20or%20EPA</span></a></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Such URIs embed search terms (“environment”, “environmental”, “EPA”) and, when used in links, retrieve the set of documents found by searching on those terms.  Typically, they are used to deal with instability or growth in the underlying corpus of things being searched. They are “automatically” kept up to date as the collection changes, inasmuch as they just provoke a search of the changed collection that presents results based on the current collection contents. </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">In that way, they are a great help to site designers. Problems can arise, however, if the user imagines that the URI somehow identifies the exact set of items retrieved for any time period other than the moment of retrieval. Precisely because the method is dynamic, the user may or may not retrieve the same document set at a later invocation.   As a low-cost, low-effort alternative to semantic tagging, however, the approach is irresistible.  </span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Some newer systems,  such as <a href="http://viaf.org/">VIAF</a>, do allow the ad-hoc construction of URIs for dynamically assembled sets of objects that are then fixed as a permanent group identified by the newly-minted URI. Assuming that an appropriate search could be designed, one might thus construct URIs for any useful group of items found in an authority file, for example a list of all subcommittees of the House Armed Services Committee that have existed up to the present:</span></p>
<p><a style="font-family: Times; font-size: medium; white-space: normal;" href="http://viaf.org/viaf/search?query=local.names+all+%22house%20armed%20services%20committee%20subcommittee%22+and+local.sources+any+%22lc%22"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://viaf.org/viaf/search?query=local.names+all+%22house%20armed%20services%20committee%20subcommittee%22+and+local.sources+any+%22lc%22</span></a></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">e) </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Process milestone</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.  The grant of a moniker by an official body can be an acknowledgement that official notice must now be taken, or that some process has begun, ended, or reached some other important stage.  That is obviously the case with bills, where a single piece of legislation may receive a number of identifiers as it makes its way through the process, culminating in a Public Law number at the time of signing. The existence of such a PL number can be taken as evidence that the bill has been passed into law.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">f) </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Proxy for provenance</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">.  Again because monikers are often assigned by officials or organizations with special standing, they become proxies for provenance.  The existence of a bill number is evidence that the Clerk of the House has seen something and acted in a particular way with respect to it; it is valuable evidence in any attempt to establish authority.</span></p>
<p><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">g) </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Popular names, professional terms of art, and other vernacular uses.</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">  Monikers notably find their way into popular and professional use, some in ways that are quite persistent.  News media frequently refer to legislation by a popular name created by Congress based on the names of sponsors (the “Taft-Hartley Act”) or by the press itself (“Obamacare”).  They can be politicized (“death tax”), or serve as a kind of marketing tool (“USA-PATRIOT Act”). Some labels and identifiers become very closely associated with the things they label, becoming terms of art in their own right.  Thus, it is common to refer to a “501(c)(3) nonprofit” or a “Subchapter K”  partnership.  Vernacular labels have particular importance for citizens, who often use them as input to search systems.  At this writing, developers at the Sunlight Foundation have just <a href="http://liicr.nl/KEPC5I">started an initiative to collect such labels</a> through crowdsourcing.</span></p>
<p>We&#8217;ll break off here with a <a href="http://www.youtube.com/watch?v=hfwFpRnOeGg">musical selection</a> or <a href="http://www.youtube.com/watch?v=r2iIHup9zKA">two</a>.</p>
<p><em>[<a href="http://blog.law.cornell.edu/metasausage/2012/05/15/identifiers-part-2/">Next time: identifier granularity and some other characteristics; stresses and strains on identifier design</a>.]</em></p>
<p>Some other reading you might find useful:</p>
<ul>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Berners-Lee, Tim. <a href="http://www.w3.org/TR/cooluris/#semweb">Cool URIs for the Semantic Web</a>. </span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Bruce, T.R. and Shetland, D.A. (2007). <a href="http://liicr.nl/pzPQWx">Resilience in Identifier Design</a>. </span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Bruce, T. R., and Richards, R. C. (2011). Adapting Specialized Legal Metadata to the Digital Environment: The Code of Federal Regulations Parallel Table of Authorities and Rules. Paper presented at ICAIL 2011: The 13th International Conference on Artificial Intelligence and Law. Slides at </span><a style="font-family: Times; font-size: medium; white-space: normal;" href="http://liicr.nl/qdWBWi"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://liicr.nl/qdWBWi</span></a><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> .  Full text available from the authors.</span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #111111; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Campbell, D. <a href="http://dcpapers.dublincore.org/ojs/pubs/article/view/868/864">Identifying the Identifiers</a>. International Conference on Dublin Core and Metadata Applications, North America. </span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><a style="color: #000000; font-family: Times; font-size: medium; white-space: normal;" href="http://legislation.gov.uk/"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">legislation.gov.uk</span></a><span style="font-size: 15px; font-family: Arial; color: #111111; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> : </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">high-level API documentation at </span><a style="color: #000000; font-family: Times; font-size: medium; white-space: normal;" href="http://data.gov.uk/blog/legislationgovuk-api"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://data.gov.uk/blog/legislationgovuk-api</span></a></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #111111; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://doc.metalex.eu/">Netherlands statutes online</a></span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #111111; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://www.niso.org/news/events/2010/nameid/resources/">NISO collection of resources on name identifiers</a>.  Online at </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Name identifiers: </span><a style="color: #000000; font-family: Times; font-size: medium; white-space: normal;" href="http://www.niso.org/news/events/2010/nameid/resources/"><span style="font-size: 15px; font-family: Arial; color: #000099; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: underline; vertical-align: baseline; white-space: pre-wrap;">http://www.niso.org/news/events/2010/nameid/resources/</span></a></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Spinosa, P., Francesconi, E., and Lupo, C. <a href="http://tools.ietf.org/html/draft-spinosa-urn-lex-06">A Uniform Resource Name (URN) Namespace for Sources of Law (LEX)</a>.  IETF informational draft.</span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Tennison, Jeni. <a href="http://www.jenitennison.com/blog/node/159">What do URIs Mean Anyway</a>. </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: italic; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Jeni’s Musings</span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"> blog, July 5, 2011. </span><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Jeni’s other blog posts are useful as well.</span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Wikipedia article on <a href="http://en.wikipedia.org/wiki/Uniform_Resource_Name)">Uniform Resource Names </a></span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #111111; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://www.w3.org/2005/Incubator/lld/wiki/Draft_recommendations_page#Create_URIs_for_library_resources_in_good_time_.5BGD.5D">W3C Library Linked Data Incubator group wiki, Recommendation on URIs </a></span></strong></li>
<li><strong id="internal-source-marker_0.20811152062378824" style="color: #000000; font-family: Times; font-style: normal; font-variant: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; font-weight: normal;"><span style="font-size: 15px; font-family: Arial; color: #000000; background-color: transparent; font-weight: normal; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;"><a href="http://www.w3.org/2001/tag/doc/URNsAndRegistries-50">W3C Technical Architecture Group. URNS, Namespaces, and Registries</a>.  </span></strong></li>
</ul>
<p><em><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.law.cornell.edu/metasausage/2012/05/07/identifiers-part-1/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
