{"id":121,"date":"2014-01-27T07:05:00","date_gmt":"2014-01-27T12:05:00","guid":{"rendered":"http:\/\/blog.law.cornell.edu\/metasausage\/?p=121"},"modified":"2014-01-26T05:39:16","modified_gmt":"2014-01-26T10:39:16","slug":"modeling-legislative-information-post-passage","status":"publish","type":"post","link":"https:\/\/blog.law.cornell.edu\/metasausage\/2014\/01\/27\/modeling-legislative-information-post-passage\/","title":{"rendered":"Modeling legislative information post-passage"},"content":{"rendered":"
<\/p>\n
[Editor’s note: this post was co-authored by Tom Bruce<\/a>, John Joergensen<\/a>, Diane Hillmann<\/a>, and Jon Phipps<\/a>. References to the “model” here refer to the LII data model for legislative information that is described and published <\/a>elsewhere.<\/span>\u00a0]<\/p>\n This post lays out some design criteria for metadata that apply to compilations of enacted legislation, and to the tools commonly used to conduct research with them. \u00a0Large corpora discussed here include Public Laws<\/a>, the Statutes at Large<\/a>, and the United States Code<\/a>. \u00a0This \u201cpost-passage\u201d category also takes in signing statements<\/a>, and — perhaps a surprise to some — a variety of finding aids. \u00a0Finding aids receive particular attention because<\/p>\n Let’s begin with a discussion of signing statements, which might be considered the \u201cfirst stop\u201d after legislation is passed.<\/p>\n Signing statements have been used by many presidents over the years as a way to record their position on new legislation. For most of our history, their use has been rare and noncontroversial. However, during the George W. Bush administration they were used to declare legal positions on the constitutionality of sections of laws being signed. \u00a0\u00a0\u00a0<\/span><\/p>\n Since they had never previously been controversial, there had been little interest in collecting or indexing these documents in any systematic manner. With the change in their use, this attitude has changed, and there is a need to easily and quickly locate these documents, particularly within the context of the legislation to which they are linked.<\/p>\n Currently, Presidential signing statements are collected as part of the Weekly Compilation of Presidential Documents and Daily Compilation of Presidential Documents<\/a>. These are collected and issued by the White House press secretary, and published by the Office of the Federal Register. As they are not technically required by law to be published, they do not appear in the Federal Register or in Title 3 of the Code of Federal Regulations.<\/p>\n Although they appear in the daily and weekly compilations, they are not marked or categorized in any particular manner. In FD\/SYS, the included MODS files includes a subject topic \u201cbill signings\u201d, marking it as related to that category of event. \u00a0\u201cBill Signings\u201d is also included in the MODS <category1> tag that exists in presidential documents. That designation, however, also will be used for remarks as well as formal signing statements. In addition, it is unclear whether that designation has been used with any consistency. \u00a0The MODS files for signing statements include no information designating the document as a signing statement, but only as a \u201cPRESDOCU\u201d. The MODS files do, however, have references to the public law to which they refer. They will also have a publication date that will match with the date on which the president signed the subject law.<\/p>\n In order to make signing statements findable, the existing links to relevant legislation which are already represented in the GPO MODS files should be built into the model, along with the publication date information, and designation of the president who is issuing the statement. \u00a0In addition to that, however, the categorization of a signing statement as a signing statement needs to be added in the same fashion in which we have categorized other documents, and implemented with consistency. If the implementation and study of signing statements continues as an important area of user inquiry, they will need to be identifiable.<\/p>\n Finally, as with all such documents, there always a desire to assist the researcher and the public by including evaluation aids. \u00a0It is tempting, for example, to indicate whether a statement includes a challenge to the constitutionality or enforceability of a law. \u00a0We believe, however, that it would be a mistake to build this into the model. \u00a0If interpretive aids of this kind are themselves properly linked to their related legislation, they will be easily found.<\/p>\n We have singled out signing statements because they appeared prominently among use cases we have collected, and in other conversations about the \u201cpost-passage\u201d corpora. \u00a0In reality, many other presidential documents relate closely to legislative materials before and after passage. \u00a0We will consider them in later sections of this document as we encounter them in finding aids.<\/p>\n Enacted Federal legislation is published by many groups in many formats, including (among versions published by the legislative branch) Public Laws, the Statutes at Large, and the United States Code. \u00a0\u00a0Privately published editions of the US Code are also common (and indeed prevalent), either in electronic or printed form, and it is likely that their use exceeds that of the officially published versions.<\/p>\n First, as to necessity: research needs have no respect for administrative boundaries or data stovepipes. Many researchers will wish to trace the history of a law from the introduction of a bill through to its final resting place in the US Code. \u00a0As to means, our model incorporates a series of properties that describe the codification of particular legislative measures (or provisions); they might be applied at the whole-document or subdocument level. That essentially replicates what is found in Tables I, II and III as we describe them below. \u00a0This area of the model might, however, require extension in light of more detailed information about the codification process itself. \u00a0We are aware, for example, that current finding aids and the data in them make it far easier to find out what happened to a particular provision in a bill (forward tracing) than it is to find out where a particular provision in the US Code came from (reverse tracing), and that the finding aids do not support all common use cases with certainty.<\/p>\n Virtually every document we have encountered in our survey of legislative corpora becomes \u201cfrozen\u201d at some point, either by being finalized, or by being captured as a series of sequential snapshots. \u00a0That is not the case with the US Code, which is continually revised as new legislation is passed. \u00a0This creates a series of updating problems that involve not only modeling the current state of the Code, but also:<\/p>\n tracking new codification decisions<\/p>\n<\/li>\n tracking changes in the state of material that has been changed, moved, or repealed,<\/p>\n<\/li>\n revising and archiving metadata that has been changed or rendered irrelevant by changes in the underlying material<\/p>\n<\/li>\n<\/ul>\n and so on.<\/p>\n It seems likely to us that there are both engineering and policy decisions involved here. Any legislative data model needs to have hooks that allow connection to more detailed models, maintained by others, that track codification decisions. Most use cases that look at statutes and ask, \u201cwhat happened to that statute?\u201d or \u201cwhere did this come from?\u201d will need those features. \u00a0The policy question simply involves deciding whether and how to connect to data developed by others (for example, if it were desirable to trace legislation from congress.gov into the US Code). \u00a0As to engineering, it may be simpler in the short run to simply model the finding aids that currently assist users in coping with the print-based stovepipes involved. \u00a0That has drawbacks that we describe in some detail later on, but has the advantage of being relatively simple to do at the level of functionality that the print-based aids currently provide.<\/p>\n Whatever approach is taken, maintenance will be an issue; most automated approaches will require the direct acceptance of data originated by others. \u00a0The Office of the Law Revision Counsel is building a system to track not only codified legislative text but to record the decisions taken. \u00a0Linking to such a system would extend, at low cost, the capabilities of existing systems in very useful ways, but it is not clear whether OLRC will expose any of this tracking metadata for public use.<\/p>\n Bills become Public Laws. \u00a0Often, they are then chopped into small bits and sprayed over the US Code. \u00a0\u00a0Even the most coherent bill — and many fall far short of that mark — is a bundle of provisions that are related by common concern with a public policy issue \u00a0(eg. an \u201cantitrust law\u201d) or by their relationship to a particular constituency (eg. a \u201cfarm bill\u201d). \u00a0\u00a0The individual provisions might most properly relate to very different portions of the US Code; \u00a0a farm bill could contain provisions related to income tax, land use, environmental regulation, and so on; many will amend existing provisions in the Code. Mapping and recording of the codification decisions involved is thus a major concern for modelers.<\/p>\n The extreme granularity of the changes involved can be seen (eg.) in the Note to 26 USC 1, which contains literally hundreds of entries like the following:<\/p>\n 2004\u2014Subsec. (f)(8). Pub. L. 108\u2013311<\/a>, \u00a7\u00a7 101(c), 105, temporarily amended par. (8) generally, substituting provisions relating to elimination of marriage penalty in 15-percent bracket for provisions relating to phaseout of marriage penalty in 15-percent bracket. See Effective and Termination Dates of 2004 Amendments note below.<\/p>\n<\/blockquote>\n For our purposes here it is the mapping of the Public Law subsection to a named paragraph in the codified statute that is interesting. It proclaims the need for identifiers at a very fine-grained level. \u00a0The XML standard used by the House and Senate for legislation contains mechanisms for markup and identification down to the so-called \u201csubitem\u201d level, which is the lowest level of named container in bills and resolutions (the text in our example is actually at the \u201csubsection\u201d level of the Act). \u00a0It seems to us unlikely that mapping is consistently between particular levels of the substructure (that is, it seems unlikely that sublevel X in the Public Law always maps to something at sublevel Y of the US Code). \u00a0Sanity checking, then, will be difficult.<\/p>\n Identifiers within the US Code provide some interestingly dysfunctional examples<\/a>. \u00a0They can usefully be thought of as having three basic types: \u00a0\u201csection\u201d identifiers, which (sensibly) identify sections, \u201csubsection\u201d identifiers, which apply to named chunks within a section, \u00a0and \u201csupersection\u201d identifiers, which identify aggregations of materials above the section level but below the level of the Title: \u00a0subtitles, parts, subparts, chapters, and subchapters.<\/p>\n Official citation takes no notice of supersection identifiers, but many topical references in other materials do. Chapters should get particular attention, because they are often containers for the codified version of an entire Act. Supersection identifiers are confusing and problematic when considered across the entire Code, \u00a0because identical levels are labelled differently from Title to Title. \u00a0For example, in most, the \u201cPart\u201d level occurs above \u201cChapter\u201d in the hierarchy, and in some, that order is reversed. \u00a0It should also be noted that practically any supersection — no matter how many other levels may exist beneath it in the hierarchy — can have a section as its direct descendant. \u00a0There are also \u201canonymous\u201d supersections that are implied by the existence of table-of-contents subheadings that have no official name; these appear in various places in the Code.<\/p>\n To our way of thinking, this suggests that the use of opaque identifiers for the intermediate supersections is the best approach for unique identification. Path-based accessors that use level-labels such as \u201csubtitle\u201d and \u201csection\u201d are obviously useful, too, \u00a0however confusing they might seem when accessors from different titles with different labelling hierarchies are compared side by side.<\/p>\n As to section identifiers, the main problem is that years of accumulated insertions have resulted in an identifier system that appears far from rational. \u00a0For example, \u201c1749bbb-10c\u201d is a valid section number in Title 12. \u00a0It may nevertheless make sense to use citation as the basis for identifier construction rather than making the identifiers fully opaque. \u00a0As to subsection labeling, it is pretty consistent throughout the Code, and can be thought of as an extension to the system of section identifiers.<\/p>\n Traditional library approaches to these complex sets of materials have been very simple: they\u2019ve been cataloged as \u2018serials\u2019 (open ended, continuing publications), with very little detail. That allows libraries to represent the materials in their catalogs, and to provide a bibliographic record that acts as a hook for check-in data, and is used to track receipt and inventory of individual physical volumes. In the law library context, where few users access these basic resources through a catalog, this approach has been sufficient, efficient and low-maintenance.<\/p>\n However, as this information \u2018goes digital\u2019, that strategy breaks down in some predictable ways, many of which we\u2019ve documented elsewhere in this project\u2019s papers; the biggest is that much of the time we would like more detailed information about smaller granules than the \u201cserial\u201d approach contemplates. As we make a fuller transition to digital access of this information, these limited approaches no longer provide even minimal access to this critical material.<\/p>\n There are a good many finding aids that can be used to trace Federal legislation through the codification process, and to follow authority relationships between legislative- and executive-branch materials, such as presidential documents and the Code of Federal Regulations. \u00a0All were originally designed for distribution in tabular form, at first \u00a0on paper, and more recently on Web pages. \u00a0In the new environment we imagine, the approach they represent is problematic. It may be nevertheless be worthwhile to model the finding aids themselves for use in the short term, as better implementations require significant analysis and administrative coordination.<\/p>\n A look at the Parallel Table of Authorities [PTOA] shows where the problems are likely to be found. \u00a0\u00a0Like all other tabular finding aids that originate in print, it was designed for consumption by human experts capable of fairly sophisticated interpretation of its contents. \u00a0It embeds a series of reductive design decisions that trade conciseness against the need for some \u201cunpacking\u201d by the reader. \u00a0Conciseness is a virtue in print, but it is at best unnecessary and at worst confusing when the data is to be consumed and processed by machines. \u00a0A couple of examples will illustrate:<\/p>\n Some PTOA entries map ranges of US Code sections against ranges of CFR Parts, in what appears to be a many-to-many relationship. \u00a0It is unlikely that every pair that we could generate by simple combinatorial expansion represents a valid authority relationship. Indeed, as we shall see, the various finding aids differ considerably in the meaning they assign to a \u201crange\u201d of sections and \u00a0in the treatment that they intend for them.<\/p>\n<\/li>\n The table simply states that there is a relationship between each of the two cells in every row of the table, without saying what it is. \u00a0The name of the table would lead the reader to believe that the relationship is one of authorization, but in fact other language around the table suggests that there are as many as four different types of relationship possible. \u00a0These are not explicitly identified.<\/p>\n<\/li>\n<\/ul>\n To model the finding aid, in this case, would be to perpetuate a less-than-accurate representation of the data. \u00a0As a practical matter of software project planning and management, it might be worth doing so anyway, in order to more quickly provide users with a semi-automated, electronic version of something familiar and useful. But that is not the best we could do. \u00a0Most of the finding aids associated with Federal statutes have similar re-modeling issues, and should be reconceived for the Semantic Web environment in order to achieve better results.<\/p>\n Most of the finding aids make use of granular references; in the case of Public Laws, these are often at the section level or below, and in the case of the US Code they are often to named subsections. \u00a0The granularity of references may or may not be reflected in the granularity of the structural XML markup of any particular edition of those resources.<\/p>\n The Statutes at Large use a page-based citation system that creates two interesting modeling issues. \u00a0First, on its own, a page-based citation is not a unique identifier for a statute in Stat. L., because more than one may appear on one page. \u00a0\u00a0Second, it was not ever thus. \u00a0Stat. L. has used three different numbering schemes at various times, each containing ambiguities. \u00a0These would be extraordinarily difficult to resolve under any circumstances, and particularly so given the demands of codification we describe later in the section on the Table III finding aid. Taking these two things together, it seems that there is no way to accurately create a pinpoint link between a provision of an Act in its Public Law format and a specific location in the Statutes at Large; the finest resolution possible is at page granularity.<\/p>\n It would thus seem that the most sensible approach would be to use a somewhat \u201cloose and floppy\u201d relationship like \u201cisPublishedAt\u201d to describe the relationship involved, since the information available from the Table does not really support pinpoint accuracy. \u00a0\u00a0That is unfortunate, in that there are important use cases that need such links. \u00a0For example, statutes are frequently described in judicial opinions using citations that refer only to the Statutes at Large, sometimes because the case in question predates the US Code and no other reference can exist, and sometimes because the writer has omitted other citation. \u00a0It is effectively impossible to construct a pinpoint link if the cite contains a subsection reference; one has to cite to the nearest page, relying on the reader to find the relevant statute on the page somewhere. \u00a0It would be equally difficult to trace through a Stat.L. citation to the relevant provision of the US Code in situations where the USC citation has been omitted.<\/p>\n In short, identifiers in this part of the legislative jungle have two problems: first, they sometimes do not exist at a sufficiently granular resolution in the relevant XML versions, and second, granular identifiers do not resolve or map well to materials whose citation has traditionally been based on print volume and page numbers.<\/p>\n Some of the finding aids we describe below provide mappings between Presidential documents and the codified statutes in the US Code. \u00a0Identifiers for Presidential documents are assigned by the Office of the Federal Register, and are typically accession numbers. \u00a0It is worth noting that OFR provides a number of finding aids and subject-matter descriptions of Presidential documents, though these are beyond our scope here.<\/p>\n As to GPO, it appears at first blush that the MODS metadata for the US Code as found in FD\/SYS does not reflect associations with Executive Orders, although they are vaguely modeled in the MODS files associated with the Executive Orders themselves. \u00a0There would be some virtue in being able to find information in both directions. \u00a0That is especially true in situations where the state of the law cannot be fully understood without referring to both the Code and related Executive Orders simultaneously. \u00a0For example, 4 USC 1, in its most current version, claims that there are 48 stars on the flag of the United States; it is only possible to find out where the other two came from by referencing the Executive Orders that accompanied statehood for Alaska and Hawai\u2019i.<\/p>\n For the general public, the TOPN is probably the single most useful finding aid for Federal legislation. That is because it bridges the gap between popular accounts of legislation — for example, in the news media — and the codified collections of laws that are in effect. \u00a0Where, exactly, do we find the Lily Ledbetter Fair Pay Act in the modern statute book? \u00a0The answer to that question isn\u2019t obvious.<\/p>\n Broadly — very broadly — there are two ways in which an Act may be codified. \u00a0First, it could be moved into the Code wholesale, typically as a new Chapter containing numbered sections that reflect the section divisions in the Act. \u00a0Second, it could be disassembled into a bag of provisions and scattered all over the Code, with each section placed in a region of the Code dictated by its subject matter. \u00a0\u00a0In such cases, the notes to the Code section that describes the \u201cShort Title\u201d of the Act generally contain a roadmap of what has been done with the rest of it. \u00a0\u00a0That also happens when the Act contains language that consists entirely of instructions for amending existing statutes already codified.<\/p>\n For example, the TOPN entry for the Lily Ledbetter Fair Pay Act looks like this:<\/p>\n Lilly Ledbetter Fair Pay Act of 2009<\/p>\n Pub. L. 111-2, Jan. 29, 2009, 123 Stat. 5<\/p>\n Short title, see 42 U.S.C. 2000a note<\/p><\/blockquote>\n It maps the identifier for the Public Law version of the Act to the Statutes at Large, with a page reference to the Stat. page on which the Act begins. \u00a0It also maps to the \u201cShort Title\u201d section of the USC, whose note contains information about what has been done with the Act.<\/p>\n Short Title of 2009 Amendment<\/p>\n Pub. L. 111\u20132<\/a>, \u00a7 1,Jan. 29, 2009, 123 Stat. 5<\/a>, provided that: \u201cThis Act [amending sections2000e\u20135 and 2000e\u201316 of this title and sections 626, 633a, and 794a of Title 29, Labor, and enacting provisions set out as notes under section 2000e\u20135 of this title] may be cited as the \u2018Lilly Ledbetter Fair Pay Act of 2009\u2019.\u201d<\/p>\n<\/blockquote>\n This entry makes an important point about codified legislation. \u00a0While it is natural to believe that codification consists of taking something that contains entirely new legislative language, breaking it into pieces, and plugging the pieces into the Code (or substituting them for old ones), that is not exactly what happens much of the time. \u00a0Any Act could be, and often is, a laundry list of directives to amend existing codified statutes in some way or other. \u00a0\u00a0In such cases, the text of the Act is not incorporated into the Code itself, but into the Notes, in a manner similar to the example just given. \u00a0That is a subtle difference, but an important one, as we shall see in the discussion of Table III below. \u00a0It introduces an extra layer of mapping into the process, in a way that is partially obscured by the fact that inclusion is in the Notes rather than in the text of the Code. \u00a0One result of this is that, in general, it is easier to look at a current provision and find out where it came from than it is to look at an historical provision and find out what happened to it.<\/p>\n From a data modeler\u2019s perspective, the TOPN is useful but not necessary; the necessary finding aid can be constructed by aggregating data from other tables, or by simply referring to the short titles and popular names given in the text of the Act itself. The relationships modeled by TOPN aggregate information:<\/p>\n from the Acts or bills themselves (House and Senate identifiers for bills, and the name of the Act as it\u2019s found in either the bill or (better) in the Public Law version);<\/p>\n<\/li>\n from Table 3, which describes where the Public Law is codified; and<\/p>\n<\/li>\n from Table 1, which models an extra \u201cchange of address\u201d that is applied in cases where codified legislation has been reorganized for passage into positive law.<\/p>\n<\/li>\n<\/ul>\n Table I describes the treatment of individual sections in Titles that have been revised for enactment as positive law. \u00a0The Table is a straightforward mapping of \u201cold\u201d section numbers in a Title to \u201cnew\u201d section numbers that apply after the Title was made into positive law. \u00a0As such, Table I entries also have a temporal dimension — the mappings need only be applied when tracing a citation to the Code as it existed before the date of positive law enactment to a location in the Code after that date.<\/p>\n A relational-database expert obsessed with normalization would say that Table I is, then, really two tables — one that maps old sections to new sections within a Title, and a second, implied table that says whether or not each of the 51 Titles has been enacted into positive law, and if so, when. \u00a0The researcher wanting to trace a particular reference would follow this heuristic:<\/p>\n Does my reference fall within a positive-law Title?<\/p>\n<\/li>\n If so, does my reference precede the date of enactment into positive law?<\/p>\n<\/li>\n If so, what is the number of the \u201cnew\u201d section?<\/p>\n<\/li>\n<\/ul>\n Thus, the model will need to reflect properties of the Title itself (\u201cenactedAsPositiveLaw\u201d) and of the mapping relationship of old to new (\u201chasPositiveLawSection\u201d).<\/p>\n The United States Code was preceded by an earlier attempt at regularized organization, the Revised Statutes of 1878 . \u00a0Citations to the Revised Statutes are to sequentially-numbered Sections, with \u201cRev.Stat.\u201d as the series indicator. \u00a0Table II provides a map between Rev. Stat. cites and sections of the US Code, along with a number of status indicators; the two most important (and common) of these indicate that a statute has been repealed, or that Table I needs to be applied because the classification shown was done prior to positive-law enactment of the Title.<\/p>\n Unlike other finding aids we describe, where the meaning of mappings between ranges and lists of things can be both combinatorial and ambiguous, Table II appears straightforward. A list or range of items in the Rev. Stat. columns can be mapped one-to-one to the corresponding list or range in the USC column. \u00a0The first element in the list or range in Rev. Stat. maps to the first element in the list in USC, the second to the second, and so on. \u00a0Simple reciprocal relationships should obtain.<\/p>\n That is particularly important in light of the relationship between Table II and Table III. \u00a0In Table III, for all statutes passed before 1874, Table III references all refer to the Revised Statutes, and not to the US Code. \u00a0So, for those statutes, in order to determine where they may still exist as part of the US Code, reference needs to be made first to Table III, to obtain the R.S. section where it was first encoded, and then to Table II, to determine where that R.S. section was re-encoded in the US Code. \u00a0Without the straightforward, one-to-one relationship between the R.S. and US Code expressed in Table II, the connection between pre-1874 statutes and current US Code sections would not be possible.<\/p>\n Table III, which maps individual provisions within Public Laws to pages in the Statutes at Large and to sections of the US Code, exhibits a number of interesting problems. \u00a0Here is how one such mapping appears in the LRC\u2019s online tool:<\/p>\n <\/p>\n In this case, we\u2019re mapping the individual provisions of PL 110-108 (readable at http:\/\/www.gpo.gov\/fdsys\/pkg\/PLAW-110publ108\/html\/PLAW-110publ108.htm<\/a> ) \u00a0to a range of pages in the Statutes at Large and to sections in the US Code (and their notes). The GPO version helpfully contains markers for the Stat. L. page breaks. \u00a0Some noteworthy observations:<\/p>\n The Public Law needs section-level identifiers. Notes sections within the USC need their own identifiers, as do pages within the Statutes at Large.<\/p>\n<\/li>\n Since the Stat. L. citation for the Act always goes to the first page of the Act as it appears in Stat.L., there is ambiguity between<\/p>\n 121 Stat. 1024, the citation\/identifier indicating the whole Act for purposes of external citation, and<\/p>\n<\/li>\n 121 Stat. 1024, the single-page reference that describes where Section 1 of the Act can be found (and for that matter, supposedly, some of Sections 2-6 as well)<\/p>\n<\/li>\n<\/ul>\n<\/li>\n For some time periods, chapter numbers would disambiguate individual laws where more than one statute appears on a single page, although as we have seen, chapter numbers have uniqueness problems of their own. Chapter numbers play no role in this example, as they were not used after 1957.<\/p>\n<\/li>\n The Act is classified to the notes in the relevant USC sections.<\/p>\n In the case of section 1 of the Act, the notes simply state the name of the Act.<\/p>\n<\/li>\n In the case of section 151, the entire text of the legislation appears in the notes for the Act. \u00a0It would appear that it is done this way because the legislation\u2019s provisions amount to a series of instructions for amending existing statutes, and thus can\u2019t be codified per se. \u00a0Rather, they are a description of what should be done to change things that have been codified already.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n GPO\u2019s MODS file is evidently created by machine extraction of USC citations, because it incorrectly identifies the Act as modifying 26 USC 4251. It\u2019s possible, though, \u00a0that the presence of a USC section in the MODS file might simply mean \u201cfound at the scene of the crime by our parser\u201d rather than \u201cchanged by the Act\u201d. \u00a0The relationship is unclear, and may be impossible to express clearly in XML.<\/p>\n<\/li>\n GPO\u2019s MODS file for the Act treats the mapping implied by the second line of the example pretty loosely, describing small collections of US Code and Stat.L. pages associated with the Act, but not describing any particular relationship between the items in each collection or between collections. \u00a0This is, again, a place where XML falls short of what is possible in an RDF-based, machine-readable model.<\/p>\n<\/li>\n<\/ul>\n The second line of the table entry is the most interesting. \u00a0At first glance, it appears to describe a many-to-many relationship between a range of sections in the Act and a range of pages in the Statutes at Large. \u00a0But it seems improbable that such a relationship would actually describe anything useful, and a quick side-by-side look at the Act and the Statute shows that such an interpretation is incorrect. \u00a0The actual arrangement of page breaks in Stat. L. would indicate that the mapping should be otherwise:<\/p>\n Section 2 appears in its entirety on 121 Stat 1024.<\/p>\n<\/li>\n Section 3 spans the break between 1024 and 1025.<\/p>\n<\/li>\n Section 4 spans 1025 and 1026<\/p>\n<\/li>\n Sections 5 and 6 appear in their entirety on 1026<\/p>\n<\/li>\n<\/ul>\n Why is that? \u00a0The simplest explanation is that the entries in the table — numbers separated by a dash — do not represent lists of individual sections. Instead, they represent clusters of sections that are related to each other as clusters. \u00a0They seem to be saying, \u201csomewhere in this clump of legislative language, you\u2019ll find things that relate to things in this other clump of legislative language, and the clumps span multiple sections or provisions, possibly ordered differently in each document\u201d.<\/p>\n Looking at the text itself — which is a series of detailed, interrelated amending instructions — shows that indeed it would be a horrible (and likely very confusing) task to pick the provisions apart into a fully granular mapping, leaving \u201ccluster-to-cluster\u201d mapping as the only viable strategy for describing the relationship between the two texts.<\/p>\n A detailed model of Table III would then require:<\/p>\n clarifying the distinction between a page reference to the first page of an Act as it appears in Stat.L. and the citation of the statute as a whole.<\/p>\n<\/li>\n<\/ul>\n describing each section or subsection (granule) within the Public Law as one that is either<\/p>\n new statutory language, or<\/p>\n<\/li>\n a set of instructions for amending existing language<\/p>\n<\/li>\n<\/ul>\n<\/li>\n describing each target in the USC as either<\/p>\n an actual statute, or<\/p>\n<\/li>\n notes to the statute. It is worth remarking that, in any of the finding aids, the fact that something has been classified to the notes provides a clue as to what that thing is and what the nature of the classified relationship might be. This may indicate a need for subproperties that would be accommodated in some future extension.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n distinguishing between relationships that involve re-publication (as between Public Laws and Statutes at Large) from those that involve restatement or codification (as between either of those and the US Code)<\/p>\n<\/li>\n using different properties to describe provision-to-provision and cluster-to-cluster relationships<\/p>\n<\/li>\n<\/ul>\n Taken together, these requirements would form an approach that would more accurately model the relationships the original Table was meant to model. \u00a0In some sense this is an interpretive act — any Table that records codification decisions does, after all, record a set of interpretations, and so will its model. \u00a0But in this case the interpretation is an official one, entrusted to the Law Revision Counsel and in any case practically unavoidable.<\/p>\n Table IV \u201clists the Executive Orders that implement general and permanent law as contained in the United States Code\u201d . \u00a0Executive Orders are instructions from the President mandating an action, reorganization, or policy change in some part of the executive branch. \u00a0They are promulgated pursuant to statutory authority and as lawful orders of the chief executive, have force of law. \u00a0They are published in the Federal Register and appear in the annual Compilation of Presidential Documents. \u00a0They are sequentially numbered, but are also identified by date of signing, title, and the authoring president. \u00a0All four of these identifying attributes are specified in the GPO MODS files which accompany these documents in FD\/SYS. \u00a0In addition, there exists a reference to the volume and issue number of the Weekly Compilation in which the order appears. \u00a0Finally, the MODS files typically include a reference to the enabling law as well.<\/p>\n The Table shows that:<\/p>\n Executive Orders have identifiers, apparently accession numbers that run from the beginning of time.<\/p>\n<\/li>\n Nearly all refer to the \u201cnotes\u201d attached to sections of the USC, since (as the description says) Executive Orders are typically implementation instructions independent of the language of the statute itself.<\/p>\n<\/li>\n<\/ul>\n References to the notes have special features worth remarking. \u00a0Often, the mapping is given to the note preceding (\u201cnt. prec.\u201d) a particular section. \u00a0That distinctive language is rooted in the way that the LRC conceives of the Code\u2019s structure. \u00a0In the minds of the LRC, the Code consists of Titles that are divided into sections. \u00a0Intermediate levels of aggregation — subtitles, parts, subparts, chapters, and subchapters — are convenient fictions used to organize the material in a manner similar to the tabs found in a card catalog. \u00a0Thus, the \u201cnote preceding\u201d a section is most often a note that is attached to the chapter of which the section is a part (chapters are typically, but not always, the level that aggregates sections, and often correspond to an Act as a whole). \u00a0As modelers, we\u2019re presented with a choice between fictions: either we join LRC in pretending that the intermediate levels of aggregation don\u2019t exist, or we make use of them. \u00a0The latter presents other problems with representing parent-child relationships in the structure, but fortunately that is a concern for XML markup designers and not so much for us.<\/p>\n It would seem that the best approach might be to model both sets of relationships: a hierarchical structure based on aggregations, and a sequential structure suggested by the \u201cinsertion model\u201d just described. \u00a0In terms of the model, this is just a matter of making sure that identifiers are in place that will facilitate both approaches. \u00a0The main issues raised by this approach have to do with XML markup and encoding; as with other corpora we have encountered (eg. the Congressional Record) user needs demand, and the model can accommodate, far more than the current publicly available XML encoding of the document will support.<\/p>\n Thus, we would end up with:<\/p>\n a set of unique identifiers for sections, based on title and section numbers and thus reflecting current citation practice;<\/p>\n<\/li>\n a set of sub-section identifiers that extend section identifiers in a way that is based on nested subsection labeling.<\/p>\n<\/li>\n a set of super-section identifiers that is based on human readable hierarchy, represented as paths, eg. \u201c\/uscode\/title42\/subtitle1\/part3\/subpart5\/chapter7\/subchapterA\u201d<\/p>\n<\/li>\n a set of completely opaque identifiers for both section and supersection levels. \u00a0There is less need for this at the subsection level, but any such system could easily be extended;<\/p>\n<\/li>\n parent-child relationships between<\/p>\n subsections and sections<\/p>\n<\/li>\n sections and supersections<\/p>\n<\/li>\n supersections and containing supersections<\/p>\n<\/li>\n<\/ul>\n<\/li>\n next-previous relationships between sections. \u00a0These should take no account of supersection boundaries.<\/p>\n<\/li>\n<\/ul>\n As we\u2019ve said in other contexts, it is worthwhile to remember that nothing limits us to a single identifier for any object.<\/p>\n Table V maps Presidential proclamations to the US Code. \u00a0Proclamations differ from Executive Orders in that they do not \u201clegislate\u201d as such. \u00a0Rather, they are issued to commemorate a significant event, or other similar occasion. \u00a0Like Executive Orders, they are published in the Federal Register, and appear in the Compilation of Presidential Documents. Like Executive Orders, they are sequentially numbered (without reference to year, president, etc.), and are also identified by date, title, issuing president, and the volume and issue number of the Weekly Compilation. \u00a0All these identifiers are typically present in the GPO MODS files in FD\/SYS.<\/p>\n Before 1950 or so, the vast majority of proclamations establish national monuments. More recently, other topics as diverse as the maintenance of the Strategic Petroleum Reserve, tariff schedules, and the celebration of Armed Forces Day show up frequently. \u00a0As with Executive Orders and Table IV, the proclamations have accession numbers, and the vast majority of references are to notes attached to the Code and not the Code itself.<\/p>\n Table VI maps reorganization plans to the US Code. \u00a0Reorganization plans are essentially executive orders that describe major alterations to executive-branch agencies and organization, though they do not carry executive-order identifiers. \u00a0For example, Reorganization Plan 3 of 1970 establishes the Environmental Protection Agency and expands the structure of the National Oceanographic and Atmospheric Administration. \u00a0Generally they carry citations to the Statutes at Large and to the Federal Register (the FR cite does not appear in Table VI). \u00a0\u00a0While no concise identifier exists for them in and of themselves, it appears that they could be identified by a year-number combination (eg. \u201cRP-1970-3\u201d). \u00a0These associations can readily be modeled by associating an identifier for the plan itself with the page references, through one or more \u201cisPublishedAt\u201d relationships.<\/p>\n The Parallel Table of Authorities and Rules describes relationships between statutes in the US Code and the CFR Parts that they authorize. For the most part, the PTOA maps ranges of sections in the US Code to lists of Parts in the Code of Federal Regulations. It has limitations, described by GPO as follows:<\/p>\n Entries in the table are taken directly from the rulemaking authority citation provided by Federal agencies in their regulations. Federal agencies are responsible for keeping these citations current and accurate. Because Federal agencies sometimes present these citations in an inconsistent manner, the table cannot be considered all-inclusive. The portion of the table listing the United States Code citations is the most comprehensive, as these citations are entered into the table whenever they are given in the authority citations provided by the agencies. United States Statutes at Large and public law citations are carried in the table only when there are no corresponding United States Code citations given.<\/p>\n The suggestions made here, then, are observations about a critically important finding aid, strongly related to legislative material, that is in need of some help. \u00a0Thinking about the PTOA and the various ways in which modeling techniques such as the ones we recommend might improve it provides an interesting overview of the problems of legislative finding aids in general.<\/p>\n Richards and Bruce have written extensively about its organization and improvement. \u00a0They note four major areas to address:<\/p>\n Ambiguity in the description of the relationships themselves. \u00a0The Table supposedly models four different types of relationship: express authorization, implied authorization, interpretation, and application. \u00a0These are not distinguished in the PTOA entries.<\/p>\n<\/li>\n Ambiguity in relationship targeting. \u00a0Entries on both sides of the table are typically given as ranges or lists, implying many-to-many relationships that can be combinatorially expanded. \u00a0It is not clear whether, in fact, all the sections of the US Code that could be enumerated from a range on the left side of the table would relate to particular Parts of the CFR enumerated from the lists on the right side of the table. \u00a0It seems unlikely.<\/p>\n<\/li>\n Granularity problems related to citation of the CFR materials by Part. \u00a0In reality, the authorizing relationship would typically run from a statute to a particular section of the CFR, but the targeting in the PTOA is to the Part containing that section. It is likely that this is not a problem with granularity so much as it is an informed design decision driven by problems with the volatility of section-level identifiers as compared to printed finding aids. \u00a0Sections of the CFR come and go with some frequency, often moving around within an individual Part. Parts change far less frequently. \u00a0In print, where updating is difficult and withdrawal of stale material even more so, identifier stability is a much bigger concern. \u00a0It is possible that a digital resource could track things much more closely.<\/p>\n<\/li>\n Directionality and reciprocity. \u00a0It is not clear which of the four possible relationships between entries are reciprocal and which are strictly directional, nor is the Table necessarily intended to be used bidirectionally.<\/p>\n<\/li>\n<\/ul>\n Unfortunately, improvement is unlikely, as it would require the collection of improved information from each of the hundreds of agencies involved. Nevertheless, a simplified model can provide at least some useful information. \u00a0The LII currently models the PTOA as a single relationship between individual pairs of identifiers, asserting that each pair in a combinatorial expansion of entries on each side of the table has some such relationship. \u00a0That is undoubtedly imprecise, but it is as good as anything currently available and far better than nothing.<\/p>\n As always, we close with a musical selection<\/a>.<\/p>\n Law Librarian\u2019s Society of Washington DC, \u201cUnited States Statutes and the United States Code: Historical Outlines, Notes, Lists, Tables, and Sources\u201d. At http:\/\/www.llsdc.org\/statutes-code\/<\/a> .<\/p>\n<\/li>\n Wikipedia article on the US Code, at http:\/\/en.wikipedia.org\/wiki\/United_states_code<\/a><\/p>\n<\/li>\n<\/ul>\n Richards, Robert, and Thomas Bruce, \u201cAdapting Specialized Legal Material to the Digital Environment: The Code of Federal Regulations Parallel Table of Authorities and Rules\u201d, in ICAIL \u201811: Proceedings of the 13th International Conference on Artificial Intelligence and Law. A slide deck based on the presentation is at http:\/\/liicr.nl\/M7QyTG<\/a> .<\/p>\n<\/li>\n<\/ul>\n Bruce, Thomas, and David Shetland, \u201cRemarks on Identifiers for Legislation\u201d, at http:\/\/www.law.cornell.edu\/wiki\/lexcraft\/lii_remarks_on_identifiers_for_legislation<\/a> .<\/p>\n<\/li>\n<\/ul>\n Table of Popular Names, online at http:\/\/uscode.house.gov\/popularnames\/popularnames.htm<\/a><\/p>\n<\/li>\n Table I , online at http:\/\/uscode.house.gov\/tables\/usctable1.htm<\/a><\/p>\n<\/li>\n Table II, online at http:\/\/uscode.house.gov\/tables\/usctable2.htm<\/a><\/p>\n<\/li>\n Table III, online at http:\/\/uscode.house.gov\/tables\/usctable3.htm<\/a><\/p>\n<\/li>\n Table IV, online at http:\/\/uscode.house.gov\/tables\/usctable4.htm<\/a><\/p>\n<\/li>\n Table V, online at http:\/\/uscode.house.gov\/tables\/usctable5.htm<\/a><\/p>\n<\/li>\n Table VI, online at http:\/\/uscode.house.gov\/tables\/usctable6.htm<\/a><\/p>\n<\/li>\n Parallel Table of Authorities and Rules, online at http:\/\/www.gpo.gov\/help\/parallel_table.pdf<\/a><\/p>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":" [Editor’s note: this post was co-authored by Tom Bruce, John Joergensen, Diane Hillmann, and Jon Phipps. References to the “model” here refer to the LII data model for legislative information that is described and published elsewhere.\u00a0] This post lays out some design criteria for metadata that apply to compilations of enacted legislation, and to […]<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/posts\/121"}],"collection":[{"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/comments?post=121"}],"version-history":[{"count":16,"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/posts\/121\/revisions"}],"predecessor-version":[{"id":139,"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/posts\/121\/revisions\/139"}],"wp:attachment":[{"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/media?parent=121"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/categories?post=121"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.law.cornell.edu\/metasausage\/wp-json\/wp\/v2\/tags?post=121"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}\n
Signing statements<\/a><\/h2>\n
Overarching issues<\/h3>\n
Existing metadata<\/h4>\n
Forms of enacted Federal legislation<\/h2>\n
Overarching issues<\/h3>\n
How do post-passage materials relate to existing systems such as THOMAS, congress.gov, or GovTrack?<\/h4>\n
Updating<\/h4>\n
\n
Identifiers and identifier granularity<\/h4>\n
\n
US Code identifiers<\/h3>\n
Public Laws, Statutes at Large, \u00a0and the United States Code<\/h3>\n
Existing metadata<\/h4>\n
Finding aids<\/h2>\n
General problems<\/h3>\n
Deficiencies of print representations<\/h4>\n
\n
Identifier granularity and alignment<\/h4>\n
Identifiers for Presidential documents: general characteristics<\/h4>\n
The Table of Popular Names (TOPN)<\/a><\/h3>\n
\n
\n
\n
US Code Table I<\/a><\/h3>\n
\n
US Code Table II<\/a><\/h3>\n
US Code Table III<\/a><\/h3>\n
\n
\n
\n
\n
\n
\n
\n
\n
US Code Table IV<\/a><\/h3>\n
\n
\n
\n
US Code Table V<\/a><\/h3>\n
US Code Table VI<\/a><\/h3>\n
The Parallel Table of Authorities (PTOA)<\/h3>\n
\n
References<\/h2>\n
General<\/h3>\n
\n
Other papers<\/h3>\n
\n
\n
Finding aids<\/h3>\n
\n