{"id":2061,"date":"2011-11-17T06:41:07","date_gmt":"2011-11-17T11:41:07","guid":{"rendered":"http:\/\/blog.law.cornell.edu\/voxpop\/?p=2061"},"modified":"2011-11-17T06:41:07","modified_gmt":"2011-11-17T11:41:07","slug":"legal-prosumers-how-can-government-leverage-user-generated-content","status":"publish","type":"post","link":"https:\/\/blog.law.cornell.edu\/voxpop\/2011\/11\/17\/legal-prosumers-how-can-government-leverage-user-generated-content\/","title":{"rendered":"Legal Prosumers: How Can Government Leverage User-Generated Content?"},"content":{"rendered":"
Prosumption: shifting the barriers between information producers and consumers<\/strong><\/p>\n <\/a>One of the major revolutions of the Internet era has been the shifting of the frontiers between producers and consumers [1]. Prosumption<\/em> refers to the emergence of a new category of actors who not only consume but also contribute to content creation and sharing. Under the umbrella of Web 2.0, many sites indeed enable users to share multimedia content, data, experiences [2], views and opinions on different issues, and even to act cooperatively to solve global problems [3]. Web 2.0 has become a fertile terrain for the proliferation of valuable user data enabling user profiling, opinion mining, trend and crisis detection, and collective problem solving [4].<\/p>\n The private sector has long understood the potentialities of user data and has used them for analysing customer preferences and satisfaction, for finding sales opportunities, for developing marketing strategies, and as a driver for innovation. Recently, corporations have relied on Web platforms for gathering new ideas from clients on the improvement or the development of new products and services (see for instance Dell\u2019s Ideastorm<\/a>; salesforce\u2019s IdeaExchange<\/a>; and My Starbucks Idea<\/a>). Similarly, Lego\u2019s Mindstorms <\/a> encourages users to share online their projects on the creation of robots, by which the design becomes public knowledge and can be freely reused by Lego (and anyone else), as indicated by the Terms of Service. Furthermore, companies have been recently mining social network data to foresee future action of the Occupy Wall Street movement<\/a>.<\/p>\n Even scientists have caught up and adopted collaborative methods that enable the participation of laymen in scientific projects [5].<\/p>\n Now, how far has government gone in taking up this opportunity?<\/p>\n Some recent initiatives indicate that the public sector is aware of the potential of the \u201cwisdom of crowds.\u201d In the domain of public health, MedWatcher<\/a> is a mobile application that allows the general public to submit information about any experienced drug side effects directly to the US Food and Drug Administration. In other cases, governments have asked for general input and ideas from citizens, such as the brainstorming session<\/a> organized by Obama government, the wiki launched by the New Zealand Police to get suggestions from citizens for the drafting of a new policing act to be presented to the parliament<\/a>, or the Website of the Department of Transport and Main Roads of the State of Queensland<\/a>, which encourages citizens to share their stories related to road tragedies.<\/p>\n Even in so crucial a task as the drafting of a constitution, government has relied on citizens\u2019 input through crowdsourcing [6]. And more recently several other initiatives have fostered crowdsourcing for constitutional reform in Morocco<\/a> and in Egypt <\/a>.<\/p>\n It is thus undeniable that we are witnessing an accelerated redefinition of the frontiers between experts and non-experts, scientists and non-scientists, doctors and patients, public officers and citizens, professional journalists and street reporters. The ‘Net has provided the infrastructure and the platforms for enabling collaborative work. Network connection is hardly a problem anymore. The problem is data analysis.<\/p>\n In other words: how to make sense of the flood of data produced and distributed by heterogeneous users? And more importantly, how to make sense of user-generated data in the light of more institutional sets of data (e.g., scientific, medical, legal)? The efficient use of crowdsourced data in public decision making requires building an informational flow between user experiences and institutional datasets.<\/p>\n Similarly, enhancing user access to public data has to do with matching user case descriptions with institutional data repositories (\u201cWhat are my rights and obligations in this case?\u201d; \u201cWhich public office can help me\u201d?; \u201cWhat is the delay in the resolution of my case?”; \u201cHow many cases like mine have there been in this area in the last month?\u201d<\/em>).<\/p>\n <\/a>From the point of view of data processing, we are clearly facing a problem of semantic mapping and data structuring. The challenge is thus to overcome the flood of isolated information while avoiding excessive management costs. There is still a long way to go before tools for content aggregation and semantic mapping are generally available. This is why private firms and governments still mostly rely on the manual processing of user input.<\/p>\n The new producers of legally relevant content: a taxonomy<\/strong><\/p>\n Before digging deeper into the challenges of efficiently managing crowdsourced data, let us take a closer look at the types of user-generated data\u00a0flowing through the Internet that have some kind of legal or institutional flavour.<\/p>\n One type of user data emerges spontaneously from citizens’ online activity, and can take the form of:<\/p>\n User data can as well be prompted by institutions as a result of participatory governance initiatives, such as:<\/a><\/p>\n This variety of media supports and knowledge producers gives rise to a plurality of textual genres, semantically rich but difficult to manage given their heterogeneity and quick evolution.<\/p>\n Managing crowdsourcing<\/strong><\/p>\n The goal of crowdsourcing in an institutional context is to extract and aggregate content relevant for the management of public issues and for public decision making. Knowledge management strategies vary considerably depending on the ways in which user data have been generated. We can think of three possible strategies for managing the flood of user data:<\/p>\n <\/a><\/em>Pre-structuring: prompting the citizen narrative in a strategic way<\/strong><\/p>\n A possible solution is to elicit user input in a structured way; that is to say, to impose some constraints on user input. This is the solution adopted by IdeaScale<\/a>, a software application that was used by the Open Government Dialogue<\/em> <\/a> initiative of the Obama Administration. In IdeaScale, users are asked to check whether their idea has already been covered by other users, and alternatively to add a new idea. They are also invited to vote for the best ideas, so that it is the community itself that rates and thus indirectly filters the users\u2019 input.<\/p>\n The MIT Deliberatorium<\/a>, a technology aimed at supporting large-scale online deliberation, follows a similar strategy. Users are expected to follow a series of rules to enable the correct creation of a knowledge map of the discussion. Each post should be limited to a single idea, it should not be redundant, and it should be linked to a suitable part of the knowledge map. Furthermore, posts are validated by moderators, who should ensure that new posts follow the rules of the system. Other systems that implement the same idea are featurelist <\/a> and Debategraph <\/a> [7].<\/p>\n While these systems enhance the creation and visualization of structured argument maps and promote community engagement through rating systems, they present a series of limitations. The most important of these is the fact that human intervention is needed to manually check the correct structure of the posts. Semantic technologies can play an important role in bridging this gap.<\/p>\n Semantic analysis through ontologies and terminologies<\/strong><\/p>\n Ontology-driven analysis of user-generated text implies finding a way to bridge Semantic Web data structures, such as formal ontologies expressed in RDF or OWL, with unstructured implicit ontologies emerging from user-generated content. Sometimes these emergent lightweight ontologies take the form of unstructured lists of terms used for tagging online content by users. Accordingly, some works have dealt with this issue, especially in the field of social tagging of Web resources in online communities. More concretely, different works have proposed models for making compatible the so-called top-down metadata structures (ontologies) with bottom-up tagging mechanisms (folksonomies).<\/p>\n The possibilities range from transforming folksonomies into lightly formalized semantic resources (Lux and Dsinger, 2007<\/a>; Mika, 2005<\/a>) to mapping folksonomy tags to the concepts and the instances of available formal ontologies (Specia and Motta, 2007<\/a>; Passant, 2007<\/a>). As the basis of these works we find the notion of emergent semantics<\/a> (Mika, 2005<\/a>), which questions the autonomy of engineered ontologies and emphasizes the value of meaning emerging from distributed communities working collaboratively through the Web.<\/p>\n <\/a>We have recently worked on several case studies in which we have proposed a mapping between legal and lay terminologies. We followed the approach proposed by Passant (2007)<\/a> and enriched the available ontologies with the terminology appearing in lay corpora. For this purpose, OWL classes were complemented with a has_lexicalization property linking them to lay terms.<\/p>\n The first case study that we conducted belongs to the domain of consumer justice, and was framed in the ONTOMEDIA project<\/a>. We proposed to reuse the available\u00a0Mediation-Core Ontology (MCO)<\/a> and Consumer Mediation Ontology (COM) as anchors to legal, institutional, and expert knowledge, and therefore as entry points for the queries posed by consumers in common-sense language.<\/p>\n The user corpus contained around 10,000 consumer questions and 20,000 complaints addressed from 2007 to 2010 to the Catalan Consumer Agency<\/a>. We applied a traditional terminology extraction methodology to identify candidate terms, which were subsequently validated by legal experts. We then manually mapped the lay terms to the ontological classes. The relations used for mapping lay terms with ontological classes are mostly has_lexicalisation and has_instance.<\/p>\n\n
\n
\n
\n
\n
\n