<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Language Logic Law Software &#187; ontology</title>
	<atom:link href="http://wyner.info/LanguageLogicLawSoftware/index.php/category/ontology/feed/" rel="self" type="application/rss+xml" />
	<link>http://wyner.info/LanguageLogicLawSoftware</link>
	<description>Dr. Adam Wyner&#039;s blog on legal informatics for legal professionals</description>
	<lastBuildDate>Wed, 18 Jan 2012 21:09:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>General Architecture for Text Engineering Summer School 2011</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/05/22/general-architecture-for-text-engineering-summer-school-2011/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/05/22/general-architecture-for-text-engineering-summer-school-2011/#comments</comments>
		<pubDate>Sun, 22 May 2011 19:31:49 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[text analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1090</guid>
		<description><![CDATA[I had the opportunity (thanks Katie Atkinson!) to attend the General Architecture for Text Engineering Summer School 2011. The GATE people have really developed this summer school very well. It was well attended (70 participants?) and well structured (three sections and various talks). GATE attacts a good, outgoing, helpful, and diverse group of people. A [...]]]></description>
			<content:encoded><![CDATA[<p>I had the opportunity (thanks <a href="http://www.csc.liv.ac.uk/~katie/">Katie Atkinson</a>!) to attend the <a href="https://gate.ac.uk/conferences/fig/fig4.html">General Architecture for Text Engineering Summer School 2011</a>.  The GATE people have really developed this summer school very well.  It was well attended (70 participants?) and well structured (three sections and various talks).  GATE attacts a good, outgoing, helpful, and diverse group of people.  A whole week of GATE and never a dull moment.  Geeky, but true.  And text analytics seems to be a growing area (at least according to the May 2011 issue of New Scientist, which lists it as one of seven &#8220;disruptive&#8221; technologies; I&#8217;ve always wanted to be bad).</p>
<p>As this was my second time at the GATE summer school, I sat in on the Advanced GATE session.  All the slides and all the materials for hands on exercises are available on the <a href="https://gate.ac.uk/wiki/TrainingCourseMay2011/">GATE Summer School Wiki</a>.  In my week, we covered the following:</p>
<ul>
<li>Module 9: Ontologies and Semantic Annotation
<ul>
<li><em>Introduction to Ontologies</em></li>
<li><em>GATE Ontology Editor</em></li>
<li><em>GATE Ontology Annotation Tools for Entities and Relations</em></li>
<li><em>Automatic Semantic Annotation in GATE</em></li>
<li><em>Measuring Performance</em></li>
<li><em>Using the Large Knowledge Base gazetteer (LKB)</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 10: Advanced GATE Applications
<ul>
<li><em>Customising ANNIE</em></li>
<li><em>Working with different languages</em></li>
<li><em>Complex applications</em></li>
<li><em>Conditional Processing</em></li>
<li><em>Section-by-section processing</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 11: Machine Learning
<ul>
<li><em>Machine learning and evaluation concepts</em></li>
<li><em>Using ML in GATE</em></li>
<li><em>Engines and algorithms)</em></li>
<li><em>Entity learning hands-onl session</em></li>
<li><em>Relation extraction hands-on session</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 12: Opinion Mining
<ul>
<li><em>Introduction to opinion mining and sentiment analysis</em></li>
<li><em>Using GATE tools to perform sentiment analysis</em></li>
<li><em>Machine learning for sentiment analysis hands-on session</em></li>
<li><em>Future directions for opinion mining</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 13: Semantic Technology and Linked Open Data: Basics, Tools, and Applications
<ul>
<li><em>Linked Open Data: Introduction of key principles  and some key tools (FactForge, LinkedLifeData)</em></li>
<li><em>Semantic Annotation with Linked Data</em></li>
<li><em>Semantic Search</em></li>
</ul>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/05/22/general-architecture-for-text-engineering-summer-school-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Presentation at Legal Know-how Workshop, Nov. 10, 2010</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/10/08/presentation-at-legal-know-how-workshop-nov-10-2010/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/10/08/presentation-at-legal-know-how-workshop-nov-10-2010/#comments</comments>
		<pubDate>Fri, 08 Oct 2010 17:25:41 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=873</guid>
		<description><![CDATA[I have been invited to make a presentation on Textual information extraction and ontologies for legal case-based reasoning at a Legal Know-how Workshop, which is an industry oriented event organised by the International Society for Knowledge Management UK. Date: 10 November 2010 Time: 13:30-19:00 Venue: University College London Medical Sciences Building A. V. Hill Lecture [...]]]></description>
			<content:encoded><![CDATA[<p>I have been invited to make a presentation on <em>Textual information extraction and ontologies for legal case-based reasoning</em> at a <a href="http://www.iskouk.org/events/legal_knowledge_nov2010.htm">Legal Know-how Workshop</a>, which is an industry oriented event organised by the <a href="http://www.iskouk.org/">International Society for Knowledge Management UK</a>.</p>
<p>Date:  10 November 2010<br />
Time:  13:30-19:00<br />
Venue: University College London<br />
Medical Sciences Building<br />
A. V. Hill Lecture Theatre<br />
Gower Street<br />
London, WC1E 6BT </p>
<p>See the workshop website for registration fee (either free or under £25) and booking.</p>
<p>This will be a very interesting opportunity to hear from and talk with industry consultants and experts about the latest developments in legal knowledge management.  My thanks to Stella Dextre Clarke of ISKO-UK for organising the event and inviting me to take part.</p>
<h2><a class="bluelink" name="programme"></a><span class="style9"> Programme </span></h2>
<table width="100%"  border="0" cellpadding="2" cellspacing="1" bgcolor="#c1d0c5">
<tr bgcolor="#e4f2f9">
<td width="9%" align="center" valign="top" bgcolor="#efefef" style="color: #336699;"><span class="style14">13:30</span></td>
<td valign="top" bgcolor="#efefef" class="kokonamecell">Registration </td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="white">14:00</td>
<td valign="top" bgcolor="white" class="kokonamecell">Welcome from ISKO-UK by Stella Dextre Clarke                    </td>
<p>
                   </tr>
<tr>
<td align="center" valign="top" bgcolor="white">14:05</td>
<td valign="top" bgcolor="white" class="kokonamecell"><span class="style26">Legal knowledge &#8211; the practitioner&#8217;s viewpoint</span><br />
                      <span class="style30"><a href="legal_knowledge_speakers.htm#farquharson" class="style15">Melanie Farquharson</a>, 3Kites Consulting
                      </p>
<p>                      </span>
<p>This session will focus on the practical situations in which lawyers look for knowledge in order to deliver legal services to their clients. It will identify some typical &#8216;use cases&#8217; and consider ways in which knowledge can be delivered to the practitioner &#8211; even without them having to look for it.
                    </td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="white">14:35</td>
<td valign="top" bgcolor="white" class="kokonamecell"><span class="style26">Why lawyers need taxonomies &#8211; adventures in organising legal knowledge</span><br />
                      <span class="style29"><span class="style30"><a class="redlink" href="legal_knowledge_speakers.htm#jacob">Kathy Jacob</a> &#038; <a class="redlink" href="legal_knowledge_speakers.htm#barker">Lynley Barker</a>, Pinsent Masons LLP;<br />
                        <a href="legal_knowledge_speakers.htm#barber" class="style15">Graham Barbour</a> &#038; <a href="legal_knowledge_speakers.htm#fea" class="style15">Mark Fea</a>, LexisNexis</span>                      </p>
<p>                      </span>
<p>This presentation will cover the practical issues encountered by a law firm in its quest to improve findability of one of its key resources &#8211; knowledge and information. We will discuss our approach to building taxonomies, the tools and processes deployed and how we anticipate our taxonomy will be applied and consumed by lawyers and publishers.<br />
                        The LexisNexis part of the presentation will focus on the challenges of building and applying legal taxonomies to suit the breadth and depth of content they provide online. It will also examine ways in which taxonomies can be surfaced in the user interface and help to drive compelling functionality that improves the user&#8217;s search experience.
                      </p>
</td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="white">15:20</td>
<td valign="top" bgcolor="white" class="kokonamecell"><span class="style26"><strong>Taxonomy management at Clifford Chance</strong></span></p>
<p>                      <a href="legal_knowledge_speakers.htm#bergman" class="style15">Mats Bergman</a>, Clifford Chance</p>
<p>This talk will describe how taxonomy management works in practice at Clifford Chance. As an increasing number of core knowledge resources are making use of the same set of firm-wide taxonomies, the increased interdependencies necessitate the implementation of a controlled process for updating the taxonomies. A simple governance model will be presented. Some thoughts will follow on the evolution of taxonomy development within a larger organisation and the current challenge of using social tagging in conjunction with controlled vocabularies.</p>
</td>
</tr>
<tr bgcolor="#e4f2f9">
<td align="center" valign="top" bgcolor="#efefef"><span class="style14">15:50</span></td>
<td valign="top" bgcolor="#efefef" class="kokonamecell style21  style14">Refreshments (Lower Refectory)</td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="white">16:20</td>
<td valign="top" bgcolor="white" class="kokonamecell"><span class="style26"><strong>Textual information extraction and ontologies for legal case-based reasoning</strong></span><br />
           <a href="legal_knowledge_speakers.htm#wyner" class="style15">Adam Wyner</a>, University of Liverpool
<p>
This talk gives a brief overview of current developments and prospects in two related areas of the legal semantic web for legal cases &#8211; textual information extraction and ontologies.  Textual information extraction is a process of automatically annotating and extracting textual information from the legal case base (precedents), thereby identifying elements such as participants, the roles the participants play, the factors which were considered in arriving at a decision, and so on.  The information is valuable not only for search (to find applicable precedents), but also to populate an ontology for legal case-based reasoning.  An ontology is a formal representation of key aspects of the knowledge of legal professionals with which we can reason (e.g. given an assertion that something is a legal case, we can infer other properties) and with respect to which we can write rules (e.g. reasoning using case factors to arrive at a legal decision).  Since it is expensive to manually populate an ontology (meaning to read cases and input the data into the ontology), we use textual information extraction to automatically populate the ontology.  We conclude with an appeal for open source, collaborative development of legal knowledge systems among partners in academia, industry, and government.</p>
</td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="white">17:00</td>
<td valign="top" bgcolor="white" class="kokonamecell"><span class="style26"><strong>Collaboration across boundaries</strong></span><br />
                       <a href="legal_knowledge_speakers.htm#sippings" class="style15">Gwenda Sippings</a> &#038; <a href="legal_knowledge_speakers.htm#bredenoord" class="style15">Gerard Bredenoord</a>, Linklaters LLP</p>
<p>In this presentation, we will look at approaches to managing legal know-how in a major global law firm. We will describe several boundaries that we have to consider when organising our know-how, including boundaries between professionals, countries, internal and external resources and the well debated boundary between information and knowledge. We will also share some of the ways in which we are making our know-how available to the fee earners and other professionals in the firm, using social and technological solutions.
                     </td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="white">17:35</td>
<td valign="top" bgcolor="white" class="kokonamecell"><span class="style26"><strong> Reconciling the taxonomy needs of different users</strong></span><br /><a class="redlink" href="legal_knowledge_speakers.htm#sturdy">Derek Sturdy</a>, Tikit Knowledge Services</p>
<p>The last decade has seen the development of a substantial number of legal know-how and knowledge databases. It has also shown up a serious question on whether the metadata, and especially the taxonomies, that are applied to the various knowledge items, should be tailored to the particular needs of end-users, or whether, so to speak, &quot;one size can fit all&quot;. In particular, this talk will discuss the overlapping, but discrete, needs of those using knowledge resources primarily for legal drafting and document production, and of those conducting legal research, and will address the relative value today, (as opposed to in 2000), of the effort put into internal metadata creation for those two sorts of end-users.
</td>
</table>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/10/08/presentation-at-legal-know-how-workshop-nov-10-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Legal Case Ontology OWL file and Case Graphic</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/05/05/legal-case-ontology-owl-file/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/05/05/legal-case-ontology-owl-file/#comments</comments>
		<pubDate>Wed, 05 May 2010 14:48:27 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=817</guid>
		<description><![CDATA[In conjunction with the paper by Rinke Hoekstra and I (as previously noted on this blog), we are making the ontology and a graphic of Popov v. Hayashi available: Legal Case Ontology v9 This is the OWL file. It was developed using Protege version 4, a knowledge acquisition and editing tool. As we have not [...]]]></description>
			<content:encoded><![CDATA[<p>In conjunction with the paper by Rinke Hoekstra and I (<a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2010/04/28/new-article-on-legal-case-ontologies-in-knowledge-engineering-review/">as previously noted on this blog</a>), we are making the ontology and a graphic of <em>Popov v. Hayashi</em> available:</p>
<p><a href="http://wyner.info/research/ontologies/LegalCaseOntology_v9.owl">Legal Case Ontology v9</a></p>
<p>This is the OWL file.  It was developed using <a href="http://protege.stanford.edu/">Protege</a> version 4, a knowledge acquisition and editing tool.</p>
<p>As we have not previously made this a publicly available ontology, consider it a <em>beta</em> release.  Comments very welcome.</p>
<p>The graphic is the ontological representation of <em>Popov v. Hayashi</em>; it is a pdf file.</p>
<p><a href="http://wyner.info/research/ontologies/popov-v-hayashi-relations-inferred.pdf">Ontological Graphic for <em>Popov v. Hayashi</em></a></p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/05/05/legal-case-ontology-owl-file/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>New Article on Legal Case Ontologies in Knowledge Engineering Review</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/04/28/new-article-on-legal-case-ontologies-in-knowledge-engineering-review/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/04/28/new-article-on-legal-case-ontologies-in-knowledge-engineering-review/#comments</comments>
		<pubDate>Wed, 28 Apr 2010 09:50:06 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=804</guid>
		<description><![CDATA[Rinke Hoekstra and I have a paper which will appear in Knowledge Engineering Review. A Legal Case OWL Ontology with an Instantiation of Popov v. Hayashi Adam Wyner and Rinke Hoekstra To appear in Knowledge Engineering Review Abstract The paper provides an OWL ontology for legal cases with an instantiation of the legal case Popov [...]]]></description>
			<content:encoded><![CDATA[<p>Rinke Hoekstra and I have a paper which will appear in Knowledge Engineering Review.</p>
<p><a href="http://wyner.info/research/Papers/WynerHoekstraKER2010Ontology.pdf"><strong>A Legal Case OWL Ontology with an Instantiation of <em>Popov v. Hayashi</em></strong></a><br />
Adam Wyner and Rinke Hoekstra<br />
To appear in Knowledge Engineering Review</p>
<p><em>Abstract</em><br />
The paper provides an OWL ontology for legal cases with an instantiation of the legal case <em>Popov v. Hayashi</em>. The ontology makes explicit the conceptual knowledge of the legal case domain, supports reasoning about the domain, and can be used to annotate the text of cases, which in turn can be used to populate the ontology. A populated ontology is a case base which can be used for information retrieval, information extraction, and case based reasoning. The ontology contains not only elements of indexing the case (e.g. the parties, jurisdiction, and date), but as well elements used to reason to a decision such as argument schemes and the components input to the schemes. We use the Protege ontology editor and knowledge acquisition system, current guidelines for ontology development, and tools for visual and linguistic presentation of the ontology.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/04/28/new-article-on-legal-case-ontologies-in-knowledge-engineering-review/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Forthcoming Article:  On Controlled Natural Languages:  Properties and Prospects</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/22/forthcoming-article-on-controlled-natural-languages-properties-and-prospects/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/22/forthcoming-article-on-controlled-natural-languages-properties-and-prospects/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 19:35:33 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[controlled natural language]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=738</guid>
		<description><![CDATA[I am a co-author of the forthcoming article On Controlled Natural Languages: Properties and Prospects. From the abstract: This collaborative report highlights the properties and prospects of Controlled Natural Languages (CNLs). The report poses a range of questions concerning the goals of the CNL, the design, the linguistic aspects, the relationships and evaluation of CNLs, [...]]]></description>
			<content:encoded><![CDATA[<p>I am a co-author of the forthcoming article <em>On Controlled Natural Languages:  Properties and Prospects</em>.  From the abstract:</p>
<blockquote><p>
This collaborative report highlights the properties and prospects of Controlled Natural Languages (CNLs).  The report poses a range of questions concerning the goals of the CNL, the design, the linguistic aspects, the relationships and evaluation of CNLs, and the application tools.  In posing the questions, the report attempts to structure the field of CNLs and to encourage further systematic discussion by researchers and developers.
</p></blockquote>
<p>The reference and link to the article:</p>
<p>A. Wyner, K. Angelov, G. Barzdins, D. Damljanovic, N. Fuchs, S. Hoefler, K. Jones, K. Kaljurand, T. Kuhn, M. Luts, J. Pool, M. Rosner, R. Schwitter, and J. Sowa. <a href="http://wyner.info/research/Papers/CNLP&#038;P.pdf">On Controlled Natural Languages: Properties and Prospects</a>, to appear in: N.E. Fuchs (ed.), Workshop on Controlled Natural Languages, CNL 2009, LNCS/LNAI 5972, Springer, 2010.</p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/22/forthcoming-article-on-controlled-natural-languages-properties-and-prospects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Instructions for GATE&#8217;s Onto Root Gazetteer</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/notes-on-onto-root-gazetteer/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/notes-on-onto-root-gazetteer/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 18:41:48 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[RDF/XML]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[taxonomy]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=355</guid>
		<description><![CDATA[In this post, I present User Manual notes for GATE&#8217;s Onto Root Gazetteer (ORG) and references to ORG. In Discussion of GATE&#8217;s Onto Root Gazetteer, I discuss aspects of Onto Root Gazetteer which I found interesting or problematic. These notes and discussion may be of use to those researchers in legal informatics who are interested [...]]]></description>
			<content:encoded><![CDATA[<p>In this post, I present User Manual notes for GATE&#8217;s <em>Onto Root Gazetteer</em> (ORG) and references to ORG.  In <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/discussion-of-gates-onto-root-gazetteer/?preview=true&#038;preview_id=498&#038;preview_nonce=d412a16361">Discussion of GATE&#8217;s Onto Root Gazetteer</a>, I discuss aspects of Onto Root Gazetteer which I found interesting or problematic.  These notes and discussion may be of use to those researchers in legal informatics who are interested in text mining and annotation for the semantic web.</p>
<p>Thanks to Diana Maynard, Danica Damljanovic, Phil Gooch, and the GATE User Manual for comments and materials which I have liberally used.  Errors rest with me (and please tell me where they are so I can fix them!).</p>
<p><strong>Purpose</strong></p>
<p>Onto Root Gazetteer links text to an ontology by creating Lookup annotations which come from the ontology rather than a default gazetteer.  The ontology is preprocessed to produce a flexible, dynamic gazetteer; that is, it is a gazetteer which takes into account alternative morphological forms and can be added to.  An important advantage is that text can be annotated as an individual of the ontology, thus facilitating the population of the ontology.</p>
<p>Besides being flexible and dynamic, some advantages of ORG over other gazetteers:</p>
<ul>
<li>It is more richly structured (see it as a gazetteer containing other gazetteers)</li>
<li>It allows one to relate textual and ontological information by adding instances.</li>
<li>It gives one richer annotations that can be used for further processes.</li>
</ul>
<p>In the following, we present the step by step instructions for &#8216;rolling your own&#8217;, then show the results of the &#8216;prepackaged&#8217; example that comes with the plugin.</p>
<p><strong>Setup</strong></p>
<p>Step 1.  Add (if not already used) the Onto Root Gazetteer plugin to GATE following the usual plugin instructions.</p>
<p>Step 2.  Add (if not already used) the Ontology Tools (OWLIM Ontology LR, OntoGazetteer, GATE Ontology Editor, OAT) plugin.  ORG uses ontologies, so one must have these tools to load them as language resources.</p>
<p>Step 3.  Create (or load) an ontology with OWLIM (see the instructions on the ontologies).  This is the ontology that is the language resource that is then used by Onto Root Gazetteer.  Suppose this ontology is called myOntology.  It is important to note that OWLIM can only use OWL-Lite ontologies (see the documentation about this).  Also, I succeeded in loading an ontology only from the resources folder of the Ontology_Tools plugin (rather than from another drive); I don&#8217;t know if this is significant.</p>
<p>Step 4.  In GATE, create processing resources with default parameters:</p>
<ul>
<li>Document Reset PR</li>
<li>RegEx Sentence Splitter (or ANNIE Sentence Splitter, but that one is likely to run slower</li>
<li>ANNIE English Tokeniser</li>
<li>ANNIE POS Tagger</li>
<li>GATE Morphological Analyser</li>
</ul>
<p>Step 5.  When all these PRs are loaded, create a Onto Root Gazetteer PR and set the initial parameters as follows. Mandatory ones are as follows (though some are set as defaults):</p>
<ul>
<li>Ontology: select previously created myOntology</li>
<li>Tokeniser: select previously created Tokeniser</li>
<li>POSTagger: select previously created POS Tagger</li>
<li>Morpher: select previously created Morpher.</li>
</ul>
<p>Step 6. Create another PR which is a Flexible Gazetteer. At the initial parameters, it is mandatory to select previously created OntoRootGazetteer for gazetteerInst.  For another parameter, inputFeatureNames, click on the button on the right and when prompt with a window, add &#8216;Token.root&#8217; in the provided text box, then click Add button. Click OK, give name to the new PR (optional) and then click OK.</p>
<p>Step 7.  To create an application, right click on Application, New &#8211;> Pipeline (or Corpus Pipeline).  Add the following PRS to the application in this order:</p>
<ul>
<li>Document Reset PR</li>
<li>RegEx Sentence Splitter</li>
<li>ANNIE English Tokeniser</li>
<li>ANNIE POS Tagger</li>
<li>GATE Morphological Analyser</li>
<li>Flexible Gazetteer</li>
</ul>
<p>Step 8.  Run the application over the selected corpus.</p>
<p>Step 9.  Inspect the results.  Look at the Annotation Set with Lookup and also the Annotation List to see how the annotations appear.</p>
<p><strong>Small Example</strong></p>
<p>The ORG plugin comes with a demo application which not only sets up all the PRs and LRs (the text, corpus, and ontology), but also the application ready to run.  This is the file <em>exampleApp.xgapp</em>, which is in resource folder of the plugin (Ontology_Based_Gazetteer).  To start this, start GATE with a clean slate (no other PRs, LRs, or applications), then Applications, then right click to Restore application from file, then load the file from the folder just given.</p>
<p>The ontology which is used for an illustration is for GATE itself, giving the classes, subclasses, and instances of the system.  While the ontology is loaded along with the application, one can find it <a href="http://gate.ac.uk/ns/gate-kb">here</a>.  The text is simple (and comes with the application):  <em>language resources and parameters</em>.</p>
<p>FIGURE 1 (missing at the moment)</p>
<p>FIGURE 2 (missing at the moment)</p>
<p>One can see that the token &#8220;language resources&#8221; is annotated with respect to the class LanguageResource, &#8220;resources&#8221; is annotated with GATEResource, and &#8220;parameters&#8221; is annotated with ResourceParameter.  We discuss this further below.</p>
<p>One further aspect is important and useful.  Since the ontology tools have been loaded and a particular ontology has been used, one can not only see the ontology (open the OAT tab in the window with the text), but <em>one can annotate the text with respect to the ontology</em> &#8212; highlight some text and a popup menu allows one to select how to annotate the text.  With this, one can add instances (or classes) to the ontology.</p>
<p><strong>Documentation</strong></p>
<p>One can consult the following for further information about how the gazetteer is made, among other topics:</p>
<ul>
<li>GATE User Manual on <a href="http://gate.ac.uk/sale/tao/splitch13.html#x18-31400013.8">Onto Root Gazetteer</a>.</li>
<li>See section 3.1. of <a href="http://gate.ac.uk/sale/lrec2008/clone-ql/clone-ql-paper.pdf">this paper</a>.</li>
<li>See section 4 of <a href="http://www.tao-project.eu/resources/publicdeliverables/d3-1.pdf">TAO Deliverables</a>.</li>
<li>See Peters and Maynard in <a href="http://www.neon-project.org/web-content/images/Publications/neon_2009_d1042.pdf">A GATE-way into NeON</a>.
</ul>
<p><strong>Discussion</strong></p>
<p>See the related post <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/discussion-of-gates-onto-root-gazetteer/">Discussion of GATE&#8217;s Onto Root Gazetteer</a>.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href=""http://creativecommons.org/licenses/by-nc-sa/2.0/uk/"">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/notes-on-onto-root-gazetteer/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Discussion of GATE&#8217;s Onto Root Gazetteer</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/discussion-of-gates-onto-root-gazetteer/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/discussion-of-gates-onto-root-gazetteer/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 16:20:04 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[ontology]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=498</guid>
		<description><![CDATA[In Instructions for GATE&#8217;s Onto Root Gazetteer, I have information to set up Onto Root Gazetteer. In this post, I discusses aspects of the Onto Root Gazetteer that I found interesting or problematic. For me, the documentation was not helpful as too much technical information was provided (e.g. preprocessing the ontology) rather than the steps [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/notes-on-onto-root-gazetteer/">Instructions for GATE&#8217;s Onto Root Gazetteer</a>, I have information to set up Onto Root Gazetteer.  In this post, I discusses aspects of the Onto Root Gazetteer that I found interesting or problematic.</p>
<p>For me, the documentation was not helpful as too much technical information was provided (e.g. preprocessing the ontology) rather than the steps just to get it to run.  Also, no walk through example was clearly illustrated.  I would still like (and will provide in the near future) a richer text (a nice paragraph) and a simpler ontology (couple of classes, subclasses, object and data properties, and individuals) to illustrate just what is done fully.</p>
<p>Though I have it running, there are several questions (and partial answers or musings):</p>
<ul>
<li>What is the annotation relative to the ontology good for?</li>
<li>What is the difference between gazetteers derived from ontologies and default gazetteers?</li>
<li>What is the selection criteria for annotating the tokens?</li>
<li>What is the relationship between the annotated text and the ontology?</li>
</ul>
<p>Concerning the first point, presumably more annotations allow more processing capabilities.  A (simple) example would be very helpful.</p>
<p>Concerning the second point, matters are more complex (to my mind).  First, default gazetteers (or flexible gazetteers for that matter) are <em>flat</em> lists (a list containing no sublists as parts) where the <em>items in the list are annotated as per the properties of the list</em>; for example, if we have a gazetteer for Organisation (call this the header of the list) which lists IBM, BBC, Hackney Council (call these the items of the list), then every token of IBM, BBC, and Hackney Council found in the corpus will be annotated <em>Organisation</em>.  If there is a token <em>organisation</em> in the corpus, it will <em>not</em> be annotated with <em>Organisation</em>; similarly, no token of IBM in the corpus is annotated IBM.  The list categorises, in effect, IBM, BBC, and Hackney Council as of the type <em>Organisation</em>.</p>
<p>ORG works differently (I believe, but may be wrong), but these points are not made in the documentation.  First, a gazetteer which is derived from an ontology preserves the subsumption hierarchy of the ontology, giving us a list of lists.  Such a gazetteer is a taxonomy of terminology, which is <em>not</em> the same as an ontology (though frequently mistaken to be identical).  Second, if a token in the text is found to (flexibly) match an item in the gazetteer, then the token is annotated with that item, meaning that if the string IBM is a token in our text and an item in the gazetteer, then token is annotated <em>IBM</em>.  In these respect, ORGs work differently from other gazetteers.</p>
<p>The third question might be addressed in the richer documentation concerning ORG.  It relates to observations concerning the results of the example application.  Consider the following.  The token &#8220;language resources&#8221; has the annotation:</p>
<p>URI=http://gate.ac.uk/ns/gate-ontology#LanguageResource, heuristic_level=0, majorType=, propertyURI=http://www.w3.org/2000/01/rdf-scheme#label, type=class</p>
<p>The token &#8220;resources&#8221; has the annotation:</p>
<p>URI=http://gate.ac.uk/ns/gate-ontology#GATEResource, heuristic_level=0, majorType=, propertyURI=http://www.w3.org/2000/01/rdf-scheme#label, type=class</p>
<p>And the token &#8220;parameters&#8221; has annotation:</p>
<p>URI=http://gate.ac.uk/ns/gate-ontology#ResourceParameter, heuristic_level=0, majorType=, propertyURI=http://www.w3.org/2000/01/rdf-scheme#label, type=class</p>
<p>We see that the tokens in the text are annotated in relation to the ontology.  Yet it is not clear why the token &#8220;resources&#8221; is not annotated with LanguageResource or ResourceParameter since these are components of the ORG as well.  Likely there is some prioritising among the annotations that we need to learn.</p>
<p>Finally, concerning the last question, matters are somewhat unclear (to me) largely because the line between annotations, gazetteers, and ontologies are blurred, where for me the key unclarity focuses around annotations in the text that match items in the gazetteer.  Consider the issue from a different point of view.  ORG was developed in the context of a project to support ontology development from text &#8212; find terms and relations which are candidates for the ontology, then (if one wants) use the terms and relations to build the ontology.   For example, if one sees lots of occurrences of &#8220;organisation&#8221; in the text, then perhaps it would be introduced as a concept in the ontology.  We have a <em>many-one</em> relation from the tokens to the ontology.  This makes sense.  See it another way, where we have a default gazetteer where every given token (e.g. IBM) in a text has the same annotation, giving the impression of a <em>one-many</em> relation.  This also makes sense.  Neither of these seem problematic to me largely because I don&#8217;t really know much or presume much about the meaning of the annotation on the token:  from the text, I abstract the concept, from the gazetteer, I label tokens as belonging to the same annotation class.  In no case is a token &#8220;organisation&#8221; annotated with Organisation; even if it were, I couldn&#8217;t really object unless I said more about what I think the annotation means.</p>
<p>Contrast these points with what goes on with ORG (admittedly, this gets pretty philosophical, and in terms of day to day practice, it may not be relevant).  First, it seems that one instance in the ontology is associated with multiple tokens in the text.  Second, an instance or class in the ontology can be associated with a token that is intended to have some similar meaning &#8212; e.g. the individual IBM in the ontology is associated by annotation with every token of IBM in the text, and similarly for the classes.  Neither of these make sense to me in terms of what ontologies are intended to represent, which is a state of knowledge (the fixed concepts, object and data properties, and individuals) about a domain.  On the first point, how can I be assured that the intended meaning of tokens is the same throughout the corpus?  In one document, we might find IBM as the name of a non-existent company, in an other for an existing company, and in another for a company that has gone bankrupt.  Simply put, the string might remain the same, but the knowledge we have about it may vary.  Ontologies (as they are currently represented) do <em>not</em> allow such dynamic interpretation.  To ignore this point risks having annotations (and whatever might flow from the annotations) slip; for example, it would be wrong to find a relationship between IBM and owners where the company doesn&#8217;t exist.  On the second point, conceptually it makes no sense to say that a token &#8220;organisation&#8221; is itself associated with the concept or instance or &#8216;organisation&#8217; in the ontology.  Or course, in developing the ontology, going from the text to the ontology makes good sense since one is abstracting from the text to the ontology.  Yet, in that move, one makes something different &#8212; a concept over all the &#8220;ideas&#8221; drawn from the tokens.  So, I disagree emphatically with Peters and Maynard (from the NeON article):  &#8220;Texts are annotated with ontology classes, and the textual elements function as instances of these classes.&#8221;  The textual element &#8220;organisation&#8221; or &#8220;IBM&#8221; is an instance of the concept organisation or the individual IBM?  I think this is a category mistake.</p>
<p>In general, I find the relationship between the text, intermediate representations (gazettees), and ontologies (higher level representations of knowledge) rather interesting, but somewhat murky.  As I said earlier, perhaps this is just philosophy.  Depending on the domain of discussion, the corpus, and the way the annotations and ontologies are used, perhaps my intuition of lurking trouble will not be realised&#8230;.  Equally, there is likely something simple that I&#8217;m missing.  If so, please enlighten me.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href=""http://creativecommons.org/licenses/by-nc-sa/2.0/uk/"">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/11/24/discussion-of-gates-onto-root-gazetteer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Meeting with John Sheridan on the Semantic Web and Public Administration</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/08/11/meeting-with-john-sheridan-on-the-semantic-web-and-public-administration/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/08/11/meeting-with-john-sheridan-on-the-semantic-web-and-public-administration/#comments</comments>
		<pubDate>Tue, 11 Aug 2009 20:02:44 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[e-Government]]></category>
		<category><![CDATA[law]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=263</guid>
		<description><![CDATA[I met today with John Sheridan, Head of e-Services, Office of Public Sector Information, The National Archives, located at the Ministry of Justice, London, UK. Also at the meeting was John&#8217;s colleague Clare Allison. John and I had met at the ICAIL conference in Barcelona, where we briefly discussed our interests in applications of Semantic [...]]]></description>
			<content:encoded><![CDATA[<p>I met today with John Sheridan, Head of e-Services, Office of Public Sector Information, The National Archives, located at the Ministry of Justice, London, UK.  Also at the meeting was John&#8217;s colleague Clare Allison.  John and I had met at the ICAIL conference in Barcelona, where we briefly discussed our interests in applications of Semantic Web technologies to legal informatics in the public sector.  Recently, John got back in contact to talk further about how we might develop projects in this area.</p>
<p>Perhaps most striking to me is that John made it clear that the government (at least his sector) is proactive, looking for research and development projects that make government data available and usable in a variety of ways.  In addition, he wanted to develop a range of collaborations to better understand the opportunities the Semantic Web may offer.</p>
<p>As part of catching up with what is going on, I took a look around the web for relatively recent documents on related activities.</p>
<ul>
<li>Blog notes on a meeting in Feb. 2009 of <a href="http://blog.okfn.org/2009/02/06/barcamp-ukgovweb-2009/">BarCamp-UKGovWeb</a> which contains a discussion of open government and re-using government data.</li>
<li><a href="http://www.appsi.gov.uk/presentations/jsheridan-presentation-09-12-2008.pdf">Slide presentation by John Sheridan</a> at the Advisory Panel on Public Sector Information, Dec. 2008.</li>
<li>Blog notes from <a href="http://blogs.talis.com/nodalities/tag/open-government">Talis</a> on open government</li>
<li>A <a href="http://cloudofdata.com/2009/07/john-sheridan-talks-about-the-drive-to-get-government-data-online/">podcast with John Sheridan</a> about open government.</li>
<li>Slides by <a href="http://tomheath.com/slides/2009-07-cercedilla-how-to-publish-linked-data.pdf">Tom Heath on Linked Data</a>.</li>
</ul>
<p>In our discussion, John gave me an overview of the current state of affairs in public access to legislation, in particular, <a href="http://www.legislation.gov.uk/">the legislative markup and API</a>.  The markup is intended to support publication, revision, and maintenance of legislation, among other possibilities.  We also had some discussion about developing an ontology of goverment which would be linked to legislation.</p>
<p>Another interesting dimension is that John&#8217;s office is one of a few that I know of which are actively engaged to develop a knowledge economy partly encouraged by public administrative requirements and goals.  Others in this area are the Dutch and the US (with xml.gov).  All very promising and discussions well worth following up on.</p>
<p>Copyright &copy; 2009 Adam Wyner</p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/08/11/meeting-with-john-sheridan-on-the-semantic-web-and-public-administration/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Participating in One-Lex &#8212; Managing Legal Resources on the Semantic Web</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/07/22/participating-in-one-lex-managing-legal-resources-on-the-semantic-web/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/07/22/participating-in-one-lex-managing-legal-resources-on-the-semantic-web/#comments</comments>
		<pubDate>Wed, 22 Jul 2009 15:29:21 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[e-Government]]></category>
		<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[ontology]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=236</guid>
		<description><![CDATA[Later this summer, I&#8217;ll be participating in the summer school Managing Legal Resources in the Semantic Web, September 7 to 12 in San Domenico di Fiesole (Florence, Italy). This program will focus on several aspects of legal document management: Drafting methods, to improve the language and the structure of legislative texts Legal XML standards, to [...]]]></description>
			<content:encoded><![CDATA[<p>Later this summer, I&#8217;ll be participating in the summer school <a href="http://www.one-lex.eu/Activities/summerschool09/legislativexml.html">Managing Legal Resources in the Semantic Web</a>, September 7 to 12 in San Domenico di Fiesole (Florence, Italy).  This program will focus on several aspects of legal document management:</p>
<ul>
<li>Drafting methods, to improve the language and the structure of legislative texts</li>
<li>Legal XML standards, to improve the accessibility and interoperability of legal resources</li>
<li>Legal ontologies, to capture legal metadata and legal semantics</li>
<li>Formal representation of legal contents,  to support legal reasoning and argumentation</li>
<li>Workflow models, to cope with the lifecycle of legal documentation</li>
</ul>
<p>While I&#8217;m familiar with several of these areas, I&#8217;m using this opportunity to fill in my knowledge in these key areas.</p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/07/22/participating-in-one-lex-managing-legal-resources-on-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>General Architecture for Text Engineering Summer School</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/07/22/general-architecture-for-text-engineering-summer-school/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/07/22/general-architecture-for-text-engineering-summer-school/#comments</comments>
		<pubDate>Wed, 22 Jul 2009 14:54:46 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=220</guid>
		<description><![CDATA[Next week I&#8217;m attending a week long summer school on General Architecture for Text Engineering (GATE). GATE is an open-source and extensible toolkit for text mining, which has been used in a variety of areas. After having worked with people who had their &#8220;hands on&#8221; the tools, I decided it would better suit me to [...]]]></description>
			<content:encoded><![CDATA[<p>Next week I&#8217;m attending a week long summer school on <a href="http://www.gate.ac.uk/conferences/fig09/">General Architecture for Text Engineering (GATE)</a>.  GATE is an open-source and extensible toolkit for text mining, which has been used in a variety of areas.  After having worked with people who had their &#8220;hands on&#8221; the tools, I decided it would better suit me to be able to work the material myself.  I&#8217;ve been looking forward to this summer school for some time and am excited at the prospect of applying GATE tools to a DB of legal cases as well as developing an ontology.</p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2009/07/22/general-architecture-for-text-engineering-summer-school/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

