<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Language Logic Law Software &#187; GATE</title>
	<atom:link href="http://wyner.info/LanguageLogicLawSoftware/index.php/category/gate/feed/" rel="self" type="application/rss+xml" />
	<link>http://wyner.info/LanguageLogicLawSoftware</link>
	<description>Dr. Adam Wyner&#039;s blog on legal informatics for legal professionals</description>
	<lastBuildDate>Wed, 18 Jan 2012 21:09:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>CFP &#8211; Workshop on Semantic Processing of Legal Texts (SPLeT 2012)</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/12/19/cfp-workshop-on-semantic-processing-of-legal-texts-splet-2012/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/12/19/cfp-workshop-on-semantic-processing-of-legal-texts-splet-2012/#comments</comments>
		<pubDate>Mon, 19 Dec 2011 18:59:33 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[controlled natural language]]></category>
		<category><![CDATA[law]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[text analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1233</guid>
		<description><![CDATA[In conjunction with Language Resources and Evaluation Conference 2012 (LREC 2012) 27 May, 2012 Istanbul, Turkey Context: The legal domain represents a primary candidate for web-based information distribution, exchange and management, as testified by the numerous e-government, e-justice and e-democracy initiatives worldwide. The last few years have seen a growing body of research and practice [...]]]></description>
			<content:encoded><![CDATA[<p>In conjunction with</p>
<p><strong><a href="https://www.univie.ac.at/RI/JURIX2011/">Language Resources and Evaluation Conference 2012 (LREC 2012)</a></strong></p>
<p>27 May, 2012<br />
Istanbul, Turkey</p>
<p><strong>Context</strong>:</p>
<p>The legal domain represents a primary candidate for web-based information distribution, exchange and management, as testified by the numerous e-government, e-justice and e-democracy initiatives worldwide. The last few years have seen a growing body of research and practice in the field of Artificial Intelligence and Law which addresses a range of topics: automated legal reasoning and argumentation, semantic and cross-language legal information retrieval, document classification, legal drafting, legal knowledge discovery and extraction, as well as the construction of legal ontologies and their application to the law domain. In this context, it is of paramount importance to use Natural Language Processing techniques and tools that automate and facilitate the process of knowledge extraction from legal texts.</p>
<p>Since 2008, the SPLeT workshops have been a venue where researchers from the Computational Linguistics and Artificial Intelligence and Law communities meet, exchange information, compare perspectives, and share experiences and concerns on the topic of legal knowledge extraction and management, with particular emphasis on the semantic processing of legal texts. Within the Artificial Intelligence and Law community, there have also been a number of dedicated workshops and tutorials specifically focussing on different aspects of semantic processing of legal texts at conferences such as JURIX-2008, ICAIL-2009, ICAIL-2011, as well as in the International Summer School “Managing Legal Resources in the Semantic Web” (2007, 2008, 2009, 2010, 2011). </p>
<p>To continue this momentum and to advance research, a 4th Workshop on “Semantic Processing of Legal Texts” is being organized at the LREC-2012 conference to bring to the attention of the broader LR/HLT (Language Resources/Human Language Technology) community the specific technical challenges posed by the semantic processing of legal texts and also share with the community the motivations and objectives which make it of interest to researchers in legal informatics. The outcome of these interactions are expected to advance research and applications and foster interdisciplinary collaboration within the legal domain.</p>
<p>New to this edition of the workshop are two sub-events (described below) to provide common and consistent task definitions, datasets, and evaluation for legal-IE systems along with a forum for the presentation of varying but focused efforts on their development.</p>
<p>The main goals of the workshop and associated events are to provide an overview of the state-of-the-art in legal knowledge extraction and management, to explore new research and development directions and emerging trends, and to exchange information regarding legal language resources and human language technologies and their applications. </p>
<p><strong>Sub-events</strong>:</p>
<p><em>Dependency Parsing</em><br />
The first sub-event will be a shared task specifically focusing on dependency parsing of legal texts: although this is not a domain-specific task, it is a task which creates the prerequisites for advanced IE applications operating on legal texts, which can benefit from reliable preprocessing tools. For this year our aim is to create the prerequisites for more advanced domain-specific tasks (e.g. event extraction) to be organized in future SPLeT editions. We strongly believe that this could be a way to attract the attention of the LR/HLT community to the specific challenges posed by the analysis of this type of texts and to have a clearer idea of the current state of the art. The languages dealt with will be Italian and English. A specific Call for Participation for the shared task is available in a <a href="http://poesix1.ilc.cnr.it/splet_shared_task/">dedicated page</a>.</p>
<p><em>Semantic Annotation</em><br />
The second sub-event will be an online, manual, collaborative, semantic annotation exercise, the results of which will be presented and discussed at the workshop. The goals of the exercise are: (1) to gain insight on and work towards the creation of a gold standard corpus of legal documents in a cohesive domain; and (2) to test the feasibility of the exercise and to get feedback on its annotation structure and workflow. The corpus to be annotated will be a selection of documents drawn from EU and US legislation, regulation, and case law in a particular domain (e.g. consumer or environmental protection). For this exercise, the language will be English. A specific Call for Participation for this annotation exercise is available in a <a href="http://wyner.info/LanguageLogicLawSoftware/?p=744">dedicated page</a>.</p>
<p><strong>Areas of Interest</strong>:</p>
<p>The workshop will focus on the topics of the automatic extraction of information from legal texts and the structural organisation of the extracted knowledge. Particular emphasis will be given to the crucial role of language resources and human language technologies. </p>
<p>Papers are invited on, but not limited to, the following topics:</p>
<ul>
<li>Construction, extension, merging, customization of legal language resources, e.g. terminologies, thesauri, ontologies, corpora</li>
<li>Information retrieval and extraction from legal texts</li>
<li>Semantic annotation of legal text</li>
<li>Legal text processing</li>
<li>Multilingual aspects of legal text semantic processing</li>
<li>Legal thesauri mapping</li>
<li>Automatic Classification of legal documents</li>
<li>Logical analysis of legal language </li>
<li>Automated parsing and translation of natural language legal arguments into a logical formalism</li>
<li>Dialogue protocols for legal information processing</li>
<li>Controlled language systems for law</li>
</ul>
<p><strong>Workshop Schedule &#8211; TBA</strong>:</p>
<p><strong>Workshop Registration and Location &#8211; TBA</strong>:</p>
<p><strong>Webpage URL</strong>:</p>
<p><a href="http://wyner.info/LanguageLogicLawSoftware/?p=1233">http://wyner.info/LanguageLogicLawSoftware/?p=1233</a></p>
<p><strong>Important Dates</strong>:</p>
<ul>
<li>Submission:  10 February 2012</li>
<li>Acceptance Notification:  5 March 2012</li>
<li>Final Version:  23 March 2012</li>
<li>Workshop date:  27 May 2012</li>
</ul>
<p><strong>Author Guidelines</strong>: </p>
<p>Submissions are solicited from researchers working on all aspects of semantic processing of legal texts. Authors are invited to submit papers describing original completed work, work in progress, interesting problems, case studies or research trends related to one or more of the topics of interest listed above. The final version of the accepted papers will be published in the Workshop Proceedings. </p>
<p>Short or full papers can be submitted. Short papers are expected to present new ideas or new visions that may influence the direction of future research, yet they may be less mature than full papers. While an exhaustive evaluation of the proposed ideas is not necessary, insight and in-depth understanding of the issues is expected. Full papers should be more well developed and evaluated.  Short papers will be reviewed the same way as full papers by the Program Committee and will be published in the Workshop Proceedings. </p>
<p>Full paper submissions should not exceed 10 pages, short papers 6 pages; both should be typeset using a font size of 11 points. Style files will be made available by LREC for the camera-ready versions of accepted papers. Papers should be submitted electronically, no later than February 10, 2012. The only accepted format for submitted papers is Adobe PDF.</p>
<p><strong>Submit papers to</strong>:</p>
<p>Submission will be electronic using START paper submission software available at:</p>
<p><a href="https://www.softconf.com/lrec2012/SPLeT2012/">https://www.softconf.com/lrec2012/SPLeT2012/</a></p>
<p>Note that when submitting a paper through the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. For further information on this new initiative, please refer to:</p>
<p><a href="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012">http://www.lrec-conf.org/lrec2012/?LRE-Map-2012</a></p>
<p><strong>Publication</strong>:</p>
<p>After the workshop a number of selected, revised, peer-reviewed articles will be published in a Special Issue on Semantic Processing of Legal Texts of the <em>AI and Law Journal</em> (Springer).</p>
<p><strong>Contact Information</strong>:</p>
<p>Address any queries regarding the workshop to:</p>
<p>lrec_legalWS@ilc.cnr.it</p>
<p><strong>Program Committee Co-Chairs</strong>:</p>
<p>Enrico Francesconi (National Research Center, Italy)<br />
Simonetta Montemagni (National Research Center, Italy)<br />
Wim Peters (University of Sheffield, UK)<br />
Adam Wyner (University of Liverpool, UK)</p>
<p><strong>Program Committee (Preliminary)</strong>:</p>
<p>Kevin Ashley (University of Pittsburgh, USA)<br />
Johan Bos (University of Rome, Italy)<br />
Daniele Bourcier (Humboldt Universitat, Germany)<br />
Pompeu Casanovas (Universitat Autonoma de Barcelona, Spain)<br />
Jack Conrad (Thomson Reuters, USA)<br />
Matthias Grabmair (University of Pittsburgh, USA)<br />
Antonio Lazari (Scuola Superiore S.Anna, Italy)<br />
Leonardo Lesmo (Universita di Torino, Italy)<br />
Marie-Francine Moens (Katholieke Universiteit Leuven, Belgium)<br />
Thorne McCarty (Rutgers University, USA)<br />
Raquel Mochales Palau (Catholic University of Leuven, Belgium)<br />
Paulo Quaresma (Universidade de Evora, Portugal)<br />
Tony Russell-Rose (UXLabs, UK)<br />
Erich Schweighofer (Universitat Wien, Austria)<br />
Rolf Schwitter (Macquarie University, Australia)<br />
Manfred Stede (University of Potsdam, Germany)<br />
Daniela Tiscornia (National Research Council, Italy)<br />
Tom van Engers (University of Amsterdam, Netherlands)<br />
Giulia Venturi (Scuola Superiore S.Anna, Italy)<br />
Vern R. Walker (Hofstra University, USA)<br />
Radboud Winkels (University of Amsterdam, Netherlands)</p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/12/19/cfp-workshop-on-semantic-processing-of-legal-texts-splet-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Papers Accepted to the JURIX 2011 Conference</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/10/13/papers-accepted-to-the-jurix-2011-conference/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/10/13/papers-accepted-to-the-jurix-2011-conference/#comments</comments>
		<pubDate>Thu, 13 Oct 2011 13:42:04 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[RDF/XML]]></category>
		<category><![CDATA[e-Government]]></category>
		<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[text analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1204</guid>
		<description><![CDATA[My colleagues and I have had two papers (one long and one short) accepted for presentation at The 24th International Conference on Legal Knowledge and Information Systems (JURIX 2011). The papers are available on the links. On Rule Extraction from Regulations Adam Wyner and Wim Peters Abstract Rules in regulations such as found in the [...]]]></description>
			<content:encoded><![CDATA[<p>My colleagues and I have had two papers (one long and one short) accepted for presentation at <a href="https://www.univie.ac.at/RI/JURIX2011/">The 24th International Conference on Legal Knowledge and Information Systems (JURIX 2011)</a>.  The papers are available on the links.</p>
<p><a href="http://wyner.info/research/Papers/WynerPetersJURIX2011.pdf">On Rule Extraction from Regulations</a><br />
Adam Wyner and Wim Peters</p>
<p>Abstract<br />
Rules in regulations such as found in the US Federal Code of Regulations can be expressed using conditional and deontic rules.  Identifying and extracting such rules from the language of the source material would be useful for automating rulebook management and translating into an executable logic.  The paper presents a linguistically-oriented, rule-based approach, which is in contrast to a machine learning approach.  It outlines use cases, discusses the source materials, reviews the methodology, then provides initial results and future steps.</p>
<p><a href="http://wyner.info/research/Papers/Populating_JURIX2011.pdf">Populating an Online Consultation Tool</a><br />
Sarah Pulfrey-Taylor, Emily Henthorn, Katie Atkinson, Adam Wyner, and Trevor Bench-Capon</p>
<p>Abstract<br />
The paper addresses the extraction, formalisation, and presentation of public policy arguments.  Arguments are extracted from documents that comment on public policy proposals.  Formalising the information from the arguments enables the construction of models and systematic analysis of the arguments.  In addition, the arguments are represented in a form suitable for presentation in an online consultation tool.  Thus, the forms in the consultation correlate with the formalisation and can be evaluated accordingly.  The stages of the process are outlined with reference to a working example.</p>
<p><a href="http://wyner.info/LanguageLogicLawSoftware/?p=1204">Shortlink to this page.</a></p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/10/13/papers-accepted-to-the-jurix-2011-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Draft &#8212; Materials for LEX 2011</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/09/08/materials-for-lex-2011/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/09/08/materials-for-lex-2011/#comments</comments>
		<pubDate>Thu, 08 Sep 2011 12:03:10 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[RDF/XML]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[text analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1146</guid>
		<description><![CDATA[Draft post At the links below, you can find the slides and hands on materials on GATE for the LEX summer school on Managing Legal Resources in the Semantic Web. GATE Legislative Rulebook By Adam Wyner Distributed under the Creative Commons Attribution-Non-Commercial-Share Alike 2.0]]></description>
			<content:encoded><![CDATA[<p>Draft post</p>
<p>At the links below, you can find the slides and hands on materials on GATE for the LEX summer school on Managing Legal Resources in the Semantic Web.</p>
<p><a href="http://wyner.info/research/Papers/WynerGATELegislativeRulebook.zip">GATE Legislative Rulebook</a></p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/09/08/materials-for-lex-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Recent Paper</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/06/13/recent-paper/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/06/13/recent-paper/#comments</comments>
		<pubDate>Mon, 13 Jun 2011 16:29:50 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[case-based reasoning]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[text analytics]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1134</guid>
		<description><![CDATA[A paper I presented at 4th Workshop on Legal Ontologies and Artificial Intelligence Techniques is to appear in the journal Rivista Informatica e diritto, an Italian journal on AI and Law. Towards Annotating and Extracting Textual Legal Case Elements Adam Wyner Abstract The paper presents an outline of a method for semantic, conceptual search in [...]]]></description>
			<content:encoded><![CDATA[<p>A paper I presented at <a href="http://www.ittig.cnr.it/loait/loait10.html">4th Workshop on Legal Ontologies and Artificial Intelligence Techniques</a> is to appear in the journal <a href="http://www.ittig.cnr.it/EditoriaServizi/AttivitaEditoriale/InformaticaEDiritto/presentazione.htm">Rivista Informatica e diritto</a>, an Italian journal on AI and Law.</p>
<p><a href="http://wyner.info/research/Papers/WynerLOAIT2010Final.pdf">Towards Annotating and Extracting Textual Legal Case Elements</a><br />
Adam Wyner</p>
<p>Abstract<br />
The paper presents an outline of a method for semantic, conceptual search in legal case documents using the GATE tool.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/06/13/recent-paper/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>General Architecture for Text Engineering Summer School 2011</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/05/22/general-architecture-for-text-engineering-summer-school-2011/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/05/22/general-architecture-for-text-engineering-summer-school-2011/#comments</comments>
		<pubDate>Sun, 22 May 2011 19:31:49 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[ontology]]></category>
		<category><![CDATA[text analytics]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1090</guid>
		<description><![CDATA[I had the opportunity (thanks Katie Atkinson!) to attend the General Architecture for Text Engineering Summer School 2011. The GATE people have really developed this summer school very well. It was well attended (70 participants?) and well structured (three sections and various talks). GATE attacts a good, outgoing, helpful, and diverse group of people. A [...]]]></description>
			<content:encoded><![CDATA[<p>I had the opportunity (thanks <a href="http://www.csc.liv.ac.uk/~katie/">Katie Atkinson</a>!) to attend the <a href="https://gate.ac.uk/conferences/fig/fig4.html">General Architecture for Text Engineering Summer School 2011</a>.  The GATE people have really developed this summer school very well.  It was well attended (70 participants?) and well structured (three sections and various talks).  GATE attacts a good, outgoing, helpful, and diverse group of people.  A whole week of GATE and never a dull moment.  Geeky, but true.  And text analytics seems to be a growing area (at least according to the May 2011 issue of New Scientist, which lists it as one of seven &#8220;disruptive&#8221; technologies; I&#8217;ve always wanted to be bad).</p>
<p>As this was my second time at the GATE summer school, I sat in on the Advanced GATE session.  All the slides and all the materials for hands on exercises are available on the <a href="https://gate.ac.uk/wiki/TrainingCourseMay2011/">GATE Summer School Wiki</a>.  In my week, we covered the following:</p>
<ul>
<li>Module 9: Ontologies and Semantic Annotation
<ul>
<li><em>Introduction to Ontologies</em></li>
<li><em>GATE Ontology Editor</em></li>
<li><em>GATE Ontology Annotation Tools for Entities and Relations</em></li>
<li><em>Automatic Semantic Annotation in GATE</em></li>
<li><em>Measuring Performance</em></li>
<li><em>Using the Large Knowledge Base gazetteer (LKB)</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 10: Advanced GATE Applications
<ul>
<li><em>Customising ANNIE</em></li>
<li><em>Working with different languages</em></li>
<li><em>Complex applications</em></li>
<li><em>Conditional Processing</em></li>
<li><em>Section-by-section processing</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 11: Machine Learning
<ul>
<li><em>Machine learning and evaluation concepts</em></li>
<li><em>Using ML in GATE</em></li>
<li><em>Engines and algorithms)</em></li>
<li><em>Entity learning hands-onl session</em></li>
<li><em>Relation extraction hands-on session</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 12: Opinion Mining
<ul>
<li><em>Introduction to opinion mining and sentiment analysis</em></li>
<li><em>Using GATE tools to perform sentiment analysis</em></li>
<li><em>Machine learning for sentiment analysis hands-on session</em></li>
<li><em>Future directions for opinion mining</em></li>
</ul>
</li>
</ul>
<ul>
<li>Module 13: Semantic Technology and Linked Open Data: Basics, Tools, and Applications
<ul>
<li><em>Linked Open Data: Introduction of key principles  and some key tools (FactForge, LinkedLifeData)</em></li>
<li><em>Semantic Annotation with Linked Data</em></li>
<li><em>Semantic Search</em></li>
</ul>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/05/22/general-architecture-for-text-engineering-summer-school-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ICAIL 2011 Tutorial:  Textual Information Extraction from Legal Resources Using GATE</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/02/19/textual-information-extraction-from-legal-resources-using-gate/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/02/19/textual-information-extraction-from-legal-resources-using-gate/#comments</comments>
		<pubDate>Sat, 19 Feb 2011 18:55:49 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=1011</guid>
		<description><![CDATA[Slides for ICAIL tutorial, Monday, June 6, 2011, University of Pittsburgh. Textual Information Extraction from Legal Resources using GATE]]></description>
			<content:encoded><![CDATA[<p>Slides for ICAIL tutorial, Monday, June 6, 2011, University of Pittsburgh.</p>
<p><a href="http://wyner.info/research/Papers/WynerICAIL2011LegalTextAnalytics02.pdf">Textual Information Extraction from Legal Resources using GATE</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2011/02/19/textual-information-extraction-from-legal-resources-using-gate/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Legal Know-How Workshop Presentations</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/11/16/legal-know-how-workshop-presentations/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/11/16/legal-know-how-workshop-presentations/#comments</comments>
		<pubDate>Tue, 16 Nov 2010 16:24:07 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[law]]></category>
		<category><![CDATA[legal knowledge management]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=932</guid>
		<description><![CDATA[December 10, 2010, I gave a presentation at the International Society for Knowledge Organisation&#8217;s meeting on Legal Know-How. It was an interesting meeting, where I got the opportunity to present my work to members of the legal profession, hear what law firms are doing about knowledge management, and make some good new contacts. The slides [...]]]></description>
			<content:encoded><![CDATA[<p>December 10, 2010, I gave a presentation at the International Society for Knowledge Organisation&#8217;s meeting on Legal Know-How.  It was an interesting meeting, where I got the opportunity to present my work to members of the legal profession, hear what law firms are doing about knowledge management, and make some good new contacts.</p>
<p>The slides of all the talks, including mine, are available:</p>
<p><a href="http://www.iskouk.org/events/legal_knowledge_nov2010.htm">ISKO-UK Legal Know-How meeting</a></p>
<p>In a couple of weeks, ISKO will also add mp3s of the talks, so one can see the slides and hear the talks.  Nice way to do things, as remarks and narration are almost more crucial than the slides themselves.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/11/16/legal-know-how-workshop-presentations/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Paper accepted at JURIX 2010</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/10/08/paper-accepted-at-jurix-2010/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/10/08/paper-accepted-at-jurix-2010/#comments</comments>
		<pubDate>Fri, 08 Oct 2010 16:59:54 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=869</guid>
		<description><![CDATA[My colleague Wim Peters and I have had our paper Lexical Semantics and Expert Legal Knowledge towards the Identification of Legal Case Factors accepted for presentation at JURIX 2010. The list of accepted papers is here. The paper will appear in the proceedings, but it is available by clicking on the paper title above. Abstract [...]]]></description>
			<content:encoded><![CDATA[<p>My colleague Wim Peters and I have had our paper</p>
<p><a href="http://wyner.info/research/Papers/WynerPetersCaseFactorsJURIX2010Final.pdf">Lexical Semantics and Expert Legal Knowledge towards the Identification of Legal Case Factors</a></p>
<p>accepted for presentation at <a href="http://conference.jurix.nl/2010/">JURIX 2010</a>.  The list of accepted papers is <a href="http://www.jurix.nl/?p=197">here</a>.  The paper will appear in the proceedings, but it is available by clicking on the paper title above.</p>
<p><em>Abstract</em><br />
Legal case factors are textually represented facts which are represented in reported legal case decisions.  Precedent decisions contribute to the decision of a case under consideration.  As textually represented facts, factors linguistically encode semantic properties and relationships among the entities which can be leveraged to identify and extract the legal case factors from decisions.  We integrate legal and linguistic resources in a text analysis tool with which we annotate textual passages.  Using annotations tailored to legal case factors, the legal researcher can rapidly zero in on textual spans which represent specific combinations of factors, participants, and semantic properties which bear on who played what role with respect to a factor.  The research reports progress on the development of a tool.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/10/08/paper-accepted-at-jurix-2010/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Information Extraction of Legal Case Features with Lists and Rules</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/21/information-extraction-of-legal-case-features-with-lists-and-rules/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/21/information-extraction-of-legal-case-features-with-lists-and-rules/#comments</comments>
		<pubDate>Thu, 21 Jan 2010 18:04:01 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=445</guid>
		<description><![CDATA[In this post, we show how legal case features can be annotated using lists and rules in GATE. By features, we mean a range of detailed information that may be relevant to searching for cases or extracting information such as the parties, the other legal professionals involved (judges, lawyers, etc), location, decision, case citation, legislation, [...]]]></description>
			<content:encoded><![CDATA[<p>In this post, we show how legal case features can be annotated using lists and rules in GATE.  By features, we mean a range of detailed information that may be relevant to searching for cases or extracting information such as the parties, the other legal professionals involved (judges, lawyers, etc), location, decision, case citation, legislation, and so on.  In a forthcoming related post, we discuss how to use an ontology to annotate cases.  We have some background discussion of case based reasoning <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/information-extraction-of-legal-case-factors/">Information Extraction of Legal Case Factors</a>.  (See <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/introduction-to-a-series-of-posts-on-legal-information-extraction-with-gate/">introductory notes</a> on this and related posts.)</p>
<p><strong>Features of cases</strong></p>
<p>Legal cases contain a wealth of detailed information such as:</p>
<ul>
<li>Case citation.
</li>
<li>Names of parties.
</li>
<li>Roles of parties, meaning plaintiff or defendant.
</li>
<li>Sort of court.
</li>
<li>Names of judges.
</li>
<li>Names of attorneys.
</li>
<li>Roles of attorneys, meaning the side they represent.
</li>
<li>Final decision.
</li>
<li>Cases cited.
</li>
<li>Relation of precedents to current case.
</li>
<li>Case structural features such as sections.
</li>
<li>Nature of the case, meaning using keywords to classify the case in terms of subject (e.g. criminal assault, intellectual property, &#8230;.)
</li>
</ul>
<p>With respect to these features, one would want to make a range of queries (using some appropriate query language).</p>
<ul>
<li>In what cases has company X been a defendant?
</li>
<li>In what cases has attorney Y worked for company X, where X was a defendant?
</li>
<li>What are the final decisions for judge Z?
</li>
<li>If the case concerns criminal assault, was a weapon used?
</li>
</ul>
<p>We initially based our work on <a href="http://www-rohan.sdsu.edu/~bransfor/thesis/thesis/index.html">Bransford-Koons Ph.D. Thesis 2005</a>, commenting on, adapting, and adding to it.  We used cases from California Criminal Courts which were used in that work since the lists and rules are highly specific. </p>
<p><strong>Output</strong></p>
<p>We have the following sample outputs from our lists and rules applied to <em>People v. Coleman, 117 Cal App. 2d 565</em>.  In the first figure, we find the address, court district, citation, case name, counsels for each side, and the roles.  There are aspects which need to be further cleaned up, but this gives a flavour of the annotations.</p>
<p><img src="http://wyner.info/research/L3S/Graphics/CalCrimAnnot01.png" alt="Case Features I" /></p>
<p>In the second figure, we focus on additional information such as structural sections (e.g. Opinion), the name of the judge, and terms having a bearing on criminal assault and weapons.</p>
<p><img src="http://wyner.info/research/L3S/Graphics/CalCrimAnnot02.png" alt="Case Features II" /></p>
<p>In the final figure, we identify the decision.</p>
<p><img src="http://wyner.info/research/L3S/Graphics/CalCrimAnnot03.png" alt="Case Features III" /></p>
<p><strong>GATE</strong></p>
<p>In the <a href="http://wyner.info/research/L3S/Files/DSACaseInfo.tar.gz">archive</a>, we have the application, lists, JAPE rules, and graphics.  The <em>lists.def</em> file in this archive are associated with the various other lists.  The JAPE rules may have different names from what is found in the application and discussed below, but (so far as we understand), this should make no difference in the functionality.</p>
<p><em>Lists</em></p>
<p>Gazetteer lists which were used are the following; these are lists contained in a master list labelled DSAGaz.  We samples and comment below.</p>
<ul>
<li>lists.def.  The gazetteer list which contains the lists below.  When importing this along with the standard ANNIE list, this list is renamed in the application.
<li>attack_words.lst.  Actions that can be construed as attacks such as <em>hit, hitting, throw, thrown, threw,&#8230;.</em>
</li>
<li>intention.lst.  Terms for intention such as <em>intend, intends, intending,&#8230;, expect, expects,&#8230;.</em>
</li>
<li>judgements.lst.  Terms related to judgment such as <em>granted, denied, reversed, overturned, remanded,&#8230;.</em>
</li>
<li>judgeindicator.lst.  The indicator <em>J.</em>.  This is a problematic indicator if it is part of an individual&#8217;s name.
</li>
<li>criminal_assault.lst.  Terms related to assault such as <em>assault, violent injury, ability,&#8230;.</em>  It is unclear just how cohesive this set of terms is.
</li>
<li>legal_appellate_districts.lst.  A list of appellate districts such as <em>Fifth Appellate District, Fifth Dist.,&#8230;.</em>
</li>
<li>legal_casenames.lst.  Terms that can be used to indicate case names such as <em>v., In Re, </em>
</li>
<li>legal_counselnames.lst.  Terms for counselor titles such as <em>Attorney General, Deputy Public Defender,&#8230;.</em>
</li>
<li>legal_general.lst.  Terms for footnotes or numbering sections such as <em>fn., No.,&#8230;.</em>
</li>
<li>legal_opinion_sections.lst.  Terms for sections of legal opinion such as <em>concurring, counsel, dissenting, opinion,&#8230;.</em>
</li>
<li>legal_coa.lst.  Terms for causes of action such as <em>aggravated assault, assault, breaking and entering, burglary, robbery,&#8230;.</em>
</li>
<li>legal_code_citations.lst.  Code citation information such as <em>Civ. Code, Penal Code,&#8230;.</em>
</li>
<li>us_district_abb_01.lst.  Abbreviations for legal districts such as <em>Cal., P., Wis.,&#8230;.</em>
</li>
<li>us_context_abb_01.lst.  Abbreviations for participant roles such as <em>App., Rptr,&#8230;.</em>
</li>
<li>legal_citations.lst.  Abbreviations for citations and related to districts such as <em>Cal.2d, Cal.App. 3d,&#8230;.</em>
</li>
<li>legal_parties.lst.  Terms for legal roles such as <em>amicus curie, appellant, appellee, counsel, defendant, plaintiff, victim, witness,&#8230;.</em>
</li>
<li>lower_courts.lst.  Phrases for other courts such as <em>Municipal Court of, Superior Court of,&#8230;.</em>
</li>
<li>possible_weapons.lst.  A list of items that could be weapons such as <em>automobile, bat, belt,&#8230;.</em>
</li>
<li>weapons.lst.  A list of items that are weapons such as <em>assault rifle, axe, club, fist, gun,&#8230;.</em>
</li>
</ul>
<p><em>Discussion of Lists</em></p>
<p>We used some of the lists directly from Bransford-Koons 2005, but they are clearly in need of reconstruction and extension.  A general problem is that the lists are defined for US case law and particularly the California district courts.  Thus, we cannot simply apply the lists to different jurisdictions, e.g. the United Kingdom; the lists and rules must be relativised to different contexts.   More technically, lists have alternative graphical (capital or lower case) or morphological forms, which would be better addressed using a Flexible Gazetteer.  In addition, it is unclear how one could bound the range of relevant terms appropriately and give them interpretations that are relevant to the context; in general, a lexicon or ontology could give us a better list of terms, but we must find some means to construe them as need be in the legal context.  For example, we have a range of <em>attack action terms</em> such as <em>hit, hitting, throw, thrown, threw,&#8230;.</em>; in some contexts these actions need not be construed as attack, e.g. baseball.  Some means needs to be found to ascribe the appropriate interpretation in context.  A related issue is whether we must list all alternative forms of some terms (also taking into consideration spaces) or whether we can better write JAPE rules;  this is relevant for the list of appellate districts, where we find both abbreviations and alternative elements of information as in <em>Fifth Appellate District</em>, <em>Fifth Appellate District Div 1</em>, and  <em>Fifth Appellate District, Division 1</em>.  Along these lines, we would prefer a systematic means to relate abbreviations to the terms they abbreviate.  In our view, more general solutions are better than specific ones which list information; lists ought to be contain arbitrary information, while JAPE rules construct systematic information.</p>
<p><em>JAPE Rules</em></p>
<p>Given the lists, we have JAPE rules to annotate the relevant portions of text.</p>
<ul>
<li>AppellantCounsel:  annotates the appellant counsel.
</li>
<li>RespondentCounsel:  annotates the respondent counsel.
</li>
<li>DSACounsellor:  annotates counsels.
</li>
<li>SectionsTerm:  annotates sections relative to the list of section terms.
</li>
<li>CaseRoleGeneral
</li>
<li>DSACaseName2:  annotates the case name.
</li>
<li>DSACaseName:  annotates the case name.
</li>
<li>DSACaseCit:  annotates the case citation.
</li>
<li>CriminalAssault:  annotates terms for criminal assault.
</li>
<li>CauseOfAction:  annotates for causes of action.
</li>
<li>AttackTerm:  annotates attack terms.
</li>
<li>AppellateDistrict:  annotates districts of courts.
</li>
<li>DecisionStatement:  annotates a sentence as the decision statement.
</li>
<li>JudgementTerm:  annotates terms related to judgement.
</li>
<li>JudgeName:  annotates the names of judges.
</li>
<li>JudgeInd:  annotates the judge name indicator.
</li>
<li>IntentTerm:  annotates terms of intent.
</li>
</ul>
<p><em>Discussion</em></p>
<p>Some of these rules annotate sentences, while others annotate entities with respect to some property.  Some of the rules don&#8217;t work quite as well as we would wish and could stand further refinement such as the rule for the roles of counsels; the solution we have is rather ad hoc.  Nonetheless, as a first pass, the lists and rules give some indication of what is possible.</p>
<p><em>Order of application</em></p>
<ul>
<li>Document Reset PR
</li>
<li>RegexSentenceSplitter
</li>
<li>ANNIE English Tokeniser
</li>
<li>ANNIE POS Tagger
</li>
<li>MorphologicalAnalyzer
</li>
<li>DSAGaz
</li>
<li>AnnieGaz
</li>
<li>Flexible Gazetteer
</li>
<li>NPChunker
</li>
<li>ANNIE NE Transducer
</li>
<li>IntentTerm
</li>
<li>JudgeInd
</li>
<li>JudgeName
</li>
<li>JudgementTerm
</li>
<li>DecisionStatement
</li>
<li>Weapons
</li>
<li>AppellateDistrict
</li>
<li>AttackTerm
</li>
<li>CauseOfAction
</li>
<li>CriminalAssault
</li>
<li>DSACaseCit
</li>
<li>DSACaseName
</li>
<li>DSACaseName2
</li>
<li>DSACaseNameAZW
</li>
<li>CaseRoleGeneral
</li>
<li>SectionsTerm
</li>
<li>DSACounsellor
</li>
<li>RespondentCounsel
</li>
<li>AppellantCounsel
</li>
</ul>
<p><strong>Discussion</strong></p>
<p>Despite the limitations, this gives some useful, preliminary results which can easily be built upon.  Moreover, we know of no other public, open system of annotating case elements (or factors).</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/uk/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/21/information-extraction-of-legal-case-features-with-lists-and-rules/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Information Extraction with ANNIC</title>
		<link>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/information-extraction-with-annic/</link>
		<comments>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/information-extraction-with-annic/#comments</comments>
		<pubDate>Wed, 20 Jan 2010 17:55:48 +0000</pubDate>
		<dc:creator>Adam Wyner</dc:creator>
				<category><![CDATA[GATE]]></category>
		<category><![CDATA[legal knowledge engineering]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://wyner.info/LanguageLogicLawSoftware/?p=636</guid>
		<description><![CDATA[In Information Extraction of Legal Case Factors, we presented lists and rules for annotation of legal case factors. In this post, we go one step further and use the ANNotations In Context (ANNIC) tool of GATE. This is a plug which helps to search for annotations, visualise them, and inspect features. It is useful for [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/information-extraction-of-legal-case-factors/">Information Extraction of Legal Case Factors</a>, we presented lists and rules for annotation of legal case factors.  In this post, we go one step further and use the <strong>ANN</strong>otations <strong>I</strong>n <strong>C</strong>ontext (ANNIC) tool of GATE.  This is a plug which helps to search for annotations, visualise them, and inspect features.  It is useful for JAPE rule development.  We outline how to plug in, load, and run ANNIC.  (See <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/introduction-to-a-series-of-posts-on-legal-information-extraction-with-gate/">introductory notes</a> on this and related posts.)</p>
<p><strong>Introduction to ANNIC</strong></p>
<p>ANNIC is an annotation indexing and retrieval system.  It is integrated with the data stores, where results of annotations on a corpus can be saved.  Once a processing pipeline is run over the corpus, we can use ANNIC to query and inspect the contexts where annotations appear; the queries are in a subset of the JAPE language, so can be complex.  The results of the queries are presented graphically, making them easy to understand.  As such, ANNIC is a very useful tool in the development of rules as one can discover and test patterns in corpora.  There is also an export facility, so the results can be presented in a file, but this is not a full information extraction system such as one might want with templates.</p>
<p>For later, but important to know from the documentation:  &#8220;Be warned that only the annotation sets, types and features initially indexed will be updated when adding/removing documents to the datastore. This means, for example, that if you add a new annotation type in one of the indexed document, it will not appear in the results when searching for it.&#8221;  This implies that where one adds new annotations to the pipeline, one should delete the old data store and create another one with respect to the new results.  For example, if one ran the pipeline without POS, one cannot add POS later and inspect it in the pipeline.}</p>
<p>Further details on ANNIC are available at <a href="http://gate.ac.uk/g8/page/print/2/releases/gate-5.0-build3244-ALL/doc/tao/splitch9.html#x11-3290009.29">GATE documentation on ANNIC</a> and there is an <a href="http://">online video</a>.</p>
<p><strong>Instantiating the serial data store</strong></p>
<p>The following steps are used to create the requisite parts and inspect them with ANNIC.  One starts with an empty GATE, then adds processing resources, language resources, and pipelines since these can all be related to the data store in a later step.  This material is adapted or adopted from the GATE ANNIC documentation, cutting out many of the options.  To instantiate a serial data store (SSD), which is how the annotated documents are saved and searched.  The application, lists, and rules that this example uses is from <a href="http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/information-extraction-of-legal-case-factors/">Information Extraction of Legal Case Factors</a>.</p>
<ul>
<li>RC on Datastores > Create datastore.</li>
<li>From the drop-down list select &#8220;Lucene Based Searchable DataStore&#8221;.</li>
<li>At the input window, provide the following parameters:</li>
<ul>
<li>DataStore URL: Select an empty folder where the data store is created.</li>
<li>Index Location: Select an empty folder. This is where the index will be created.</li>
<li>Annotation Sets: Provide the annotation sets that you wish to include or exclude from being indexed.  There are options here, but we want to index all the annotation sets in all the documents, so make this list empty.</li>
<li>Base-Token Type: These are the basic tokens of any document (e.g. Token) which your documents must in order to get indexed.</li>
<li>Index Unit Type: This specifies the unit of Index (e.g. Sentence). In other words, annotations lying within the boundaries of the annotations are indexed (e.g. in the case of Sentence, no annotations that are spanned across the boundaries of two sentences are considered for indexing).  We use the Sentence unit.</li>
<li>Features: Users can specify the annotation types and features that should be included or excluded from being indexed (e.g. exclude SpaceToken, Split, or Person.matches).</li>
</ul>
<li>Click OK. If all parameters are OK, a new empty searchable SSD will be created.</li>
<li>Create an empty corpus and save it to the SSD.</li>
<li>Populate the corpus with some documents.  Each document in the corpus is automatically indexed and saved to the data store.</li>
<li>Load some processing resources and then a pipeline.  Run the pipeline over the corpus.</li>
<li>Once the pipeline has finished (and there are no errors), save the corpus in the SSD by right clicking on the corpus, then &#8220;Save to its datastore&#8221;.</li>
<li>Double click on the SSD file under Datastores.  Click on the &#8220;Lucene DataStore Searcher&#8221; tab to activate the search GUI.</li>
<li>Now you are ready to specify a search query of your annotated documents in the SSD.</li>
</ul>
<p><strong>Output</strong></p>
<p>The GUI opens with parts as shown in the following two figures:</p>
<p><img src="http://wyner.info/research/L3S/Graphics/ANNIC01.png" alt="ANNIC search for "Trandes'' string" /></p>
<p><img src="http://wyner.info/research/L3S/Graphics/ANNIC02.png" alt="ANNIC search for disclosure concept" /></p>
<p><strong>Working with the GUI</strong></p>
<p>The figures above show three main sections.  In the top section, left section, there is a blank text area in which one can write a query (more on this below); the search query returns the &#8220;content&#8221; of the annotations.  There are options to select a corpus, annotation set, the number of results, the size of the context (e.g. the number of tokens to the left and right of what one searches for).  In the central section, one can see a visualisation of annotations and values given the search query.  In the bottom section, one has a list of the matches to the query across the corpus, giving the left and right contexts relative to the search results.  An annotation rows manager lets one add (green plus sign) or remove (red minus sign) annotation types and features to display in the central section.  The bottom section contains the results table of the query, i.e. the text that matches the query with their left and right contexts. The bottom section also contains tabbed panes of statistics such as how many instances of particular annotation appear.</p>
<p><em>Queries</em></p>
<p>The queries written in the blank text area are a subset of the JAPE patterns and use the annotations used in the pipeline.  Queries are activated by hitting ENTER (or the Search icon).  The following are some template patterns that can be used Below we give a few examples of JAPE pattern clauses which can be used as SSD queries.</p>
<ul>
<li>String</li>
<li>{AnnotationType}</li>
<li>{AnnotationType == String}</li>
<li>{AnnotationType.feature == feature value}</li>
<li>{AnnotationType1, AnnotationType2.feature == featureValue} </li>
<li>{AnnotationType1.feature == featureValue,<br />
AnnotationType2.feature == featureValue}</li>
</ul>
<p>Specific queries are:</p>
<ul>
<li>Trandes &#8212; returns all occurrences of the string where it appears in the corpus.</li>
<li>{Person} &#8212; returns annotations of type Person.</li>
<li>{Token.string == &#8220;Microsoft&#8221;} &#8212; returns all occurrences of &#8220;Microsoft&#8221;.</li>
<li>{Person}({Token})*2{Organization} &#8212; returns Person followed by zero or up to two tokens followed by Organization.</li>
<li>{Token.orth==&#8221;upperInitial&#8221;, Organization} &#8212; returns Token with feature orth with value set to &#8220;upperInitial&#8221; and which is also annotated as Organization.</li>
<li>{Token.string==&#8221;Trandes&#8221;}{Token})*10{Secret} &#8212; returns string &#8220;Trandes&#8221; followed by zero to ten tokens followed by Secret.</li>
<li>{Token.string ==&#8221;not&#8221;}({Token})*4{Secret} &#8212; returns the string &#8220;not&#8221;, followed by 4 or less tokens, followed by something annotated with Secret.</li>
</ul>
<p>An example of a result for the last query is:</p>
<p><em>Trandes averred nothing more than that it possessed secret.</em></p>
<p>In ANNIC, the result of the query appears as:</p>
<p><img src="http://wyner.info/research/L3S/Graphics/AnnicSecret02.png" alt="ANNIC search for negation and disclosure concept" /></p>
<p>One can write queries using the JAPE operators:  | (OR operator), +, and *.  ({A})+n means one and up to n occurrences of annotation {A}, and ({A})*n means zero or up to n occurrences of annotation {A}.</p>
<p><strong>Summary</strong></p>
<p>ANNIC is particularly useful in writing and refining one&#8217;s JAPE rules.  Finally, one&#8217;s results can be exported at HTML files.</p>
<p>By Adam Wyner<br />
Distributed under the Creative Commons<br />
<a href="http://creativecommons.org/licenses/by-nc-sa/2.0/">Attribution-Non-Commercial-Share Alike 2.0</a></p>
]]></content:encoded>
			<wfw:commentRss>http://wyner.info/LanguageLogicLawSoftware/index.php/2010/01/20/information-extraction-with-annic/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

