On the Way to Semantic Legal Knowledge Systems Erich Schweighofer http://rechtsinformatik.univie.ac.at NII Shonan Meeting Seminar 057 Towards Explanation Production Combining Natural Language Processing and Logical Reasoning Shonan-EXPCOLL2014 - November 26-30, 2014
33
Embed
On the Way to Semantic Legal Knowledge Systems Erich Schweighofer NII Shonan Meeting Seminar 057 Towards Explanation.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
On the Way to Semantic Legal Knowledge Systems
Erich Schweighoferhttp://rechtsinformatik.univie.ac.at
NII Shonan Meeting Seminar 057 Towards Explanation Production Combining Natural Language Processing and Logical ReasoningShonan-EXPCOLL2014 - November 26-30, 2014
◦6 views of a legal information systemWhat lawyers need? Semantic legal knowledge systemSome theory Dynamic Electronic Legal
Commentary◦Main tools
Conclusions
Legal knowledge challenge (1)
Knowledge is the main production factor for law
◦ Model of a legal system ◦ Huge (gigabytes (GB), millions of documents, about 100,000
rules, about 300.000 words, more than 10.000 legal concepts,)◦ All or nothing … every document maybe relevant (no “toy
system”)◦ Highly relevant networks of documents◦ Dynamic (daily changes!) – real time information system ◦ Complex (many document types, advanced structure, legal
processes) Problem: How to master the body of
knowledge of the legal order?◦ Media: papyrus, paper, hard disk, DVD, memory disk, etc.◦ Representation: rolls, books, journals, DVDs, online services,
Data, information and knowledgeConcepts of data, information and knowledge are vaguely
defined; different definitions existData: syntactic representation; collection of numbers,
characters and images in a (ICT) digital (binary) character set; everything that is not computer code
◦ Law: prints of books and journals of a library, source code of documents in a legal retrieval systems or of web documents
Information: syntactic representation with semantic meaning, message, output, (sensory) input
◦ Law: laws, judgements, regulations, directives, decisions, facts, advisory opinions, etc. as structured documents in a printed or electronic text corpora
Knowledge: what is known; expertise & skills, either as an abstraction of all available knowledge or a personal capacity acquired through experience or education
◦ Law: team of highly qualified lawyers, e.g. high courts, law faculties, law firms, etc. , in the future: legal knowledge systems
5
EU Law-making and law-implementing process
www.laquadrature.net
Each step in the process creates particular documents.
Information retrieval (since 1958!)◦ Text corpus ◦ Index (dictionary) of all words (without stop words)◦ Boolean search with proximity operations◦ Information need has to be represented as a Boolean query ◦ Good query: vocabulary & meta knowledge
Legal Open Data◦ Official Gazettes◦ Public legal information systems (e.g. EUR-Lex)◦ Legal Information Institutes (e.g. AustLII)◦ High standard◦ XML: Akomo Ntoso, Legal XML
7
Erich Schweighofer (2014) 8
What lawyers have … legal text retrieval(2)Advanced information retrieval
◦ Vector Space Model (Smith, Schweighofer/Winiwarter etc.)
◦ Connectionist IR (Belew/Rose, Merkl/Schweighofer etc.) ◦ Probabilistic IR (Inference Networks) (Croft/Turtle etc.)
E-Discovery◦ Extraction of relevant information from electronic text
corpora (electronically stored information or ESI)◦ Pre-trial discovery (USA)◦ Analysis of unstructured data ◦ Electronic Discovery Reference Model (EDRM)
document categorisation, predictive coding etc. Conrad, E-Discovery revisited: the need for artificial
intelligence beyond information retrieval, AI & Law (2010) 18:321 – 345
„Google“ vs. legal search
Best information taken from the web
Method: information retrieval + ranking
Some redundancy Recall
◦ Should be only sufficient; original information desired but not required
Easy vocabulary◦ All (most) terms exist
Exact references to relevant norms, court decisions or literature
Method: Boolean search (proximity operators) information retrieval
No or uncontrollable redundancy
Recall◦ Should be 100%;
original information required
Difficult vocabulary ◦ Only legal concepts
Erich Schweighofer (2014) 11
Status: text-corpus based approach Text-corpus
◦ Task of LIIs (Legal Information Institutes) or publishers or official legal information providers to deliver a comprehensive legal text corpus (multimedia corpus)
◦ Identification and storage of all legal sources◦ Bibliographic data
◦ Hybrid knowledge model (Schweighofer 1999) Present: text of a legal commentary/legal handbook
(mostly print, now also electronically available)◦ Intellectual product of experienced legal writers◦ Not updated regularly
Why not link these semantic representation techniques to text corpora and use knowledge acquisition techniques?
Idea of a Dynamic Electronic Legal Commentary ◦ Schweighofer (Festschrift Seipel 2006, AI & Law 2007)
Erich Schweighofer (2014) 18
Semantic legal knowledge system (1)
Machine has to do more …◦There are too many rules, statutes, court
decisions, administrative decisions, literature texts, grey materials, soft information pieces …
◦Retrieval is too difficult in time of some semantic retrieval by Google (too much training required, impossible trade-off of legal retrieval)
◦Finding the document or document part within millions of documents: ranking problem
◦Clients do not accept any more that it is so difficult to know everything in the law; they also do legal search … with some results
Erich Schweighofer (2014) 19
Semantic legal knowledge system (2)
New co-operation modelSupport◦Semantic representation ◦Meta data ◦Semi-automated tools of text analysis
Use of excellence of lawyers◦Determining relevant parts of a legal decision
even if it changes over time or depends on a particular jurisdiction or court
◦Respect and challenge of views of authorities (Haft)
Erich Schweighofer (2014) 20
Pragmatic approach of legal knowledge representation (1)Legal text corpora & file archives Textual structure
◦ Facts, rules and argumentsCases
◦ Easy cases (standard cases, eligible for automation), hard cases (fight for the best legal solution, legal argumentation skills required), curious cases (legal theory)
Evidence◦ Easy evidence, hard evidence, automatically
Some order with logic ◦ John F. Sowa, Knowledge Representation (2000), p. XII
Erich Schweighofer (2014) 21
Pragmatic approach of legal knowledge representation (2)
“Without logic, a knowledge representation is vague, with no criteria for determination whether statements are redundant or contradictory. Without ontology, the terms and symbols are ill-defined, confused, and confusing. And without computable models, the logic and ontology cannot be implemented in computer programs. Knowledge representation is the application of logic and ontology to the task of constructing computable models for some domain.”
Relations – a better logic model required Hybrid model
◦ Being helpful in a man/machine co-operation using knowledge-based techniques
◦ Erich Schweighofer, Legal Knowledge Representation (1999)
Erich Schweighofer (2014) 22
Pragmatic approach of legal knowledge representation (3)
◦ “Knowledge representation in law is the challenge of how knowledge and information on legal norms, judgements and literature can be represented and how relevant information can be gained for concrete case solutions. This question is at this time above all pursued as special discipline of legal informatics where naturally the emphasis is on automated forms.”
◦ Multimedia representation of knowledge pieces Facts: text, all kind of things, pictures, videos,
intelligent forms, big data (electronic discovery) Rules: text, graphics, visualisations, computer
Constant improving important goal of interpretation and dynamic development of legal system
◦ Wilburg‘s „flexible system“ ("bewegliches System„ (Bydlinski et al.) Interaction of organic co-operative forces in law
◦ Human rights Proportionality between goal of action and its and intrusion in
other rights Fair and just procedure
◦ More use of legal logic and legal ontologies required but so far neglected or ignored
◦ Language use highly important as representation of thoughts of authorities Not many rules but established practice (like English
language)
Dynamic Electronic Legal Commentary (1)
Abstract representation of law in a conceptual & logical-systematic structure; like printed commentary but in a machine-useable format
Legal information system Conceptual structure
◦ Description of the world ([possible] facts) ◦ Description of the law ([possible] rules)
The core: links between possible facts (situations) and legal consequences
Strong use of knowledge acquisition techniques to ensure a daily update◦ Long research practice in legal informatics
Smith, Schweighofer/Winiwarter/Merkl/Dietenbach, Moens, Daniels, Brünninghaus, Wyner, Quaresma etc.
Dynamic Electronic Legal Commentary (2)
Challenge ◦ World ontologies have still some way to improve
sufficiently, legal formalisation has to move from small environments to the real big world
Next step◦ Tools like a navigator [time and document types,
layers of the legal order, consolidated texts] (e.g. PreLex) , citator or terminologist; e.g. a semantic representation of the 6 views
Near future◦ Some automated support for legal subsumption,
e.g. helping in the real game of applying legal provisions (could that also called legal reasoning or a legal expert system
Erich Schweighofer (2014) 27
Tools of a Dynamic Electronic Legal Commentary Classification: document categorisation• Thesaurus: semi-automatic generation of thesaurus descriptors (e.g. work of Madori Ikeda and Akihito Yamamoto)• Citations: automatic general of hypertext links• Temporal relations: automatic generation of temporal relations• Ranking: document vs. search request, document in the text corpus, document in the citations network, document in the time line
◦ Use of textual entailment (e.g. work of Bernardo Magnini, Yosuke Mayao) or Open Information Extraction (e.g. work of Ido Dragan)
• Text summarisation: semi-automatic generation of summaries of documents• Multilingualism: automatic translation of documents (e.g. Google Translate) Free text search like in Westlaw, LexisNexis or in the work of
Yu Asano
Erich Schweighofer (2014) 28
Some formalisation (1) Legal concept:
◦ Header: Measures of equivalent effect (L)◦ Definition: Discriminatory and non-discriminatory rules of
Member States hindering trade between Member States are illegal.
◦ Source: Article 34 TFEU, cases C-267/91 Keck and Mithouard, 120/78 Cassis de Dijon, 8/74 DassonvilleRelations: BT customs, measures of equivalent effect (A), freedom of goods (A)
◦ Classification: 02.40 ◦ Legal conceptual structure: customs union, freedom of goods◦ Other information: none
Fact concept: ◦ Header: Liqueur in Germany (F) ◦ Definition: The minimum amount of alcohol which should
exist in liqueurs was 25% (up to 1978).◦ Relations: Measures having equivalent effect
Erich Schweighofer (2014) 29
Some formalisation (2)◦ Source: DE Brandtweinmonopolgesetz (German liquor
monopoly act) ◦ Classification: 02.40◦ Legal conceptual structure: customs ◦ Links: Measures having equivalent effect ◦ Other information: none
Anchor (link):
◦ Header: Measures having equivalent effect (A)◦ Links: Liqueur in Germany (F), selling arrangements (F),
Edam cheese in France (F), vinegar in Italy (F), beer in Germany (F), resale at a loss (F), advertising restrictions (F), distribution restrictions (F) , measures having equivalent effect (L), Article 34 TFEU, Article 28 EC, Article 30 ECT, Article 30 EECT
◦ etc.
Erich Schweighofer (2014) 30
A lot of work to be done Powerful legal thesaurus (e.g. Switzerland)Better knowledge model with more logic Better extraction rules
◦Probabilistic retrieval techniques not sufficient ◦Textual entailment◦Open Information Extraction ◦More NLP
Legal authors writing in semantic structure (e.g. better semantic representations that can be updated semi-automatically)
Erich Schweighofer (2014) 31
ConclusionsExample of Big Data researchMove to semantic knowledge systems requires
more logic of text analysis and of knowledge representation
Knowledge model Knowledge acquisition tool linking text corpora
and knowledge model Result: some sort of a Dynamic Electronic Legal
Commentary More research necessary to have a better data
basis of unsolved practical problemsStronger co-operation between logicians,
JURIX2014, The 27th International Conference on Legal Knowledge and Information Systems
10-12 December 2014, Jagiellonian University, Kraków, PL
IRIS International Conference on Legal Informatics, 26-28 February 2015, Salzburg, AT
ICAIL 2015, The 15th International Conference on Artificial Intelligence and Law (ICAIL 2015), University of San Diego School of Law from Monday, June 8 to Friday, June 12, 2015, USA