Lecture Notes in Artificial Intelligence 10596 Subseries of Lecture Notes in Computer Science LNAI Series Editors Randy Goebel University of Alberta, Edmonton, Canada Yuzuru Tanaka Hokkaido University, Sapporo, Japan Wolfgang Wahlster DFKI and Saarland University, Saarbrücken, Germany LNAI Founding Series Editor Joerg Siekmann DFKI and Saarland University, Saarbrücken, Germany
14
Embed
Lecture Notes in Artificial Intelligence 10596 - Springer978-3-319-69805-2/1.pdf · Lecture Notes in Artificial Intelligence 10596 Subseries of Lecture Notes in Computer Science
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lecture Notes in Artificial Intelligence 10596
Subseries of Lecture Notes in Computer Science
LNAI Series Editors
Randy GoebelUniversity of Alberta, Edmonton, Canada
Yuzuru TanakaHokkaido University, Sapporo, Japan
Wolfgang WahlsterDFKI and Saarland University, Saarbrücken, Germany
LNAI Founding Series Editor
Joerg SiekmannDFKI and Saarland University, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/1244
This Springer imprint is published by Springer NatureThe registered company is Springer International Publishing AGThe registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Computational and Corpus-Based Phraseology:Recent Advances and Interdisciplinary Approaches
As the late and inspiring John Sinclair (1991, 2007) observed, knowledge of vocab-ulary and grammar is not sufficient for someone to express himself/herself idiomati-cally or naturally in a specific language. One has to have the knowledge and skill toproduce effective and naturally phrased utterances, which are often based on phrase-ological units (the idiom principle). This is in contrast to the traditional assumption oropen choice principle that lies at the heart of generative approaches to language. AsPawley and Syder (1983) stated more than three decades ago, the traditional approachcannot account for nativelike selection (idiomaticity) or fluency.
Language is indeed phraseological and phraseology is the discipline that studiesphraseological units (PUs) or their related concepts referred to (and regarded largelysynonymous) by scholars as multiword units, multiword expressions (MWEs), fixedexpressions, set expressions, phraseological units, formulaic language, phrasemes,idiomatic expressions, idioms, collocations, and/or polylexical expressions. PUs orMWEs, are ubiquitous and pervasive in language. They are a fundamental linguisticconcept that is central to a wide range of natural language processing and appliedlinguistics applications, including, but not limited to, phraseology, terminology,translation, language learning, teaching and assessment, and lexicography. Jackendoff(1977) observes that the number of MWEs in a speaker’s lexicon is of the same orderof magnitude as the number of single words (Jackendoff 1977). Biber et al.(1999) argue that they constitute up to 45% of spoken English and up to 21% ofacademic prose in English. Sag et al. (2002) state that they are overwhelmingly presentin terminology and 41% of the entries in WordNet 1.7 are reported to be MWEs.
PUs do not play a crucial role only in the computational treatment of natural lan-guages. Terms are often MWEs (and not single words), which makes them highlyrelevant to terminology. Translation and interpreting are two other fields wherephraseology plays an important role, as finding correct translation equivalents of PUs isa pivotal step in the translation process. Given their pervasive nature, PUs are absolutelycentral to the work carried out by lexicographers, who analyse and describe both singlewords and PUs. Last but not least, PUs are vital not only for language learning, teaching,and assessment, but also for more theoretical linguistic areas such as pragmatics, cog-nitive linguistics, and construction grammars. All the aforementioned areas are todayaided by (and often driven by) corpora, which makes PUs particularly relevant forcorpus linguists. Finally, PUs provide an excellent basis for inter- and multidisciplinary
studies, fostering fruitful collaborations between researchers across different disciplines,which are, for the time being, unfortunately still largely unexplored.
This volume features a selection of papers written by the invited speakers as well asregular papers presented at the international conference “Computational and Corpus-Based Phraseology: Recent Advances and Interdisciplinary Approaches” (Europhras2017). The conference, which is organised jointly by the European Association ofPhraseology (Europhras) and the Research Institute in Information and LanguageProcessing of the University of Wolverhampton, and sponsored by Europhras, theSketch Engine, ELRA and the University of Wolverhampton, provides the perfectopportunity for researchers to present their work, fostering interaction and collaborationbetween scholars working in disciplines as diverse as natural language processing,translation, terminology, lexicography, languages learning, teaching and assessment,and cognitive science, to name only a few. I organised the volume thematically into thefollowing sections, which demonstrate the breath of the topics represented at Europhras2017: (1) Keynote and Invited Papers, (2) Phraseology in Translation and ContrastiveStudies, (3) Lexicography and Terminography, (4) Exploitation of Corpora inPhraseological Studies, (5) Development of Corpora for Phraseological Studies,(6) Phraseology and Language Learning, (7) Cognitive and Cultural Aspects ofPhraseology, (8) Theoretical and Descriptive Approaches to phraseology, and(9) Computational Approaches to Phraseology. In fact, the variety of topics at Euro-phras 2017 is even more remarkable if we take into account other conference pre-sentations that are not included in this volume – in addition to the regular papers, theconference also featured short papers and posters, which are published separately ase-proceedings with ISBN and DOI numbers assigned to every contribution.
Every submission to the conference was evaluated by three reviewers – i.e.,members of the Programme Committee consisting of 46 scholars from 23 differentcountries, or 12 additional reviewers from eight countries, who were recommended bythe Programme Committee. The conference contributions were authored by a total of91 scholars from 24 different countries. These figures attest to the truly internationaldimension of Europhras 2017.
I would like to thank everyone who made this truly interdisciplinary and interna-tional event possible. I would like to start by thanking all colleagues who submittedpapers to Europhras 2017 and travelled to London to attend the event. I am grateful toall members of the Programme Committee and the additional reviewers for carefullyexamining all submissions and providing substantial feedback on all papers, helpingthe authors of accepted papers to improve and polish the final versions of their papers.A special thanks goes to the invited speakers – both the keynote speakers of the mainconference (Ken Church, Gloria Corpas, Dmitrij Dobrovol’skij, Patrick Hanks, MilošJakubíček) and the invited speakers of the two accompanying workshops (CarlosRamish and Jean-Pierre Colson). Words of gratitude go to our sponsors – Europhras,the Sketch Engine, ELRA, and the University of Wolverhampton.
Last but not least, I would like to use this paragraph to acknowledge the membersof the Organising Committee, who worked very hard during the last 12 months andwhose dedication and efforts made the organisation of this event possible. I would liketo mention (in alphabetical order) the following colleagues whom I would like tohighlight for competently carrying out numerous organisational tasks and being ready
VI Preface
to step in and support the organisation of the conference whenever needed. My bigthank you goes out to Amanda Bloore, Martina Cotella, Arianna Fabbri, April Harper,Sara Moze, Nikolai Nikolov, Ivelina Nikolova, Rocío Sánchez González, AndreaSilvestre Baquero, Shiva Taslimipoor, and Victoria Yaneva.
November 2017 Ruslan Mitkov
Preface VII
Organisation
Europhras 2017 was jointly organised by the European Association for Phraseol-ogy EUROPHRAS, the University of Wolverhampton (Research Institute of Informa-tion and Language Processing), and the Association for Computational Linguistics,Bulgaria.
Programme Committee
Julio Bernal Caro and Cuervo Institute, ColombiaDouglas Biber Northern Arizona University, USANicoletta Calzolari Institute for Computational Linguistics, ItalyMaría Luisa Carrió-Pastor Polytechnic University of Valencia, SpainSheila Castilho Dublin City University, IrelandKenneth Church IBM Research, USAJean-Pierre Colson Université catholique de Louvain, BelgiumGloria Corpas University of Malaga, SpainFrantišek Čermák Charles University in Prague, Czech RepublicAnna Čermáková Charles University, Czech RepublicDimitrij Dobrovol’skij Russian Academy of Sciences, Russian Language
Institute, RussiaJesse Egbert Northern Arizona University, USAThierry Fontenelle Translation Centre for the Bodies of the European
Union, LuxembourgKleanthes K. Grohmann University of Cyprus, CyprusPatrick Hanks University of Wolverhampton, UKUlrich Heid University of Hildesheim, GermanyMiloš Jakubíček Lexical Computing and Masaryk University,
Czech RepublicKyo Kageura University of Tokyo, JapanValia Kordoni Humboldt University of Berlin, GermanySimon Krek University of Ljubljana, SloveniaPedro Mogorrón Huerta University of Alicante, SpainJohanna Monti Naples Eastern University, ItalySara Moze University of Wolverhampton, UKPreslav Nakov Qatar Computing Research Institute, HBKU, QatarMichael Oakes University of Wolverhampton, UKMarija Omazić University of Osijek, CroatiaPetya Osenova Sofia University, BulgariaMagali Paquot Université catholique de Louvain, BelgiumGiovanni Parodi Sweis Pontifical Catholic University of Valparaíso, ChileAlain Polguère University of Lorraine, France
Carlos Ramisch Marseille Laboratory of Fundamental ComputerScience, France
Ute Römer Georgia State University, USAAgata Savary François Rabelais University, FranceBarbara Schlücker The University of Bonn, GermanyVioleta Seretan University of Geneva, SwitzerlandKathrin Steyer Institute of German Language, GermanyYukio Tono Tokyo University of Foreign Studies, JapanCornelia Tschichold Swansea University, UKBenjamin Tsou City University of Hong Kong, SAR ChinaAgnès Tutin University of Grenoble, FranceAline Villavicencio Federal University of Rio Grande do Sul, BrazilEveline Wandl-Vogt Austrian Academy of Sciences, AustriaTom Wasow Stanford University, USAEric Wehrli University of Geneva, SwitzerlandStefanie Wulff University of Florida, USAMichael Zock Marseille Laboratory of Fundamental Computer
Science, France
Additional Reviewers
Verginica Barbu Mititelu Romanian Academy, Research Institute for AI,Romania
Archna Bhatia Language Technologies Institute, CMU, USAIsmail El Maarouf Adarga Limited, Oxford University Press, UKVoula Giouli Institute for Language and Speech Processing,
Athena RIC, GreeceVáclava Kettnerová Charles University, Czech RepublicRogelio Nazar Pontifical Catholic University of Valparaíso, ChileIrene Renau Pontifical Catholic University of Valparaíso, ChileIoannis Saridakis University of Athens, GreeceInguna Skadina University of Latvia, LatviaShiva Taslimipoor University of Wolverhampton, UKVeronika Vincze Hungarian Academy of Sciences, HungaryVictoria Yaneva University of Wolverhampton, UK
Keynote Speakers Main Conference
Kenneth Church Johns Hopkins University, USAGloria Corpas University of Malaga, SpainDmitrij Dobrovol’skij Russian Academy of Sciences, Russian Language
Institute, RussiaPatrick Hanks University of Wolverhampton, UKMiloš Jakubíček Lexical Computing and Masaryk University,
Czech Republic
X Organisation
Invited Speakers of Europhras 2017 Workshops
Jean-Pierre Colson Université catholique de Louvain, BelgiumCarlos Ramisch Marseille Laboratory of Fundamental Computer
Science, France
Organising Committee
Amanda Bloore University of Wolverhampton, UKMartina Cotella University of Genoa, ItalyArianna Fabbri University of Genoa, ItalyApril Harper University of Wolverhampton, UKSara Moze University of Wolverhampton, UKRocío Sánchez González University of Malaga, SpainAndrea Silvestre Baquero Polytechnic University of Valencia, SpainShiva Taslimipoor University of Wolverhampton, UKVictoria Yaneva University of Wolverhampton, UK
Conference Chair
Ruslan Mitkov University of Wolverhampton, UK
Organisation XI
Sponsors
EUROPHRAS
Sketch Engine
University of Wolverhampton
ELRA
XII Organisation
Contents
Keynote and Invited Talks
Corpus Methods in a Digitized World . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Kenneth Ward Church
Phrasal Settings in Which the Definite and Indefinite Articles Appearto Be Interchangeable in English: An Exploratory Study . . . . . . . . . . . . . . . 193
Using Parallel Corpora to Study the Translation of Legal System-BoundTerms: The Case of Names of English and Spanish Courts . . . . . . . . . . . . . 260