Top Banner
Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker Schmidt Methods in Dialectology XV, Groningen Friday, August 15, 2014
49

Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Mar 01, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Bottom-up dialectometry using the GeoLing package

Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker Schmidt

Methods in Dialectology XV, Groningen Friday, August 15, 2014

Page 2: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

A statistical software package for geolinguistic data

• developed in cooperation by statisticians (Ulm University) and dialectologists (Universities of Augsburg and Salzburg)

• funded by the Deutsche Forschungsgemeinschaft (DFG)

• multi-platform (written in Java)

• open source (GPLv3)

• tried and tested with data from the Sprachatlas von Bayerisch-Schwaben (SBS) and other geolinguistic corpora

• www.geoling.net

Page 3: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

A tool for bottom-up dialectometry (cf. Pickl/Rumpf 2012)

With GeoLing, you can

• produce probabilistic area-class maps of linguistic variables using intensity estimation

• find groups of maps that are spatially similar using cluster analysis

• identify and plot recurring spatial patterns using factor analysis

Page 4: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• What can you do with GeoLing? → Simon Pickl • intensity estimation

• factor analysis

• How do you use GeoLing? → Aaron Spettl • installing GeoLing

• performing analyses

• importing your data

Outline

Page 5: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• What can you do with GeoLing? → Simon Pickl • intensity estimation

• factor analysis

• How do you use GeoLing? → Aaron Spettl • installing GeoLing

• performing analyses

• importing your data

Outline

Page 6: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Testbed: Sprachatlas von Bayerisch-Schwaben (SBS)

• compiled 1984‒2009 at the University of Augsburg under the direction of Werner König

• approximately 2,700 maps in 14 volumes

• 272 sites

• for each map 0–3 records per site

photo by Stefan Puchner

Germany

Bavaria

Page 7: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Intensity estimation Cf. Rumpf/Pickl/Elspaß/König/Schmidt 2009; Pickl/Rumpf 2011; 2012

• Method for estimating the probabilistic distribution underlying the records

• Motivation: Individual records are not necessarily representative

• Records are treated as statistical samples from an underlying distribution

• Intensity estimation uses the geographical or linguistic relatedness between sites to infer local probabilities

Page 8: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Intensity estimation Cf. Rumpf/Pickl/Elspaß/König/Schmidt 2009; Pickl/Rumpf 2011; 2012

intensity estimation continuous intensity estimation

words for ‘woodlouse’

Page 9: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Linguistic distances in intensity estimation Cf. Pickl/Spettl/Pröll/Elspaß/König/Schmidt 2014

intensity estimation based on geographical distances

intensity estimation based on linguistic (in this case: lexical) distances

words for ‘woodlouse’

Page 10: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• Intensity estimation with linguistic distances:

• less “smooth” isoglosses and areas

• more detail

• preservation of language island (e.g. towns) and dialect borders

• continuous plot not possible with linguistic distances

Linguistic distances in intensity estimation Cf. Pickl/Spettl/Pröll/Elspaß/König/Schmidt 2014

Page 11: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 12: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 13: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• Further analysis:

• statistical analysis of spatial characteristics (homogeneity, complexity)

• Rumpf/Pickl/Elspaß/König/Schmidt 2010

• cluster analysis to obtain groups of maps with similar spatial structure

• Rumpf/Pickl/Elspaß/König/Schmidt 2010; Meschenmoser/Pröll 2012

Intensity estimation Cf. Rumpf/Pickl/Elspaß/König/Schmidt 2009; Pickl/Rumpf 2011; 2012; Pickl/Spettl/Pröll/Elspaß/König/Schmidt 2014

Page 14: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• statistical tool for dimensionality reduction Applications in dialectometry: Clopper/Paolillo 2006; Nerbonne 2006; Leinonen 2010; Grieve/Speelman/Geeraerts 2011

• condenses large numbers of variants with similar distributions into so-called “factors”

• provides a “summary” of predominant spatial patterns in the data

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 15: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• summarize 59.9 % of the data (equivalent of 16,961 variants)

• areas of similar variant distributions

• only the 10 locally dominant factors visible in this map

• in total: 15 factors (Kaiser criterion)

• non-dominant factors are hidden but ‘latently’ present

Combined Factor Map: Dominant factors in the SBS (all 2,160 maps, 28,315 variants)

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 16: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 17: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• summarizes 14.58 % of the data (equivalent of 4,128 variant distributions)

• area of tendential co-occurrence of variants

• fuzzy distribution

Example: Factor 1

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 18: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• summarizes 12.40 % of the data (equivalent of 3,511 variant distributions)

• area of tendential co-occurrence of variants

• fuzzy distribution

Example: Factor 2

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 19: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• summarizes 0.71 % of the data (equivalent of 201 variant distributions)

• area of tendential co-occurrence of variants

• fuzzy, discontinous distribution

Example: Factor 10

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 20: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• summarizes 0.62 % of the data (equivalent of 176 variant distributions)

• area of tendential co-occurrence of variants

• fuzzy distribution

Example: Factor 11

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 21: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear) Catchment area of market town Lauingen Factor 11

Page 22: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• nuanced and detailed account of overall spatial patterns in the data

• useful for

• a quick overview of major spatial structures

• a differentiated division into graded, fuzzy dialect areas

• an exploratory look into recurring spatial structures (even weak ones) that are hitherto unknown

Factor Analysis Cf. Pröll/Pickl/Spettl (to appear)

Page 23: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• What can you do with GeoLing? → Simon Pickl • intensity estimation

• factor analysis

• How do you use GeoLing? → Aaron Spettl • installing GeoLing

• performing analyses

• importing your data

Outline

Page 24: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• What can you do with GeoLing? → Simon Pickl • intensity estimation

• factor analysis

• How do you use GeoLing? → Aaron Spettl • installing GeoLing

• performing analyses

• importing your data

Outline

Page 25: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Installing GeoLing

Simple installation:

• download: www.geoling.net

• GeoLing is ready to use after unzipping a single file; no installation is required.

• Sprachatlas von Bayerisch-Schwaben (SBS) is included for demonstration purposes

Live demonstration of GeoLing – some screenshots are supplied at the end of this

presentation!

Page 26: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Installing GeoLing

Requirements:

• Java 7 (or higher) must be installed

• a 64-bit Java is recommended on 64-bit operating systems

• processor and memory requirements depend on the database, e.g. number of locations

• SBS database: dual-core CPU and 2 GB RAM recommended

Live demonstration of GeoLing – some screenshots are supplied at the end of this

presentation!

Page 27: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Performing analyses

Main window of GeoLing:

• maps are hierarchically organized for easy navigation

• individual maps can be investigated directly

• but: most analyses are performed on ‘groups’ of maps

• example: groups in SBS database • full corpus

• lexical sub-corpus

• phonetic sub-corpus

• morphological sub-corpus

Live demonstration of GeoLing – some screenshots are supplied at the end of this

presentation!

Page 28: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Performing analyses

With a group of maps, you can

• perform intensity estimations, plot maps to image files, calculate characteristics etc.

• perform factor analyses and cluster analyses

For factor and cluster analysis:

• results are visualized immediately

• results can be saved to CSV or XML files for further processing

Live demonstration of GeoLing – some screenshots are supplied at the end of this

presentation!

Page 29: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Importing data

• “Create new database” / “Edit existing database”

• “Database management” dialog: • import your own data from simple text files (CSV), whose format is

described in the user guide

• import custom distances between locations

• export/import database e.g. for backup and exchange of data

• computation of linguistic distances

• computation of bandwidths for intensity estimations

Live demonstration of GeoLing – some screenshots are supplied at the end of this

presentation!

Page 30: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

Bottom line

GeoLing provides

• several methods for the detection of spatial patterns in geolinguistic data

• easy installation and import of your own data

• open-source license allows modifications and custom extensions

You can start now to use it on your own data!

Page 31: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• Clopper, C. G. / Paolillo, J. C. (2006): “North American English Vowels: A Factor-analytic Perspective”. Literary and Linguistic Computing 21/4, 445–462.

• Leinonen, T. (2010): An Acoustic Analysis of Vowel Pronunciation in Swedish Dialects. Groningen: Rijksuniversiteit Groningen.

• Meschenmoser, D. / Pröll, S. (2012): “Using fuzzy clustering to reveal recurring spatial patterns in corpora of dialect maps”. International Journal of Corpus Linguistics 17/2, 176–197.

• Nerbonne, J. (2006): “Identifying linguistic structure in aggregate comparison”. Literary and Linguistic Computing 21/4, 463–475.

• Pickl, S. / Rumpf, J. (2011): “Automatische Strukturanalyse von Sprachkarten. Ein neues statistisches Verfahren”. In: Glaser, E. / Schmidt, J. E. / Frey, N. (eds): Dynamik des Dialekts – Wandel und Variation. Akten des 3. Kongresses der Internationalen Gesellschaft für Dialektologie des Deutschen (IGDD). Stuttgart: Steiner, 267–285.

• Pickl, S. / Rumpf, J. (2012): “Dialectometric Concepts of Space: Towards a Variant-Based Dialectometry”. In: Hansen, S. / Schwarz, C. / Stoeckle, P. / Streck, T. (eds): Dialectological and folk dialectological concepts of space. Berlin: Walter de Gruyter. 199–214.

• Pickl, S. / Spettl, A. / Pröll, S. / Elspaß, S. / König, W. / Schmidt, V. (2014): “Linguistic distances in dialectometric intensity estimation”. Journal of Linguistic Geography 2, 25–40.

• Pröll, S. / Pickl, S. / Spettl, A. (to appear): “Latente Strukturen in geolinguistischen Korpora”. In: Elmentaler, M. / Hundt, M. / Schmidt, J. E. (eds.): Deutsche Dialekte. Konzepte, Probleme, Handlungsfelder. Akten des 4. Kongresses der Internationalen Gesellschaft für Dialektologie des Deutschen (IGDD) in Kiel. Stuttgart: Steiner.

• Rumpf, J. / Pickl, S. / Elspaß, S. / König, W. / Schmidt, V. (2009): “Structural analysis of dialect maps using methods from spatial statistics”. Zeitschrift für Dialektologie und Linguistik 76/3, 280–308.

• Rumpf, J. / Pickl, S. / Elspaß, S. / König, W. / Schmidt, V. (2010): “Quantification and statistical analysis of structural similarities in dialectological area-class maps”. Dialectologia et Geolinguistica 18, 73–98.

References

Page 32: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• www.geoling.net

• contents of extracted ZIP archive

• starting GeoLing

Appendix: Screenshots

Page 33: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

www.geoling.net

Page 34: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 35: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

double-click to start GeoLing

Page 36: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• main window after startup

• navigation by hierarchical categories to individual maps

• graded area-class-maps by intensity estimation

Appendix: Screenshots

Page 37: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 38: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 39: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 40: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 41: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 42: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• „groups“ for operations/analyses on many maps

• export function to generate e.g. graded area-class-maps for all maps of a group

• factor analysis

Appendix: Screenshots

Page 43: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 44: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 45: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 46: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker
Page 47: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

• importing data

• example of file format required

Appendix: Screenshots

Page 48: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker

choose file name of database

Page 49: Bottom-up dialectometry using the GeoLing package...Bottom-up dialectometry using the GeoLing package Simon Pickl, Aaron Spettl, Simon Pröll, Stephan Elspaß, Werner König, Volker