Mapping Lexical Spread in American English Jack Grieve Aston University Research conducted with Diansheng Guo & Alice Kasakoff, University of South Carolina Andrea Nini, Aston University American Dialect Society Annual Meeting, Portland, Oregon, January 8, 2015
55
Embed
Mapping Lexical Spread in American Englishgraphics8.nytimes.com/newsgraphics/2015/02/20/... · 20/02/2015 · Top 10 Emerging Words on Twitter 2014 Words ρ Definition unbothered
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Mapping Lexical Spread in American English
Jack GrieveAston University
Research conducted with Diansheng Guo & Alice Kasakoff, University of South CarolinaAndrea Nini, Aston University
American Dialect Society Annual Meeting, Portland, Oregon, January 8, 2015
Lexical Spread
New words are regularly identified by lexicographers,
linguists, and the media, but very little is known about
how new words spread across time and space.
This is primarily because we haven’t had access to
sufficiently large geo-coded and time-stamped corpora to
identify and map words as they spread (although see
Eisenstein et al., 2014).
Research Goals
Identify emerging words from 2014 based on an analysis
of a multi-billion word geocoded corpus of American
tweets.
Map the geographical spread of these new words across
the United States to identify common sources of new
words and common patterns of lexical spread.
The Corpus
The team at USC are in the process of compiling a multi-
billion word corpus using the Twitter API, consisting of
almost all geocoded Tweet from the US and the UK since
2013.
These tweets come from the approximately 2% of users
who are tweeting from a geo-tracking mobile device.
The analysis presented here is based on a 8.9 billion
word corpus of American Tweets from October 2013 to
November 2014.
The corpus contains approximately 980 million Tweets
from 7 million users from within the contiguous United
States.
Identifying Rising Words
We first extracted the 67,000 words that occur at least
1,000 times in the corpus and identified rising words by
correlating word relative frequency per day to day of the
year using a Spearman’s rank correlation coefficient.
ρ = .044
ρ = .116
ρ = .044
The Top 10 Rising Words on Twitter 2014
Word ρ Definitionfuckboy 0.947 Asshole, Jerk, Poser, Tool, etc.rn 0.938 Right Now (Top Riser 2013)hbd 0.928 Happy Birthdayfw 0.927 Fuck withunbothered 0.926 Unconcerned & Disengagedft 0.925 Face timegmfu 0.924 Get me fucked upsm 0.919 So Muchsquad 0.919 Squadasf 0.918 As fuck
Identifying Emerging Words
Although measuring correlations allows for rising words
to be identified, most are far too common by 2014 to
show patterns of regional spread.
To identify emerging words we cross-referenced the list
of rising words against a list of rare words, defined as
words with low overall frequencies in the fourth quarter
of 2013, excluding proper nouns.
Top 10 Emerging Words on Twitter 2014
Words ρ Definitionunbothered 0.926 Unconcerned & Disengagedgmfu 0.924 Get Me Fucked Upjoggers 0.908 Jogging pantsfuckboys 0.902 Losers, wimps, posers, etc.rekt 0.900 Wreckedtfw 0.879 That feel whenxans 0.878 Benzodiazepine pillsbaeless 0.875 To be without a baeboolin 0.857 Hanging out, esp. young menlordt 0.854 Lord, as exclamation
Top 11-20 Emerging Words on Twitter 2014
Words ρ Definitioncelfie 0.852 selfieslays 0.843 impresses, succeeds at, etc.famo 0.840 family and friendsfuckboi 0.838 fuckboy(on) fleek 0.838 on point, esp. eyebrowsfaved 0.836 to favorite somethinggainz 0.828 earningsbruuh 0.817 broamirite 0.816 am I rightnotifs 0.808 notifications, especially online
Mapping Emerging Words
Given a series of locations (e.g. counties), there are
several ways to map emerging words:
Date of first appearance (e.g. month)
Relative frequency (word frequency/total words)
Number of words until first (or second...) occurrence
etc. (e.g. see Eisenstein et al., 2014).
Top 10 Emerging Words on Twitter 2014
Words ρ Definitionunbothered 0.926 Unconcerned & Disengagedgmfu 0.924 Get Me Fucked Upjoggers 0.908 Jogging pantsfuckboys 0.902 Losers, wimps, posers, etc.rekt 0.900 Wreckedtfw 0.879 That feel whenxans 0.878 Benzodiazepine pillsbaeless 0.875 To be without a baeboolin 0.857 Hanging out, esp. young menlordt 0.854 Lord, as exclamation
Top 11-20 Emerging Words on Twitter 2014
Words ρ Definitioncelfie 0.852 selfieslays 0.843 impresses, succeeds at, etc.famo 0.840 family and friendsfuckboi 0.838 fuckboy(on) fleek 0.838 on point, esp. eyebrowsfaved 0.836 to favorite somethinggainz 0.828 earningsbruuh 0.817 broamirite 0.816 am I rightnotifs 0.808 notifications, especially online
Conclusions
Top Rising Word of 2014: Fuckboy (WOTY?)
Top Emerging Word of 2014: Unbothered (WOTY?)
New Southern Words: Unbothered, Baeless, Boolin,
Lordt, Celfie, Bruuh
New Northern Words: Fuckboy/etc., TFW, Gainz
New Western Words: GMFU, Wrekt
Mapping Lexical Spread in American English
Jack GrieveCentre for Forensic LinguisticsAston University
Word ρ Definitionhaha/hahaha/etc. -0.965 Laughterfdb -0.947 Fuck Dem Bitchesuoeno -0.944 You don’t even knowooo/ooh/etc. -0.940 Oohratchet -0.937 Ghettoohh -0.932 Ohyolo -0.929 You Only Live Oncekillem -0.929 Kill themill -0.925 Goodswangin -0.922 Swinging, Driving, etc.