Studying online conversations in the Korean blogosphere: A network approach Anatoliy Gruzd ([email protected]) Dalhousie University, Canada Chung Joo Chung ([email protected]) State University of New York at Buffalo, USA Han Woo PARK ([email protected]) YeungNam University, Korea International Sunbelt Social Network Conference Riva del Garda, Italy July 3, 2010
15
Embed
Studying online conversations in the Korean blogosphere: A network approach
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Studying online conversations in the Korean blogosphere: A network approach
1 난 (I) 2 사진쟁이 (Photographer) 3 그래서 (and, so)4 테츠 (Tetz)5 방짜 (Bangzza)6 댓글 (comment)7 녹두 (Nokdu)8 ㅋㅋ (, : ))9 좀 (a little, a bit)10 사람 (people)
Among 10 nodes, only 2, 4, 5 and 7 are NANEs or IDs of participants in the Bangzza blog
1
2 3
5
67
10
8
94
Semi-automated network evaluation
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 12
Clues suggesting that a word is likely to be a nickname
context words such as "님 " = an honorific or "씨 " = Mr./Ms
full name, which is almost always three characters
punctuation indicative of someone being addressed (e.g., “/” or “:”)
combination of characters (Korean, English and/or Chinese), symbols (e.g., underscores, hyphens) and numbers
patterns indicative of non-native words
phonetic koreanization of English (e.g., "미디어몽골 " = mediamogul = Media Mogul)
phonetic romanization of Korean (e.g. “jihwaja” = 지화자 )
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 13
Words that are NOT likely to be used as a nickname
a word candidate is a phrase
e.g., if the “FROM” field is used more like a subject line (possible indicators include white spaces and length)
a word candidate consists of a single character (e.g., “a” or “ㄱ” )
a word candidate consists of netspeak
emoticons (e.g. “=_=”)
slang and abbreviations (e.g., using “2MB” to refer to the former Korean president)
onomatopoeia (e.g., "ㅋㅋ” = heehee, "하하” = haha)
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 14
Words that are NOT likely to be used as a nickname (2)
a word candidate appears more than one time in the comment
a word candidate consists of random characters (e.g. "ㅁㄴㅇㄹ " or “asdf”)
a word candidate is a short, conversational word or phrase (e.g., "나 " = me,"아이고 " = oh no, "그래서 " = so/therefore)
a word candidate is a common word or idea in the given context/topic (e.g "대한민국 " = Republic of Korea, "쥐체사상 " = a newly created word used to refer to political fanatics)
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 15
Conclusion
A network representation of comments posted to a blog makes it much easier to analyze social interactions among online participants
Even in a blog dominated by mostly anonymous and argumentative commentators, a community can still emerge
Suggested future improvements to our network discovery algorithm.
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 16
Acknowledgments
Jaeeun Yoo at the University of Toronto for her help with the data analysis