This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Slide 1
Jonathan Simon Elizabeth Langdon COM 633, Fall 2010
Slide 2
The function of GI is to generate a count of words falling into
various dictionary-supplied categories Uses categories from the
Harvard IV-4 dictionary and the Lasswell dictionary, as well as
five categories based on the social cognition work of Semin and
Fiedler 182 categories in all Each category is a list of words and
word senses
Slide 3
Examples of Harvard IV-4 categories: Pstv 1045 positive words,
plus a subset of 557 words tagged Affil for words indicating
affiliation or supportiveness PstvAffil Ngtv 1160 negative words,
plus a subset of 833 words tagged Hostile for words indicating an
attitude or concern with hostility or aggressiveness NgtvHostile
Strong 1902 words implying strength, plus a subset of 689 words
tagged Power, indicating a concern with power, control or authority
StrongPower Weak 755 words implying weakness, plus a subset of 284
words tagged Submit, indicating submission to authority or power,
dependence on others, vulnerability to others, or withdrawal
WeakSubmit,
Slide 4
Examples of Lasswell categories: PowGain = 65 words about power
increasing PowGain PowLoss = 109 words of power decreasing PowLoss
PowEnds = 30 words about the goals of the power process PowEnds
PowAren = 53 words referring to political places and environments
PowAren PowCon= 228 words for ways of conflicting PowCon
Slide 5
For names and basic descriptions of each category:
http://www.wjh.harvard.edu/~inquirer/homecat.htm
http://www.wjh.harvard.edu/~inquirer/homecat.htm For a list of all
words contained in each of the 182 categories:
http://www.webuse.umd.edu:9090/tags/http://www.webuse.umd.edu:9090/tags/
Slide 6
Users CAN add new categories Considerations for adding
categories: Somewhat comparable to producing a set of survey
questions that everyone agrees has validity in measuring a
well-specified construct To map categories with accuracy requires
attention to word use, word senses, and disambiguation
routines
Slide 7
Purpose: Analyze content of news articles from three different
sources Articles are about the same Ted Strickland fundraiser
Include a newscast (via closed captioning) from WKYC, an online
article from FOX8, and online article from The Plain Dealer
Slide 8
Beginning Screens:
Slide 9
Input: Select the content you wish to analyze Use plain text
format (.txt) Analyze a single file or multiple files at one time
To analyze multiple files simultaneously, save them to a directory
(e.g. F:\NewsArticles) In output, each file will have its own line
of data within your Excel file (one row for single files, multiple
rows for multiple files)
Slide 10
Output: Specify where you want the data output to be saved,
name the file and add the.xls extension Dictionary: You will not
need to change this! GI will analyze your content using all of its
182 categories
Slide 11
Tags: Output is a matrix of counts and percentages of words
falling into the dictionaries semantic categories Format column
includes r (raw count, or simple count of words) and s (scaled
count, or percentage of words in each category Wordcount column is
total number of words in the file Leftovers column shows words not
found in any dictionary
Slide 12
Slide 13
Words: Output is a count of all words appearing in your file
Rows are words, columns are file names
Slide 14
Slide 15
Overall, the WKYC article can be viewed as being more positive
and affiliative when compared to the FOX and PD articles WKYC story
showed highest percentages of all positively valenced categories
FOX or Plain Dealer showed higher percentages of all negatively
valenced categories CATA / GI findings are reflective of the
overall tone of the articles, as experienced by readers (e.g.
pulled quotes, emphasis on political / economic climates,
etc.)
Slide 16
Slide 17
Yoshikoder is provides a general word count, custom dictionary
word count, KWIC, and reading highlight function The program can
handle multiple documents and analyze them individually or side by
side All dictionaries must be either custom built or downloaded
from an external source several dictionaries are available on the
Yoshikoder website
Slide 18
Dictionaries consist of 2 levels: Categories and Patterns
Categories are concept words that fall into a larger construct
Patterns are individual words or phrases that fall into a category
and are actually searched for Yoshikoder dictionaries allow wild
cards (*)
Slide 19
Purpose: Analyze content of news articles from three different
sources Articles are about the same Ted Strickland fundraiser
Include a newscast (via closed captioning) from WKYC, an online
article from FOX8, and online article from The Plain Dealer This
analysis will identify which issues were most frequently mentioned
in these stories given a list of predetermined possible issues
Slide 20
Beginning Screen:
Slide 21
Add Document: Documents must be.TXT file
Slide 22
Multiple Documents can be uploaded
Slide 23
123 4
Slide 24
567 8 9
Slide 25
It is important to make sure that the proper level is
highlighted when adding a category or pattern. Yoshikoder can stack
categories within each other
Slide 26
Pre-made or downloaded dictionaries can be imported
Slide 27
A Yoshikoder concordance is a KWIC analysis Concordance >
Make Concordance Results can be exported to HTML or Excel
Slide 28
Report Document Word Frequencies reports the frequencies of all
words in an individual document All Word Frequencies reports the
frequencies of all words in all documents, sorted by document
Unified Word Frequencies reports the frequencies of all words in
all selected documents
Slide 29
Report Dictionary Report shows the frequencies of dictionary
words, by category or pattern for an individual document A unified
dictionary report downloads the category frequencies into an excel
spreadsheet Document Comparison will compare any two documents
Statistical Comparison Report will compare any two documents in
terms of percent difference
Slide 30
Slide 31
Slide 32
The Channel 3 newscast contained more issue keywords than the
Fox 8 and PD stories, with the biggest difference in focus being in
education issues. The Jobs issue was most frequently mentioned,
however it was more emphasized in the FOX 8 and PD story than in
channel 3s coverage. The remainder of issue mentions were sporadic
with little overlap between the sources.