REPRESENTING LINGUISTIC DATA Maha Shouman
Feb 15, 2016
REPRESENTING LINGUISTIC DATAMaha Shouman
TEXTARC Data target:
Raw text Medium-sized
Traditional techniques: Structured word lists (indices, concordances) Automatic summary generation Exclude original linearity!
Indexhttp://www.i75online.com/
FLAIndexPage1.html
Concordancehttp://www.opensourceshakespeare.com
THEMERIVER Data target:
Large text collections Temporal patterns Thematic changes
Traditional techniques: Histogram
Other visualizations focus on documents
3D THEMERIVER?
www.cs.sunysb.edu/~vislab/papers/3DThemeriver.pdf
THE WORD TREE Visualization + information retrieval Graphical Key Word In Context (KWIC)
Format for concordance KWIC + suffix tree
THE WORD TREE
THE WORD TREE
Click
Shift-Click