Top Banner
1 Email Viz Future Directions Marti Hearst UC Berkeley
23
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

1

Email VizFuture Directions

Marti HearstUC Berkeley

Page 2: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

2

Outline

• Important Infoviz Principle• Tough Data Mining Problem

– The infrequent important thing

• Interfaces tailored to user goals– Intelligence Analysts– Investigative Reporters

• Promising Future Directions– Integration of task, viz, and content analysis– Mixed-Initiative Interaction

Page 3: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

3

Important InfoViz Principle

Distinguish between:

PRESENTATION ANALYSIS

Page 4: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

4

Tough Data Mining Problem

• It’s easy to see the main trends• But often we want the rare but

unexpected and important event:– Russian oil company example– Schwarzenegger and Enron– Cigarettes and kids– Person on the periphery who is

working stealthily to influence things• Deep throat

Page 5: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

5

Intelligence Analysts

Page 6: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

6

Intelligence Analysts

• Interviews wit active counter-terrorist analysts

• Great diversity in– Goals– Computing environments

• Biggest problems are social/systemic

• Many mundane IT problems as well

Page 7: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

7

Mundane IT Problems

• System incompatibilities• Data reformatting• Data cleaning• Documenting sources• Archiving materials

Page 8: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

8

Intelligence Analysts: Problem 1

• Look at a series of reports, images, communication patterns;

• Try to build a model of what is going on– Follow leads– Compare to previous situations

• Recent problem: – Groups are changing their behavior patterns

quickly

• Very little use of sophisticated software tools

Page 9: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

9

Intelligence Analysts: Problem 2

• Given a large collection• “Roll around” in the data

– See what has been “touched”• Tools should indicate which parts of the

collection have been examined and which have yet to be looked at, and by whom

– View data in several different ways• Data reduction methods such as MDS,

SVD, and clustering often hide important trends.

Page 10: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

10

Intelligence Analysts: Problem 2

– Don’t show the obvious• e.g., Cheney is president

– Don’t show what you’ve already shown

– Only show the most recent version– Show which info is not present

• Changes in the usual pattern• Something stops happening

Page 11: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

11

Intelligence Analysts: Problem 3

• Prepare a very short executive summary for the purposes of policy making– Really the culmination of a cascade of

summaries– Reps from different agencies meet and

“pow-wow” to form a view of the situation

– Rarely, but crucially, must be able to refer back to original sources and reasoning process for purposes of accountability

Page 12: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

12

Investigative Reporter Example

• Looking for trends in online literature

• Create, support, refute hypotheses

Page 13: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

13

Investigative Reporter Example

What are the current main topics?

What are the new popular terms? How do they track with the news?

Clustering

Corpus-level statistics, Co-occurrence statistics

Contrasting collection statistics

Page 14: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

14

Investigative Reporter Example

How long after a new Star Trek series comes on the air before characters from the series appear in stories?

How often do Klingons initiate attacks against Vulcans, vs. the converse?

Named-entity recognitionCreating a list of termsApply the list to a Subcollection

Create regex rules withPOS information

Page 15: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

15

Integration

• TAKMI, by Nasukawa and Nagano, IBM systems Journal 40(4), 2001

• The system integrates:– Real tasks (CRM, patent analysis)– Content analysis– Information Visualization

Page 16: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

16

TAKMI, by Nasukawa and Nagano, 2001Docs containing “windows 98”

Page 17: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

17

CRM

TAKMI, by Nasukawa and Nagano, 2001

Page 18: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

18

TAKMI, by Nasukawa and Nagano, 2001

Page 19: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

19

TAKMI, by Nasukawa and Nagano, 2001

Page 20: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

20

TAKMI, by Nasukawa and Nagano, 2001

Page 21: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

21

TAKMI, by Nasukawa and Nagano, 2001

Page 22: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

22

TAKMI, by Nasukawa and Nagano, 2001

Page 23: 1 Email Viz Future Directions Marti Hearst UC Berkeley.

23

Mixed-Initiative Interaction

• Balance control between user and agent– In Spotfire demo, system adjusts axes after

“other” category hidden– EDA:

• User selects a subset of data based on interesting-looking grouping

• System then does stats on this subset in the background while user continues to work

• Then system notifies user of interesting trends• See the AIDE system:

– St. Amant, R., Dinardo, M. D., and Buckner, N. (2003). Balancing Efficiency and Interpretability in an Interactive Statistical Assistant. Proceedings of IUI.