Top Banner
.go v Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill www.ils.unc.edu/govstat NSF Grants EIA 0131824 and EIA 0129978 Principal Investigators: Gary Marchionini, Stephanie Haas, Ben Shneiderman, Catherine Plaisant, and Carol Hert
37

. gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Toward Digital Government: The Case of Government Statistics

Gary MarchioniniUniversity of North Carolina at Chapel Hill

www.ils.unc.edu/govstat

NSF Grants EIA 0131824 and EIA 0129978 Principal Investigators: Gary Marchionini, Stephanie Haas, Ben

Shneiderman, Catherine Plaisant, and Carol Hert

Page 2: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Digital Government: Leveraging IT

• Government information dissemination– Websites– Other publications (no mass emailings yet)

• Transactions– Registrations– Census, regulatory filings– Taxes

• Policy making– E-voting– E-rules

• Our work focuses on statistical information and agencies as many important decisions by policy makers and citizens depend on statistics

Page 3: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govPreliminary Work

1996-2000• Human needs

– Interviews (agencies, public)– Transaction log analysis– Email content analysis

• System development and testing– Novel interfaces– Information architecture– Usability studies

Page 4: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govFocus on Tables

1998-2000• Table browser

– Java applet – DTD for tables (DC and DDI influence)– XML protocol– Mapping metadata elements to interface

control mechanisms– Piping data from large databases to applet– User studies

• Metadata to aid understanding

Page 5: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govStatistical Knowledge Network

2003-2006• Create SKN prototype with agency partners• Integration

– Horizontal integration across federal agencies (BLS, EIA, NCHS, Census, SSA, NASS)

– Vertical integration from local/state• Focus on non-specialists

– Help crucial– Metadata drives help

• User interfaces are the intermediaries to link people and data

• Find what you need, understand what you find

Page 6: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govData Flow

agency data with integrated

metadata

agency with multiple metadata

repositories

agency backend data and metadata

agency backend data and metadata

Distributed Public Intermediary:

variable/concept level, XML-based incorporating

ISO 11179 and DDI providing java-based

statistical literacy tools to user interfaces

Statistical Ontology

firewall

Domain ExpertsEnd User

Communities

Domain Ontologies

I n

t e r

f a

c e

sU

s e

r

end user

end user

end user

end users: interactwith data frominformation/conceptperspective, notagency perspective

membrane

end user

enduser

enduser

enduser

Page 7: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Statistical Knowledge Network Architecture

Agencies

SKN Registry

ActionsContribute

FindDisplay

Annotate UnderstandManipulate Collaborate

…..

………….

ObjectsActions

Private Work Space

ObjectsActions

Private Work Space

ObjectsActions

Private Work Space

Ontology Rules & Constraints

SKN Consortium

…..

.gov

Objects Reports metadataTables metadataPeople metadata

GlossaryAnnotations

Page 8: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Interface Prototypes:Find, Display, Understand; Leverage

Metadata, Glossary, Ontology

• Relation Browser

• Mulitlayered help: treemaps, video help

• Animated Glossary

• Contextualizer

• PairTrees

• Spatial audio for maps

• Missing Data

Page 9: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govUse Case Scenarios to Guide

Design• Based on discussions with agency

partners

• 20 scenarios

• 4 detailed with in depth resources located

• Used to ground ongoing work

Page 10: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govRelation Browser++

displaying all webpages EIA

Page 11: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govRB++ with Cursor Over Residential

Sector

Page 12: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govRB++ showing ‘hous’ typed in title

field

Page 13: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govMulti-layered interfaces

1 level 3 levels of growing complexity

map+table

map+table +filters

map+table +filters +scatterplot

map+table +filters +scatterplot

Page 14: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govAnimated Demonstration Features

Page 15: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Script Guidelines• Base the script on a live demonstration (never on a written

description) – Focus on tasks

(not tours of widgets or conceptual overviews) – Act out the interaction (with minimum description) then describe results

in context of task– Start with a tour of main screen components

(orient and introduce vocabulary) 5-10 sec. max – Plan a linear sequences made of

very short autonomous chunks (15-60 sec.) • Map the chunks to existing online documentation • Show text title at beginning of each chunk • Carefully synchronize voice and visual (hard when alone) • Provide duration and file size for individual chunk

Page 16: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govInteractive Glossary Development

Tools• Provide foundation for content

development

• Separate content development from presentation development

• Reduce overall development time

• Maximize reuse of existing elements

• Create multiple presentations from a single content development effort

Page 17: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Animation Template

Page 18: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govContent

FoundationTemplate(SIG)

Questioninitial

motivation

Answeroverview, definition

Processexplanation,

equationExample

Resultstatistic, answer

Reviewsummary,

interpretation

Page 19: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Animation Template

• Consistent display and interaction for all animations

• Presents animation and explanatory text simultaneously

• Navigate (forward and back) through animation segments

• Complete review of text at any time

Page 20: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Animation Template

• Three pieces: text, animations, template

• Text is tagged with content section tags in a separate text file

• Animation consists of segments in individual animation files

• Text and animation segments coordinated by placement in template

Page 21: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

ontology

Semantic level

•Classes

•Relationships

•Constraint rules

DTD/XML Schema Structural level

•Elements

•Attributes

•Datatypes

SKN

• Ontology

• DTD / XML Schema

• Interface Tools

• Statistical Interactive Glossary (SIG)

Ontology Applications

Knowledge organization

Content and terminology control

Data integration

Query support

Automatic classification support

Reasoning mechanism

Others

modeling

implementation

Page 22: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

unit

agedunit

<anlyUnit>aged unit</anlyUnit><universe>married couples living together,with husband or wife aged 65 or older</universe>

age

SSA

household

Domain knowledge

Operational knowledge

estimate

poverty estimate

poverty

benefit

Census Bureau FIFARS

earning

salary

wage

income

family

distribution

Page 23: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Project DTD

• Investigate DDI and ISO 11179

• Leverage DDI and data cubes

• Markup a set of objects– Tables– Reports/press releases

• Use markup to build added value search (find what you need) and help (understand what you find) support into interfaces

Page 24: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govThe Basic Structure

entDscr_1:description of an entity within the

marked up document

docDscr: description of the markup-what is being marked-up, who marked it up, etc.

entDscr_2:description of an entity within the

marked up document

varDscr_1: description of each variable within an entity, study group or document

stdygrpDscr: describes the “group” to which an entity or document belongs such as a survey program

nCubeDscr: used when entity is an aggregated table

fileDscr: descripes physical file structures for nCubes

varDscr_2: description of each variable within an entity, study group or document

Page 25: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govOne Example of How the DTD

Helps

The DTD can help bring the “expert knowledge” to the less expert user and bring relevant information together by enabling searching via variables as well as subjects/keywords

Page 26: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Median income, by age, 2001

<var name="age" dcml="0" intrvl="discrete" aggrMeth="count" measUnit="aged units" scale="x1" origin="0" nature="interval" additivity="" temporal="no" geog="no" geoVocab="" catQnty="4">

<labl source="producer" level="variable">age</labl>

<universe level="variable" clusion="I">persons</universe>

<catgryGrp ID="CG1_1" catgry="C1_1 C1_2 C1_3 C1_4">

<labl source="producer" level="catgryGrp">Age</labl>

</catgryGrp>

<catgry ID="C1_1">

<catValu ID="CV1_1">1</catValu>

<labl source="producer" level="catgry">65-69</labl>

</catgry>

<catgry ID="C1_2">

<catValu ID="CV1_2">2</catValu>

<labl source="producer" level="catgry">70-74</labl>

</catgry>

<catgry ID="C1_3">

<catValu ID="CV1_3">3</catValu>

<labl source="producer" level="catgry">75-79</labl>

</catgry>

<catgry ID="C1_4">

<catValu ID="CV1_4">4</catValu>

<labl source="producer" level="catgry">80 or older</labl>

</catgry>

</var>

Page 27: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Discovering Metadata

• Hybrid machine learning approach– Crawl website– Create term document matrices– Use k-means clustering with small K to fit on

screen in RB++– Revise

• Use structure in the existing sites to train a classifier

• For small n of concepts, classify site

Page 28: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Combining Machine Learning and Dynamic InterfacesWhat should these

topics be, and how do we know if we’ve found

the right names for them?

Page 29: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

Combining Machine Learning and Dynamic Interfaces

How do we assign thousands of

documents to their respective topics?

Page 30: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govInitial, Unstructured Approach

doc

doc

doc

doc

doc

doc

doc

doc

docdoc

doc

doc

docdoc

Page 31: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govInitial, Unstructured Approach

doc

doc

doc

doc

doc

doc

doc

doc

docdoc

doc

doc

docdoc

Page 32: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govInitial, Unstructured Approach

doc

doc

doc

doc

doc

doc

doc

doc

docdoc

doc

doc

docdoc

This approach yielded intuitively coherentclusters. But the clusters fall at too fine a

level of granularity, while also wasting largeportions of the data.

Clustering Based on Word Distributions

Page 33: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

New Approach, Semi-Supervised

Page 34: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

New Approach, Semi-Supervised

docdoc

doc

docdoc

doc

docdoc

doc

docdoc

doc

docdoc

doc

doc

Page 35: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

New Approach, Semi-Supervised

docdoc

doc

docdoc

doc

docdoc

doc

docdoc

doc

docdoc

doc

docThis approach capitalizes on the agencies’ effortsand expertise, and so far seems to yield superior

results. However, the amount of training datais very sparse, and the observed categories

have high correlation in some cases.Our current work addresses these

tuning issues.

Page 36: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.gov

StateStatistical

Office

USDA / NASS

State CooperativeAgency (Dept.of Agriculture,etc.)

Farmers & Producers

StatisticalConsumers

Supply data to agencies

Obtain data from agencies

Collection agents

Vertical Integration: Agriculture

Page 37: . gov Toward Digital Government: The Case of Government Statistics Gary Marchionini University of North Carolina at Chapel Hill .

.govMultiple Research Threads for the

SKN• Interfaces• Metadata and Ontology• Multi-leveled help• Automatic slicing and dicing• User needs and user testing• Cross agency cooperation

• See www.ils.unc.edu/govstat