Top Banner
SEASR Analytics and Zotero University of Illinois at Urbana-Champaign
39

SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Dec 25, 2015

Download

Documents

Bertha Bridges
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR Analytics and Zotero

University of Illinois at Urbana-Champaign

Page 2: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Outline

• Brief Zotero Introduction

• SEASR Analytics for Zotero Plugin

• Interaction between SEASR and VUE

• Zotero Flows

• Configuration Mechanism

• Web Service Components

• Zotero-enabled Flows

• VUE-enabled Flows

• Attendee Project Work

Page 3: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

The Zotero Picture

The WEB

ZoteroStore

Page 4: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

What is Zotero? (from Zotero Quick Start Guide)

• A citation manager. It is designed to store, manage, and cite bibliographic references, such as books and articles. In Zotero, each of these references constitutes an item.

• An extension for the Firefox web-browser by the Center for History and New Media at George Mason University.

• Installed by visiting zotero.org and clicking the download button on the page.

Page 5: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Zotero Features (from zotero.org)

• Automatically capture citations• Remotely back up and sync

your library• Store PDFs, images, and web

pages• Cite from within Word and

OpenOffice• Take rich-text notes in any

language• Wide variety of import/export

options• Free, open source, and

extensible• Collaborate with group libraries• Organize with collections and

tags

• Access your library from anywhere

• Automatically grab metadata for PDFs

• Use thousands of bibliographic styles

• Instantly search your PDFs and notes

• Advanced search and data mining tools

• Interface available in over 30 languages

Page 6: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

The Zotero + SEASR Picture

TheWEB

ZoteroStore

TheWEB

Page 7: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR Analytics for Zotero

• An extension for the Firefox web-browser by the SEASR Team

• Uses your Zotero Collections

• Performs analysis using SEASR Services

Page 8: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR Analytics for Zotero Interface

Page 9: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

How to Setup Your Machine

• Install/Open Firefox

• Install Zotero– https://addons.mozilla.org/en-US/firefox/addon/3504

– http://zotero.org

• Install the SEASR Zotero plugin– https://addons.mozilla.org/en-US/firefox/addon/10020

• The plugin points to the default services provided by SEASR (running on our server)

Page 10: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Zotero and SEASR

Tag Cloud Analysis

Readability Analysis

Automatic Summarization

Network Analysis

Date Entity to Simile Timeline

Example: Zotero, SEASR, Protovis, Google Maps, Simile

Location Entity to Google Map

Page 11: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Tag Cloud Examples

• Tag Cloud Viewer

– Creates tag cloud for all items submitted (with a url), stop words filtered including common tokens (punctuation), stemmed, top 100 words displayed in tag cloud viewer

• NGram Tag Cloud Viewer

– Creates tag cloud for all items submitted (with a url), stop words filtered including common tokens (punctuation), 2-grams, top 100 2-grams displayed in tag cloud viewer

Page 12: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Entity Extraction Examples

• Date Entities to Simile Timeline

– Extracts date entities from all items submitted (with a url), and plots these dates on the Simile Timeline

• Location Entities to Google Map

– Extracts location entities from all items submitted (with a url), and plots these on a Google Map

• Entities to Protovis Network

– Extracts entities, creates relationships of entities existing in the same sentence and display in a Protovis force directed link node graph

Page 13: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Text Summarization

HITS Summarizer

– Finds top sentences and tokens from all items submitted (with a url) and displays them in a report

Page 14: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Flesch-Kincaid Readability Test

• Given: Zotero item(s)• Results show scores for

each item selected– Designed to indicate

comprehension difficulty when reading a passage of contemporary academic English

– Flesch Reading Ease: higher scores indicate material that is easier to read; lower numbers mark passages that are more difficult to read

– Flesch–Kincaid Grade Level: result is a number that corresponds with a grade level

Page 15: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Authorship Analysis

• Given: Zotero Collection (or multiple items selection) with Author/Co-Author Information

• Determine importance of given authors in this collection?– Each author is a vertex in the

graph

– Authors are connected with an edge if they are co-authors of an item

– List of Authors ranked by the Betweenness Centrality Measure

– Betweenness is a centrality measure of a vertex within a graph. Vertices that occur on many shortest paths between other vertices have higher betweenness than those that do not.

Page 16: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

The Value Added

• Analytical Results are saved as Zotero items (View Snapshot)– Includes metadata – Item naming strategy identifies the item or collection

processed– Creator indicates the Menu Label of the SEASR Analysis

• Related Tab links to the items processed in the Analysis

• No need to install the analysis, it runs as web service

Page 17: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

The Zotero Plugin

• Open Firefox

• Install Zotero– https://addons.mozilla.org/en-US/firefox/addon/3504

• Install the SEASR Zotero plugin– https://addons.mozilla.org/en-US/firefox/addon/10020

• The plugin will point to the default services provided by SEASR

• You can develop and deploy your own (samples available)

• SEASR plugin preferences allow to point to other service providers

Page 18: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Zotero and VUE

The VUE team has integrated their tool, so that items can be exported into VUE

Page 19: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR Support in VUE

• Goal: Provide functionality in VUE to use SEASR flows

• Implementations:– Add content to map for

top 10 words from the given url

– Get metadata for content– Get information about

content

Page 20: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR and VUE

• Top words from 2 different web pages with nodes moved around to see overlap

Page 21: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR and VUE

Page 22: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Demonstration

• We will be demonstrating how to install and use the SEASR Analytics extension for Zotero

• We will also showcase Tufts' Visual Understanding Environment (VUE) for Zotero and its integration with SEASR

Page 23: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Learning Exercises: Zotero Collection

Have participants run some of the Zotero-enabled flows

– Setup a Zotero collection you want to use, skip to the next step

• Create a collection by right-clicking on "My Library" and selecting "New Collection"

– Give the collection a name (such as "DocSouth")

• Select this collection

• Use Firefox to navigate to http://docsouth.unc.edu/neh/aaron/aaron.html

• Open Zotero by clicking the Zotero icon in Firefox (bottom-right corner)

• Capture the current webpage as a Zotero item by clicking the "Create new item from current page" button (fifth from the left on the Zotero toolbar)

• Navigate to http://docsouth.unc.edu/neh/adams/adams.html and repeat the previous step

• You should now have two documents in your "DocSouth" collection

Page 24: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Learning Exercise: Access SEASR

• Select one or more items in Zotero and then right-click on one of the selected items and choose SEASR Analytics -> SEASR -> Tag Cloud Viewer to create a tag cloud from text extracted from your Zotero item(s)

• Do the same thing but select SEASR Analytics -> SEASR -> Hits Summarizer instead, to view a list of top tokens and sentences extracted from your item(s)

• Repeat the same procedure one more time, but this time select SEASR Analytics -> SEASR -> Date Entities to Simile Timeline to view a timeline containing dates extracted from your item(s)

Page 25: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Learning Exercise: Zotero and VUE

• Run an analysis on a Zotero collection through VUE. These steps will create nodes in VUE and extract the words from these documents and connect them with a link. By doing this for multiple documents, you will see what words (concepts) are mentioned in  multiple documents. Note that we could change from words to an extracted entity, like Person and automatically build a social network around the documents that are selected.

– Note: This exercise requires the existence of a Zotero collection; you can create one by following step 1 in the previous exercise, if necessary

Page 26: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Learning Exercise: SEASR and VUE

– Open Zotero by clicking on the Zotero icon in Firefox (bottom-right corner)

– Click the Settings button on the Zotero toolbar (third from left) and select "Start VUE"

• Confirm the security prompt, if one is presented

– Right-click the collection (ex: "DocSouth") and select "Send to VUE and Add to Map"

– Select one of the boxes (documents) in the VUE workspace and then choose Analysis -> SEASR from the VUE menu

– In the new window that is displayed, select "Create new nodes" in step 2, and the "Resource Word Count" analysis in step 3 and press Analyze

– Repeat this for additional nodes in your graph to build a more complex network of words. You can now use the functionality of VUE to rearrange your graph to tell a story.

Page 27: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Discussion Questions

• What kinds of data assets would you be creating in Zotero?

• What other analysis would you like to use against this data?

Page 28: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Creating Zotero Flows

Page 29: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Outline

• Zotero Flow

• SEASR Configuration File

• VUE-SEASR Configuration File

Page 30: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

SEASR Plugin Preferences

• Configuration files are managed in a list

• Each configuration file can be enabled or disabled

• Reload will refresh the plugin with the flows in the configuration files

Page 31: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Local Setup

• Copy config file to your machine from

– http://repository.seasr.org/Zotero/config/seasr.config

• In Zotero,

– Select Preferences from Menu

– Go to SEASR

• Click Add

– Specify a Provider Name

– Specify a URL for the config file (file:///Users/lauvil/Sites/zotero.config)

– Click box for Enabled

• Note: In the future, after editing the config you only need to click “Reload”

Page 32: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Extensible to Analysis that You Create

• You can deploy the flows we have on your server or request your university to host this analysis

• You can modify these flows and redeploy

• You can create new flows

– Perhaps you want to see only nouns or verbs

– Perhaps you want to see a list of extracted entities

• You can share these flows back to the community

Page 33: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Configuration File (XML or json)

• Contains 2 attribute-value pairs– name: label to use in the Zotero drop-down display– url: url for where to send the post

• XML<seasr_analytics> <flows> <flow name="Author Centrality Analysis"

url="http://services.seasr.org:10000/http://seasr.org/flows/zotero-social-network/instance/service-head-post/1"/>

</flows></seasr_analytics>

• json{"seasr_flows":[{"name":"Author Centrality Analysis",

"url":"http://services.seasr.org:1718/meandre://seasr.org/components/zotero/service-head-post/instance/shp" } ,

{"name":"Flesch-Kincaid Readability Test", "url":"http://services.seasr.org:1721/meandre://seasr.org/components/zotero/service-head-post/instance/shp" }

]}

Page 34: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

What does a Web Service Flow Look Like

Common components used for creating a web service flow

• Service Head Post

– Receives the http post and sends the data to the rest of the flow

• Service Tail Text

– Send the results back to the http request

Page 35: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Another Zotero Service Flow

Components that read Zotero data from the web service

• Zotero Author Extractor (previous slide)

– Extracts the author-coauthor from each item

• Zotero URL Extractor

– Extracts the url from each item

Page 36: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

VUE-SEASR Configuration File (XML)

<?xml version="1.0" encoding="UTF-8"?><seasr_analytics>

<flow_group label="Create New Nodes" input="one" output="map”><flow label="Resource Word Count" uri="http://vue.tufts.edu/word-counts-vuetokenizer/word-counts-for-vue-using-vue-tokenizer/" url="http://vue-dl.tccs.tufts.edu:1719/service/ping" duplicate="false" >

<input>location</input> </flow> </flow_group>

<flow_group label="Add Metadata" input="one" output="map" ><flow label="Resource Word Count" uri="http://vue.tufts.edu/word-counts-vuetokenizer/word-counts-for-vue-using-vue-tokenizer/" url="http://vue-dl.tccs.tufts.edu:1719/service/ping" duplicate="false”>

<input>location</input></flow>

</flow_group>

</seasr_analytics>

Page 37: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Demonstration

• We will go through an example of what a Zotero-enabled flow looks like and what's special about it

• We will show how to modify an existing Zotero-enabled flow and how to "deploy" it so that it can be leveraged within Zotero

Page 38: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Learning Exercises

1. Create a new flow (or adapt an existing flow) using the Meandre Workbench that performs some simple analysis and "deploy" it for access by Zotero

1. We can use the flow we constructed in an earlier session as a base

2. Execute this flow

3. Change the configuration of SEASR plugin so that it knows how to access this flow

4. From Zotero, refresh the configuration file

5. Select some data to process through the updated SEASR flow

Page 39: SEASR Analytics and Zotero University of Illinois at Urbana-Champaign.

Discussion Questions

• What kinds of data assets would you be creating in Zotero?

• What other analysis would you like to use against this data?