Top Banner
and Navigating the Republic of Letters School of Library and Information Science Department of History & Philosophy of Science Indiana University, Bloomington, IN Scott Weingart http://www.scottbot.net Bodleian Digital Library Systems and Services at Osney Mead Oxford, UK 14:00-16:00 on July 11, 2011
70

Analyzing, Visualizing, and Navigating the Republic of Letters

Jan 16, 2016

Download

Documents

Elda

Analyzing, Visualizing, and Navigating the Republic of Letters. Bodleian Digital Library Systems and Services at Osney Mead Oxford, UK 14:00-16:00 on July 11, 2011. School of Library and Information Science Department of History & Philosophy of Science Indiana University, Bloomington, IN - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

The Circulation of Knowledge

Analyzing, Visualizing, and Navigating the Republic of LettersSchool of Library and Information ScienceDepartment of History & Philosophy of Science

Indiana University, Bloomington, IN

Scott Weingarthttp://www.scottbot.net

Bodleian Digital Library Systems and Services at Osney Mead Oxford, UK

14:00-16:00 on July 11, 2011

1Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsSchedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsInspirationWhy Visualize?Napoleons March -MinardArmy Location, Direction, Split, Size | Temperature | Time

http://upload.wikimedia.org/wikipedia/commons/2/29/Minard.pngFlow map pub 1869 MinardOn 24 June 1812, Napoleon's forces crossed the riverNeman. They headed east with nearly 500,000 troopsCut north, then south, then north againSome troops branched off, shrinking their numbersOne such branch was destroyed at Polotsk, the remaining retreated south and eventually westBy September of 1812 only 100,000 soldiers had reached MoscowMoscow was empty and burning, Napoleon forced to retreatTemperatures plummeted: -20 degrees, -24 degrees, -30 degrees. Many troops died. Of the 422,000 men who marched to Moscow, a thin black line stands now for a mere 10,000 men back across the Neman.5The Many UsesWhy Visualize?The Importance of Visualization[Visualizations] aim at more than making the invisible visible. [They aspire] to all-at-once-ness, the condensation of laborious, step-by-step procedures in to an immediate coup doeil What was a painstaking process of calculation and correlationfor example, in the construction of a table of variablesbecomes a flash of intuition. And all-at-once intuition is traditionally the way that angels know, in contrast to the plodding demonstrations of humans.Descartess craving for angelic all-at-once-ness emerged forcefully in his mathematics, compressing the steps of mathematical proof into a single bright flare of insight: I see the whole thing at once, by intuition.Lorraine Daston On Scientific ObservationDastons predictably embellished prose, but good point. See things clearly and quickly.7The Many Uses of VisualizationsSolidification of objects of inquirySummarizing data Exploration/NavigationDiscoveryTrend-spottingEvidenceAudience EngagementEngaging public / funding agenciesSolidification: modern geology and rock strata, modern physics/chemistry and the atomSummary: the at-a-glanceness quickly and easily seeing what you have, and possibly what youre missingExploration/Nav: Looking for possible areas of inquiry, interesting outliers, focusing on one curiosityDiscovery: Finding the difference that makes a difference, note that the creation and conceptualization of viz itself can be useful in understanding what you have and what you do not haveTrend-spotting: Finding self-similarities over time and space that are too big to found in careful, close studyEvidence: Support a historical argument/theory/hypothesisAudience engagement: How do scientists read average paper? Students? Look at pictures.Public/Funding: Pretty pictures sell.8Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: Questions9Previous WorkVisualizations of the Republic of LettersMuch work has been done here, at Stanford, The Huygens Institute, in Italy, at Florida, and many other places on doing digital, visual, and quantitative work with the republic of letters10Peiresc Correspondence -MandrouCorrespondents Per City | Geographic Spread

Mandrou (Annale/s historian), From Humanism to Science, 1979Also included examples of letters of Erasmus, number of printing presses, number of universitiesNumber of correspondeNTS per city, ability to compare with others for internationality11Peiresc Correspondence -HatchLetters per Year | Letters per City | Geographic Spreadhttp://www.clas.ufl.edu/users/ufhatch/pages/11-ResearchProjects/peiresc/06rp-p-corr.htm

Improved Mandrou (beginning in 1980s) volume of letters over time (can see and ask about what is missing) number of correspondenCES per place

12Republic of Letters -HatchLetters per Year | Correspondent Comparisons

Introduced overlaid comparisons, also visualizing more of republic of letters over timehubs of Europe Mersenne was the Mailbox, Oldenburg as Clearing House of ScienceHatch did many others as well, and continues to

13Grotius Correspondence -WeingartSender & Recipient Locations | Geographic Spread

Network Science becoming popular in 2000s (Barabasi, Watts), out of social network analysis in 70s (Milgram, Wasserman, Granovetter)With computer tech, now able to look at *Networks* and *Complex Systems* as objects of studyFor the first time the large scale is conceivable and doableAge of Google it becomes easy to take a series of coordinates and drop them into Google Maps (Huygens Inst., CEN)14Republic of Letters -StanfordS&R Locations | Comparisons | Time | Correspondentshttps://republicofletters.stanford.edu/

Stanford group using Electronic Enlightenment Data Dan Edelstein, Paula Findlen, Nicole Coleman conceived July 2008Large project, much of which is visible, this was the first go (2009)Made It Big New York Times, etcAbility to compare correspondence networks, see geographic spread, networks, various comparative statisticsGreat for a first pass, but raised questions of uncertainty, what knowledge does user gain, etc.?15Republic of Letters -StanfordS&R Locations | Location Volume | Time | Uncertaintyhttps://republicofletters.stanford.edu/

2011A next pass includes much the same information as before, but also visualized uncertainty in bar chartsAlso included ability to drill down in letters themselves, now a tool for research16Republic of Letters -WeingartCommunities | Time | Central Correspondents | Volume & Flow

February 2010, CEN database, using solely network-based viz from network scienceIntroducing the analytics of networksSocial centrality rather than geographicQuantitatively determine if someone is a hub, a mailbox of europe, heavy producer, etcVisualization of THE WHOLE NETWORK, at-a-glance-ness (Spaghetti?)My doesnt that look like the internet? Makes you wonder why this subject is becoming popular..17

Grotius Correspondence -WeingartLetters over Time | Correspondent Share | Location Share

At same time, slew of basic tools to visualize SINGLE CORRESPONDENT, not networkBreakdown of correspondents is someones correspondence overwhelmingly dominated by one person, or many? Tree-mapSimple breakdown by place/time there are better ways to do this18Epistolarium -CKCCFull Text | Senders & Recipients | Keywords | Time | Language

June 2010 first version of EpistolariumFaceted browser, driven by idea that historians would want to do both close and distant reading, comfortably switch between bothSimple visualizations of when and how many records appeared when selectedIncluded Huygens (Christiaan & Constantijn), Grotius, others later added

19Epistolarium -CKCCTime | Correspondent | Volume

Also included basic time-line, who spoke to whom and when can be improved20Epistolarium -CKCCGeographic Spread | Volume

Include a similar geographic map of spread of correspondences for purposes of comparisons,Could click on any and get to letters themselves21Epistolarium -CKCCCommunities | Correspondent Centrality | Volume

Also included network layoutImportantly, included the Giant Network Metrics of CEN database Dots sized by correspondent centrality in the overall network, contextualizes small ego-network appearingGreen correspondents are one that appear in our full text, blue ones are only in CEN22Epistolarium -CKCCTopics

Full text topical analysis more coming soon, based on visualizations23Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsBreaking Free of GraphsFuture PossibilitiesClusteringMcKechnie et al.http://informationr.net/ir/10-2/paper220.htmlHierarchicalGroupsStill Spaghetti

26Circular HierarchiesHolton - http://www.win.tue.nl/~dholten/papers/bundles_infovis.pdfRe-interpreting the NetworkHierarchiesClustersEdge BundlingIncreased Dimensionality

27Increasing Dimensionalityhttp://www.medialab.sciences-po.fr/index.php?mact=CGCalendar,cntnt01,default,0&cntnt01event_id=23&cntnt01display=event&cntnt01returnid=15Graphs in 3.5 dimensions(Time? Space?)

28Maps Adding Advanced NetworksMeeks http://dh2011network.stanford.edu/acercaDe.html

Bringing in the OldDavid Rumsey Google Earthhttp://www.davidrumsey.com/Visualizing the world as they saw it

Earth & Cosmos30Bringing in the OldDavid Rumsey Google Earthhttp://www.davidrumsey.com/

Overlay very detailed historical maps begin to see paths letters took, places people lived, etc. Historian can situate herself as much as possible within context of republic of letters31Small MultiplesAndrew Gelman - http://www.juiceanalytics.com/writing/better-know-visualization-small-multiples/

Small multiples simple but powerful, importantPublic support for vouchers broken down by region, ethnicity, and income32Visualizing Narrative - XKCDRandall Munroe http://www.xkcd.com

Xkcd who has the ring?33Dimensionality Reduction Last.FMBiberstine Indiana University

Self Organizing Map 2010Thousands/Millions of dimensions of data reduced to 2.5d map like contour maps of geography34Travel Time on Commuter RailsNew York Times - http://nyti.ms/irMnHS

Radial how far you can go in 15 minute intervals35Travel Time vs. Carbon Footprint In Parishttp://xiaoji-chen.com/blog/2010/map-of-paris-visualizing-urban-transportation/

Radial how far you can go in 30 minute intervalsCompare car, metro, bicycle in time vs. carbon footprintVisualizing the same info in different ways36New York Subway Ridershiphttp://diametunim.com/blog/?p=111

Edges on map have meaning both in geographic space and geometric space (thickness is a quantitative dimension)Also see ridership over time for the last 100 years on different lines

37Thank YouAnalyzing, Visualizing, and Navigating the Republic of Letters Scott WeingartSchedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsSchedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsImplementationSchedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsPlanning EarlyData ConceptualizationsHow data is represented will inform how visualizations are createdTry to think of what you want to visualize, analyze, etc. before going in Can what youre putting in get that? Granularity of entries? Ease of data format? Portability?43Representing UncertaintyThree kinds of uncertainty:Uncertain fields within an entryMissing entriesUnknown entries

Degrees of certaintyRanges of certainty (time, space, quantity)

That which we know we dont know, that which we dont know we dont know44Representing ContinuityDigital vs. Analog, Discontinuous vs. Continuous, Points vs. FieldsTime (point vs. range)SpaceGranularity town, city, county, countryRange town, city, county, countryAuthorship how is it distributed?What is a document? Can they be nested? Sent along? Continued?Include all, because in visualizations sometimes one is more useful than the other letters from paris to london vs. letters within london45Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsNetworksData Formats Network FormatsMatrixAdjacency ListNode & Edge ListNewtonOldenburgFlamsteedNewton01338Oldenburg24045Flamsteed6270NewtonOldenburg13NewtonFlamsteed38OldenburgNewton24OldenburgFlamsteed45FlamsteedNewton62FlamsteedOldenburg7Nodes1Newton2Oldenburg3FlamsteedEdges1213133821242345316232748Discuss Storage vs. ConceptualizationDifference between a network and a visualizationIncreasing dimensionality NWB Format*Nodesid*int label*string totaldegree*int 16 Merwede van Clootwyck, Matthys van der (1613-1664) 1 36 Perrault, Charles 1 48 Bonius, Johannes 1 67 Surenhusius Gzn., Gulielmus 1 99 Anguissola, Giacomo 1 126 Johann Moritz, von Nassau-Siegen (1604-1679) 6 131 Steenberge, J.B. 1 133 Vosberghen Jr., Caspar van 1 151 Bogerman, Johannes (1576-1637) 25 *DirectedEdges source*int target*int weight*float eyear*int syear*int 16 36 1 1640 1650 16 126 5 1641 1649 36 48 2 1630 1633 48 16 4 1637 1644 48 67 10 1645 1648 48 36 2 1632 1638 67 133 7 1644 1648 67 131 3 1642 1643 99 67 9 1640 1645 126 16 3 1641 1646 131 133 5 1630 1638 131 99 1 1637 1639 133 36 4 1645 1648 133 48 8 1632 1636 151 48 6 1644 1647

49 GraphML Format

50Rather clunky, not often used in viz packages, good because its understandable, maybe intermediate step JSON Formatvar json = [ { "adjacencies": [ "graphnode21", { "nodeTo": "graphnode1", "nodeFrom": "graphnode0", "data": { "$color": "#557EAA" } }, { "nodeTo": "graphnode13", "nodeFrom": "graphnode0", "data": { "$color": "#909291" } }, { "nodeTo": "graphnode14", "nodeFrom": "graphnode0", "data": { "$color": "#557EAA" } 51Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsIndiana UniversityCyberinfrastructure for Network Science CenterVisualization PackagesKaty Borner53Just as the microscope empowered our naked eyes to see cells, microbes, and viruses thereby advancing the progress of biology and medicine or the telescope opened our minds to the immensity of the cosmos and has prepared mankind for the conquest of space, macroscopes promise to help us cope with another infinite: the infinitely complex. Macroscopes give us a vision of the whole and help us synthesize. They let us detect patterns, trends, outliers, and access details in the landscape of science. Instead of making things larger or smaller, macroscopes let us observe what is at once too great, too slow, or too complex for our eyes. Microscopes, Telescopes, and Macrocopes

Desirable Features of MacroscopesCore Architecture & Plugins/Division of Labor: Computer scientists need to design the standardized, modular, easy to maintain and extend core architecture. Dataset and algorithm plugins, i.e., the filling, are provided by those that care and know most about the data and developed the algorithms: the domain experts. Ease of Use: As most plugin contributions and usage will come from non-computer scientists it must be possible to contribute, share, and use new plugins without writing one line of code. Users need guidance for constructing effective workflows from 100+ continuously changing plugins.Modularity: The design of software modules with well defined functionality that can be flexibly combined helps reduce costs, makes it possible to have many contribute, and increases flexibility in tool development, augmentation, and customization.Standardization: Adoption of (industry) standards speeds up development as existing code can be leveraged. It helps pool resources, supports interoperability, but also eases the migration from research code to production code and hence the transfer of research results into industry applications and products.Open Data and Open Code: Lets anybody check, improve, or repurpose code and eases the replication of scientific studies.

Macroscopes are similar to Flickr and YouTube and but instead of sharing images or videos, you freely share datasets and algorithms with scholars around the globe.

Brner, Katy (in press) Plug-and-Play Macroscopes. Communications of the ACM.55Network Workbench

Talk about CIShell shells around algorithms, written in java but algs in other places56Network WorkbenchThe NWB tool supports loading the following input file formats:GraphML (*.xml or *.graphml)XGMML (*.xml)Pajek .NET (*.net) & Pajek .Matrix (*.mat)NWB (*.nwb)TreeML (*.xml)Edge list (*.edge)CSV (*.csv)ISI (*.isi)Scopus (*.scopus)NSF (*.nsf)Bibtex (*.bib)Endnote (*.enw)and the following network file output formats:GraphML (*.xml or *.graphml)Pajek .MAT (*.mat)Pajek .NET (*.net)NWB (*.nwb)XGMML (*.xml)CSV (*.csv)Formats are documented at https://nwb.slis.indiana.edu/community/?n=DataFormats.HomePage. The Sci2 Tool

The Sci2 Tool

Horizontal Time Graphs

Sci MapsGUESS NetworkWeb-BasedVisualization PackagesFlarehttp://flare.prefuse.org/

Flash / ActionScript61Prefusehttp://www.prefuse.org/

Java (applet-based)62Protovis

http://mbostock.github.com/protovis/Javascript and SVG - Stanford63d3.js Data Driven Documentshttp://mbostock.github.com/d3/

Stanford - JavaScript library uses CSS3, HTML5, SVG supports csv, json, etc64JIT JavaScript InfoVis Toolkithttp://thejit.org/

JavaScript uses json, csv, etc65Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsTo-DoVisualizations more seamlessly integrated with navigations & facetsHandle more dataStream data of different types from different sourcesImmersive environments as humanistic tools

Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: QuestionsThank YouAnalyzing, Visualizing, and Navigating the Republic of Letters Scott WeingartSchedule14:05 - 14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55: Questions

15 Minute Break

15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45: To-Do15:45 - 16:00: Questions