The Circulation of Knowledge
Analyzing, Visualizing, and Navigating the Republic of
LettersSchool of Library and Information ScienceDepartment of
History & Philosophy of Science
Indiana University, Bloomington, IN
Scott Weingarthttp://www.scottbot.net
Bodleian Digital Library Systems and Services at Osney Mead
Oxford, UK
14:00-16:00 on July 11, 2011
1Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30:
Visualizations of the Republic of Letters14:30 - 14:45: Future
Possibilities14:45 - 14:55: Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsSchedule14:05 - 14:15: Why
Visualize?14:15 - 14:30: Visualizations of the Republic of
Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsInspirationWhy Visualize?Napoleons
March -MinardArmy Location, Direction, Split, Size | Temperature |
Time
http://upload.wikimedia.org/wikipedia/commons/2/29/Minard.pngFlow
map pub 1869 MinardOn 24 June 1812, Napoleon's forces crossed the
riverNeman. They headed east with nearly 500,000 troopsCut north,
then south, then north againSome troops branched off, shrinking
their numbersOne such branch was destroyed at Polotsk, the
remaining retreated south and eventually westBy September of 1812
only 100,000 soldiers had reached MoscowMoscow was empty and
burning, Napoleon forced to retreatTemperatures plummeted: -20
degrees, -24 degrees, -30 degrees. Many troops died. Of the 422,000
men who marched to Moscow, a thin black line stands now for a mere
10,000 men back across the Neman.5The Many UsesWhy Visualize?The
Importance of Visualization[Visualizations] aim at more than making
the invisible visible. [They aspire] to all-at-once-ness, the
condensation of laborious, step-by-step procedures in to an
immediate coup doeil What was a painstaking process of calculation
and correlationfor example, in the construction of a table of
variablesbecomes a flash of intuition. And all-at-once intuition is
traditionally the way that angels know, in contrast to the plodding
demonstrations of humans.Descartess craving for angelic
all-at-once-ness emerged forcefully in his mathematics, compressing
the steps of mathematical proof into a single bright flare of
insight: I see the whole thing at once, by intuition.Lorraine
Daston On Scientific ObservationDastons predictably embellished
prose, but good point. See things clearly and quickly.7The Many
Uses of VisualizationsSolidification of objects of
inquirySummarizing data
Exploration/NavigationDiscoveryTrend-spottingEvidenceAudience
EngagementEngaging public / funding agenciesSolidification: modern
geology and rock strata, modern physics/chemistry and the
atomSummary: the at-a-glanceness quickly and easily seeing what you
have, and possibly what youre missingExploration/Nav: Looking for
possible areas of inquiry, interesting outliers, focusing on one
curiosityDiscovery: Finding the difference that makes a difference,
note that the creation and conceptualization of viz itself can be
useful in understanding what you have and what you do not
haveTrend-spotting: Finding self-similarities over time and space
that are too big to found in careful, close studyEvidence: Support
a historical argument/theory/hypothesisAudience engagement: How do
scientists read average paper? Students? Look at
pictures.Public/Funding: Pretty pictures sell.8Schedule14:05 -
14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic
of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: Questions9Previous WorkVisualizations of the
Republic of LettersMuch work has been done here, at Stanford, The
Huygens Institute, in Italy, at Florida, and many other places on
doing digital, visual, and quantitative work with the republic of
letters10Peiresc Correspondence -MandrouCorrespondents Per City |
Geographic Spread
Mandrou (Annale/s historian), From Humanism to Science, 1979Also
included examples of letters of Erasmus, number of printing
presses, number of universitiesNumber of correspondeNTS per city,
ability to compare with others for internationality11Peiresc
Correspondence -HatchLetters per Year | Letters per City |
Geographic
Spreadhttp://www.clas.ufl.edu/users/ufhatch/pages/11-ResearchProjects/peiresc/06rp-p-corr.htm
Improved Mandrou (beginning in 1980s) volume of letters over
time (can see and ask about what is missing) number of
correspondenCES per place
12Republic of Letters -HatchLetters per Year | Correspondent
Comparisons
Introduced overlaid comparisons, also visualizing more of
republic of letters over timehubs of Europe Mersenne was the
Mailbox, Oldenburg as Clearing House of ScienceHatch did many
others as well, and continues to
13Grotius Correspondence -WeingartSender & Recipient
Locations | Geographic Spread
Network Science becoming popular in 2000s (Barabasi, Watts), out
of social network analysis in 70s (Milgram, Wasserman,
Granovetter)With computer tech, now able to look at *Networks* and
*Complex Systems* as objects of studyFor the first time the large
scale is conceivable and doableAge of Google it becomes easy to
take a series of coordinates and drop them into Google Maps
(Huygens Inst., CEN)14Republic of Letters -StanfordS&R
Locations | Comparisons | Time |
Correspondentshttps://republicofletters.stanford.edu/
Stanford group using Electronic Enlightenment Data Dan
Edelstein, Paula Findlen, Nicole Coleman conceived July 2008Large
project, much of which is visible, this was the first go (2009)Made
It Big New York Times, etcAbility to compare correspondence
networks, see geographic spread, networks, various comparative
statisticsGreat for a first pass, but raised questions of
uncertainty, what knowledge does user gain, etc.?15Republic of
Letters -StanfordS&R Locations | Location Volume | Time |
Uncertaintyhttps://republicofletters.stanford.edu/
2011A next pass includes much the same information as before,
but also visualized uncertainty in bar chartsAlso included ability
to drill down in letters themselves, now a tool for
research16Republic of Letters -WeingartCommunities | Time | Central
Correspondents | Volume & Flow
February 2010, CEN database, using solely network-based viz from
network scienceIntroducing the analytics of networksSocial
centrality rather than geographicQuantitatively determine if
someone is a hub, a mailbox of europe, heavy producer,
etcVisualization of THE WHOLE NETWORK, at-a-glance-ness
(Spaghetti?)My doesnt that look like the internet? Makes you wonder
why this subject is becoming popular..17
Grotius Correspondence -WeingartLetters over Time |
Correspondent Share | Location Share
At same time, slew of basic tools to visualize SINGLE
CORRESPONDENT, not networkBreakdown of correspondents is someones
correspondence overwhelmingly dominated by one person, or many?
Tree-mapSimple breakdown by place/time there are better ways to do
this18Epistolarium -CKCCFull Text | Senders & Recipients |
Keywords | Time | Language
June 2010 first version of EpistolariumFaceted browser, driven
by idea that historians would want to do both close and distant
reading, comfortably switch between bothSimple visualizations of
when and how many records appeared when selectedIncluded Huygens
(Christiaan & Constantijn), Grotius, others later added
19Epistolarium -CKCCTime | Correspondent | Volume
Also included basic time-line, who spoke to whom and when can be
improved20Epistolarium -CKCCGeographic Spread | Volume
Include a similar geographic map of spread of correspondences
for purposes of comparisons,Could click on any and get to letters
themselves21Epistolarium -CKCCCommunities | Correspondent
Centrality | Volume
Also included network layoutImportantly, included the Giant
Network Metrics of CEN database Dots sized by correspondent
centrality in the overall network, contextualizes small ego-network
appearingGreen correspondents are one that appear in our full text,
blue ones are only in CEN22Epistolarium -CKCCTopics
Full text topical analysis more coming soon, based on
visualizations23Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30:
Visualizations of the Republic of Letters14:30 - 14:45: Future
Possibilities14:45 - 14:55: Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsBreaking Free of GraphsFuture
PossibilitiesClusteringMcKechnie et
al.http://informationr.net/ir/10-2/paper220.htmlHierarchicalGroupsStill
Spaghetti
26Circular HierarchiesHolton -
http://www.win.tue.nl/~dholten/papers/bundles_infovis.pdfRe-interpreting
the NetworkHierarchiesClustersEdge BundlingIncreased
Dimensionality
27Increasing
Dimensionalityhttp://www.medialab.sciences-po.fr/index.php?mact=CGCalendar,cntnt01,default,0&cntnt01event_id=23&cntnt01display=event&cntnt01returnid=15Graphs
in 3.5 dimensions(Time? Space?)
28Maps Adding Advanced NetworksMeeks
http://dh2011network.stanford.edu/acercaDe.html
Bringing in the OldDavid Rumsey Google
Earthhttp://www.davidrumsey.com/Visualizing the world as they saw
it
Earth & Cosmos30Bringing in the OldDavid Rumsey Google
Earthhttp://www.davidrumsey.com/
Overlay very detailed historical maps begin to see paths letters
took, places people lived, etc. Historian can situate herself as
much as possible within context of republic of letters31Small
MultiplesAndrew Gelman -
http://www.juiceanalytics.com/writing/better-know-visualization-small-multiples/
Small multiples simple but powerful, importantPublic support for
vouchers broken down by region, ethnicity, and income32Visualizing
Narrative - XKCDRandall Munroe http://www.xkcd.com
Xkcd who has the ring?33Dimensionality Reduction
Last.FMBiberstine Indiana University
Self Organizing Map 2010Thousands/Millions of dimensions of data
reduced to 2.5d map like contour maps of geography34Travel Time on
Commuter RailsNew York Times - http://nyti.ms/irMnHS
Radial how far you can go in 15 minute intervals35Travel Time
vs. Carbon Footprint In
Parishttp://xiaoji-chen.com/blog/2010/map-of-paris-visualizing-urban-transportation/
Radial how far you can go in 30 minute intervalsCompare car,
metro, bicycle in time vs. carbon footprintVisualizing the same
info in different ways36New York Subway
Ridershiphttp://diametunim.com/blog/?p=111
Edges on map have meaning both in geographic space and geometric
space (thickness is a quantitative dimension)Also see ridership
over time for the last 100 years on different lines
37Thank YouAnalyzing, Visualizing, and Navigating the Republic
of Letters Scott WeingartSchedule14:05 - 14:15: Why Visualize?14:15
- 14:30: Visualizations of the Republic of Letters14:30 - 14:45:
Future Possibilities14:45 - 14:55: Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsSchedule14:05 - 14:15: Why
Visualize?14:15 - 14:30: Visualizations of the Republic of
Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsImplementationSchedule14:05 - 14:15:
Why Visualize?14:15 - 14:30: Visualizations of the Republic of
Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsPlanning EarlyData
ConceptualizationsHow data is represented will inform how
visualizations are createdTry to think of what you want to
visualize, analyze, etc. before going in Can what youre putting in
get that? Granularity of entries? Ease of data format?
Portability?43Representing UncertaintyThree kinds of
uncertainty:Uncertain fields within an entryMissing entriesUnknown
entries
Degrees of certaintyRanges of certainty (time, space,
quantity)
That which we know we dont know, that which we dont know we dont
know44Representing ContinuityDigital vs. Analog, Discontinuous vs.
Continuous, Points vs. FieldsTime (point vs. range)SpaceGranularity
town, city, county, countryRange town, city, county,
countryAuthorship how is it distributed?What is a document? Can
they be nested? Sent along? Continued?Include all, because in
visualizations sometimes one is more useful than the other letters
from paris to london vs. letters within london45Schedule14:05 -
14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic
of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsNetworksData Formats Network
FormatsMatrixAdjacency ListNode & Edge
ListNewtonOldenburgFlamsteedNewton01338Oldenburg24045Flamsteed6270NewtonOldenburg13NewtonFlamsteed38OldenburgNewton24OldenburgFlamsteed45FlamsteedNewton62FlamsteedOldenburg7Nodes1Newton2Oldenburg3FlamsteedEdges1213133821242345316232748Discuss
Storage vs. ConceptualizationDifference between a network and a
visualizationIncreasing dimensionality NWB Format*Nodesid*int
label*string totaldegree*int 16 Merwede van Clootwyck, Matthys van
der (1613-1664) 1 36 Perrault, Charles 1 48 Bonius, Johannes 1 67
Surenhusius Gzn., Gulielmus 1 99 Anguissola, Giacomo 1 126 Johann
Moritz, von Nassau-Siegen (1604-1679) 6 131 Steenberge, J.B. 1 133
Vosberghen Jr., Caspar van 1 151 Bogerman, Johannes (1576-1637) 25
*DirectedEdges source*int target*int weight*float eyear*int
syear*int 16 36 1 1640 1650 16 126 5 1641 1649 36 48 2 1630 1633 48
16 4 1637 1644 48 67 10 1645 1648 48 36 2 1632 1638 67 133 7 1644
1648 67 131 3 1642 1643 99 67 9 1640 1645 126 16 3 1641 1646 131
133 5 1630 1638 131 99 1 1637 1639 133 36 4 1645 1648 133 48 8 1632
1636 151 48 6 1644 1647
49 GraphML Format
50Rather clunky, not often used in viz packages, good because
its understandable, maybe intermediate step JSON Formatvar json = [
{ "adjacencies": [ "graphnode21", { "nodeTo": "graphnode1",
"nodeFrom": "graphnode0", "data": { "$color": "#557EAA" } }, {
"nodeTo": "graphnode13", "nodeFrom": "graphnode0", "data": {
"$color": "#909291" } }, { "nodeTo": "graphnode14", "nodeFrom":
"graphnode0", "data": { "$color": "#557EAA" } 51Schedule14:05 -
14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic
of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsIndiana UniversityCyberinfrastructure
for Network Science CenterVisualization PackagesKaty Borner53Just
as the microscope empowered our naked eyes to see cells, microbes,
and viruses thereby advancing the progress of biology and medicine
or the telescope opened our minds to the immensity of the cosmos
and has prepared mankind for the conquest of space, macroscopes
promise to help us cope with another infinite: the infinitely
complex. Macroscopes give us a vision of the whole and help us
synthesize. They let us detect patterns, trends, outliers, and
access details in the landscape of science. Instead of making
things larger or smaller, macroscopes let us observe what is at
once too great, too slow, or too complex for our eyes. Microscopes,
Telescopes, and Macrocopes
Desirable Features of MacroscopesCore Architecture &
Plugins/Division of Labor: Computer scientists need to design the
standardized, modular, easy to maintain and extend core
architecture. Dataset and algorithm plugins, i.e., the filling, are
provided by those that care and know most about the data and
developed the algorithms: the domain experts. Ease of Use: As most
plugin contributions and usage will come from non-computer
scientists it must be possible to contribute, share, and use new
plugins without writing one line of code. Users need guidance for
constructing effective workflows from 100+ continuously changing
plugins.Modularity: The design of software modules with well
defined functionality that can be flexibly combined helps reduce
costs, makes it possible to have many contribute, and increases
flexibility in tool development, augmentation, and
customization.Standardization: Adoption of (industry) standards
speeds up development as existing code can be leveraged. It helps
pool resources, supports interoperability, but also eases the
migration from research code to production code and hence the
transfer of research results into industry applications and
products.Open Data and Open Code: Lets anybody check, improve, or
repurpose code and eases the replication of scientific studies.
Macroscopes are similar to Flickr and YouTube and but instead of
sharing images or videos, you freely share datasets and algorithms
with scholars around the globe.
Brner, Katy (in press) Plug-and-Play Macroscopes. Communications
of the ACM.55Network Workbench
Talk about CIShell shells around algorithms, written in java but
algs in other places56Network WorkbenchThe NWB tool supports
loading the following input file formats:GraphML (*.xml or
*.graphml)XGMML (*.xml)Pajek .NET (*.net) & Pajek .Matrix
(*.mat)NWB (*.nwb)TreeML (*.xml)Edge list (*.edge)CSV (*.csv)ISI
(*.isi)Scopus (*.scopus)NSF (*.nsf)Bibtex (*.bib)Endnote (*.enw)and
the following network file output formats:GraphML (*.xml or
*.graphml)Pajek .MAT (*.mat)Pajek .NET (*.net)NWB (*.nwb)XGMML
(*.xml)CSV (*.csv)Formats are documented at
https://nwb.slis.indiana.edu/community/?n=DataFormats.HomePage. The
Sci2 Tool
The Sci2 Tool
Horizontal Time Graphs
Sci MapsGUESS NetworkWeb-BasedVisualization
PackagesFlarehttp://flare.prefuse.org/
Flash / ActionScript61Prefusehttp://www.prefuse.org/
Java (applet-based)62Protovis
http://mbostock.github.com/protovis/Javascript and SVG -
Stanford63d3.js Data Driven
Documentshttp://mbostock.github.com/d3/
Stanford - JavaScript library uses CSS3, HTML5, SVG supports
csv, json, etc64JIT JavaScript InfoVis
Toolkithttp://thejit.org/
JavaScript uses json, csv, etc65Schedule14:05 - 14:15: Why
Visualize?14:15 - 14:30: Visualizations of the Republic of
Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsTo-DoVisualizations more seamlessly
integrated with navigations & facetsHandle more dataStream data
of different types from different sourcesImmersive environments as
humanistic tools
Schedule14:05 - 14:15: Why Visualize?14:15 - 14:30:
Visualizations of the Republic of Letters14:30 - 14:45: Future
Possibilities14:45 - 14:55: Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: QuestionsThank YouAnalyzing, Visualizing, and
Navigating the Republic of Letters Scott WeingartSchedule14:05 -
14:15: Why Visualize?14:15 - 14:30: Visualizations of the Republic
of Letters14:30 - 14:45: Future Possibilities14:45 - 14:55:
Questions
15 Minute Break
15:10 - 15:15: Data Conceptualizations15:15 - 15:25: Data
Formats15:25 - 15: 40: Visualization Packages15:40 - 15:45:
To-Do15:45 - 16:00: Questions