Top Banner
Lecture/Studio: Culturomics Prof. Alvarado MDST 3703/7703 1 November 2012
21
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mdst3703 culturomics-2012-11-01

Lecture/Studio:Culturomics

Prof. AlvaradoMDST 3703/77031 November 2012

Page 2: Mdst3703 culturomics-2012-11-01

Business

• Everyone’s families and friends OK?

Page 3: Mdst3703 culturomics-2012-11-01

Review

• The New Epistemology– Rise of Big Data: massive, available, social– Shifts our relationship to primary sources– From reading to quantitative methods and

visualizations– Example of media determinism

• Manovich– Consistent with database logic– Applies spirit of Big Data methods to art

Page 4: Mdst3703 culturomics-2012-11-01
Page 5: Mdst3703 culturomics-2012-11-01

Review

• Rationalization Effects– What are we looking at?– What is theory?– What are models?– What is culture?– What are the humanities?

Page 6: Mdst3703 culturomics-2012-11-01

Overview

• Combined Studio and Lecture• Lecture– Google’s NGram Viewer– Culturomics

• Studio: – Collaborative Topic Index

Page 7: Mdst3703 culturomics-2012-11-01

Google Does the Humanities

Page 8: Mdst3703 culturomics-2012-11-01

Google NGrams

• Google Books comprises 11% of the corpus of published books, about 2 trillion words

• NGrams uses 5.2 million books (4% of the corpus)

• 500 billion words• Published between 1500-1800• In English, French, Spanish, German, Chinese

and Russian (Hebrew too)

Page 9: Mdst3703 culturomics-2012-11-01

Erez Lieberman Aiden and Jean-Baptiste Michel

Page 10: Mdst3703 culturomics-2012-11-01

What’s an NGram?

Page 11: Mdst3703 culturomics-2012-11-01

A space-delimited string

N = number of strings

Case sensitivePurely syntactic

Very hard to index

Page 12: Mdst3703 culturomics-2012-11-01

Culturomics

• A method more than a model (like Anderson argues)

• Analogy is to genomics– Does this make sense? – What is the analog to the gene?

Page 13: Mdst3703 culturomics-2012-11-01
Page 14: Mdst3703 culturomics-2012-11-01
Page 15: Mdst3703 culturomics-2012-11-01

Parallel

Crossing

Convergent/Divergent

Page 16: Mdst3703 culturomics-2012-11-01

American

British

Page 17: Mdst3703 culturomics-2012-11-01
Page 18: Mdst3703 culturomics-2012-11-01

“There’s not even a historian of the book connected to the project,” Mr. Menand noted.

Page 19: Mdst3703 culturomics-2012-11-01

Anthony Grafton, History, Princeton

Page 20: Mdst3703 culturomics-2012-11-01
Page 21: Mdst3703 culturomics-2012-11-01

Studio

• We are now at the point where we have all the pieces in place– HTML markup, CSS, JavaScript– Structured data (table in Google Docs)– Visualization tools

• Create Character Index– We will use everything we have done so far – notes,

network visualizations, etc.– Today we begin to collaboratively create the Character

Index (a subset of a full topic index)