EmpireJS: Hacking Art with Node js and Image Analysis

Post on 08-Sep-2014

29 Views

Category:

Software

7 Downloads

Preview:

Click to see full reader

DESCRIPTION

Talk at EmpireJS, May 6th, 2014.

Transcript

Analyzing Japanese Art with Node.js and Computer Vision John Resig

Lot 55: 20 Japanese Woodblock PrintsEach depicting a female/Geisha figure with calligraphy throughout each print. Prints measure 13.75" H x 9.375" W. Toning to each print, some losses around edges.

Estimated Price: $400 - $600

Step 1: Acquire and read tons of expensive books.

Step 2: Learn to read Japanese. *

Japanese from the 17th to 19th century. *

You’re not going to learn this from Rosetta Stone.

Step 3: Learn to read Japanese calligraphy.

Solution: A fast-loading, responsive, i18ned, web site: Ukiyo-e.org

https://github.com/jeresig/i18n-node-2

var greeting = i18n.__('Hello %s, how are you today?', 'Marcus');

i18n.__n('%s cat', '%s cats', 3);

Node i18n 2 (npm install i18n-2)

setLocaleFromSubdomain([request])

https://github.com/jeresig/i18n-node-2

{! "Hello": "Hello",! "Hello %s, how are you today?": "Hello %s, how are you today?",! "weekend": "weekend",! "Hello %s, how are you today? How was your %s.": "Hello %s, how are you today? How was your %s.",! "Hi": "Hi",! "Howdy": "Howdy",! "%s cat": {! "one": "%s cat",! "other": "%s cats"! },! "There is one monkey in the %%s": {! "one": "There is one monkey in the %%s",! "other": "There are %d monkeys in the %%s"! },! "tree": "tree"!}!

Node i18n 2 (npm install i18n-2)

Digital Ocean

Amazon S3

Amazon Cloudfront

Digital Ocean

ImagesData

(HTML, XML, JSON)

Images JS, CSS

Images JS, CSSnginx

(w/ cache)

node.js express

node.js express

naught

mongodb ElasticSearch

Scraper

https://github.com/jeresig/jquery-imgscrubber

Collecting Tons of Woodblock Print Data

Search

Page Page Page

HTML

Image

HTML

Image

HTML

Image

Search

Page Page Page

HTML

Image

HTML

Image

HTML

Image

Queue-based Crawling using PhantomJS

Processing Queue

Some Website

WebKit

PhantomJS

CasperJS

SpookyJS

Save DataXML Files

Mongo Log

libxml (+ xpath)

MongoDB

Extract Data

Process Data

Artists

Images

Correct Artist and Date

Add to Site!

module.exports = function() {! return {! scrape: [! {! start: "http://ukiyo-e.org/search",! visit: "//a[@class='img']",! next: "//a[contains(@rel,'next')]"! },! {! extract: {! "title": "//p[contains(@class, 'title')]//span",! "dateCreated": "//p[contains(@class, 'date')]//span",! "artists[]": "//p[contains(@class, 'artist')]//a",! "images[]": "//div[contains(@class,'imageholder')]//a/@href"! }! }! ]! };!};!

"surname" : "Hashimoto", "surname_kana" : "はしもと", "name" : "Hashimoto Okiie", "ascii" : "Hashimoto Okiie", "plain" : "Hashimoto Okiie", "kana" : "はしもとおきいえ", "_id" : ObjectId("530c0825d9a80976b2000437") } ], "names" : [ { "original" : "Hashimoto Okiie (橋本興家)", "locale" : "ja", "kanji" : "橋本興家", "given" : "Okiie", "given_kana" : "おきいえ", "surname" : "Hashimoto", "surname_kana" : "はしもと", "given_kanji" : "興家", "surname_kanji" : "橋本", "name" : "Hashimoto Okiie", "ascii" : "Hashimoto Okiie", "plain" : "Hashimoto Okiie", "kana" : "はしもとおきいえ", "_id" : ObjectId("530c0825d9a80976b2000439") } ], "extract" : [ "53dfc997cbf9fa7501d78e4820b24a9c" ], "created" : ISODate("2014-02-25T03:04:05Z"), "__v" : 0 }

“Stack Scraper”

https://github.com/jeresig/stack-scraper

https://github.com/jeresig/ukiyoe-scrapers

Image Similarity

https://github.com/jeresig/node-matchengine

Image Similarity Search

Idyll: Offline Image Cropping

• https://github.com/jeresig/idyll

• Crop images offline and on a mobile device.

• Saves the selections back to a server.

• Data is synced and saved using HTML 5 appcache.

• https://github.com/jeresig/node-appcache-glob

by David Chesterat Shutterstock

https://github.com/dchester/perl-image-crop-calibration-target

http://www.ersatzlabs.com/

Aiding Woodblock Print Studies with Image Analysis

Correcting Print Data

Japanese Names

• Utagawa Hiroshige

• Ando Hiroshige

• Andō Hiroshige

• Hiroshige

• 歌川広重 • 広重

安土 安堂 安島 安東 安籐 安藤 安道 安達 阿藤

Andō

安藤

andō antō anzō

yasuzuka

A many-to-many mapping!

Sharaku Toshusai

東洲斎写楽

Sharaku Toshusai

東洲斎写楽

Is this the family name?Where are the stress marks?

How do you “split” this name?

Which name partscorrelate?

Tools (all are Node modules!)

• https://github.com/lovell/hepburn

• https://github.com/jeresig/node-enamdict

• https://github.com/jeresig/node-ndlna

• https://github.com/jeresig/node-romaji-name

ndlnahepburn enamdict

romaji-name

Hepburn

• https://github.com/lovell/hepburn

• Takes in the English form of a Japanese word.

• Returns it written in Hiragana or Katakana (phonetic Japanese alphabets).

ndlnahepburn enamdict

romaji-name

うたがわひろしげUtagawa Hiroshige

Enamdict

• https://github.com/jeresig/node-enamdict

• Downloads and queries the ENAMDICT database

• (A mapping of Japanese proper names to Hiragana and English.)

• Used to correct typos and figure out surname/given name.

ndlnahepburn enamdict

romaji-name

NDLNA

• https://github.com/jeresig/node-ndlna

• Queries the NDLNA database

• Finds the correct Kanji for an English name.

• Or the correct English for a Kanji name.

ndlnahepburn enamdict

romaji-name

ndlnahepburn enamdict

romaji-name

{ "original" : "Sharaku Toshusai (東洲斎写楽 )", "locale" : "ja", "kanji" : "東洲斎写楽", "given" : "Sharaku", "given_kana" : "しゃらく", "surname" : "Tōshūsai", "surname_kana" : "とおしゅうさい", "surname_kanji" : "東洲斎", "given_kanji" : "写楽", "name" : "Tōshūsai Sharaku", "ascii" : "Tooshuusai Sharaku", "plain" : "Toshusai Sharaku", "kana" : "とおしゅうさいしゃらく" }

Dates

• https://github.com/jeresig/node-yearrange

var yr = require("yearrange");!"yr.parse("1877")!// {"start": 1877, "end": 1877}!"yr.parse("1847-48")!// {"start": 1847, "end": 1848}!"yr.parse("ca. 1810-20s")!// {"start": 1810, "end": 1829, "circa": true}!"yr.parse("18th–19th century")!// {"start": 1700, "end": 1899}!"yr.parse("Meiji era")!// {"start": 1868, "end": 1912}

Artist Rectification

Miyagawa Shuntei

Printed in 1897

Sold for: $550

Prints sell for $100-$400 individually

True Estimate: $2100 - $8400 ** You just have to find someone willing to buy them!

• http://ejohn.org/research/

• http://ukiyo-e.org/

• https://github.com/jeresig

top related