Top Banner
Semantic Search on the Public Web with Creative Semantic Search on the Public Web with Creative Commons Commons 2006.03.07 2006.03.07 Mike Linksvayer Mike Linksvayer
51
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic Search on the Public Web with Creative Commons

Semantic Search on the Public Web with Semantic Search on the Public Web with Creative CommonsCreative Commons

2006.03.072006.03.07

Mike LinksvayerMike Linksvayer

Page 2: Semantic Search on the Public Web with Creative Commons

Billion$ (0)Billion$ (0)

Let's get the hype out of Let's get the hype out of the way....the way....

Page 3: Semantic Search on the Public Web with Creative Commons

Billion$ (1)Billion$ (1)

Let's get the hype out of Let's get the hype out of the way....the way....

Page 4: Semantic Search on the Public Web with Creative Commons

Billion$ (2)Billion$ (2)

Let's get the hype out of Let's get the hype out of the way....the way....

Page 5: Semantic Search on the Public Web with Creative Commons

Billion$ (3)Billion$ (3)

This calls for a mashup...This calls for a mashup...

Page 6: Semantic Search on the Public Web with Creative Commons

Billion$ (4)Billion$ (4)

Page 7: Semantic Search on the Public Web with Creative Commons

Billion$ (5)Billion$ (5)

Fortunately CC's Fortunately CC's founders thought of founders thought of that from the that from the beginning...beginning...

Page 8: Semantic Search on the Public Web with Creative Commons

Billion$ (6)Billion$ (6)

Page 9: Semantic Search on the Public Web with Creative Commons

Billion$ (7)Billion$ (7)

Page 10: Semantic Search on the Public Web with Creative Commons

About Creative About Creative CommonsCommons

Page 11: Semantic Search on the Public Web with Creative Commons
Page 12: Semantic Search on the Public Web with Creative Commons

Core Licensing Suite: Creator/Licensor chooses license options

NonCommercial

No Derivatives

ShareAlike

Every Creative Commons licenses allows the world to copy and distribute a work provided that the licensee credits the creator/licensor

In addition creator/licensor may apply the following conditions:

Page 13: Semantic Search on the Public Web with Creative Commons
Page 14: Semantic Search on the Public Web with Creative Commons

Simple License Generator

Page 15: Semantic Search on the Public Web with Creative Commons

Internet Archive

Free Hosting for CC works

http://www.archive.org/

Page 16: Semantic Search on the Public Web with Creative Commons

Creative Commons Creative Commons MetadataMetadata

Page 17: Semantic Search on the Public Web with Creative Commons

Creative Commons Metadata Creative Commons Metadata ExampleExample

<rdf:RDF xmlns="http://web.resource.org/cc/"<rdf:RDF xmlns="http://web.resource.org/cc/" xmlns:dc="http://purl.org/dc/elements/1.1/"xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<Work rdf:about="http://example.com/article.html"><Work rdf:about="http://example.com/article.html"> <dc:title>An Example Article</dc:title><dc:title>An Example Article</dc:title> <dc:date>2003-10-01</dc:date><dc:date>2003-10-01</dc:date> <dc:type rdf:resource="http://purl.org/dc/dcmitype/Text" /><dc:type rdf:resource="http://purl.org/dc/dcmitype/Text" />

<license rdf:resource="http://creativecommons.org/licenses/by-<license rdf:resource="http://creativecommons.org/licenses/by-nc-sa/2.5/" />nc-sa/2.5/" />

</Work></Work>

<License rdf:about="http://creativecommons.org/licenses/by-nc-sa/2.5/"><License rdf:about="http://creativecommons.org/licenses/by-nc-sa/2.5/"> <permits rdf:resource="http://web.resource.org/cc/Reproduction" /><permits rdf:resource="http://web.resource.org/cc/Reproduction" /> <permits rdf:resource="http://web.resource.org/cc/Distribution" /><permits rdf:resource="http://web.resource.org/cc/Distribution" /> <requires rdf:resource="http://web.resource.org/cc/Notice" /><requires rdf:resource="http://web.resource.org/cc/Notice" /> <requires rdf:resource="http://web.resource.org/cc/Attribution" /><requires rdf:resource="http://web.resource.org/cc/Attribution" /> <prohibits rdf:resource="http://web.resource.org/cc/CommercialUse" /><prohibits rdf:resource="http://web.resource.org/cc/CommercialUse" /> <permits rdf:resource="http://web.resource.org/cc/DerivativeWorks" /><permits rdf:resource="http://web.resource.org/cc/DerivativeWorks" /> <requires rdf:resource="http://web.resource.org/cc/ShareAlike" /><requires rdf:resource="http://web.resource.org/cc/ShareAlike" /></License></License>

</rdf:RDF></rdf:RDF>

Page 18: Semantic Search on the Public Web with Creative Commons

Rights Description Use CasesRights Description Use Cases

DiscoveryDiscovery

ExpressionExpression

CommerceCommerce

Management(1)Management(1)

Page 19: Semantic Search on the Public Web with Creative Commons

Rights Description vs. Rights Rights Description vs. Rights Management(2)Management(2)

Copy/Use promotion vs. Copy/Use protectionCopy/Use promotion vs. Copy/Use protection

Encourage fans vs. Discourage casual piratesEncourage fans vs. Discourage casual pirates

Resource management vs. Customer Resource management vs. Customer managementmanagement

Web content model vs. 20Web content model vs. 20thth century content model century content model

Not mutually exclusive in theory.Not mutually exclusive in theory.

Page 20: Semantic Search on the Public Web with Creative Commons

Why Semantic Web?Why Semantic Web?

Small organization, no central registration for Small organization, no central registration for every license every license

Decentralization: Let a thousand search engines Decentralization: Let a thousand search engines bloom; web as API bloom; web as API

Existing RDF tools could take advantage of CC Existing RDF tools could take advantage of CC RDFRDF

Page 21: Semantic Search on the Public Web with Creative Commons

Why RDF-in-HTML comments? (yuck)Why RDF-in-HTML comments? (yuck)

Considered:Considered:• Robots.txt-likeRobots.txt-like• HTML meta tagsHTML meta tags• LINK to external RDF fileLINK to external RDF file

RDF-in-HTML comments wins becauseRDF-in-HTML comments wins because• Metadata colocated with human visible HTML, Metadata colocated with human visible HTML,

only single copy & paste for licensorsonly single copy & paste for licensors• Full power of RDFFull power of RDF

Page 22: Semantic Search on the Public Web with Creative Commons

CC Search History ICC Search History I

Postgresql/tsearch2/python prototype (early 2004)Postgresql/tsearch2/python prototype (early 2004)

Sloooowwwww, but did what a prototype Sloooowwwww, but did what a prototype should doshould do

Page 23: Semantic Search on the Public Web with Creative Commons

CC Search History IICC Search History II

CC-Nutch (late 2004)CC-Nutch (late 2004)

Nutch aims to be open source search engine Nutch aims to be open source search engine comparable to commercial web scale search comparable to commercial web scale search enginesengines

Built on top of Lucene full text indexBuilt on top of Lucene full text index

CC plugin only ~500 lines of code (not counting CC plugin only ~500 lines of code (not counting UI, CC-required additions to Nutch core)UI, CC-required additions to Nutch core)

http://search.creativecommons.org uses Nutch, uses Nutch, >1m CC-licensed pages indexed>1m CC-licensed pages indexed

Page 24: Semantic Search on the Public Web with Creative Commons
Page 25: Semantic Search on the Public Web with Creative Commons

CC Search History IIICC Search History III

Yahoo! Search for Creative Commons (early 2005)Yahoo! Search for Creative Commons (early 2005)

Search CC-licensed subset of Yahoo!’s index Search CC-licensed subset of Yahoo!’s index (~15m* pages)(~15m* pages)

*very rough guesstimate*very rough guesstimate

Page 26: Semantic Search on the Public Web with Creative Commons
Page 27: Semantic Search on the Public Web with Creative Commons
Page 28: Semantic Search on the Public Web with Creative Commons
Page 29: Semantic Search on the Public Web with Creative Commons

CC Search History IVCC Search History IV

Google CC search (November 2005)Google CC search (November 2005)

Search CC-licensed subset of Google’s index Search CC-licensed subset of Google’s index (~45m* pages)(~45m* pages)

*very rough guesstimate*very rough guesstimate

Page 30: Semantic Search on the Public Web with Creative Commons
Page 31: Semantic Search on the Public Web with Creative Commons
Page 32: Semantic Search on the Public Web with Creative Commons
Page 33: Semantic Search on the Public Web with Creative Commons

CC Search History V (the future)CC Search History V (the future)

Better metadata formatsBetter metadata formats

Image and Video searchImage and Video search

Derivatives searchDerivatives search

Content commerce searchContent commerce search

““Live” web searchLive” web search

““Management” (desktop, workgroup)Management” (desktop, workgroup)

Semantic mashupsSemantic mashups

Page 34: Semantic Search on the Public Web with Creative Commons

Future CC metadata formatsFuture CC metadata formats

““Semantic XHTML” AKA “lowercase semantic web” Semantic XHTML” AKA “lowercase semantic web” AKA “microformats” (now)AKA “microformats” (now)

<a <a rel=“license”rel=“license” href=“ href=“http://creativecommons.org/licenses/by/2.5/”>”>

RDF/A AKA XHTML2 metadata (in working RDF/A AKA XHTML2 metadata (in working group)group)

GRDDL (gleaning resource descriptions from GRDDL (gleaning resource descriptions from dialects of languages)dialects of languages)

Page 35: Semantic Search on the Public Web with Creative Commons
Page 36: Semantic Search on the Public Web with Creative Commons
Page 37: Semantic Search on the Public Web with Creative Commons
Page 38: Semantic Search on the Public Web with Creative Commons
Page 39: Semantic Search on the Public Web with Creative Commons

Image and Video searchImage and Video search

Better metadata formatsBetter metadata formats

Image and Video searchImage and Video search

Derivatives searchDerivatives search

Content commerce searchContent commerce search

““Live” web searchLive” web search

““Management” (desktop, workgroup)Management” (desktop, workgroup)

Semantic mashupsSemantic mashups

Page 40: Semantic Search on the Public Web with Creative Commons

Searching for Derivative WorksSearching for Derivative Works

Page 41: Semantic Search on the Public Web with Creative Commons

Creative Commons (0)Creative Commons (0)

Page 42: Semantic Search on the Public Web with Creative Commons

Creative Commons (0)Creative Commons (0)

Page 43: Semantic Search on the Public Web with Creative Commons

Creative Commons (0)Creative Commons (0)

Page 44: Semantic Search on the Public Web with Creative Commons

Creative Commons (0)Creative Commons (0)

Page 45: Semantic Search on the Public Web with Creative Commons

Derivatives searchDerivatives search

RDF/XML snippet:RDF/XML snippet:

<dc:source <dc:source rdf:resource=”http://ccmixter.org/media/files/vicrdf:resource=”http://ccmixter.org/media/files/victor/3385”/>tor/3385”/>

Query like Yahoo! link: search or Technorati Query like Yahoo! link: search or Technorati Cosmos searchCosmos search

source:http://ccmixter.org/media/files/victor/338source:http://ccmixter.org/media/files/victor/33855

““Who sampled this” as the new “who linked to Who sampled this” as the new “who linked to this”this”

Page 46: Semantic Search on the Public Web with Creative Commons

Content commerce searchContent commerce search

Transaction costs should be low even if rights are Transaction costs should be low even if rights are reservedreserved

Commercial terms and other commerce described Commercial terms and other commerce described by metadata associated with a workby metadata associated with a work

Find me work I can use at a price I can pay forFind me work I can use at a price I can pay for usage rightsusage rights warranty/paper trail (even if rights not warranty/paper trail (even if rights not

reserved)reserved)

Reintermediate consumer and creatorReintermediate consumer and creator

Page 47: Semantic Search on the Public Web with Creative Commons

““Live” web search (feeds)Live” web search (feeds)

Feeds are explicitly metadata-rich (unlike typical Feeds are explicitly metadata-rich (unlike typical web page)web page)

Existing blog search ignores metadataExisting blog search ignores metadata

Web search will become more like blog search, Web search will become more like blog search, vice versa?vice versa?

Page 48: Semantic Search on the Public Web with Creative Commons

““Management” (desktop, workgroup)Management” (desktop, workgroup)

Desktop search (OS-level)Desktop search (OS-level)

Content creation and media player integrationContent creation and media player integration

XMPXMP

Semantic WikisSemantic Wikis

Page 49: Semantic Search on the Public Web with Creative Commons

Semantic mashupsSemantic mashups

Page 50: Semantic Search on the Public Web with Creative Commons

Issues for Semantic Search on the Issues for Semantic Search on the Public WebPublic Web

Metadata qualityMetadata quality

TrustTrust

ScalabilityScalability

UsabilityUsability

CompatibilityCompatibility

Critical massCritical mass

State of the art IR works very well – high State of the art IR works very well – high expectations!expectations!

Page 51: Semantic Search on the Public Web with Creative Commons

Semantic Search on the Public Web with Creative Semantic Search on the Public Web with Creative CommonsCommons

2006.03.072006.03.07

Mike LinksvayerMike Linksvayer

Questions, feedback, flames:Questions, feedback, flames:

[email protected]

http://http://developer.creativecommons.orgdeveloper.creativecommons.org