1 rNews Embedded Data For The News Industry
Nov 18, 2014
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
1
rNewsEmbedded Data For
The News Industry
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Hello!
§ Stuart Myles – @smylesLead of the IPTC Semantic Web WG &Deputy Director of Schema Standards,The Associated Press
§ Evan Sandhaus – @kansandhausLead Architect, Semantic Platforms,The New York Times Company
§ Andreas Gebhard – @agebhardManaging Editor,Getty Images
2
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
4
...And 50 Others
STORY
PHOTO
Story components which are obvious to a person…
STORY
PHOTO
...are not so obvious to a machine.
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Problem of Structured Data
§ Modern Web Sites Built with 3 Tier Architecture• Data Tier: Database
Where Content Lives.• Presentation Tier:
HTML Document that is sent to user.
• Logic Tier: Software that reads from the Data Tier and outputs the Presentation Tier.
8
Data Tier
Logic Tier
Display Tier
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Problem Of Structured Data: Continued
9
Label Type Value
id number 1248069162607
Headline text New Web Code Draws Concern...
Byline text By TANZINA VEGA
Date date 20101010
Body text In the next few years, a powerful...
Length number 1123
Tag text Privacy
Tag text Computers and the Internet
Tag text Web Browsers
<html> <head> <title> New Web Code Draws Concern... </title> </head> <body> <div> New Web Code Draws Concern... </div> <div> By TANZINA VEGA </div> <div> October 10, 2010 </div> <div> In the next few years, a powerful... </div> </body></html>
Data Tier Display TierLogic Tier
§ Content very well structured on Data Tier, but all of this structure is lost in translation to presentation tier.
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Problem Of Structured Data: Continued
10
<html> <head> <title> New Web Code Draws Concern... </title> </head> <body> <div> New Web Code Draws Concern... </div> <div> By TANZINA VEGA </div> <div> October 10, 2010 </div> <div> In the next few years, a powerful... </div> </body></html>
Display Tier
=
?
§ Search engines, social networks, aggregators and other sites only see the Display Tier, and cannot leverage the underlying structure of the data.
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Semantic Markup Standards
11
Microformats RDFa Microdata JSON
§ First§ Simple§ Rigid
§ Official§ Complex§ OpenGraph
§ Unofficial§ Flexible§ Schema.org
§ Official§ Developers§ External
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
12
rNews
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews Defined
rNews is a data model for embedding machine-readable publishing metadata in web documents and a set of suggested implementations.
13
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews is a data model
14
ImageObjectVideoObjectAudioObject
Article
Comment
OrganizationPerson Location
NewsItem
comment
associatedMedia
Concept
about
PostalAddress
addressaddress
mentions
address
creatoreditorcontributorprovidercopyrightHolderaccountablePerson
creatoreditor
contributorprovider
copyrightHoldersourceOrganization
name
associatedArticle
GeoCoordinates
geoCoordinates
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
for embedding machine-readable publishing metadata in web documents
15
HeadlineBylineTagsCreator...
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
and a set of suggested implementations
16
RDFa Microdata JSON
Today Very Soon Maybe?
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews - Working Example
17
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
18
123456789
10111213141516171819202122232425262728293031323334353637383940414243444546474849505152
<!DOCTYPE4html4PUBLIC4">//W3C//DTD4XHTML41.04Transitional//EN"44"http://www.w3.org/TR/xhtml1/DTD/xhtml1>transitional.dtd">44<html><head></head><body>444<div>4444<div>4444444<div>Allies4Are44Split...</div>444444<div>NATO4Takes44Command</div>444444<div>44444444<img44src="img/libya_sample_reuters.jpg"/>44444444<div>Credit:4Goran4Tomasevic/Reuters</div>444444444<div>Rebel4fighters44take...</div>444444</div>444444<div>By4STEVEN4LEE44MYERS</div>444444<div>WASHINGTON44|4March424,42011</div>444444<div>44444444<p>Having44largely4succeeded...</p>444444</div>444444<div>44444444<p><a44href="http://www.nytimes.com/content/help/rights/copyright/copyright>notice.html">44444444444©4Copyright442011444444444</a><span>The4New4York44Times44Company</span></p>44444444<p><a44href="http://www.nytimes.com/ref/membercenter/help/agree.html">44444444444Disclaimer444444444</a></p>4444444</div>4444</div>44444<div>444444<div>44444444<div>Section</div>44444444<div>World</div>444444</div>444444<div>Tags</div>4444444<div>44444444<div>4444444444<div>People</div>4444444444<div>Qaddafi,4Muammar44el></div>44444444</div>444444</div>444444444444<div>44444444<div>Discussion44(3)</div>44444444<div>4444444444<div>So4the4question44is..."</div>4444444444<div>4444444444<a44href="http://timespeople.nytimes.com/view/user/27242827/activities.html">Chuck</a></div>4444444444<div>March425th,44201148:274am</div>44444444</div>444444</div>4444</div>444</div></body></html>
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
HTML 5 Microdata
19
<!DOCTYPE HTML><html itemscope itemtype="http://schema.org/NewsArticle" ><head>! <style type="text/css">@import url(css/iptc_times2.css);</style>! <meta itemprop="dateCreated" content="2011-03-23"/>! <meta itemprop="description" content="The questions about the command..."/>! <meta itemprop="inLanguage" content="en-US"/>! <meta itemprop="thumbnailUrl" content="http://graphics8.nytimes.com/images/common/icons/t_wb_75.gif"/>! <meta itemprop="genre" content="Current"/>! <meta itemprop="id" content="1248069687395"/>! <meta itemprop="version" content="2"/>! <meta itemprop="publishingPrinciples" content="http://www.nytco.com/press/ethics.html"/>! <meta itemprop="wordCount" content="879"/>!</head><body>! <div style="height:900px" class="article">! ! <div class="a_column">! ! ! <div itemprop="headline" class="headline">Allies Are Split on Goal and Exit Strategy in Libya</div>! ! ! <div itemprop="alternativeHeadline" class="rider">NATO Takes Command</div>! ! ! <div itemprop="associatedMedia" itemscope itemtype="http://schema.org/ImageObject">! ! ! ! <img itemprop="URL" class="image" src="img/libya_sample_reuters.jpg"/>! ! ! ! <div class="image_credit">Credit:! ! ! ! ! <span itemprop="creator" itemscope itemtype="http://schema.org/Person"> ! ! ! ! ! ! <span itemprop="name">Goran Tomasevic</span>! ! ! ! ! </span> ! ! ! ! ! /! ! ! ! ! <span itemprop="sourceOrganization" itemscope itemtype="http://schema.org/Organization">! ! ! ! ! ! <span itemprop="name">Reuters</span>! ! ! ! ! ! <meta itemprop="tickerSymbol" content="NYSE TRI"/>! ! ! ! ! </span>! ! ! ! </div>
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
RDFa
20
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"><html xmlns:rnews="http://dec.iptc.org/rnews/0.1/"><head>! <style type="text/css">@import url(css/iptc_times2.css);</style></head><body>! <div class="article" style="height:623px">! ! <div class="a_column">! ! ! <div property="rnews:headline" class="headline">Allies Are Split on Goal and Exit Strategy in Libya</div>! ! ! <div class="rider">NATO Takes Command</div>! ! ! <div class="main_image">! ! ! ! <img class="image" src="img/libya_sample_reuters.jpg"/>! ! ! ! <div class="image_credit">Credit: Goran Tomasevic/Reuters</div>! ! ! ! <div class="image_caption">! ! ! ! ! Rebel fighters take cover during a shelling near Ajdabiyah, Libya on Thursday.! ! ! ! </div>! ! ! </div>! ! ! <div rel="rnews:createdBy" class="byline">By ! ! ! ! <span about="http://demo.iptc.org/per/steven_lee_myers" typeof="rnews:Person">! ! ! ! ! <span property="rnews:name">STEVEN LEE MYERS</span>! ! ! ! </span>! ! ! </div>! ! ! <div class="publication_date">! ! ! ! <span property="rnews:dateline">WASHINGTON</span>! ! ! ! | ! ! ! ! <span property="rnews:dateCreated" content="2011-03-24">March 24, 2011</span>! ! ! </div>
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Way to rNews
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
The Way To rNews
§ June: Genesis of rNews - Evan at SemTech 2010
§ November 5 - Rome: chartered
§ internal discussions about NYT draft
§ March 9 - Dubai: rNews 0.1
§ lots of feedback, changes and additions
§ June 9 - Berlin: rNews 0.5
§ June 28: rNews 0.6
§ September 6: rNews 0.7 [aligned w/ schema.org]
§ October 7 - Vienna: rNews 1.022
2010
2011
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
23
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
24
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
25
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
26
Engaging Our Community
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Feedback we incorporated...
§ In Person• 3 Meetups: New York, Berlin, London• Over a dozen one-on-one meetings with leading media and
technology companies.
§ Online • Rnews.org forum• Numerous Blog Posts
§ In The Standard’s Community• W3C Community Group• Media Standards Trust
27
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Feedback we incorporated...
28
pointcircleelevationpolygonboxlineGeo
CoordinatesLocation
latitudelongitudealtitude
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Feedback we incorporated...
29
Person
editor
NewsItem
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews Benefits
Or Why You Should Care About rNews
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Benefit #1: Better Links
31
With StructuredData
No StructuredData
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
Benefit #2: Better Analytics
32
Javascript can extract richer news metadataAnalytics per item, not just per page
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
33
Benefit #3: Better Ad Placement
Leverage metadatanot just text
Avoid unfortunatejuxtapositions
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
rNews as a news API
34
Level the Playing FieldEncourage Open Innovation
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
35
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
How Can You Help Us Get to rNews 1.0?
§ Check out the rNews 0.7 spec§ Mark up some pages using rNews§ Extract rNews properties using your favourite distiller§ Dream up The Next Metadata Killer App™
Let us know what you thinkLet us know how we can help
@smyles • @agebhard • @kansandhaus
36
IPTC Document:
DIR1004.2-AuMschedule.doc Page 1 of 2 © 2010 International Press Telecommunications Council | www.iptc.org
Document history [Document URN: urn:iptc:workdoc:dir:1004:2 ]
Revision Issue Date Pages Author (revised by) Remark
1 2010-09-22 2 Michael Steidl
2 2010-10-11 2 Michael Steidl Revised after Chairs call
37
rNewsThank
You