The Internet Is Your New Database: An Introduction To The Semantic Web

An Introduction to the Semantic WebWill Strinz

The Internet Is Your New Database

What’s this all about then?

Today we have the World Wide Web


Today we have the World Wide Web


Its

Distributed

Accessible using all sorts of devices and software

Document Based

This is very flexible. But

Hard to search

Unstructured and context-less - hard to consume automatically

Easy to share, hard to compose and remix

Full of all sorts of ‘homebrew’ databases and ad-hoc schema


Database software is powerful but rarely interacted with directly

Often accessed through the web, but indirectly

Still ultimately siloed

Hard to compose and remix

What if the internet acted as one big distributed database?

Wat Do?

Move from a web of documents to a web of data

Needs to be

Structured, but still flexible

Distributed

Accessible to machine and human alike

What is this Semantic Web Thing?

Represents information using Subject Predicate Object

Subject Predicate Object

Will Strinz years old 24

Bendyworks a Company

Will Strinz works at Bendyworks

What is this Semantic Web Thing?

With URIs and typed literals

Subject Predicate Object

example.org/Will_Strinz example.org/years_old 24

example.org/Bendyworks example.org/is_a example.org/Company

example.org/Will_Strinz example.org/works_for example.org/Bendyworks

• Called Resource Description Format (RDF)

Example

Lets start with a resource, or “thing”

Example

Lets start with a resource, or “thing”

We’ll call it http://example.org/Fresh, or ex:Fresh for short

Example

What can we say about ex:Fresh?

S P O

ex:Fresh ex:enjoys “B-Ball”

ex:Fresh ex:auntex:Vivian_Bank

s

ex:aunt

ex:enjoys

ex:Vivian_Banks

“B-Ball”ex:Fresh

Example

ex:Vivian_Banks is also a resource

ex:aunt

“B-Ball”ex:Fresh ex:enjoys

ex:Vivian_Banks

S P O


ex:Fresh ex:auntex:Vivian_Bank

s

Example

So now we can say things about her too!

S P O


ex:Fresh ex:aunt ex:Vivian_Banks

ex:Vivian_Banks

ex:nickname

“Aunt Viv”

1

ex:nickname “Aunt Viv”

ex:aunt

“B-Ball”ex:Fresh ex:enjoys

ex:Vivian_Banks

Vocabularies

We’ve been defining our own predicates and objects so far

Could add details about each predicate

Isn’t this a waste of time?

Yes! Use RDF Vocabularies

Define new namespaces, terms, and objects

‘Imported’ simply by reference

Are described in RDF

FOAF Vocabulary

“Friend Of A Friend”

Located at http://xmlns.com/foaf/0.1/

foaf:name a rdf:Property, owl:DatatypeProperty; rdfs:label "name"; rdfs:comment "A name for some thing."; rdfs:domain owl:Thing; rdfs:isDefinedBy foaf:; rdfs:range rdfs:Literal; rdfs:subPropertyOf rdfs:label; sw_ns:term_status "testing" .

http://xmlns.com/foaf/0.1/

Example

S P O

ex:Freshfoaf:based_ne

arex:West_Philedelp

hia

ex:Fresh foaf:age 20

ex:Fresh foaf:nameWill ‘The Fresh Prince’ Smith

ex:aunt

ex:enjoys “B-Ball”

ex:Vivian_Banks

ex:Fresh

ex:nickname“Aunt Viv”

foaf:based_near

ex:West_Philadelphia

foaf:age

22 foaf:name

“Will ‘The Fresh Prince’ Smith”

Example

S P O

ex:Freshfoaf:based_ne

arex:West_Philedelp

hia

ex:Fresh foaf:age 20

ex:Fresh foaf:nameWill ‘The Fresh Prince’ Smith

ex:aunt

ex:enjoys “B-Ball”

ex:Vivian_Banks

ex:Fresh

ex:nickname“Aunt Viv”

foaf:based_near

ex:West_Philadelphia

foaf:age

22 foaf:name

“Will ‘The Fresh Prince’ Smith”

Lets look at what we have

Human readable

Simple and flexible

Rigid when necessary

Atomic statements

Dual representation

Ideally

Dereferencable

Structured, Machine understandable

Author

For each of these, if there’s time, make it a separate slide and make a little visual for each one

Serialization

Multiple formats

All of which are

Standardized

Interoperable

Information preserving

File or Triple Store

Serialization

NTriples<http://example.orgfresh> <http://xmlns.com/foaf/0.1/name> "Will 'The Fresh Prince' Smith" .<http://example.orgfresh> <http://example.orgenjoys> "B-Ball" .<http://example.orgfresh> <http://xmlns.com/foaf/0.1/age> "22"^^<http://www.w3.org/2001/XMLSchema#integer> .<http://example.orgfresh> <http://example.orgaunt> <http://example.orgVivian_Banks> .<http://example.orgfresh> <http://xmlns.com/foaf/0.1/based_near> <http://example.orgWest_Philadelphia> .<http://example.orgVivian_Banks> <http://example.orgnickname> "Aunt Viv" .

Turtle@prefix ex: <http://example.org> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .

ex:fresh foaf:name "Will 'The Fresh Prince' Smith" ; ex:enjoys "B-Ball" ; foaf:age 22 ; ex:aunt ex:Vivian_Banks ; foaf:based_near ex:West_Philadelphia .

ex:Vivian_Banks ex:nickname "Aunt Viv" .

Serialization

JSON-LD

{ "@context": { "ex": "http://example.org", "foaf": "http://xmlns.com/foaf/0.1/" }, "@graph": [ { "@id": "ex:Vivian_Banks", "ex:nickname": "Aunt Viv" }, { "@id": "ex:fresh", "ex:aunt": { "@id": "ex:Vivian_Banks" }, "ex:enjoys": "B-Ball", "foaf:age": { "@value": "22", "@type": "http://www.w3.org/2001/XMLSchema#integer" }, "foaf:based_near": { "@id": "ex:West_Philadelphia" }, "foaf:name": "Will 'The Fresh Prince' Smith" } ]}

Serialization

RDF/XML

<?xml version='1.0' encoding='utf-8' ?><rdf:RDF xmlns:ex='http://example.org' xmlns:foaf='http://xmlns.com/foaf/0.1/' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:xsd='http://www.w3.org/2001/XMLSchema#'> <rdf:Description rdf:about='http://example.orgVivian_Banks'> <ex:nickname>Aunt Viv</ex:nickname> </rdf:Description> <rdf:Description rdf:about='http://example.orgfresh'> <ex:aunt rdf:resource='http://example.orgVivian_Banks' /> <ex:enjoys>B-Ball</ex:enjoys> <foaf:age rdf:datatype='http://www.w3.org/2001/XMLSchema#integer'>22</foaf:age> <foaf:based_near rdf:resource='http://example.orgWest_Philadelphia' /> <foaf:name>Will 'The Fresh Prince' Smith</foaf:name> </rdf:Description></rdf:RDF>

Serialization

Other Formats

TRiG/TRiX

RDFa

N3

NQuads

Mappings onto SQLite, Mongo, etc

Querying

What new language do I have to learn just to query

SPARQL!

But stick with me, its not so bad

Syntax similar to SQL, but joins are free!

Much more consistent across endpoints and triple stores

Querying

How old is Will ‘The Fresh Prince’ Smith?

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX ex: <http://example.org>

SELECT (?age) WHERE { ex:Fresh foaf:age ?age}

=> 22

Querying

What is Will ‘The Fresh Prince’ Smith’s Aunt’s nickname?

SELECT (?nick) WHERE { ex:Fresh ex:aunt ?aunt . ?aunt ex:nickname ?nick .}

=> “Aunt Viv”

Querying

What do we know about Will ‘The Fresh Prince’ Smith?

SELECT (?prop) WHERE { ex:Fresh ?prop ?value .}

=> foaf:name, ex:enjoys, foaf:age, ex:aunt, foaf:based_near

Author

maybe mention schema ‘falls out’?

Querying

• Update, Insert, and Delete

• Logical / Regex filters

• Subqueries

• Order and Offset

• Aggregates

Many other features in SPARQL

• Data Types

• Construct, Describe, and Ask modes

• Order and Offset

• Property Paths

• Math functions

Getting Connected

So, RDF has some cool features

• Flexible yet structured

• Atomic

• Graph Based

• W3C Backed

• Human Friendly

• Queryable

• Serializable

• Extensible

And I can say things about Will Smith with itBut how different is that really from any other database?

Connections!

Getting Connected

Remember our ex:Fresh URI?

http://example.org/Fresh

Its not dereferencable.

What if instead we used

http://dbpedia.org/resource/Will_Smith_(character)



Getting Connected

Suddenly we have more information

dbpedia:Will_Smith_(character) a yago:FictionalCharacter, dbpedia-owl:FictionalCharacter, yago:FictionalVersionsOfRealPeople, yago:ImaginaryBeing109483738, dbpedia-owl:Person, dbpedia-owl:Agent, yago:SitcomCharacters, owl:Thing, foaf:Person; rdfs:label "Will Smith (character)”@en;

Getting Connected

A lot more information

dbpedia-owl:abstract "William \"Will\" Smith (born July 3, 1973) is a fictional character in the NBC television series, The Fresh Prince of Bel-Air."@en; dbpedia-owl:birthDate "1973-07-02+02:00"^^xsd:date; dbpedia-owl:portrayer dbp:Will_Smith; dbpedia-owl:series dbp:The_Fresh_Prince_of_Bel-Air; dbpedia-owl:wikiPageExternalLink <http://www.imdb.com/character/ch0020905/>; dbpprop:born "1973-07-02+02:00"^^xsd:date; dbpprop:family "Janice Smith"@en, "Hilary Banks"@en, "Carlton Banks"@en, "Vy Smith-Wilkes"@en, "Lisa Wilkes"@en, "Lou Smith"@en, "Ashley Banks"@en, "Helen Smith"@en, "Fred Wilkes"@en, "Phillip Banks"@en, "Vivian Banks"@en;

Getting Connected

dbpprop:first "\"The Fresh Prince Project\""@en; dbpprop:hasPhotoCollection informatik:Will_Smith_(character); dbpprop:name "Will Smith"@en; dbpprop:nicknames "Master William, Prince, Fresh Prince, Will"@en; dbpprop:portrayer dbp:Will_Smith; dbpprop:series dbp:The_Fresh_Prince_of_Bel-Air; dbpprop:wordnet_type wn:synset-character-noun-4; dcterms:subject category:Fictional_characters_introduced_in_1990, category:Fictional_African-American_people, category:Fictional_versions_of_real_people, dbpcategory:Fictional_characters_from_Philadelphia,_Pennsylvania, category:Sitcom_characters; rdfs:comment "William \"Will\" Smith (born July 3, 1973) is a fictional character in the NBC television series, The Fresh Prince of Bel-Air."@en; owl:sameAs <http://rdf.freebase.com/ns/m.0417_vv>, <http://yago-knowledge.org/resource/Will_Smith_(character)>, <http://dbpedia.org/resource/Will_Smith_(character)>; foaf:isPrimaryTopicOf <http://en.wikipedia.org/wiki/Will_Smith_(character)>; foaf:name "Will Smith"@en .

Getting Connected

Human AND Machine Readable

Getting Connected

DBPedia has a SPARQL Endpoint

Lots of fun queries

SELECT * WHERE { ?episode dc:subject dbpcategory:The_Simpsons_%28season_14%29_episodes . ?episode dbpedia2:blackboard ?chalkboard_gag .}

Getting Connected

Larger goal of connecting the whole web

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A "Semantic Web", which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The "intelligent agents" people have touted for ages will finally materialize. — Sir Tim Berners-Lee (1999)

http://dgallery.s3.amazonaws.com/lod-cloud_colored.png

Going Further

Templating with RDFa or XSLT

Ontologies / OWL

Reasoning

Libraries

Ruby-RDF

Spira

Publisci and Publisci Server

Tools and interfaces - Build them!

Disclaimers

Relatively young

Less engineering time

SPARQL changing and not fully implemented in all triple stores

Flexibility has its downsides; Garbage in Garbage out

No agreed upon method for schema constraints

End

Thanks Bendyconf Attendees and Organizers!

Questions?

The Internet Is Your New Database: An Introduction To The Semantic Web

Technology

fresh ex

aunt ex

example ex

banks ex

resource ex

prefix ex

philedelphia ex

fresh prince smith ex