Big Linked Data
Presented by:Barry Norton, Ontotext AD
2
Aims for curriculum
1.Show realistic solutions2.Use real data3.Use real tools4.Show scalable solutions5.Eat the dog’s food
EUCLID – a Curriculum for Big Linked Data
3
Realistic Solution
Visualization Module
Metadata
Streaming providers
Physical Wrapper
Downloads
Dat
a ac
quis
ition
R2R Transf.LD Wrapper
Musical Content
Appl
icati
on
Analysis & Mining Module
LD D
atas
etAc
cess
LD Wrapper
RDF/ XML
Integrated Dataset
Interlinking CleansingVocabulary Mapping
SPARQL Endpoint
Publishing
RDFa
Other contentEUCLID – a Curriculum for Big Linked Data
4
Real Data
EUCLID - Providing Linked Data
• No pizza
5
Real Data
EUCLID - Providing Linked Data
• No pizza • No wine
6
Real Data
EUCLID - Providing Linked Data
• No pizza • No wine
• No Protégé
7
Real Data
EUCLID - Providing Linked Data
• MusicBrainz dataset:
• Music Ontology:
8
• Admitted simple start
Real Tools
EUCLID – a Curriculum for Big Linked Data
• Industry-strength by Module 2
9
• All tools explained by screencast
Real Tools
EUCLID – a Curriculum for Big Linked Data
10
• All tools explained by screencast
Real Tools
EUCLID – a Curriculum for Big Linked Data
Explains how Exercise 1 was created
11
• The data of interest may be stored in a wide range or formats:
• Several tools support the process of mining data from different repositories, for example:
Real Tools
Spreadsheetsor tabular data
Databases Text
R2RMLEUCLID – a Curriculum for Big Linked Data
12
Scalable Solutions
EUCLID – a Curriculum for Big Linked Data
• MusicBrainz RDF derived via R2RML:
lb:artist_member a rr:TriplesMap ; rr:logicalTable [rr:sqlQuery """SELECT a1.gid, a2.gid AS band FROM artist a1 INNER JOIN l_artist_artist ON a1.id = l_artist_artist.entity0 INNER JOIN link ON l_artist_artist.link = link.id INNER JOIN link_type ON link_type = link_type.id INNER JOIN artist a2 on l_artist_artist.entity1 = a2.id WHERE link_type.gid='5be4c609-9afa-4ea0-910b-12ffb71e3821'"""] ; rr:subjectMap [rr:template "http://musicbrainz.org/artist/{gid}#_"] ; rr:predicateObjectMap [rr:predicate mo:member_of ; rr:objectMap [rr:template "http://musicbrainz.org/artist/{band}#_" ; rr:termType rr:IRI]] .
300M Triples
13
Dog Food
EUCLID – a Curriculum for Big Linked Data
• EUCLID output, topics and engagement monitored
public-lodpublic-vocabssemantic-web
14
Dog Food
EUCLID – a Curriculum for Big Linked Data
• EUCLID output, topics and engagement monitored
public-lodpublic-vocabssemantic-web
• Offered as public SPARQL endpoint
15
Dog Food
EUCLID – a Curriculum for Big Linked Data
• EUCLID output, topics and engagement monitored
public-lodpublic-vocabssemantic-web
• Offered as public SPARQL endpoint
• Will be used as basis of analysis examples
16
Results
EUCLID – a Curriculum for Big Linked Data
• Achieving ~100 live viewers:
• Set to exceed 1000 post hoc views /channel (Webinar platform, Vimeo, Slideshare):
17
For exercises, quiz and further material visit our website:
@euclid_project EUCLID project EUCLIDproject
http://www.euclid-project.eu
Other channels:
eBook Course
EUCLID – a Curriculum for Big Linked Data