Semantic Web Technologies Lecture Blog: http://semweb2013.blogspot.com / This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0 ) Lecture Dr. Harald Sack Hasso-Plattner-Institut für IT Systems Engineering University of Potsdam Winter Semester 2012/13 Donnerstag, 18. Oktober 12
100
Embed
Semantic Web Technologies - 01 - From Internet to the Web of Data
First lecture of this years lecture series on 'Semantic Web Technologies', at Hasso Plattner Institute for IT Systems Engineering, University Potsdam, Germany, winter 2012/13
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Semantic Web Technologies
Lecture Blog: http://semweb2013.blogspot.com/This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0)
LectureDr. Harald Sack
Hasso-Plattner-Institut für IT Systems EngineeringUniversity of Potsdam
Winter Semester 2012/13
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
1. Introduction 2. Semantic Web - Basic Architecture
Languages of the Semantic Web - Part 1
3. Knowledge Representation and LogicsLanguages of the Semantic Web - Part 2
4. Applications in the ,Web of Data‘
2
Semantic Web Technologies Content
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
3 1. Introduction • From Internet to Web 2.0
...from a historical point of view
• Quo Vadis WWW? ...the limits of the WWW
• Semantic Web...towards an ,intelligent‘ WWW
• Semantic Web Applications...the semantic WWW begins
Semantic Web Technologies Content
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
4
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
5
ARPANET29. October 1969
Semantic Web Technologies Content
J.C.R. Licklider, Robert Taylor: "The Computer as a Communication Device". Science and Technology 76, pp. 21-31, April 1968.
"[...] we are entering a technological age in which we will be able to interact with the richness of living information - not merely in the passive way that we have been accustomed to using books and libraries, but as active participants in an ongoing process, bringing something to it through our interaction with it, and not simply receiving something from it by our connection to it."
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
6 Es war einmal....das Internet
Larry RobertsARPA IPTO Chief Scientist
(1966-1973)
Semantic Web Technologies From Internet to Web 2.0
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
7 • How did the user get the information?
1. open terminal2. connect to remote computer3. retrieve file system data from
remote computer4. download file from remote to
local computer5. read file on local computer
Problem:• Information access requires expert knowledge• Information access is expensive...• Information Retrieval is very expensive...
First Generation: The Internet Computer Centered Processing
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
8
The World Wide Web was born at the European Nuclear Research CenterCERN in 1990...
Tim Berners-Lee
Robert Cailliau
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
9 • How did the user get the information?
Advantages:• No expert knowledge required• Simple information access• Information retrieval via search engines
Second Generation: The Web Document Centered Processing
1. open browser2. load document3. click on next hyperlink4. ...
Dokument
Dokument
Dokument
Dokument
Dokument
Hyperlink
Hyperlink Hyperlink
Hyperlink
HyperlinkHy
perlink
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
10
• But the original idea behind theWWW is much older.....
Denis Diderot(1713-1784)
Jean-Baptiste le Rond
d'Alembert (1717-1783)
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
11
Agostino Ramelli (1588), Le diverse et artificiose machine;
composte in lingua Italiana et Francese
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
11
Agostino Ramelli (1588), Le diverse et artificiose machine;
composte in lingua Italiana et Francese
Usabilit
y...?
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
12
Vannevar Bush(1890-1974)
vgl. Vannevar Bush, As we may think ,The Atlantic Monthly, 1945, July
Vannevar Bush proposed the firstHypertext-System „MEMEX“ in 1945
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
13
Why was the Web such a big success ?
Lynx 1993
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
Why was the Web such a big success ?
14
NCSA Mosaic 1994
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
Why was the Web such a big success ?
15
IPadSafari 2010
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
16
http://www.isc.org/(Stand: 10/2012)
There seem to be no Limits of Growth...
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
17
Information Consumption
Information Production
Interactive Participation
908,585,739
From Web 1.0 to Web 2.0 Web Content and Applications are Changing
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
18
Web Content and Roles are Changing... Web 2.0 is thebusiness revolution inthe computer industrycaused by the move tothe Internet as
platform, and anattempt to understandthe rules for successon that new platform " -- Tim O'Reilly, 2003
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
19
Semantic Web Technologies Content
1. Introduction • From Internet to Web 2.0
...from a historical point of view
• Quo Vadis WWW? ...the limits of the WWW
• Semantic Web...towards an ,intelligent‘ WWW
• Semantic Web Applications...the semantic WWW begins
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
20
How can we find somethingin the Web?
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
current solution:
21
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
The Web is big. Really big. You just won't believe how vastly, hugely, mind-bogglingly big it is.(...according to Douglas Adams)
22
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
The Web is really big...•ca. 25 x 109 indexed documents in search engines
(TNL Blog: Google has 24 billion items index, considers MSN search nearest competitor, September 2005)
•Web Crawler: > 1012 (1 trillion) documents(The Official Google Blog: We knew the Web was Big....., Juli 25, 2008)
•Google Search Index Caffeine comprises ca.100 Million Gigabytes i.e. 1017 Byte (SMX Video: Google’s Matt Cutts On Caffeine Launch, June 9, 2010,http://searchengineland.com/smx-video-googles-matt-cutts-on-caffeine-launch-43933)
•DeepWeb (Darkweb) estimated to be about 550 times bigger than Surface Web (Bergman, 2001)
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
26 Multimedia, Real-Time Data, Sensor Data, ....
Semantic Web Technologien Quo vadis WWW ?
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
27 Multimedia, Real-Time Data, Sensor Data, ....
Semantic Web Technologien Quo vadis WWW ?
Donnerstag, 18. Oktober 12
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
28
and all the things will connect with social networks...
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
29
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
29 • what is important and how do you know?
• what is information, what is advertisement?
• what does the information mean?
• how credible/trustworthy is the information?
• what belongs together?• what is redundant?
• Humans have contextual knowledge, world knowledge and experience to solve the problem
Information in the WWW
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
30
• The Web is supposed to be used by humans.• The Web is based on the markup language HTML
• HTML describes• how information is presented • how information is linked • but not, what the information means
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
31
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
32
Semantics (greek σεμαντικος = pertains to the character, the study of meaning) is part of the linguistics focussed on
•Sense and •Meaning
of language or symbols of language. It is the study of interpretation of signs or symbols as used by agents or communities within particular circumstances and contexts.
Semantics asks, how sense and meaning of complex concepts can be derived from simple concepts based on the rules of syntax.
The semantics of a message depends of its context and pragmatics.
Semantics
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
33
Syntax (greek σψνταξις = Arrangement, Ordering) as in grammatics denotes the study of the principles and processes by which sentences are constructed in particular languages.
• In formal Languages, syntax is just a set of rules, by which well formed expressions can be created from a fundamental set of symbols (alphabet).
• in computer science, syntax defines the normative structure of data.
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
34
Context (lat. contextus = interweaved) denotes the surrounding of a symbol (concept) in an expression resp. its relationship with surrounding expressions (concepts) and further related elements,
Contexts denotes all elements of any sort of communication that define the interpretation of the communicated content, as e.g., • general contexts:
place, time, interrelation of action in a message• personal or social contexts:
relation between sender and receiver of a message
Context
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
35
Pragmatics (greek. πραγμα = action) reflects the intention by which the language is used to communicate a message.
In linguistics pragmatics denotes the study of applying language in different situations. Pragmatics studies the ways in which context contributes to meaning.
Pragmatics
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
36
In the (traditional) Web there is no explicit semantics.Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
37
Problem 1: Information RetrievalDonnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
37
Problem 1: Information Retrieval
• traditional keyword-based search leads to many not relevant results•different meanings•polysemy•different contexts
• traditional keyword-based search does not find all results•synonyms and metaphors•missing context definition
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
38
Problem 2: Information Extraction
•can only be solved ,correctly‘ by a human agent
•heterogeneous distribution and order of information
•Software agent does not have sufficient• knowledge of contexts•world knowledge and•experience
• to solve the problem
Donnerstag, 18. Oktober 12
Problemfeld 2: Informationsextraktion
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
39
•implicit knowledge, i.e. information does not have specified explicitely, but must be derived via logical deductions from available information.
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
40
Problem 3: Maintenance
• the more complex and voluminous a website, the more complicated is the maintenance of the only weakly structured data.
•Problems:•syntactic and semantic (link)
consistency•correctness• timeliness
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
41
Problem 4: Personalization
•Adaption of the presented information content to personal requirements
•Problems: • from where do we get the
required (personal) information?
•personalization vs. data security
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
42
GAME OVER
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
43
Semantic Web Technologies Content
1. Introduction • From Internet to Web 2.0
...from a historical point of view
• Quo Vadis WWW? ...the limits of the WWW
• Semantic Web...towards an ,intelligent‘ WWW
• Semantic Web Applications...the semantic WWW begins
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
44
Tim Berners-Lee, Semantic Web Roadmap, Sept 1998
„The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help… “
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
50
Fabio Capello
Entity Mapping
Entity
• The Meaning (Semantics) of entitiesand classes must be defined explicitly.
Text: „Why snub me, Fab?“
„Understanding“ Content on the Web
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
50
Fabio Capello
Entity Mapping
Entity
Soccer Manager
is a
Class
• The Meaning (Semantics) of entitiesand classes must be defined explicitly.
Text: „Why snub me, Fab?“
„Understanding“ Content on the Web
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
50
Fabio Capello
Entity Mapping
Entity
Soccer Manager
is a
Class
Person
is subclass of
Class
• The Meaning (Semantics) of entitiesand classes must be defined explicitly.
Text: „Why snub me, Fab?“
„Understanding“ Content on the Web
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
51
Fabio Capello (entity)
Soccer Manager
is a
(class)
Class-membership has class
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
„Understanding“ Content on the Web
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
51
Fabio Capello (entity)
Soccer Manager
is a
(class)
Class-membership has class
Person
is subclass of
(class)
superclass
subclass
is subclass of
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
„Understanding“ Content on the Web
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
52
Fabio Capello
Soccer Manager
Person
is aEntities
Classes
is subclass of
„Understanding“ Content on the Web (III)
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
52
Fabio Capello
Soccer Manager
Person PlacebirthPlace
is aEntities
Classes
is subclass of
„Understanding“ Content on the Web (III)
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
52
Fabio Capello
Soccer Manager
Person PlacebirthPlace
is aEntities
Classes
is subclass of
San CanzianbirthPlace
is a
„Understanding“ Content on the Web (III)
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
52
Fabio Capello
Soccer Manager
Person PlacebirthPlace
DatebirthDate
is aEntities
Classes
is subclass of
San CanzianbirthPlace
is a
„Understanding“ Content on the Web (III)
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
52
Fabio Capello
Soccer Manager
Person PlacebirthPlace
DatebirthDate
is aEntities
Classes
is subclass of
San CanzianbirthPlace
is a
birthDate1946-06-18
is a
„Understanding“ Content on the Web (III)
•The Meaning (Semantics) is expressed with the help of well suited knowledge representations (Ontologies)
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
53
Tim Berners-Lee, James Hendler, Ora Lassila: The Semantic Web, Scientific American, 284(5), pp. 34-43(2001)
What is the Semantic Web?
„The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation"
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
68
What is so special about the BBC Music Website?• Information is dynamically aggregated from
external, publicly available data (Wikipedia, MusicBrainz,...)
• no Screen Scraping• no specialized API• data available als Linked Open Data• data access via simple HTTP Request• data is always up-to-date without manual
interaction
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
71Search Engines - Document Retrieval• General Problems:
• correct interpretation of query string
• correct identification of entities
• automatic disambiguation
• usability
• personalization
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
72
Search Engines - Fact Retrieval•Query String:
“Where can I fill up the tank at a considerable discount?“
Answer : - Hohenfelden, xy-Str. 32 -> Super leaded, 1,99 € - fuel-efficient route will be passed to navigation - drive only at half throttle for saving fuel…
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
73
dbpedia-owl:mission
dbpedia:Neil_Armstrong
dbpedia:Apollo_11dbpedia-owl:mission
category:Apollo_program
dcterms:subject
dbpedia:Apollo_13
dcterms:subject
yago:Space_accidents_and_incidents
rdf:type
rdf:type
dbpedia:Space_Shuttle_Challenger
dbpedia-owl:mission
dbpedia:Buzz_Aldrin
dbpedia:Michael_Collins
Search Engines - Exploratory Search
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
74
Intelligent Agents in the Semantic Web
User
retrieval service(e.g. Google)
WWW documents
presentation service(e.g. Firefox)
WWW
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
74
Intelligent Agents in the Semantic Web
User
retrieval service(e.g. Google)
WWW documents
presentation service(e.g. Firefox)
WWW
WWW documents
SemanticWeb
User
personalassistant
intelligentinfrastructure
services
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
75
3 Generations of Web Documents
staticweb pages
HTML / CSS
1. Generation
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
75
3 Generations of Web Documents
staticweb pages
HTML / CSS
1. Generation
interactiveweb pages
dynamischeweb pages
JavaScript / Applets
Data Base AccessTemplate-based
Generation
2. Generation
Donnerstag, 18. Oktober 12
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam