Open Web Data for Education - Linked Data technologies for connecting open educational data
Post on 27-Jan-2015
102 Views
Preview:
DESCRIPTION
Transcript
Open Web Data for Education
Linked Data technologies for connecting
open educational data
Mathieu d’Aquin, Philippe Cudre- Mauroux, Besnik Fetahu, Marieke Guy The Open University, University of Fribourg, L3S Hanover, Open Knowledge Foundation
@mdaquin @FetahuBesnik @mariekeguy
Slides at: http://slideshare.net/mdaquin
Primary School
Secondary School
Higher Education
The way it used to be… (Excessively simplifying)
Primary School
Secondary School
Higher Education
Now… (Still simplifying, I guess)
Other institutions through online courses
Open Universities
coursera
e d X
UDACITY
MIT
OCW
OpenLearn
MOOCs
and OER
Siri, I want to become a
professional photographer.
What should I do?
“I want to be a photographer,
what should I do?”
I found this Open University
course (T189), that you can
enrolled to in the regional centre
2 miles from here (cost £427).
“OK, anything free I can try
first?”
There is an Introduction to
Photography course on MIT
OCW, and a Computational
Photography course on coursera
starting soon.
Needs data from everybody, contributed to one
common data space (… linked data maybe?)
coursera
e d X
UDACITY
MIT
OCW
OpenLearn
courses
courses
courses
requirements requirements
topics
topics
topics
topics
learning
outcomes learning
outcomes
learning
outcomes locations
locations
results results
results
assessment
Outline of the talk(s)/tutorial
1- The state of open/linked data in education
II- How to contribute to open/linked data in education
III- Case study - The Bowlogna Ontology
IV- Making things with open/linked data in education
V- Open Education – more than just open data
State of open data in education
Universities Repositories
Publishers Thesaurus, vocabularies, etc.
Government bodies
Historically, mostly open educational
resources, i.e., these guys
But more and
more of them
and them now!
And hopefully, very soon, them?
Loosely based on http://data.linkededucation.org/linkedup/catalog/
LinkedUp Catalogue of Web Data for Education
http://data.linkededucation.org/linkedup/catalog/
Pause
What are we missing?
How to contribute
In other words:
How to represent
data in education for
sharing
Examples of sharing
linked open data in
education
Bias: We like Open and Linked Data
Open University
Website
Open University
VLE
KMi Website
Mathieu’s
Homepage
Mathieu’s
List of
Publications
Mathieu’s
The Web
M366 Course
page
Person: Mathieu
Publication: Pub1
Organisation:
The Open University
Course: M366
Country: Belgium
Book: Mechatronics
author
workFor
availableIn
offers
setBook
The Web of Linked Data
Need for common vocabularies
VIVO
LRMI
DataCube TEACH
Geo
Ontology
Dublin
Core FOAF
DOAP
SIOC
BIBO
Media
Ontology
AIISO
SKOS
17/11/13 LinkedUp – Author Name 15
From LinkedUp data catalogue
Example: AIISO
foaf:Organization
aiiso:School
rdfs:subClassOf
aiiso:College
aiiso:Course
aiiso:Department
rdfs:subClassOf
rdfs:subClassOf
aiiso:Institution
rdfs:subClassOf
aiiso:Faculty
rdfs:subClassOf
aiiso:KnowledgeGroupin
g
rdfs:subClassOf
aiiso:Module
rdfs:subClassOf
aiiso:Programme
rdfs:subClassOf
aiiso:part_of
foaf:Agent
aiiso:responsibleFor
aiiso:responsibleFor
aiiso:teaches
Example: BIBO
bibo:Article
bibo:AcademicArticle
bibo:Document
bibo:Book
bibo:AudioVisualDocument
bibo:DocumentPar
t
bibo:BookSection
bibo:Chapter
bibo:EditedBook
bibo:Issue
bibo:Journal
rdfs:subClassOf
rdfs:subClassOf rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf
rdfs:subClassOf rdfs:subClassOf
bibo:partOf
All bibo:partOf
<=1 bibo:partOf
<=1 bibo:partOf
Example: LRMI
A common framework common metadata framework for describing or “tagging” learning resources on the web, with Schema.org
http://www.lrmi.net/the-specification
Schema.org/CreativeWork “e.g. assignment” educationalUse
timeRequired
Schema.org/Duration learningResourceType
“e.g. presentation”
Schema.org/URL useRightsUrl
Schema.org/Audience
LRMI/EducationalAudience
subClass
audience
educationalRole “e.g. HE student”
Case-Study: Bowlogna Ontology
Fostering Open Curricula and Agile Knowledge Bases for Europe’s Higher Education Landscape
• The Bowlogna ontology
• Extending & managing Bowlogna data
– Entity-centric data management
The Bologna Reform
• Started in June 1999
• Framework for higher education systems
• 47 Countries
• Common academic degrees
• Common study structure
• Common terminology
20
The university setting after Bologna
• A lot of data is available – Not following standard schemas
– Comprehensive and available data is a success factor
• Shared data – Erasmus exchanges
– Courses in a given language
• Analytic tools may help monitoring university performance
21
An ontology about Bologna
• A Lexicon for the Bologna Reform
– Basic set of terms for the new system
– Stable across time and institutions
– Developed by a professional terminologist
22
The ontology creation process
• The Bowlogna Ontology
– 29 top classes (67 in total)
– Classes: student, professor, evaluation, teaching
unit, ECTS credit, semester, etc.
– Concept definitions in English, French, German
23
Bowlogna Ontology
24
Bowlogna Ontology
• Private / Public parts
– Public data can be shared with other uni (e.g.,
course descriptions)
– Private data in sensible (e.g., evaluation results)
• Private data might contain more instances
• Aggregations over private data may be shared
(e.g., number of enrolled students)
25
Managing Bowlogna Data
• Entity-Centric Data Management
– Searching for entities
– Linking entities
– Typing entities
– Storing entities
26
Entities as Mediation
• Rising paradigm – Store information at the entity granularity
– Integrate information by inter-linking entities
• Advantages? – Coarser granularity compared to keywords
• More natural, e.g., brain functions similarly (or is it the other way around?)
• Easier to integrate 3rd party information
– Denormalized information compared to RDBMSs • Schema-later, heterogeneity, sparsity
• Pre-computed joins, “Semantic” linking
• Drawbacks?
27
Searching for Entities (1)
The Descendants
TheDescendants
type
title
GeorgeClooney
George Clooney
name
May 6, 1961
dateOfBirth
type
ShaileneW
Shailene Woodley
name
Nov. 15, 1991
dateOfBirth
type
playsIn
playsIn
• Main idea: combine unstructured and structured search
– Inverted index to locate first candidates
– Graph queries to refine the results
• Graph traversals (queries on object properties)
• Graph neighborhoods (queries on data type properties)
Inverted Index
Keywords
HTTP
DBMS
SPARQL
28
Searching for Entities (2)
LOD Cloud
index()
User
Query Annotation and Expansion
Inverted Index
RDF
Store
Ranking FunctionsRanking
FunctionsRanking Functions
query()
Entity SearchKeyword Query
intermediate
top-k resultsGraph-Enriched
Results
Graph Traversals(queries on object
properties)
Neighborhoods(queries on datatype
properties)
Structured
Inverted Index
WordNet
3rd party
search engines
Final Ranking Function
Pseudo-Relevance Feedback
29
Linking Entities (1)
• ZenCrowd: linking textual content to entities
• Uses sets of algorithmic matchers to match
entities to online concepts
• Uses dynamic templating to create micro-
matching-tasks and publish them on MTurk
• Combines both algorithmic and human
matchers using probabilistic networks
30
Linking Entities (2)
Micro Matching
Tasks
HTML
Pages
HTML+ RDFa
Pages
LOD Open Data Cloud
Crowdsourcing
Platform
Z enCrowd
Entity
Extractors
LOD Index Get Entity
Input Output
Probabilistic
Network
Decision Engine
Mic
ro-
Ta
sk M
an
ag
er
Workers Decisions
Algorithmic
Matchers
31
Storing Entities (1)
• Fundamental impedance mismatch between
graphs of entities and…
– N-ary / decomposition storage model
– Inverted Indices
– Key-value paradigms
32
Storing Entities (2)
• dipLODocus[RDF]
– Materialize the joins!
– Dense-pack the values
– Provide new indices
– Co-locate
– Co-locate
– Co-locate
33
Typing Entities
34
Type rankingType ranking
Type ranking
Text
extraction
(BoilerPipe)
Named Entity
Recognition
(Stanford NER)
List of
entity
labels
Entity linking
(inverted index:
DBpedia labels ⟹
resource URIs)
foreach
List of
entity
URIs
Type retrieval
(inverted index:
resource URIs ⟹ type URIs)
List of
type
URIs
Type rankingRanked
list of
types
Trank • Input: a knowledge base G, an Entity e, a context c in
which e appears. • Output: e’s types ranked by relevance wrt the context c.
References
• The Bowlogna ontology: Semantic Web J. 2013
• Searching for entities: SIGIR 2012
• Linking entities: WWW 2012, VLDB J. 2013
• Storing entities: ISWC 2011
• Typing entities: ISWC 2013
35
Pause
What else needs representing in educational
data?
What to do with it
Resource
Discovery
Research
Exploration
Social
Example: UK HESA/UNISTAT Key Information Set
http://www.hesa.ac.uk/unistatsdata
“Unistats, which incorporates the KIS, provides course level information on all undergraduate higher education courses provided in the UK, which are of at least one year’s duration and consist of 120 or more credits of study” [1]
Includes statistics about the success rate of degrees (courses), the type of assessment, and what students do afterwards (further study, jobs).
[1] http://www.hesa.ac.uk/includes/C13061_resources/Unistats_checkdoc_definitions.pdf?v=1.12
Simple application:
Tell me the job you
want to do, I tell you
what degree (in the
UK) you might want
to study
Currently: It is Open Data (kind of)
Building an application on top of this?
Need to download the data, unzip parse the xml, re-interpret it into own model, store the data, provide querying facility, and finally, build the application.
Doing it as linked data with a SPARQL endpoint does that once for everybody!
http://data.linkededucation.org/linkedup/catalog/browse/
90 lines of HTML/Javascript,
written in a couple of hours
select distinct ?course ?label ?link ?perc where {
?o <http://purl.org/linked-data/cube#dataSet>
<http://data.linkedu.eu/kis/dataset/commonJobs>.
?o <http://data.linkedu.eu/kis/ontology/job>
<http://data.linkedu.eu/kis/job/354>.
?o <http://data.linkedu.eu/kis/ontology/course>
?course.
?course <http://purl.org/dc/terms/title> ?label.
?course
<http://data.linkedu.eu/kis/ontology/courseUrl>
?link.
?o
<http://data.linkedu.eu/kis/ontology/percentage>
?perc.
filter ( ?perc > 0 )
} order by desc(?perc)
Using this SPARQL Query:
data.open.ac.uk
Semantic
Indexing
Semantic Index
Named Entity
Recognition
Podcasts, OpenLearn
Units and Articles
Semantic Entities
(Dbpedia)
Indexes
BBC Programme or iPlayer page
Synopsis
Similarity-
Based Search
Indexes
Interface
Resource
descriptions
Resources URIs +
common topics
API/Service view
Interface
(Javascript)
Injected with bookmarklet
Named Entity
Recognition Indexing
Similarity
Search
Common
Topic
Extraction
Programme
URI
Scored
semantic
entities
Prg. URI &
Res. URI
Common Sem. Entity
Scored sem.
Entities & Prg
URI
Prg.
URI Resource
URIS
Same thing, with just text (discou.info/alfa)
And on course material (open + closed data)
17/11/13 LinkedUp – Besnik Fetahu 49
17/11/13
Source: http://lod-cloud.net/state, September 2011
Domain Number of
datasets Triples % (Out-)Links %
Media 25 1,841,852,061 5.82 % 50,440,705 10.01 %
Geographic 31 6,145,532,484 19.43 % 35,812,328 7.11 %
Government 49 13,315,009,400 42.09 % 19,343,519 3.84 %
Publications 87 2,950,720,693 9.33 % 139,925,218 27.76 %
Cross-domain 41 4,184,635,715 13.23 % 63,183,065 12.54 %
Life sciences 41 3,036,336,004 9.60 % 191,844,090 38.06 %
User-generated
content 20 134,127,413 0.42 % 3,449,143 0.68 %
295 31,634,213,770
503,998,829
Example: Topic Exploration
What is the data about?
The Big Picture: What is the data about?
17/11/13 LinkedUp – Besnik Fetahu 50
Domain Number of
datasets Triples % (Out-)Links %
Media 25 1,841,852,061 5.82 % 50,440,705 10.01 %
Geographic 31 6,145,532,484 19.43 % 35,812,328 7.11 %
Government 49 13,315,009,400 42.09 % 19,343,519 3.84 %
Publications 87 2,950,720,693 9.33 % 139,925,218 27.76 %
Cross-domain 41 4,184,635,715 13.23 % 63,183,065 12.54 %
Life sciences 41 3,036,336,004 9.60 % 191,844,090 38.06 %
User-generated
content 20 134,127,413 0.42 % 3,449,143 0.68 %
295 31,634,213,770
503,998,829
and many
more
languages
(16)…
and many
more
organisatio
ns (184)…
The Big Picture: How to find the right information?
17/11/13 LinkedUp – Besnik Fetahu 51
How to find information
about “renewable
energy”?
search into individual
resources in all these
sources?
338 sources of information
~300 million individual
resources
- Manual inspection costly!
- Current infrastructure is not
reliable for such large scale
queries!
now what? Generate representative topics
for the individual data sources
Topics linking the data sources
into a central and interlinked
graph
Explore the graph for specific
concepts e.g. “renewable
energy”
Constructing Topic Profiles
17/11/13 LinkedUp – Besnik Fetahu 52
book
thesis
proceedings series
audio document manuscript
newspaper report The types of
information
existing in the
data source
individual
resources
Linux in wenigen Stunden beherrschen ; absolut keine Vorkenntnisse nötig! ; ideal für Einsteiger und Umsteiger ; Animationen, Videos und Sprachausg. erklären LINUX Schritt für Schritt.
"British Association for Biofuels and Oils“ The prime objective of the Association is to persuade Government to modify the tax on Biodiesel so as to give this splendidly 'green' fuel a chance to establish itself to the advantage of the environment. This means a tax structure which ensures that the pump price of Biodiesel is at least competitive with fossil diesel. A second objective is to see established in Britain a Biodiesel plant of sufficient size to get the appropriate economies of scale in production costs.
organization
"British Association for Biofuels and Oils“
The prime objective of the Association is to persuade Government to modify the tax
on Biodiesel so as to give this splendidly 'green' fuel a chance to establish itself
to the advantage of the environment. This means a tax structure which
ensures that the pump price of Biodiesel is at least competitive with fossil diesel.
A second objective is to see established in Britain a Biodiesel plant of sufficient
size to get the appropriate economies of scale in production costs.
Linux in wenigen Stunden beherrschen ; absolut keine
Vorkenntnisse nötig! ; ideal für Einsteiger und Umsteiger;
Animationen, Videos und Sprachausg. erklären
LINUX Schritt für Schritt.
http://de.dbpedia.org/page/Linux
category-de:Freies_Betriebssystem
category-de:Linux
category-de:Unixoides_Betriebssystem
http://de.dbpedia.org/page/Animation
category-de:Animation
http://de.dbpedia.org/page/Videoclip
category-de:Video
http://dbpedia.org/page/Biodiesel
category:Biodiesel
category:Biofuels
category:Liquid_fuels
http://dbpedia.org/page/Economy
category:Economics
category:Economic_systems
http://dbpedia.org/page/Price category:Pri
cing
category:Marketing
http://dbpedia.org/page/Biofuel category:Bioenerg
y
category:Biomass
category:Fuels
category:Renewable_fuels
Constructing Topic Profiles (I) individual resources
Linux in wenigen Stunden beherrschen; absolut keine Vorkenntnisse nötig! ; ideal für Einsteiger und Umsteiger;
Animationen, Videos und Sprachausg. erklären LINUX Schritt für Schritt.
"British Association for Biofuels and Oils“ The prime (…) to persuade Government to modify the tax on
Biodiesel so as to give (…) to the advantage of the environment.
This means a tax (…)that the pump price of (….) A second objective
is to see established in Britain a Biodiesel plant of (…)appropriate
economies of scale in production costs.
topic profiles from the individual sources
biodiesel
biofue
l liquid fuels
economy
economic
systems
bioenergy
fuel
biomass
linu
x
video
17/11/13 LinkedUp – Besnik Fetahu 54
Exploring topics: Finding the right information? How to find information
about “renewable
energy”?
search individual resources
from all information sources?
explorable topic
graph biodiesel
biofuel
liquid fuels
economy
economic systems
bioenergy
fu
el
biomass
linux
video
• Searching for topics about “renewable
energy”, we find the following?
• 5 datasets
• data-gov-uk, clean-energy-reegle,
educationalprograms_sisvu,…
• Thousands of resources talking about:
biodiesel, biofuel, wind farms,
hydroelectricity, solar power, sugar
canes, etc.
Finding resources about “Renewable Energy”
17/11/13 LinkedUp – Besnik Fetahu 55
• From millions of resources from all information
sources to top matching ranked resources
about “Renewable Energy”
• Resources with “Renewable Energy” as a
topic convey information about different forms
of renewable energy:
• Solar Energy
• Wind-farms
• Biogas
• Hydroelectricity etc.
http://enipedia.tudelft.nl/wiki/Windmar_Renewable_Energy
http://enipedia.tudelft.nl/data/page/eGRID/Plant/57050
http://enipedia.tudelft.nl/wiki/Us_Energy_Biogas_Corp
http://www.reegle.info/profiles/JP
Topic Profiling: Applications!
17/11/13 LinkedUp – Besnik Fetahu 56
http://data-observatory.org/lod-profiles/profile-explorer/
http://data-observatory.org/lod-
profiles/sparql-endpoint
http://data-observatory.org/lod-profiles/
More examples: Data mining, knowledge
discovery, analytics
Learning
Analytics
Exploring communities
Course
management
Pause
What applications for educational Web data?
Open Education
Removing barriers to education
Open Education
Food for thought
More minds online
• Around 2.7 billion people (40% of the world's population)
will be connected to the Internet by the end of 2013 – UN
sources
• Several billion more in the forthcoming years – from
developing countries, many with disabilities
• Worldwide demand for higher education
• New pedagogies needed for large-scale student teaching
Open Data in Education
Overview
Open data in education
• All open data that can be used for educational purposes
(e.g research data, GLAM data etc.) Data exploited/used
by education.
Open data that comes out of education institutions
• Administrative data created by educational institutions
that can improve efficiency, allow students to make
informed decisions etc.
Both relevant to the LinkedUp Project
What type of data?
http://www.slideshare.net/louiscrusoe/open-education-data
How can we use open data
…to meet educational needs?
By supporting students
• Through creation of new tools that enable new ways to
analyse and access data e.g. maps of disabled access, tools
for disciplines
• By enriching resources, making it easier to share and find
them, and how to personalize the way they are presented
• By allowing student to explore resources, concepts, ideas
and objects in various areas
• To make informed choices on education e.g. by comparing
scores, course data etc.
How can we use open data
…to meet educational needs?
By supporting schools and institutions
• Learning analytics data can help retain students
• Use data can enable efficiencies in practice e.g. library data can
help support book purchasing
• Benchmarking and performance measuring
By supporting governments and policy
• Open data can lead to change in policy
• Open data can lead support transparency & enable efficiency
• Data on equity and equality issues (3rd world countries)
• Education reform
Education & Development
How can open data help?
• Data is crucial for planning, managing budgets and spending,
and evaluation
• Transparency of data is essential
• Interesting work going on to build tools to analyse data,
building capacity etc.
• Global Partnership for Education Open Data Project (57 key
education indicators from 29 countries)
• The data revolution in education and development:
http://bit.ly/data-development
• School of data: http://schoolofdata.org
Keep an eye on…
Working Group
Overview
• Binds together people to promote open data, open
educational resources (OER) and open educational
practices
• First activity: Writing the Open Education Handbook
• Mailing list, Twitter feed
• Want to see the discussions around open data in
education pulled into the wider debates around open
education
• http://education.okfn.org
Open Education Handbook
Overview
• First activity of Working Group
• Deliverable for LinkedUp Project
• Collaboratively authored
• Booksprint #1 London
• Booksprint #2 Berlin
• Open Ed Timeline event
• Now on Booktype
• Looking at synergies between
areas
Check out:
Linkeduniversities.org Linkededucation.org education.okfn.org
Linkedup-project.eu linkedup-challenge.org
data.linkededucation.org/linkedup/catalog
data.linkededucation.org/linkedup/devtalk
Picture credits
• http://www.flickr.com/photos/colorblindpicaso/2902713219/
• http://www.flickr.com/photos/army_arch/2860392346/
• http://www.flickr.com/photos/tulanesally/5198784680/
• http://www.flickr.com/photos/erfgoed/6743262901/
• http://www.flickr.com/photos/melystu/4984029996/
• http://www.flickr.com/photos/tulanesally/5202279590/
• http://www.flickr.com/photos/75905404@N00/4152885782/
• http://www.flickr.com/photos/jeffozvold/2253932630/
• http://www.flickr.com/photos/75905404@N00/3482204217/
• http://www.flickr.com/photos/soutra/4254200381/
• http://www.flickr.com/photos/70832171@N07/7911285000/
• http://www.flickr.com/photos/37996583811@N01/7354910368/
• http://www.flickr.com/photos/dbc-photography/4466855461/
• http://www.flickr.com/photos/pnnl/3638446615/
Case-Study: data.open.ac.uk
Case Study: data.open.ac.uk
Course information: 600 modules/ description of the course, information about the levels and number of
credits associated with it, topics, and conditions of enrolment.
Research publications: 25,000 academic articles / information about authors, dates, abstract and venue of the
publication.
Podcasts: 2220 video podcasts and 1500 audio podcats / short description, topics, link to a
representative image and to a transscript if available, information about the course the
podcast might relate to and license information regarding the content of the podcast.
Open Educational Resources: 640 OpenLearn Units / short description, topics, tags used to annotate the resource, its
language, the course it might relate to, and the license that applies to the content.
Youtube videos: 900 videos / short description of the video, tags that were used to annotate the video,
collection it might be part of and link to the related course if relevant.
University buildings: 100 buildings / address, a picture of the building and the sub-divisions of the building into
floors and spaces.
Library catalogue: 12,000 books/ topics, authors, publisher and ISBN, as well as the course related.
Others…
AIISO
BIBO
MEDIA
FOAF
DC
GEO
Deployment
Original systems / Databases
Dedicated extractors
daily
updates
Triple store
RDF
SPARQL endpoint
URI Resolver Web server
owl:sameAs
mlo:offers
mlo:location
http://data.open.ac.uk/course/m366
http://sws.geonames.org/2963597/ (Ireland)
http://data.open.ac.uk/organization/the_open_university
http://education.data.gov.uk/id/school/133849
select distinct ?q (count(distinct ?t) as ?n) where {
?q a <http://purl.org/net/mlo/qualification>.
?q <http://data.open.ac.uk/saou/ontology#hasPathway> ?p.
?p <http://data.open.ac.uk/saou/ontology#hasStage> ?s.
{{?s <http://data.open.ac.uk/saou/ontology#includesCompulsoryCourse>
?c}
union
{?s <http://data.open.ac.uk/saou/ontology#includesOptionalCourse> ?c}}.
?c <http://purl.org/dc/terms/subject> ?t.
[] <http://www.w3.org/2004/02/skos/core#hasTopConcept> ?t.
} group by ?q order by desc(?n)
List of courses (degrees, etc.) at The Open University, with number of
topics they cover
Example:
data.open.ac.uk/query
URI of the query:
http://data.open.ac.uk/query?query=select%20distinct%20...
Example: Map of buildings
Interactive map of Open University Buildings in the UK
Built in 1 hour
Connected to Ordnance Survey for location based on post-codes
Allowed us to find out about issues in the data.
Spaces
Floors
ID Address Post-code
Buildings
bat1
bat1-address
Postcode-mk76aa
name “Berrill building”
data.open.ac.uk
Milton Keynes
inDistrict
Buckinghamshire
inCounty
Mk76aa-location
location
lat long
52.024924 -0.709726
data.ordnancesurvey.co.uk
top related