Top Banner
Scientific Publishing Services (P) Ltd. Semantic Technology Opportunities Avinash Punekar – Scientific Publishing Services
31

Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

Jun 27, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

•Scientific Publishing Services (P) Ltd.

Semantic Technology

Opportunities

Avinash Punekar – Scientific Publishing Services

Page 2: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 2•Scientific Publishing Services (P) Ltd.

Semantic Technology

Page 3: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 3•Scientific Publishing Services (P) Ltd.

What is Semantic Technology?

² Semantic Web

² Web 3.0

² Linked Open Data / Linked Enterprise Data

² Web of Data

² Web of Things

² GGG –Giant Global Graph

² Is about using software to leverage our understanding and use of

information

² …And more!!!

Page 4: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 4•Scientific Publishing Services (P) Ltd.

Semantic Technology

It is all about DATA² Semantic Data that is not only machine READABLE.

² It is machine UNDERSTANDABLE!

It is not…It is not…

² A software package

² Something that will ever “be complete”

² A replacement for the current Web

² A pipe dream

² A silver bullet

Page 5: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 5•Scientific Publishing Services (P) Ltd.

Semantic Technology

It is…

² A Web-scale architecture

² A metadata technology

² A layer of meaning on the existing Web

² In use TODAY!

Semantic enrichment is a process whereby text within a research or

scholarly document is annotated by semantic metadata. It enables free

text to be converted into a database of knowledge by extracting the

concepts and linking the concepts to related knowledge bases.

Page 6: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 6•Scientific Publishing Services (P) Ltd.

Semantic Technology

Machine Understanding - How?

² By uniquely identifying THINGS

² By uniquely identifying RELATIONSHIPS

² By using TRIPLES

Page 7: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 7•Scientific Publishing Services (P) Ltd.

Semantic Technology

What is a THING?

A THING is anything that can be uniquely identified by a URI or a literal (string)

•Me à http://twitter.com/ericaxel

•My postal code à http://www.city-data.com/zips/90043.html

•The White House à Lat: 38.89859 Long: -77.035971

•L.A. County’s sales tax rate à 9.750 %

à http://ericfranzon.com/operator.jpg

Page 8: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 8•Scientific Publishing Services (P) Ltd.

Semantic Technology

What is a RELATIONSHIP?

Something which connects two THINGS uniquely

--- isFatherOf -------à

<owl:ObjectProperty rdf:ID="isFather"><rdfs:domain

rdf:resource="#Person"/><rdfs:range

rdf:resource="#Person"/></owl:ObjectProperty>

Page 9: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 9•Scientific Publishing Services (P) Ltd.

Semantic Technology

What is a TRIPLE?

book has title

This a

Relationship

Thing ------------------------------à Thing

Predicate

Subject ------------------------------à Object

Page 10: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 10•Scientific Publishing Services (P) Ltd.

Semantic Technology

Where is it now?

Page 11: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 11•Scientific Publishing Services (P) Ltd.

Semantic Technology

Technologies²RDBMS – Data, Schema. Query Language

²Semantic – Data, Schema (Vocabularies), Query Language

²Data Language – Resource Description Framework

²RDF is good for distributing data across the Web and pretending it’s in ²RDF is good for distributing data across the Web and pretending it’s in

one place

http://plushbeautybar.com dc:creator http://www.ericaxel.com/foaf.rdf

http://www.geonames.org/maps/google_34.021_-118.396.html dc: locationN 34°1' 16'‘ W 118°23' 47''

http://twitter.com/ericaxel foaf: knows “Dave McComb”

Page 12: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 12•Scientific Publishing Services (P) Ltd.

Semantic Technology

Vocabularies² Ontologies

² Taxonomies

² Folksonomies

Page 13: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 13•Scientific Publishing Services (P) Ltd.

Semantic Technology

Some are ways of describing vocabularies:² RDF: property –triple RELATIONSHIPS

² RDFs (RDF Schema)

² OWL (Web Ontology Language)

Some are controlled vocabularies like:² Dublin Core

² SKOS (Simple Knowledge Organization System)

² SIOC (Semantically-Interlinked Online Communities)

Reuse or make up your own!

Page 14: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 14•Scientific Publishing Services (P) Ltd.

Semantic Technology

Query Language: SPARQL² SPARQL

² Protocol

² And

² RDF² RDF

² Query

² Language

Page 15: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 15•Scientific Publishing Services (P) Ltd.

Semantic ComponentsThe semantic web comprises the standards and tools of HTML5, XML,

XML Schema, RDF, RDF Schema and OWL that are organized in the

Semantic Web Stack. The OWL Web Ontology Language Overview

describes the function and relationship of each of these components of

the semantic web:

§ XML provides an elemental syntax for content structure within documents, yetassociates no semantics with the meaning of the content contained within.associates no semantics with the meaning of the content contained within.

§ XML Schema is a language for providing and restricting the structure and content ofelements contained within XML documents.

§ RDF is a simple language for expressing data models, which refer to objects("resources") and their relationships. An RDF-based model can be represented in XMLsyntax.

§ RDF Schema extends RDF and is a vocabulary for describing properties and classesof RDF-based resources, with semantics for generalized-hierarchies of such propertiesand classes.

§ OWL adds more vocabulary for describing properties and classes: among others,relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality,richer typing of properties, characteristics of properties (e.g. symmetry), andenumerated classes.

§ SPARQL is a protocol and query language for semantic web data sources.

Page 16: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 16•Scientific Publishing Services (P) Ltd.

Semantic ProjectsDBpedia - DBpedia is an effort to publish structured data extracted from Wikipedia: thedata is published in RDF and made available on the Web for use under the GNU FreeDocumentation License, thus allowing Semantic Web agents to provide inferencing andadvanced querying over the Wikipedia-derived dataset and facilitating interlinking, re-useand extension in other data-sourcesFOAF - A popular application of the semantic web is Friend of a Friend (or FoaF), whichuses RDF to describe the relationships people have to other people and the "things"around them. FOAF is an example of how the Semantic Web attempts to make use of therelationships within a social context.relationships within a social context.GoodRelations for e-commerce - A huge potential for Semantic Web technologies lies inadding data structure and typed links to the vast amount of offer data, product modelfeatures, and tendering / request for quotation data. The GoodRelations ontology is apopular vocabulary for expressing product information, prices, payment options, etc. It alsoallows expressing demand in a straightforward fashion. GoodRelations has been adoptedby BestBuy, Yahoo, OpenLink Software, O'Reilly Media, the Book Mashup, and manyothers.NextBio - A database consolidating high-throughput life sciences experimental datatagged and connected via biomedical ontologies. Nextbio is accessible via a search engineinterface. Researchers can contribute their findings for incorporation to the database. Thedatabase currently supports gene or protein expression data and is steadily expanding tosupport other biological data types.

Page 17: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 17•Scientific Publishing Services (P) Ltd.

Web Ontology Language (OWL)The Web Ontology Language (OWL) is a family of knowledge

representation languages for authoring ontologies endorsed by the World

Wide Web Consortium. They are characterised by formal semantics and

RDF/XML-based serializations for the Semantic Web. OWL has attracted

both academic, medical and commercial interest.

§ Basic Formal Ontology,[14] a formal upper ontology designed to support scientificresearchresearch

§ BioPAX, an ontology for the exchange and interoperability of biological pathway (cellularprocesses) data

§ BMO, an e-Business Model Ontology based on a review of enterprise ontologies andbusiness model literature

§ CCO (Cell-Cycle Ontology, an application ontology that represents the cell cycle§ Ccontology, an e-business ontology to support online customer complaint management§ CIDOC Conceptual Reference Model, an ontology for cultural heritage[19]§ COSMO, a Foundation Ontology designed to contain representations of all of the

primitive concepts needed to logically specify the meanings of any domain entity. It isintended to serve as a basic ontology that can be used to translate among therepresentations in other ontologies or databases. It started as a merger of the basicelements of the OpenCyc and SUMO ontologies, and has been supplemented with otherontology elements (types, relations) so as to include representations of all of the wordsin the Longman dictionary defining vocabulary.

Page 18: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 18•Scientific Publishing Services (P) Ltd.

Web Ontology Language (OWL)The Web Ontology Language (OWL) is a family of knowledge

representation languages for authoring ontologies endorsed by the World

Wide Web Consortium. They are characterised by formal semantics and

RDF/XML-based serializations for the Semantic Web. OWL has attracted

both academic, medical and commercial interest.

§ Cyc, a large Foundation Ontology for formal representation of the universe of discourse.Disease Ontology, designed to facilitate the mapping of diseases and associated§ Disease Ontology, designed to facilitate the mapping of diseases and associatedconditions to particular medical codes

§ DOLCE, a Descriptive Ontology for Linguistic and Cognitive Engineering§ Dublin Core, a simple ontology for documents and publishing§ Foundational, Core and Linguistic Ontologies§ Foundational Model of Anatomy, an ontology for human anatomy§ Gene Ontology for genomics§ GUM (Generalized Upper Model), a linguistically-motivated ontology for mediating

between clients systems and natural language technology§ NIFSTD Ontologies from the Neuroscience Information Framework: a modular set of

ontologies for the neuroscience domain. See http://neuinfo.org

Page 19: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 19•Scientific Publishing Services (P) Ltd.

Web Ontology Language (OWL)The Web Ontology Language (OWL) is a family of knowledge

representation languages for authoring ontologies endorsed by the World

Wide Web Consortium. They are characterised by formal semantics and

RDF/XML-based serializations for the Semantic Web. OWL has attracted

both academic, medical and commercial interest.

§ OBO Foundry, a suite of interoperable reference ontologies in biomedicineOntology for Biomedical Investigations, an open access, integrated ontology for the§ Ontology for Biomedical Investigations, an open access, integrated ontology for thedescription of biological and clinical investigations

§ OMNIBUS Ontology, an ontology of learning, instruction, and instructional design§ Plant Ontology for plant structures and growth/development stages, etc.§ POPE, Purdue Ontology for Pharmaceutical Engineering§ PRO, the Protein Ontology of the Protein Information Resource, Georgetown University.§ Program abstraction taxonomy program abstraction taxonomy§ Protein Ontology for proteomics§ Systems Biology Ontology (SBO), for computational models in biology§ Many more…(ONIX, MARC, Dublin Core)

Page 20: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 20•Scientific Publishing Services (P) Ltd.

Semantic Technology

Why is it important to us?² It is the future

² All major governments have made adoption mandatory

² All big businesses have adopted it

² The scope in all areas and especially in publishing is huge² The scope in all areas and especially in publishing is huge

² Fundamentally changes what we are doing

² Our customers have adopted it

² Presents new opportunities

Page 21: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 21•Scientific Publishing Services (P) Ltd.

² Publishing – Books, Journals

² Media & Entertainment

² Banking, Finance, Insurance

MarketThe market for semantic enrichment will be much larger. Some of the

industries/sectors are:

² Pharmaceutical

² Medical

² Government

² Legal

Page 22: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 22•Scientific Publishing Services (P) Ltd.

Semantic Opportunities

² Content Abstraction

² Technical Data Extraction

² Keyword/Semantic Indexing

² Bibliographic Data Management

² Editorial Services² Editorial Services

² Taxonomy, Thesaurus, Ontology, Terminology

² Annotation, Recommendation Creation

² Semantic Tagging

² Semantic Linking

² Researched Linking

² Resource Repurposing

Page 23: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 23•Scientific Publishing Services (P) Ltd.

Content AbstractionContent abstraction is the process of creating a condensed version of a

full text article or other technical and research documents. An abstract

will provide an indication to the reader of the core themes discussed in

the full text. This is used as a document surrogate by publishers to

promote the delivery and sales of full text documents.

§ Indicative Abstracts - This discusses what the article indicates in terms of topicand methodology, without providing the key content present in the article.and methodology, without providing the key content present in the article.Examples: Product reviews, book abstracts etc.

§ Informative Abstracts - It provides a condensed view of the entire content in thefull text document, culling out the key topics and concepts covered. Examples:Abstracts of technical articles, technical standards and specifications.

§ Structured Abstracts - Abstracts created in a structured format with pre-definedheadings that truly represent the way the full text is organized. Examples:Abstracts of clinical trials and medical case reports. Here the abstracts follow thetypical structure of Introduction/ Background, Scope/ Methods, Results/discussion/ conclusion based on the specific house style followed by thepublisher or information provider.

§ Enhanced/ Value Added Abstracts - Abstracts that pick out the key knowledgethat are helpful for decision making using domain expertise and inferences.Examples: English abstracts of patents in multiple languages that extract the keypatentability parameters like novelty, use and advantage. Bottom-line summaries/clinical pearls, etc.

Page 24: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 24•Scientific Publishing Services (P) Ltd.

Technical Data AbstractionTechnical data extraction is the process of extracting properties,

attributes, metadata and conceptual entities from unstructured technical

documents such as patents and non-patent technical literature. Few

examples of data that can be extracted from typical chemical and life

science related documents are:

§ Systematic Chemical Names (IUPAC Nomenclature) with different spellings;Commas, Periods, Hyphens, Parentheses, Apostrophes, Plusses, Minuses andCommas, Periods, Hyphens, Parentheses, Apostrophes, Plusses, Minuses andGreek Symbols

§ Common or Generic Names§ Trade Names§ Company Codes§ Abbreviations§ Fragmented Descriptors§ Molecular Formula§ Genetic Information

Page 25: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 25•Scientific Publishing Services (P) Ltd.

Keyword/Semantic IndexingIndexing is a process where the key descriptors that can represent the

core theme of an article or a document are extracted and such article or

document is tagged with those descriptors. Such descriptors can be in

the form of keywords that are actually present in the document (keyword

indexing) or descriptors that represent the key concepts elaborated in the

article, but not necessarily to be present in the document (Semantic

Indexing). Some of the areas are:Indexing). Some of the areas are:

§ Journal Indexing§ Subject Category Indexing§ Image Indexing§ Medical Indexing and Coding/ Evidence Based Rating§ Drug Indexing§ Chemical Structure Drawing and Indexing

Page 26: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 26•Scientific Publishing Services (P) Ltd.

Bibliographic Data ManagementIt includes developing, validating, updating and editing bibliographic

databases based on the cataloguing rules of some of the leading

bibliographic databases like ISSN, OCLC and other leading catalogs. It

should also include Onix and RSS feeds.

Page 27: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 27•Scientific Publishing Services (P) Ltd.

Editorial Services§ Editorial Workflow Administration - Handle the entire manuscript handling

process from peer reviewer selection, tracking of manuscripts, reminders to peerreviewers, and style checking of manuscripts.

§ Developmental Editing - Work in tandem with the authors in editing and finetuning their manuscript.Provide services such as fact checking and contentenrichment to enhance the authenticity and readability of the manuscript.

§ Content Editing§ Language Editing§ Technical Editing

Editorial Services for Business and Commercial News Services:

§ Technical Editing§ Proofreading

§ News Summaries§ Press Report Analysis§ Newsletters§ Media Monitoring§ Product and Service Descriptions

Page 28: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 28•Scientific Publishing Services (P) Ltd.

Taxonomy, Thesaurus, Ontology, TerminologyThe offerings should include the following:

§ Taxonomy development and maintenance§ Taxonomy Mapping/ Integration§ Taxonomy expansion§ Semantic labeling of taxonomy nodes through ontology§ Development of niche taxonomies for medical specialties§ Automated content mining and vertical search solutions through the deployment

of taxonomy and ontologyof taxonomy and ontology§ Lexicon development - Word variants, Spelling variants, Morphological variants,

Language variants§ Thesaurus development - Multilevel Broader and Narrower Terms Hierarchical

Displays, Construction of Equivalent Terms (Synonyms), Construction ofAssociated Terms (Related Terms)

§ Ontology Development- Conceptual definition for each node, Disambiguation ofhomonyms, Deconstruction of existing taxonomies and semantic labeling oftaxonomy nodes

Page 29: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 29•Scientific Publishing Services (P) Ltd.

Annotation/Recommendation CreationAnnotation Creation – During this process the data from the databases is

annotated semantically. The process makes the heterogeneous collection

data syntactically and semantically interoperable.

Recommendation Creation - Rules that define more associative relations

between different metadata items need to be created. These rules are

based on the domain ontologies, the collection item annotations, andbased on the domain ontologies, the collection item annotations, and

expert knowledge

Page 30: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 30•Scientific Publishing Services (P) Ltd.

Semantic Services @ SPS² Semantic Tagging Services –We can offer our services for content

transformation with semantic tagging.

² Semantic Linking Services – We can offer our services for semantic linking ofthe semantic tags with external objects, resources or databases.

² Researched Linking Services – In addition to the above service, we can alsooffer the services of our teams which can research the disparate informationover the internet consisting of the above objects, resources, databases whichover the internet consisting of the above objects, resources, databases whichcan then be linked to the content.

² Resource Repurposing/Rebuilding Services – We can also offer our servicesfor repurposing/rebuilding of resources, objects such as images, graphs, charts,tables, animations, audios, videos, etc.

Page 31: Semantic Technology - CII · Semantic Technology It is… ² A Web-scale architecture ² A metadata technology ² A layer of meaning on the existing Web ² In use TODAY! Semantic

April 2011 | 31•Scientific Publishing Services (P) Ltd.

Thank You

Avinash PunekarAvinash Punekar

[email protected]

Phone: + 91 – 91766 50335

Scientific Publishing Services