Top Banner
Tech Talk BIBFRAME Working Group 17 November 2015 XPath, XSLT, and XQuery Notes from the Library Juice Academy courses, “Introduction to XML” and “Transforming and Querying XML with XSLT and XQuery” Allison Jai O’Dell | [email protected] || Hikaru Nakano | [email protected] Douglas Smith | [email protected] || Gerald Langford | [email protected]
20

Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

Jan 22, 2018

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

Tech TalkBIBFRAME Working Group

17 November 2015

XPath, XSLT, and XQuery

Notes from the Library Juice Academy courses, “Introduction to XML” and

“Transforming and Querying XML with XSLT and XQuery”

Allison Jai O’Dell | [email protected] || Hikaru Nakano | [email protected]

Douglas Smith | [email protected] || Gerald Langford | [email protected]

Page 2: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XML“The Extensible Markup Language (XML) is a simple text-based format for

representing structured information: documents, data, configuration, books,

transactions, invoices, and much more. ” -- http://www.w3.org/standards/xml/core

Page 3: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XML Example

<?xml version="1.0" encoding="UTF-8"?>

<bunnies xmlns:food=“http://www.example.com/food”>

<bunny>

<name>Frances</name>

<breed>mini lop</breed>

<gender>female</gender>

<color>white with brown spots</color>

<birth>January 10, 2009</birth>

<food:fave>strawberries, parsley,

cilantro, carrots</food:fave>

</bunny>

<bunny status="RBB">

<name>Howard</name>

<breed>mixed, dwarf</breed>

<gender>male</gender>

<color>light brown agouti</color>

<birth>March 15, 2009</birth>

<death>September 1, 2012</death>

</bunny>

</bunnies>

• opening and closing tag

• case sensitive

• properly nested

• quoted attribute values

• opening XML declaration

• character encoding

• root element

• namespace declaration

Page 4: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XPathSelects nodes from an XML document

http://www.w3schools.com/xsl/xpath_intro.asp

Page 5: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XPath Examples

<?xml version="1.0" encoding="UTF-8"?>

<bookstore>

<book category="COOKING">

<title lang="en">Everyday Italian</title>

<author>Giada De Laurentiis</author>

<year>2005</year>

<price>30.00</price>

</book>

<book category="CHILDREN">

<title lang="en">Harry Potter</title>

<author>J K. Rowling</author>

<year>2005</year>

<price>29.99</price>

</book>

<book category="WEB">

<title lang="en">XQuery Kick Start</title>

<author>James McGovern</author>

<author>Per Bothner</author>

<author>Kurt Cagle</author>

<author>James Linn</author>

<author>Vaidyanathan Nagarajan</author>

<year>2003</year>

<price>49.99</price>

</book>

<book category="WEB">

<title lang="en">Learning XML</title>

<author>Erik T. Ray</author>

<year>2003</year>

<price>39.95</price>

</book>

</bookstore>

Select all the title nodes:

/bookstore/book/title

Select all the year nodes, regardless of path:

//year

Select the title of the first book:

/bookstore/book[1]/title

Select price nodes with price>35:

/bookstore/book[price>35]/price

Select the attribute category "WEB" within book node; return titles

/bookstore/book[@category=“WEB”]/title

And you can use regular expressions!

http://www.w3schools.com/xsl/xpath_examples.asp

Page 6: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLTeXtensible Stylesheet Language Transformations

Transforms XML documents into other documents

http://www.w3schools.com/xsl/default.asp

Page 7: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLT Overview

XSLT is used to transform an XML document into various types of documents, such as another

XML document, a web recognizable document (for example, HTML, XHTML, HTML5), and even

plain text documents.

How it works: The XSLT process utilizes XPath to navigate through the source tree and to identify

nodes in the source tree. The process then checks to see if the nodes that the XPath identifies

match a template that the user has defined. If a node matches a template, the transformation

defined in the template is performed.

Page 8: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

Transformations using XSL

“A transformation in the XSLT language is expressed in the form of a stylesheet, whose syntax is

well-formed XML …

“The term stylesheet reflects the fact that one of the important roles of XSLT is to add styling

information to an XML source document, by transforming it into a document consisting of XSL

formatting objects (see [Extensible Stylesheet Language (XSL)]), or into another presentation-

oriented format such as HTML, XHTML, or SVG. However, XSLT is used for a wide range of

transformation tasks, not exclusively for formatting and presentation applications.”

-- http://www.w3.org/TR/xslt20/#what-is-xslt

Page 9: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

How is XSLT used?

“If you make a purchase on eBay, or buy a book at Amazon, chances are that pretty much everything

you see on every Web page has been processed with XSLT. Use XSLT to process multiple XML

documents and to produce any combination of text, HTML and XML output. XSLT support is

shipped with all major computer operating systems today, as well as being built in to all major Web

browsers.”

-- http://www.w3.org/standards/xml/transformation

Page 10: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLT Example: XML to XML

Input:

\\ad.ufl.edu\uflib\deptdata\Cataloging\Authorities_&_Metadata_Quality\BibFrame\Meeting20151117\pubmed_sample_xml_rev.docx

Output:

\\ad.ufl.edu\uflib\deptdata\Cataloging\Authorities_&_Metadata_Quality\BibFrame\Meeting20151117\pubmed_sample_xslt_out.docx

Page 11: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLT Example: XML to XML

XSLT:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

xmlns:xs="http://www.w3.org/2001/XMLSchema"

exclude-result-prefixes="xs"

version="2.0">

<xsl:output method="xml" indent="yes" encoding="utf-8"/>

<xsl:template match="/ArticleSet">

<xsl:element name="ArticleSet">

<xsl:for-each select="Article">

<xsl:element name="title"><xsl:value-of select="ArticleTitle"/></xsl:element>

<xsl:copy-of select="AuthorList"/>

<xsl:element name="pages"><xsl:text>pages </xsl:text><xsl:value-of select="FirstPage"/><xsl:text>-

</xsl:text><xsl:value-of select="LastPage"/></xsl:element>

<xsl:element name="link"><xsl:if

test="ArticleIdList/ArticleId[@IdType='doi']"><xsl:text>http://dx.doi.org/</xsl:text><xsl:value-of

select="ArticleIdList/ArticleId[@IdType='doi']"/></xsl:if></xsl:element>

</xsl:for-each>

</xsl:element>

</xsl:template>

</xsl:stylesheet>

Page 12: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLT Example: XML to HTML

Input:<?xml-stylesheet type="text/xsl" href="quiz1.xsl"?>

<catalog>

<type>Image Catalog</type>

<image>

<id>entry.0001</id>

<preview>http://upload.wikimedia.org/wikipedia/commons/9/93/Waterhouse-sleep_and_his_half-

brother_death-1874.jpg</preview>

<title>Hypnos and Thanatos</title>

<artist>John Willian Waterhouse</artist>

<country>UK</country>

<medium>Painting</medium>

<year>1874</year>

<subject>Greek Mythology</subject>

</image>

</catalog>

Page 13: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLT Example: XML to HTML

XSLT:<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

version="2.0">

<xsl:output method="html"/>

<xsl:template match="/">

<xsl:apply-templates select="catalog"/>

</xsl:template>

<xsl:template match="catalog">

<html>

<head>

<title>Quiz 1</title>

<style>

body {background-color: #000000}

h1 {color: #ffffff;

font-family: verdana}

h2 {color: #F6CEEC;

font-family: verdana}

p {color: #F6CEEC;

font-family: verdana}

</style>

</head>

<body>

<xsl:for-each select="image">

<p>

<xsl:variable name="preview" select="preview"></xsl:variable>

<img src="{$preview}" width="400px"/>

</p>

<h1>

<xsl:value-of select="title"/>

</h1>

<h2>

<b><xsl:value-of select="artist"/></b>

</h2>

<p>

Country: <xsl:value-of select="country" /><br />

Medium: <xsl:value-of select="medium" /><br />

Date: <xsl:value-of select="year" /><br />

Subject: <xsl:value-of select="subject" /><br />

</p>

</xsl:for-each>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

Page 14: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XSLT Example: XML to HTML

Output:

http://allisonjai.com/lja/quiz1.xml

Page 15: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XQueryQueries XML data

http://www.w3schools.com/xsl/xquery_intro.asp

Page 16: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XQuery Overview

There are a myriad of uses for XQuery including:

• querying XML documents and data sources that can output XML documents

• combining data from multiple sources

• transforming data

• generating reports from XML data

• building web and application services over XML data

There is some overlap in utility between XSLT and XQuery, but in general XQuery is more effective

in querying large structured and unstructured data sets and deriving data from large data sets.

Page 17: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

FLWOR

XQuery works by combining the use of path expressions (XPaths) to access parts or fragments of

XML data and the use of FLWOR ("for", "let", "where", "order by", "return") expressions to process,

join, and return data.

http://www.w3ctutorial.com/xquery-basic/xquery-flwor

Page 18: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XQuery Example

Input:

Page 19: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XQuery Example

Query:

Page 20: Notes from the Library Juice Academy courses on XPath, XSLT, and XQuery: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 17 November 2015

XQuery Example

Output: