Top Banner
Consuming eXtensible Markup Language (XML) feeds
15
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Consuming eXtensible Markup Language (XML) feeds.

Consuming eXtensible Markup

Language (XML) feeds

Page 2: Consuming eXtensible Markup Language (XML) feeds.

What is XML? XML

Stands for eXtensible Markup Language

Is designed to transport and store data with focus on what data is As opposed to HTML that was designed to display data with focus on how data looks

Tags are not predefined The tags used in HTML are predefined

HTML docs use tags defined in HTML standard

Does not do anything Created to structure, store, and transport information

is a Software and hardware-independent tool

For carrying information

<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>

Page 3: Consuming eXtensible Markup Language (XML) feeds.

How can XML be used?

Separate data from HTML Displaying dynamic data in your HTML document

Simplify data sharing/transport XML is stored in plain text format =>

Software and hardware-independent data sharing

Greatly reducing complexity of data transport Between incompatible applications

Page 4: Consuming eXtensible Markup Language (XML) feeds.

XML tree

XML documents Form a tree structure

Starting at root and branching to leaves

Example XML document:

<?xml version="1.0" encoding="ISO-8859-1"?><note>  <to>Tove</to>  <from>Jani</from>  <heading>Reminder</heading>  <body>Don't forget me this weekend!</body></note>

Page 5: Consuming eXtensible Markup Language (XML) feeds.

Tree representation of an XML doc: Example

<bookstore>  <book category="COOKING">    <title lang="en">Everyday Italian</title>    <author>Giada De Laurentiis</author>    <year>2005</year>    <price>30.00</price>  </book>  <book category="CHILDREN">    <title lang="en">Harry Potter</title>    <author>J K. Rowling</author>    <year>2005</year>    <price>29.99</price>  </book>  <book category="WEB">    <title lang="en">Learning XML</title>    <author>Erik T. Ray</author>    <year>2003</year>    <price>39.95</price>  </book></bookstore>

Page 6: Consuming eXtensible Markup Language (XML) feeds.

XML elements

An element can contain Other elements

Text

Attributes

Or a mix of the above

<bookstore>  <book category="CHILDREN">    <title>Harry Potter</title>    <author>J K. Rowling</author>    <year>2005</year>    <price>29.99</price>  </book>  <book category="WEB">    <title>Learning XML</title>    <author>Erik T. Ray</author>    <year>2003</year>    <price>39.95</price>  </book></bookstore>

Page 7: Consuming eXtensible Markup Language (XML) feeds.

Well-formed XML docs The syntax rules

XML docs must have a root element

XML elements must have a closing tag

XML tags are case sensitive

XML elements must be properly nested

XML attributes must be quoted

<b><i>This text is bold and italic</i></b>

<Message>This is incorrect</message>

<note date="12/11/2007">  <to>Tove</to>  <from>Jani</from></note>

Page 8: Consuming eXtensible Markup Language (XML) feeds.

Document Object Model (DOM) DOM

Is a tree structure where each node Contains one of the components of an XML structure

The two most common nodes are Element nodes and text nodes

Provides an API for processing XML files Instantiate the Factory

Create a document builder

Get a parser and parse the file

Page 9: Consuming eXtensible Markup Language (XML) feeds.

DOM NodesNode nodeName nodeValue Attributes

Attr Name of attribute Value of attribute null

CDATASection #cdata-section Content of the CDATA section

null

Comment #comment Content of the comment

null

Document #document null null

DocumentFragment #documentFragment null null

DocumentType Document Type name null null

Element Tag name null null

Entity Entity name null null

EntityReference Name of entity referenced

null null

Notation Notation name null null

ProcessingInstruction Target Entire content excluding the target

null

Text #text Content of the text node

null

Page 10: Consuming eXtensible Markup Language (XML) feeds.

Classes for Processing XML files Document

Represents the entire XML document Providing primary access to the document’s data

Methods getElementsByTagName(String tagname)

Returns a NodeList of all Nodes with a given tag name

Node Represents a single node in the document tree

getNodeName()/getNodeValue() return The name/value as a string of the node depending on its type

getFirstChild()/getLastChild()/getChildNodes()

Page 11: Consuming eXtensible Markup Language (XML) feeds.

Classes for Processing XML files (continued) NodeList

Ordered collection of nodes, where Items accessible via an integral index

Methods item(int index)

Returns the Node at index.

NamedNodeMap Collection of nodes that can be accessed by name

Methods Node item(int index) Node getNamedItem(String name)

Page 12: Consuming eXtensible Markup Language (XML) feeds.

Classes for Processing XML files (cont’d)

Element Represents an element in XML that

May have attributes associated with them Has methods to retrieve attributes by name or by value

String getAttribute(String name) Retrieves an attribute name by name

Attr getAttributeNode(String name) Retrieves an attribute node by name

Attr Represents an attribute in an Element object

String getName() String getValue()

Page 13: Consuming eXtensible Markup Language (XML) feeds.

JDOM

Page 14: Consuming eXtensible Markup Language (XML) feeds.

What is JDOM?

JDOM is a third-party java library

for accessing, manipulating and outputting XML data

has an easy-to-use API HTML documentation for API can found at:

http://www.jdom.org/docs/apidocs/index.html

is lightweight and fast As compared to the DOM parsing library offered by Java

Can be downloaded from: http://www.jdom.org/downloads/index.html

Page 15: Consuming eXtensible Markup Language (XML) feeds.

JDOM: exampleimport java.io.File; import java.io.IOException;

import java.util.List; import org.jdom.Document;

import org.jdom.Element; import org.jdom.JDOMException;

import org.jdom.input.SAXBuilder;  

public class ReadXMLFile { public static void main(String[] args) {  

SAXBuilder builder = new SAXBuilder();

File xmlFile = new File(“staff.xml");  

try {  

Document document = (Document) builder.build(xmlFile);

Element rootNode = document.getRootElement();

List list = rootNode.getChildren("staff");  

for (int i = 0; i < list.size(); i++) {  

Element node = (Element) list.get(i);  

System.out.println("First Name : " + node.getChildText("firstname"));

System.out.println("Last Name : " + node.getChildText("lastname"));

System.out.println("Nick Name : " + node.getChildText("nickname"));

System.out.println("Salary : " + node.getChildText("salary"));  

}  

} catch (IOException io) { System.out.println(io.getMessage()); }

catch (JDOMException jdomex) { System.out.println(jdomex.getMessage()); } } }