Top Banner
Department of Computer Science and Engineering University of South Carolina Columbia, SC 29208 CSCE 547 CSCE 547 Windows Programming Windows Programming XML Support XML Support
26

XML DotNet Lecture Notes

Apr 11, 2015

Download

Documents

api-3737107
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: XML DotNet Lecture Notes

Department of Computer Science and EngineeringUniversity of South CarolinaColumbia, SC 29208

CSCE 547CSCE 547Windows ProgrammingWindows Programming

XML SupportXML Support

Page 2: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 22

Why XML?Why XML?XML stands for eXXtensible MMarkup LLanguage.

XML is an extension of HTML; it is designed to express the structure of data and information about how to render the data.

Some organizations are embarked on defining standards that use XML to express the semantics of their domains (healthcare, automotive, security and the military).

WHY XML? Because:

1. It is just text, readable by any OS (Linux, MacOs, WinTel, etc) and humans

2. It has become the de facto standard adopted by everybody who is somebody wishing to communicate data over the WWW

This chapter discusses .NET support for XML.

Page 3: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 33

Why XML?Why XML?

XML encourages the separation of interface from structured data,allowing the seamless integration of data from diverse sources, and providing the infrastructure to create N-tier architectures.

XML

Page 4: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 44

XML DocumentsXML DocumentsXML documents can be described in terms of their logical and physical structure.

The logical structure is a function of the XML elements and attributescontained in the document.

The physical structure is the set of storage units in which the document actually exists. These units, called entities, could be a stream of characters or a (set of) files.

XML documents contain two parts, called the header and the content.

Typically, the header contains declarations or processing instructions(commands for the XML processor).

Page 5: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 55

XML Documents Can ContainXML Documents Can Contain

• Processing Instructions (aka PIs) delimited by <? . . . ?>

• Declarations, in the form <! aDeclaration >• Elements• Attributes• Entities• Comments

Typically, you will include in the header declarations and/or processing instructions

Page 6: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 66

Processing Instructions and DeclarationsProcessing Instructions and Declarations

<?xml version="1.0"?><?xml-stylesheet href="XSL\DotNet.html.xsl" type="text/xsl"?><?xml-stylesheet href="XSL\DotNet.wml.xsl" type="text/xsl"

media="wap"?><?cocoon-process type="xslt"?>

<?xml-stylesheet type="text/xsl" href="Guitars.xsl"?><?xml version="1.0" encoding="UTF-16"?>

<!DOCTYPE DotNetXML:Book SYSTEM "DTD\DotNetXML.dtd"><!NOTATION PNG SYSTEM “program.exe”><!ATTLIST . . . >

<!ENTITY AGRAPH SYSTEM “file.png” NDATA PNG><!ENTITY memoText “blablabla”><memo> && memoText; </memo>

?xml

Declarations

& is Reference Notation

Page 7: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 77

XML ElementsXML ElementsXML elements are made up of a start tag, an end tag, and data in between. The start and end tags describe the data or value of the elements:

<Student> Anita Donut </Student><CarDriver> Anita Donut </CarDriver><BloodDonor> Anita Donut </BloodDonor>

Elements can be empty, e.g.,

<memo> </memo>

But this only makes sense when creating attributes. The preferred way is:

<memo />

Attributes define properties for an element. XML elements can contain one or more attributes

Page 8: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 88

XML ElementsXML ElementsThe XML tree in Figure 13-1 was

produced by the code below<?xml version="1.0"?><Guitars><Guitar><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>

</Guitar><Guitar><Make>Fender</Make><Model>Stratocaster</Model><Year></Year><Color>Black</Color><Neck>Maple</Neck>

</Guitar></Guitars>

<Guitar Year="1977"><Make>Gibson</Make><Model>SG</Model><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>

</Guitar>

<Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>

</Guitar>

Using attributes:

Page 9: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 99

Name SpacesName SpacesXML uses name spaces to avoid name collisions, such that, e.g., gibson:color and fender:color may refer to different elements

<?xml version="1.0"?><win:Guitarsxmlns:win="http://www.wintellect.com/classic-guitars"xmlns:gibson="http://www.gibson.com/finishes"xmlns:fender="http://www.fender.com/finishes"><win:Guitar><win:Make>Gibson</win:Make><win:Model>SG</win:Model><win:Year>1977</win:Year><gibson:Color>Tobacco Sunburst</gibson:Color><win:Neck>Rosewood</win:Neck>

</win:Guitar><win:Guitar><win:Make>Fender</win:Make><win:Model>Stratocaster</win:Model><win:Year>1990</win:Year><fender:Color>Black</fender:Color><win:Neck>Maple</win:Neck>

</win:Guitar></win:Guitars>

Page 10: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1010

Default Name SpacesDefault Name SpacesA default space is declared with no tag. The XML in the previous slide has the same content as this one.

<?xml version="1.0"?><win:Guitarsxmlns="http://www.wintellect.com/classic-guitars"xmlns:gibson="http://www.gibson.com/finishes"xmlns:fender="http://www.fender.com/finishes"><Guitar><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><gibson:Color>Tobacco Sunburst</gibson:Color><Neck>Rosewood</Neck>

</Guitar><Guitar><Make>Fender</Make><Model>Stratocaster</Model><Year>1990</Year><fender:Color>Black</fender:Color><Neck>Maple</Neck>

</Guitar></Guitars>

Default Name Space

Page 11: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1111

Document ValidationDocument Validation“Well-formed” documents satisfy XML syntactic rules. Well-formed documents may be validated against schema documents, which define in great detail how elements in the document must be written.<?xml version="1.0"?><xsd:schemaschema id="Guitars" xmlns=""xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="Guitars"><xsd:complexType><xsd:choice maxOccurs="unbounded"><xsd:element name="Guitar">

<xsd:complexType><xsd:sequence><xsd:element name="Make" type="xsd:string" /><xsd:element name="Model" type="xsd:string" /><xsd:element name="Year" type="xsd:gYear"

minOccurs="0" /><xsd:element name="Color" type="xsd:string"

minOccurs="0" /><xsd:element name="Neck" type="xsd:string"

minOccurs="0" /></xsd:sequence>

</xsd:complexType></xsd:element>

</xsd:choice></xsd:complexType>

</xsd:element></xsd:schema>

Document is a schema

As of 2001, this was the mother of all schemas

The definitions in red come from the XMLSchema document

Page 12: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1212

Parsing XMLParsing XMLThere are two main APIs for XML parsers: DOM and SAX. The differences are significant. DOM parsers assume that the entire document resides in memory, while SAX parsers do their work under an event-driven model.

DOM offers the advantage of random-access while SAX offers advantages derived from the event-driven style of processing.

Microsoft offers a DOM-based parser, MSXML.dll as part of IE in Windows.

The DOM tree of Figure 13-2 can be produced by:

<?xml version="1.0"?><Guitars><Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>

</Guitar></Guitars>

<?xml version="1.0"?><Guitars>

<Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>

</Guitar><Guitar Image="MyStrat.jpeg"

PreviousOwner="Eric Clapton"><Make>Fender</Make><Model>Stratocaster</Model><Year>1990</Year><Color>Black</Color><Neck>Maple</Neck>

</Guitar></Guitars>

Page 13: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1313

ReadXML.CPPReadXML.CPP

This sample code reads XML using MSXML.dll.

Although the code is great fun to decipher, not every not enjoys doing so L

The crucial code is

hr = CoCreateInstance (CLSID_DOMDocument, NULL,CLSCTX_INPROC_SERVER, IID_IXMLDOMDocument, (void**) &pDocpDoc);

hr = pDocpDoc->load (var, &success);

hr = pDocpDoc->getElementsByTagName (tag, &pNodeList);

Create a COM object to host the parser in the memory of this process

Use the parser to load XML doc from file

Get element given tag into pNodeList

Page 14: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1414

ReadXML.CSReadXML.CSThe code below also reads the Guitars.xml file and writes into the console the values associated to the “Guitar” tag.

The entire code is:

using System;using System.Xml;

class MyApp{

static void Main (){

XmlDocumentXmlDocument doc = new doc = new XmlDocumentXmlDocument ();();doc.Loaddoc.Load ("Guitars.xml");XmlNodeListXmlNodeList nodes = doc.GetElementsByTagNameGetElementsByTagName ("Guitar");foreach (XmlNodeXmlNode node in nodes) {

Console.WriteLine ("{0} {1}", node["Make"].InnerText,node["Model"].InnerText);

}}

}

Page 15: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1515

XmlDocumentXmlDocument Class Class This class is compatible with DOM level 2. Using that class is quite trivial, even to discover the contents of the nodes in the document

XmlDocument doc = new XmlDocument ();doc.Load ("Guitars.xml");OutputNode (doc.DocumentElement);...

void OutputNode (XmlNode node){

Console.WriteLine("Type={0}\tName={1}\tValue={2}",

node.NodeType, node.Name, node.Value);

if (node.HasChildNodes) {XmlNodeList children = node.ChildNodes;foreach (XmlNode child in children)

OutputNode (child);}

}

XmlNode is a class that contains type, name and value information

The items in red are defined in the Xml Name Space

Document points to root when loaded

Page 16: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1616

Inspecting AttributesInspecting AttributesA node may have a collection named AttributesAttributes, which may contain XmlAttributeXmlAttribute items, which in turn may contain type, name and value

void OutputNode (XmlNode node){

Console.WriteLine ("Type={0}\tName={1}\tValue={2}",node.NodeTypenode.NodeType, , node.Namenode.Name, , node.Valuenode.Value);

if (node.Attributes != null) {foreach (XmlAttribute attr in node.Attributes)

Console.WriteLine ("Type={0}\tName={1}\tValue={2}",attr.NodeType, attr.Name, attr.Value);

}

if (node.HasChildNodes) {foreach (XmlNode child in node.ChildNodes)

OutputNode (child);}

}

Attributes and XmlAttribute

HasChildNode and ChildNodes

Page 17: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1717

XmlTextReaderXmlTextReaderThis class is a forward-only reader, which, as the ADO.NET DataReaderclass, provides a fast mechanism for traversing through an XML document.

XmlTextReader reader = null;try {

reader = new XmlTextReader ("Guitars.xml");reader.WhitespaceHandling = WhitespaceHandling.None;while (reader.Read ()) {

if (reader.NodeType == XmlNodeType.Element &&reader.Name == "Guitar" &&reader.AttributeCount > 0) {while (reader.MoveToNextAttribute ()) {

if (reader.Name == "Image") {Console.WriteLine (reader.Value);break;

}}}}}finally {

if (reader != null)reader.Close ();

}

Page 18: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1818

XmlValidatingReaderXmlValidatingReaderHopefully you guessed it: This class performs validations while reading. Validation could be against schemas of types DTD XSD, XDRusing System; using System.Xml;using System.Xml.Schema;class MyApp {static void Main (string[] args) {

if (args.Length < 2) {Console.WriteLine ("Syntax: VALIDATE xmldoc schemadoc");return;

}XmlValidatingReader reader = null;try {

XmlTextReader nvr = new XmlTextReader (args[0]);nvr.WhitespaceHandling = WhitespaceHandling.None;reader = new XmlValidatingReader (nvr);reader.Schemas.Add (GetTargetNamespace (args[1]), args[1]);reader.ValidationEventHandler +=

new ValidationEventHandler (OnValidationError);while (reader.Read ());

}catch (Exception ex) {

Console.WriteLine (ex.Message);}finally {

if (reader != null)reader.Close ();

}}

Throw exception if invalid elements are found

Page 19: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 1919

XmlTextWriterXmlTextWriterThis class has methods for reading and writingand writing elements, attributes, comments, etc, from/to an XML Document.try {

writer = new XmlTextWriter("Guitars.xml", System.Text.Encoding.Unicode);

writer.Formatting = Formatting.Indented;

writer.WriteStartDocument ();writer.WriteStartElement ("Guitars");writer.WriteStartElement ("Guitar");writer.WriteAttributeString ("Image", "MySG.jpeg");writer.WriteElementString ("Make", "Gibson");writer.WriteElementString ("Model", "SG");writer.WriteElementString ("Year", "1977");writer.WriteElementString ("Color", "Tobacco Sunburst");writer.WriteElementString ("Neck", "Rosewood");writer.WriteEndElement ();writer.WriteEndElement ();

}finally {

if (writer != null)writer.Close ();

}

<?xml version="1.0" encoding="utf-16"?><Guitars><Guitar Image="MySG.jpeg"><Make>Gibson</Make><Model>SG</Model><Year>1977</Year><Color>Tobacco Sunburst</Color><Neck>Rosewood</Neck>

</Guitar></Guitars>

Page 20: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2020

XPathXPathXPath is a query language that can be used to get elements or attributes from an XML document, using “path expressions.” Since these expressions are a bit arcane, the WWW consortium is working on a SQL-like query language aimed at replacing XPath.

In the meantime, .NET offers XPath support via a class named XPathNavigator, which contains a number of features (methods, events, etc) that make querying a document quite simple, as seen in XPathDemo.csusing System; using System.Xml.XPath;class MyApp {static void Main () {XPathDocument doc = new XPathDocument ("Guitars.xml");XPathNavigator nav = doc.CreateNavigator ();XPathNodeIterator iterator = nav.Select ("/Guitars/Guitar");while (iterator.MoveNext ()) {XPathNodeIterator it = iterator.Current.Select ("Make");it.MoveNext ();string make = it.Current.Value;it = iterator.Current.Select ("Model");

it.MoveNext ();string model = it.Current.Value;Console.WriteLine ("{0} {1}", make, model);

}}}

This is the query expresion

Page 21: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2121

Expressalyzer.csExpressalyzer.csThis application, shown in Figure 13-12, illustrates the power of XPath.

You can load a document, and make queries dynamically (provided that you are familiar with xPath expressions)

The crucial methods in this application are OnExecuteExpressionwhere a navigator is built, and AddNoteAndChildren, where, depending on the type of item found, nodes are added to the TreeView.

Page 22: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2222

XSL TransformationsXSL Transformations

XSL is a language that can be used to transform the format of a document into a different format. XSL stands for eXXtensible SStylesheetLLanguage, and was probably the main reason XML became so popular,as it was a crucial factor in the early success of EDI (Electronic Data Interchange)

Organizations use XSL to get their document from/to other organizations, e.g., just in the healthcare sector

Humana ó KaiserPermanenteBlueCrossBlueShield ó HCA

XSLT is at the heart of MS BizTalk Server, a set of B2B tools, that facilitate converting all kinds of business forms (invoices, paychecks, purchase orders, etc) from one format to another.

Figure 13-13 illustrates this concept.

Page 23: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2323

XML XML --> HTML> HTMLCopy Figure 13-16’s Guitars.xml and Guitars.xsl into a directory

Comment out the following statement in Guitars.xml:

<?xml-stylesheet type="text/xsl" href="Guitars.xsl"?>

Open Guitars.xml in IE. (Figure 13-14).

Uncomment the statement

Open Guitars.xml again in IE. (Figure 13-15).

<?xml<?xml--stylesheetstylesheet type="text/type="text/xslxsl" " hrefhref="="Guitars.xslGuitars.xsl"?>"?>

The code in

Contains instructions to transform the XML file into an HTML table at the client side

Page 24: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2424

Guitars.XSLGuitars.XSL<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0"><xsl:template match="/"><html><body><h1>My Guitars</h1><hr /><table width="100%" border="1">

<tr bgcolor="gainsboro"><td><b>Make</b></td><td><b>Model</b></td><td><b>Year</b></td><td><b>Color</b></td><td><b>Neck</b></td>

</tr><xsl:for-each select="Guitars/Guitar"><tr><td><xsl:value-of select="Make" /></td><td><xsl:value-of select="Model" /></td><td><xsl:value-of select="Year" /></td><td><xsl:value-of select="Color" /></td><td><xsl:value-of select="Neck" /></td>

</tr></xsl:for-each>

</table></body>

</html></xsl:template>

</xsl:stylesheet>

Page 25: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2525

XSLT at the serverXSLT at the server

.NET provides a class, named XslTransformXslTransform, that can convert a document from a format to another, at the server side, using ASP.NET

The chapter illustrates how this can be done in three files:

Quotes.aspx Quotes.xml Quotes.xml

The result is shown in figure 13-17.

Note that the key to get this done is to have a good understanding of .XSL specifics.

Page 26: XML DotNet Lecture Notes

CSCE 547CSCE 547 Fall 2002Fall 2002 Ch 13 Ch 13 -- 2626

XslTransformXslTransform in CSin CSThe code below shows how easy it is to work with XslTransform.

Again, as long as you know the details of XSL, transforming a document to another format is quite easy.

using System; using System.Xml.XPath;using System.Xml.Xsl;class MyApp {static void Main (string[] args) {

if (args.Length < 2) {Console.WriteLine ("Syntax: TRANSFORM xmldoc xsldoc");return;

}try {

XPathDocument doc = new XPathDocument (args[0]);XslTransform xsl = new XslTransform ();xsl.Load (args[1]);xsl.Transform (doc, null, Console.Out);

}catch (Exception ex) {

Console.WriteLine (ex.Message);}}}