Unit 10 Schema Data Processing. Key Concepts DOM basics DOM components Nodes Node types Node hierarchy.
Post on 17-Jan-2016
224 Views
Preview:
Transcript
Unit 10Unit 10
Schema Data Schema Data ProcessingProcessing
Key ConceptsKey Concepts• DOM basics• DOM components• Nodes• Node types• Node hierarchy
Document Object ModelDocument Object Model• In-memory data storage.• Hierarchical data storage model.• Allows navigation between nodes.• Allows data insertion and data
retrieval from the DOM• Language-neutral representation of
data• Supported by .NET languages
Some DOM-based parsersSome DOM-based parsers Parser Description
JAXP Sun Microsystem’s Java API for XML Parsing (JAXP) is available at no charge from java.sun.com/xml.
XML4J IBM’s XML Parser for Java (XML4J) is available at no charge from www.alphaworks.ibm.com/tech/xml4j.
Xerces Apache’s Xerces Java Parser is available at no charge from xml.apache.org/xerces.
msxml Microsoft’s XML parser (msxml) version 2.0 is built-into Internet Explorer 5.5. Version 3.0 is also available at no charge from msdn.microsoft.com/xml.
4DOM 4DOM is a parser for the Python programming language and is available at no charge from fourthought.com/4Suite/4DOM.
XML::DOM XML::DOM is a Perl module that we use in Chapter 17 to manipulate XML documents using Perl. For additional information, visit www-4.ibm.com/software/developer/library/xml-perl2.
DOM ComponentsDOM Components
Class/Interface Description
Document interface Represents the XML document’s top-level node, which provides access to all the document’s nodes—including the root element.
Node interface Represents an XML document node.
NodeList interface Represents a read-only list of Node objects.
Element interface Represents an element node. Derives from Node.
Attr interface Represents an attribute node. Derives from Node.
CharacterData interface Represents character data. Derives from Node.
Text interface Represents a text node. Derives from CharacterData.
Comment interface Represents a comment node. Derives from CharacterData.
ProcessingInstruction interface
Represents a processing instruction node. Derives from Node.
CDATASection interface Represents a CDATA section. Derives from Text.
XML NodesXML Nodes<root>
<name>
<first>Joe</first>
<middle/>
<last>Smith</last>
</name>
<root>
Node TypesNode TypesNode type Description
Document The document root, which is a container for all document nodes.
DocumentFragment Temporary containing holding a subset of document nodes.
DocumentType A <!DOCTYPE…> node.
EntityReference Entity reference text.
Element An element node.
Attr An attributed of an element, represented as a node.
Node Types (cont'd)Node Types (cont'd)Node type Description
ProcessingInstruction Node containing processing instruction information.
Comment Comment node.
Text Text value of an element or attribute.
CDATASection Node representing a CDATA section.
Entity DTD <!ENTITY…> declaration.
Notation DTD notation declaration.
Creating an Creating an XML DocumentXML Document
1 // Fig. 8.14 : BuildXml.java2 // Creates element node, attribute node, comment node,3 // processing instruction and a CDATA section.45 import java.io.*;6 import org.w3c.dom.*;7 import org.xml.sax.*;8 import javax.xml.parsers.*;9 import com.sun.xml.tree.XmlDocument;1011 public class BuildXml {12 private Document document; 1314 public BuildXml()15 {1617 DocumentBuilderFactory factory =18 DocumentBuilderFactory.newInstance();1920 try {2122 // get DocumentBuilder23 DocumentBuilder builder = 24 factory.newDocumentBuilder();25
Creating an Creating an XML Document (cont'd)XML Document (cont'd)26 // create root node 27 document = builder.newDocument();28 } 29 catch ( ParserConfigurationException pce ) {30 pce.printStackTrace();31 }3233 Element root = document.createElement( "root" );34 document.appendChild( root );3536 // add a comment to XML document37 Comment simpleComment = document.createComment( 38 "This is a simple contact list" );39 root.appendChild( simpleComment );4041 // add a child element42 Node contactNode = createContactNode( document );43 root.appendChild( contactNode );4445 // add a processing instruction46 ProcessingInstruction pi = 47 document.createProcessingInstruction(48 "myInstruction", "action silent" );49 root.appendChild( pi );50
Creating an Creating an XML Document (cont'd)XML Document (cont'd)
Parser Description
JAXP Sun Microsystem’s Java API for XML Parsing (JAXP) is available at no charge from java.sun.com/xml.
XML4J IBM’s XML Parser for Java (XML4J) is available at no charge from www.alphaworks.ibm.com/tech/xml4j.
Xerces Apache’s Xerces Java Parser is available at no charge from xml.apache.org/xerces.
msxml Microsoft’s XML parser (msxml) version 2.0 is built-into Internet Explorer 5.5. Version 3.0 is also available at no charge from msdn.microsoft.com/xml.
4DOM 4DOM is a parser for the Python programming language and is available at no charge from fourthought.com/4Suite/4DOM.
XML::DOM XML::DOM is a Perl module that we use in Chapter 17 to manipulate XML documents using Perl. For additional information, visit www-4.ibm.com/software/developer/library/xml-perl2.
51 // add a CDATA section52 CDATASection cdata =
document.createCDATASection(53` "I can add <, >, and ?" ); 54 root.appendChild( cdata ); 5556 try { 5758 // write the XML document to a file59 ( (XmlDocument) document).write( new
FileOutputStream(60 "myDocument.xml" ) ); 61 } 62 catch ( IOException ioe ) {63 ioe.printStackTrace();64 }65 }6667 public Node createContactNode( Document
document )68 {6970 // create FirstName and LastName
elements 71 Element firstName = document.createElement( "FirstName" );72 Element lastName = document.createElement( "LastName" );73
74 firstName.appendChild( document.createTextNode( "Sue" ) );75 lastName.appendChild( document.createTextNode( "Green" ) );76
Creating an Creating an XML Document (cont'd)XML Document (cont'd)
77 // create contact element78 Element contact = document.createElement( "contact" );7980 // create an attribute81 Attr genderAttribute = document.createAttribute( "gender" ); 82 genderAttribute.setValue( "F" );8384 // append attribute to contact element85 contact.setAttributeNode( genderAttribute );86 contact.appendChild( firstName );87 contact.appendChild( lastName );88 return contact;89 } 90 91 public static void main( String args[] )92 {93 BuildXml buildXml = new BuildXml(); 94 }95 }
XML XML Document Document ResultResult
1 // Fig. 8.15 : TraverseDOM.java
2 // Traverses DOM and prints various nodes.
3
4 import java.io.*;
5 import org.w3c.dom.*;
6 import org.xml.sax.*;
7 import javax.xml.parsers.*;
8 import com.sun.xml.tree.XmlDocument;
9
10 public class TraverseDOM {
11 private Document document;
12
13 public TraverseDOM( String file )
14 {
15 try {
16
17 // obtain the default parser
18 DocumentBuilderFactory factory =
19 DocumentBuilderFactory.newInstance();
20 factory.setValidating( true );
21 DocumentBuilder builder = factory.newDocumentBuilder();
22
23 // set error handler for validation errors
24 builder.setErrorHandler( new MyErrorHandler() );
25
XML Document Result (cont’d)XML Document Result (cont’d)
22
23 // set error handler for validation errors
24 builder.setErrorHandler( new MyErrorHandler() );
25
26 // obtain document object from XML document27 document = builder.parse( new File( file ) );28 processNode( document );29 } 30 catch ( SAXParseException spe ) {31 System.err.println( 32 "Parse error: " + spe.getMessage() );33 System.exit( 1 );34 }35 catch ( SAXException se ) {36 se.printStackTrace(); 37 }38 catch ( FileNotFoundException fne ) {39 System.err.println( "File \'" 40 + file + "\' not found. " );41 System.exit( 1 );42 }43 catch ( Exception e ) {44 e.printStackTrace();45 }46 }4748 public void processNode( Node currentNode )49 {50 switch ( currentNode.getNodeType() ) {51
XML XML Document Document
Result Result (cont’d)(cont’d)
52 // process a Document node53 case Node.DOCUMENT_NODE:54 Document doc = ( Document ) currentNode;5556 System.out.println( 57 "Document node: " + doc.getNodeName() +58 "\nRoot element: " +59 doc.getDocumentElement().getNodeName() );60 processChildNodes( doc.getChildNodes() );61 break;6263 // process an Element node64 case Node.ELEMENT_NODE: 65 System.out.println( "\nElement node: " + 66 currentNode.getNodeName() );67 NamedNodeMap attributeNodes =68 currentNode.getAttributes();6970 for ( int i = 0; i < attributeNodes.getLength(); i++){
71 Attr attribute = ( Attr ) attributeNodes.item( i );
7273 System.out.println( "\tAttribute: " + 74 attribute.getNodeName() + " ; Value = " +75 attribute.getNodeValue() );76 }7778 processChildNodes( currentNode.getChildNodes() );79 break;80
XML XML Document Document
Result Result (cont’d)(cont’d)
81 // process a text node and a CDATA section82 case Node.CDATA_SECTION_NODE:83 case Node.TEXT_NODE: 84 Text text = ( Text ) currentNode;8586 if ( !text.getNodeValue().trim().equals( "" ) )87 System.out.println( "\tText: " +88 text.getNodeValue() );89 break;90 }91 }9293 public void processChildNodes( NodeList children )94 {95 if ( children.getLength() != 0 ) 9697 for ( int i = 0; i < children.getLength(); i++)98 processNode( children.item( i ) );99 }100101 public static void main( String args[] )102 {103 if ( args.length < 1 ) {104 System.err.println( 105 "Usage: java TraverseDOM <filename>" );106 System.exit( 1 );107 }108109 TraverseDOM traverseDOM = new TraverseDOM( args[ 0 ] ); 110 }111}
top related