Updates, source code, and Wrox technical support at www.wrox.com Beginning Ian Williams XSLT and Xpath Transforming XML Documents and Data Wrox Programmer to Programmer TM
Beginning
XSLT and XpathTransforming XML Documents and Data
www.wrox.com
$49.99 USA$59.99 CANADA
Recommended Computer Book
Categories
Programming Languages
XML
ISBN: 978-0-470-47725-0
Wrox Beginning guides are crafted to make learning programming languages and technologies easier than you think, providing a structured, tutorial format that will guide you through all the techniques involved.
Extensible Stylesheet Language Transformations (XSLT) is a language for transforming XML documents and data from one format to another. Answering the demand for an introductory book on XSLT processing, Ian Williams presents a clear, concise resource on XSLT concepts and methods and explains how and why XSLT relies on the XML Path language (XPath).
As you gain a solid foundation in XSLT processing, you’ll learn the basic node tree structure that is used in the data model and discover how XSLT differs from the approach used in other programming languages. Example-laden chapters include both versions 1.0 and 2.0 features and demonstrate how to transform one XML data format to another. The book covers the key structural elements of an XSLT file and shows you how to use simple XPath expressions to match and select source file content. Along the way, you’ll uncover a rich set of XPath functions that will benefit you again and again as you develop your XSLT skills.
What you will learn from this book● How to define templates, the basic building blocks of XSLT
● The way to construct XPath expressions and use a range of powerful XPath and XSLT functions
● The role of variables and parameters in XSLT
● Making use of control structures and iteration
● How to generate and format numbers, dates, and times
● Methods for working with multiple source and stylesheet documents
● Ways to debug XSLT, validate types in XSLT, and document your stylesheets
● Tips for indexing and linking items using identifiers and keys
● Techniques for controlling whitespace and processing plain text
Who this book is forThis book is for web developers, authors, and designers who understand XML basics, and are interested in gaining a solid understanding of XSLT processing.
Williams
Beginning
spine=.844"
Updates, source code, and Wrox technical support at www.wrox.com
Beginning
Ian Williams
XSLT and XpathTransforming XML Documents and Data
XS
LT and Xpath Transform
ing XM
L D
ocuments and D
ata
Wrox Programmer to Programmer TMWrox Programmer to Programmer TM
Williams ffirs.tex V2 - 07/03/2009 2:47pm Page i
Beginning XSLT and XPathIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xixChapter 1: First Steps with XSLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Chapter 2: Introducing XPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Chapter 3: Templates, Variables, and Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43Chapter 4: Using Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61Chapter 5: Sorting and Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Chapter 6: Strings, Numbers, Dates, and Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95Chapter 7: Multiple Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115Chapter 8: Processing Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141Chapter 9: Identifiers and Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159Chapter 10: Debugging, Validation, and Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181Chapter 11: A Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201Appendix A: Answers to Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239Appendix B: Extending XSLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253Appendix C: XSLT Processing Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259Appendix D: XSLT 2.0 Quick Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263Appendix E: XSLT 2.0 Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315Appendix F: XPath 2.0 Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341Appendix G: References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Williams ffirs.tex V2 - 07/03/2009 2:47pm Page iii
Beginning
XSLT and XPath
Transforming XML Documents and Data
Ian Williams
Wiley Publishing, Inc.
Williams ffirs.tex V2 - 07/03/2009 2:47pm Page iv
Beginning XSLT and XPath: Transforming XML Documentsand DataPublished byWiley Publishing, Inc.10475 Crosspoint BoulevardIndianapolis, IN 46256www.wiley.com
Copyright © 2009 by Ian Williams
Published by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN: 978-0-470-47725-0
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or byany means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted underSections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of thePublisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center,222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permissionshould be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030,(201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warrantieswith respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties,including without limitation warranties of fitness for a particular purpose. No warranty may be created or extendedby sales or promotional materials. The advice and strategies contained herein may not be suitable for everysituation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting,or other professional services. If professional assistance is required, the services of a competent professional personshould be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that anorganization or Web site is referred to in this work as a citation and/or a potential source of further informationdoes not mean that the author or the publisher endorses the information the organization or Web site may provideor recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work mayhave changed or disappeared between when this work was written and when it is read.
For general information on our other products and services please contact our Customer Care Department within theUnited States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not beavailable in electronic books.
Library of Congress Control Number: 2009929458
Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Wrox Programmer to Programmer, and related trade dressare trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States andother countries, and may not be used without written permission. All other trademarks are the property of theirrespective owners. Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book.
Williams fauth.tex V2 - 07/03/2009 2:49pm Page v
About the AuthorIan Williams is an information designer specializing in XML technologies, and a software technicalwriter. He worked in the U.K. publishing industry before getting involved in information technology atOWL International, developers of the one of the first commercial hypertext products. Ian was a productmanager there, and later a consultant working with large corporate customers.
Since 1998 Ian has worked on technical writing and information-design projects, most recently for Nokia,Reuters, and Volantis. He is co-author with Pierre Greborio of Professional InfoPath 2003, also from WroxPress.
Ian lives with his wife, Helen, in Kent, in a converted lifeboat station overlooking the English Channel.
Williams fcre.tex V2 - 07/03/2009 2:50pm Page vi
CreditsExecutive EditorCarol Long
Development EditorTom Dinse
Technical EditorDan Squier
Production EditorEric Charbonneau
Copy EditorLuann Rouff
Editorial DirectorRobyn B. Siesky
Editorial ManagerMary Beth Wakefield
Production ManagerTim Tate
Vice President and ExecutiveGroup PublisherRichard Swadley
Vice President and Executive PublisherBarry Pruett
Associate PublisherJim Minatel
Project Coordinator, CoverLynsey Stanford
ProofreaderCandace English
IndexerJohnna VanHoose Dinse
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page vii
Contents
Introduction xix
Chapter 1: First Steps with XSLT 1
Transforming an XML Document to a Web Page 1Using a Browser 2Transforming Locally 11
Transforming XML Data to XML 14Atom and RSS Elements 14Developing the Stylesheet 17
Summary 23
Chapter 2: Introducing XPath 25
Nodes 25Node Types 26Node Properties 26
Data Model 27Path Expressions 28
Using an XPath Analyzer 29Axes 30Node Tests 32Predicates 33Operators 33
XPath Functions 35Strings 35Dates, Times, and Durations 37Nodes and Documents 38Numbers 39
Summary 40Exercises 40
Chapter 3: Templates, Variables, and Parameters 43
About Templates 44Template Rules 44
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page viii
Contents
Invoking a Rule 45Using Modes 46Setting Priorities 48Built-in Rules 49
Named Templates 49Variables 50Parameters 55
Global Parameters 55Template Parameters 57
Summary 59Exercises 60
Chapter 4: Using Logic 61
Conditional Processing 61A Simple Choice 61Multiple Choices 62Using XPath for Conditional Tests 64
Iteration 64Using Attribute Sets 64Monitoring the Context 69
Processing XML Code 70Summary 71Exercises 72
Chapter 5: Sorting and Grouping 75
Sorting Content 75Perform a Sort 81
Grouping 83Common Values 84Adjacent Items 85Starting and Ending Conditions 91
Summary 93Exercises 93
Chapter 6: Strings, Numbers, Dates, and Times 95
String Processing 95About Collations 95General Functions 96Codepoints 96Comparison 97
viii
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page ix
Contents
Concatenation 97Simple Substrings 98Using Regular Expressions 99Normalizing Values 101Escaping URIs 102
Numbers 103Generating Numbers 103Formatting Source Numbers 107
Dates and Times 109Contextual Dates 109Formatting 109Combining and Converting Values 111Durations 112Time Zones 112
Summary 113Exercises 114
Chapter 7: Multiple Documents 115
Modular Stylesheets 115Including Modules 116Imported Stylesheets 119
Source Documents 123Using the document() Function 123XPath Alternatives 125Setting or Changing Context 127
Output Documents 127Preparing a Feed Update 129Splitting a Document 136
Summary 138Exercises 139
Chapter 8: Processing Text 141
Controlling Whitespace 141Stripping Space 142Preserving Space 143Using <xsl:text> 143
XML to Text 144Text to XML 146
Loading Unparsed Text 146Analyzing the Input 146
ix
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page x
Contents
Alternatives to XSLT 153XML Maps 154XML Data in Excel 156
Summary 157Exercises 158
Chapter 9: Identifiers and Keys 159
ID Datatypes 159Using the id() Function 160Keys 162
The key() Function 163Generating Identifiers 165
Indexing Lines 165Census to GEDCOM XML 170
Summary 179Exercises 180
Chapter 10: Debugging, Validation, and Documentation 181
Debugging XSLT 181Profiling 184Verifying XHTML Output 185Using Messages 186Commenting Output 187Using the error() Function 188
Type and Schema Validation 188Types in XSLT 188Using a Schema-Aware Processor 189
Documenting Your Stylesheets 195Summary 199Exercises 200
Chapter 11: A Case Study 201
Schema Overview 201Common Elements and Attributes 202
Common Attributes 202Block Elements 202Inline Elements 203
The Quick-Reference Schema 204Link Container Elements 205
x
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page xi
Contents
Property Elements 206Link Verification 208
Metadata Schemas 209Resource Metadata 210Subject Metadata 214
Reference Stylesheets 217Link Module 220
Link Parameters 220Function Module 223
Term Module 223Term Parameters 225Displaying Inline Terms 226
Building the Site 227Generating the Reference Pages 228Landing and Glossary Pages 233Creating a Sitemap 234
Summary 237
Appendix A: Answers to Exercises 239
Chapter 1 239Chapter 2 239
Question 1 239Question 2 240Question 3 240
Chapter 3 241Question 1 241Question 2 242
Chapter 4 242Question 1 243Question 2 243
Chapter 5 244Question 1 244Question 2 245Question 3 245
Chapter 6 246Question 1 246Question 2 246Question 3 247
Chapter 7 247Question 1 247
xi
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page xii
Contents
Question 2 248Question 3 248Question 4 248
Chapter 8 249Question 1 249Question 2 249Question 3 250
Chapter 9 250Question 1 250Question 2 250Question 3 251
Chapter 10 251Question 1 251Question 2 251
Chapter 11 252
Appendix B: Extending XSLT 253
Stylesheet Functions 253Calling an Extension Function 254
Function Libraries 255EXSLT 255FunctX 256
Vendor Extensions 256User-Defined Extensions 257
Appendix C: XSLT Processing Model 259
The Data Model 259Transforming 260
Parsing Inputs 260Template Rules 261Variables and Parameters 261Controlling Processing 261Outputs and Serialization 262
Appendix D: XSLT 2.0 Quick Reference 263
Elements 263Attribute Groups 264Types 264Functions 264
xii
Williams ftoc.tex V2 - 08/03/2009 4:20pm Page xiii
Contents
XSLT Elements 264xsl:analyze-string 264xsl:apply-imports 265xsl:apply-templates 265xsl:attribute 266xsl:attribute-set 267xsl:call-template 268xsl:character-map 269xsl:choose 269xsl:comment 270xsl:copy 271xsl:copy-of 271xsl:decimal-format 272xsl:declaration 274xsl:document 274xsl:element 274xsl:fallback 275xsl:for-each 276xsl:for-each-group 276xsl:function 277xsl:if 278xsl:import 279xsl:import-schema 280xsl:include 281xsl:instruction 281xsl:key 282xsl:matching-substring 283xsl:message 283xsl:namespace 284xsl:namespace-alias 285xsl:next-match 285xsl:non-matching-substring 286xsl:number 286xsl:otherwise 288xsl:output 288xsl:output-character 291xsl:param 291xsl:perform-sort 292xsl:preserve-space 293xsl:processing-instruction 294xsl:result-document 294
xiii