XML in eBusiness Spring 2005 Dave Hollander Chief Technology Officer Contivo, Inc
Dec 27, 2015
XML in eBusiness
Spring 2005Dave HollanderChief Technology OfficerContivo, Inc
XML Background
Information Age
XML Origins
Markup
Questions: Origins
• Why is XML such an important development?
• What are the benefits of XML compared to HTML?
• What was the most challenging thing during implementing XML?
• What was the creative process in envisioning XML from SGML? What was the spark/moment of clarity that started the process rolling?
• What was the goal behind creating XML?
Information Age – or is it?
• Guttenberg• Industrial Age• The Web• InfoGlut
• Information– It is 3º outside.
• Knowledge = info + action– It is 3º out, put on a coat.
• Wisdom = knowledge + context– Why bother, I am just jumping in the hot tub!
The Promise of XML
XML is the standard platform for information convergence
• XML Enables Information Reuse – Global interchange– Machine processing– New uses for documents
• Values of XML– Feature/Complexity balance– Enables semantic processing– User defined semantics
Data-setSize
Data-setSize
StructuredStructuredUn-StructuredUn-Structured
LargeLarge
SmallSmall
Publishing
(Full-Text)
Publishing
(Full-Text)DatabaseDatabase
WordProcessing
WordProcessing TransactionalTransactional
XMLXML
Interchangeable Parts drove the Industrial Age
Reusable Information drives the Information Age
Origins of XML
• 1988 ISO 8879 SGML
• 1996 April (WWW2)
– XML vision written in a taxiby Jon Bosak and Dave Hollander
• 1996 November - introduced to SGML Community
• 1997 March - First press articles
• 1997 April (WWW6) - introduced to Web Community
• 1998 February - XML 1.0
• 1999 January - XML Namespaces
• 2001 May - XML Schema
• 2001 October - XSL Recommendation
• 2002 February - XML Digital Signatures “I didn’t actually build it, but it was based on my idea.”
Why XML?
• XML was designed to manage documents on the web– Team included architects of HP.COM and
DOCS.SUN.COM– Reuse content made for print in multiple web pages:
• data sheets, white papers etc.
– Present a more organized view of information• We faced significant differences in how our organizations
structured information
• So, the answer was to create XML to – Interchange document information between groups– Make it easy to publish content standards– Separate content from presentation
• which makes it easy to build tools that reuse information
The design goals for XML
1. XML shall be straightforwardly usable over the Internet.
2. XML shall support a wide variety of applications.
3. XML shall be compatible with SGML.
4. It shall be easy to write programs which process XML documents.
5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
6. XML documents should be human-legible and reasonably clear.
7. The XML design should be prepared quickly.
8. The design of XML shall be formal and concise.
9. XML documents shall be easy to create.
10.Terseness in XML markup is of minimal importance.
XML
• XML is the eXtensible Markup Language
• Evolved from ISO Standard SGML• Designed to
– Add structure to Web documents – Be simple (25 pages)
• XML has expanded well beyond its original goals
But what is it?
• XML is a meta-language for creating markup languages
– Markup – information that computers use • XML makes it easy and reliable for computers
(and humans) to identify markup in documents.
– Meta-language – a language to create languages
XML allows you to design a markup language that describes what is important to you.
Markup
• Simple Syntax that make it easy to separate “data” from “meta-data”
• Markup includes– Elements– Attributes– Comments– Entity references– Processing instructions– CDATA sections– Document type declarations
<tag> Content </tag>
Element
OpeningTag
ClosingTagContent
Meta-Language
• Meta-language– A language to create languages– User defined semantics (meaning)– HTML has fixed semantics
• Each meta-language– Defined by a schema
• May be implicitly defined
– Referred to as a dialect
<?xml version="1.0"?><ShoppingCart><ProductList> Dave’s Order </ProductList><Part> 00000-99999 </Part></ShoppingCart>
XML is a meta-language for creating markup languages
XML is Descriptive Markup
• XML is Descriptive; HTML is procedural
– Describe and assign a name to a class of data
• Multiple behaviors can be assigned to each class
– Examples: layout, search, database, eCommerce
– “find the part numbers in all shopping carts”
• Markup is only valuable if you know what it means!
<?xml version="1.0"?><ShoppingCart><List>Dave’s Order</List><Part> 00000-99999</Part>…..</ShoppingCart>
<?xml version="1.0"?><ShoppingCart><List>Dave’s Order</List><Part> 00000-99999</Part>…..</ShoppingCart>
<html><H1>Dave’s Order</H1><P><font size=“6”>
00000-99999</font>…..</html>
<html><H1>Dave’s Order</H1><P><font size=“6”>
00000-99999</font>…..</html>
Questions: Origins
• What was the creative process in envisioning XML from SGML? What was the spark/moment of clarity that started the process rolling?– Laziness– Community of practice w/ 100+ years of experience– Walk through, feature by feature asking
“is this necessary for success”
• What was the goal behind creating XML?– My favorite: information reuse
Questions: Origins
• Why is XML such an important development?– XML is foundation of information interchange
• What are the benefits of XML compared to HTML?– User defined markup– More extensive application space
• What was the most challenging thing during implementing XML?– Agreeing on “is it necessary”
XML Applications
XML SpecificationsXML ProcessorsApplications
Questions: Applications
• I know XML is very compatible to many of modern languages, is it compatible to old languages like COBOL
• Compare XML to EDI and explain the different industry-specific dialects or standards that exist today.
• What are the recent trends and forecasts for corporate use of XML in integrating the enterprise both internally and externally?
XML Specifications
• XML Instance Document = the Data• Schemas = the contract• Style-sheets = user interface• XQuery = finding data• Web Services = interchange of data
• Protocol– n 1: (computer science) rules determining the
format and transmission of data
The W3C XML Family
•XML Coordination Group– XML Core
• errata, X-Include, Information Set
– XML Schema • Parts 0, 1, 2, 3
– XML Linking WG • XML Base, Xpath, Xlink,
Xpointer
– XML Query WG• Data Model, Algebra,
Language
– XML Namespaces
•XML Protocols WG•XSL WG
– XSL, XSLT
•XML DSIG– XML Signature, – Canonical XML
•DOM ( Levels 1, 2, 3 )
•Others – XML-Encryption– VoiceXML– XForms WG – SMIL, SVG – XHTML– RDF …
More than 20 horizontal XML specifications!More than 20 horizontal XML specifications!
XQuery
• XQuery 1.0: An XML Query Language– W3C Working Draft 04 April 2005– Still at least 3 months from Recommendation
• Covers – XPATH: addressing single elements in an XML document– Query: like SQL
• Limitations– No semantics– No mechanism to normalize multiple data resource results
W3C XML Schemas
• Schema defined in a .xsd file (usually)
• Schemas– Defines Classes of documents– Defines structure, constraints and datatypes – Validation – Schemas can only express part of the semantics.
• Relax NG is schema specification similar to W3C
• Schemas are a contract to interchange information.
XML Schemas
• Many Different ways to markup data
1) <BUYER_NAME> JOHN SMITH </BUYER_NAME>
2) <BUYER_NAME> <LAST> SMITH </LAST>
<FIRST> JOHN </FIRST> </BUYER_NAME>
3) <NAME role=”BUYER"> <SURNAME BSR_CODE=”NAM-01"> SMITH </SURNAME> <GIVEN BSR_CODE=”NAM-02"> JOHN </GIVEN> </NAME>
• Which is right?
Schemas Reflect Business Models
• Prescriptive vs. permissive– who pays to make the data right?
• Loose vs. tight– how many semantics are expressed?– easy to author vs. reuse
• Interchange model– blind or pre-defined partners?
• Extensibility– kept up to date vs. predictability DocBook
Shopping CartRosettaNet
Pinnacles
The “Waterloo” Model
Authors Intent
Validate your data against a business model.
CALS
XML has Namespaces
How is software to recognize markup it knows how to process, and avoid confusing it with markup designed for the use of some other software? [1]
• Namespaces allow documents to be merged without name collisions.
• Can be used to identify an authority for the element type
<?xml version="1.0"?><MyDoc xmlns=“http://mhxml.com/ns1” xmlns:hp=“http://hp.com/ns2”>
<part> 00000-99999</part> <!-- from default namespace --><hp:part> 00000-99999-hp</hp:part> <!-- from HP namespace --></MyDoc>
XML Tools
• XML enabled reuse of core technology– Parsers
• DOM, SAX, others
– Processors• App servers, java, .Net
– Databases• Native and Enabled
• Free, or at least inexpensive:– http://www.xml.com/
programming/
Lexical
Semantic
<ShoppingCart><ProductList> Dave’s Order</ProductList><Part> 00000-99999 </Part></ShoppingCart>
ISA~00~ ~00~ ~01~0819405530010 BEG~00~DS~20-P1-749833~~000114.NTE~ORI~SHIP ASAP.
<Order><PL> Dave’s Order </PL><Part> 00000-99999 </Part></Order>
Syntactic
XML as Data Model
• Relational– Entity Relation Model– Normalization Plan
• BLOBs/CLOBs
– Queries• Grievances
• Signers and states
• Declarations
• Hierarchical (XML)– Elements, attributes– Structure– Constraints
Now, tell me who’s proudest?
Legacy
• XML does not support non-XML data resources– COBOL– EDI– Others
• It is possible, and often a good idea to use XML to harmonize data.
Semantic Harmonization
Schema Reconciliation Semantic Reconciliation
Harmonized
XML
XML
XML XML
Lexical Reconciliation
Lexical
EDI
Legacy
Flat File Syntax Semantic
EDI Tech
nology
Beyond XML and B2B
demand for EDI
XML Technology
demand for XML eCommerce Technology
demand for technology Z
Per
form
ance
Met
ric
Time
Technology Z
• Volume of transactions• Security, Reliability,
Predictability• Reduced Cost of Procurement
Interoperability• Flexibility and Agility• Number of trading partners• Global supply chains• Reduced setup and TCO• One-to-one marketing
• Reuse, leverage and communities• Semantics
• Cost of new product deployment• One-to-one business
• Security, Reliability, Predictability? • Completeness?
Ref: Innovators Dilemma; Clayton Christensen
Compare to EDI
ISA~00~ ~00~ ~01~0819405530010 ~01~153734900 ~000114~0927~U~00302~000160473~0~P~|.GS~PO~COMDEX~D710-850~000114~0927~161441~X~003020.ST~850~290267.BEG~00~DS~20-P1-749833~~000114.NTE~ORI~SHIP ASAP.FOB~CC~OR.DTM~002~000114.N1~ST~LUCENT TECHNOLOGIES~92~99.N3~67 WHIPPANY RD~CAHNDANG.N4~WHIPPANY~NJ~07981.
I have no idea what this might
mean!
• EDI error rates can approach 85%.• HTML parsing requires up to 50% of the
code in your favorite browser!
The 20-80 Rule• Build for the 20% who do
80% of the business
Throughput• Primary Design Metric
• Information Design Metric
– Low Character Count
– Move context data to TPA
Interoperability– Between trading partners
Semantics– Defined in Standards and TPA
EDI Values
InteroperabilityInteroperability
SemanticsSemantics
ThroughputThroughput
XML Values
InteroperabilityInteroperability
SemanticsSemantics
ThroughputThroughput
SGML for the Web• Make it easy to interchange
documents on the web
Interoperability• Primary Design Metric• Loosely coupled systems • Information Design Metrics
– Self-describing messages– Ease of processing
Semantics– User defined– Machine and Human
Throughput– Not primary metric
New Metrics?
• eCommerce
– Process Effectiveness– Agility: flexibility, adaptability– Strategic Business Relationships– Evolution in Marketplace Dynamics– Technology ( Hubs, I-servers, portals )
• XML
– Technical Maturity – Standards ( Schemas, XSL, Query )– Interoperability ( New TLAs )– Products ( Contivo )
• New Axis?
InteroperabilityInteroperability
SemanticsSemantics
ThroughputThroughput
It is hard for me to believe that anything will replace XML for information interoperability
It is hard for me to believe that anything will replace XML for information interoperability
New Technology Adoption
New, disruptive, technology succeeds when performance metrics change
• B2B – EDI– Build for the 20% who
do 80% of the business• EAI
– Connectivity between high value, internal business applications
• Web Services– standards describing interoperability – detailed enough to be definitive– flexible enough to describe any system– scalable to be pervasive– easy to implement
EDI / VAN
XML / Content
Technology Z
Clayton Christensen; Innovator’s Dilemma
Integration and Information Silos
• Business face a challenge to sustain competencies built around their systems –and– to integrate the systems to create new business solutions.
• Requirements
ROI and TCO
Flexible – built to integrate
Evolvable – support legacy and change
Loosely Coupled - able to support independent development efforts
Integration
• Developers using middleware need three answers– How are messages moved?
• Physical infrastructure selection
– What messages are exchanged and in what order?• Delivery, Workflow and/or Collaboration
– What do the messages mean? • Logical and Conceptual understanding
Messaging
Application
Messaging System
Interface
Ad
apter
Interface
Ad
apter
Application
Interface
Ad
apter
Interface
Ad
apter
95%
5%
Interoperability requires interfaces between applications to be standardized.
The Fundamental Challenge
The remaining 95% is a function of
application semantics.
Only 5% is a function of the middleware choice.
(Gartner Group)
Taming the “Integration Hairball”
• Physical– multiple interconnect technologies– deploy middleware
• Logical– no messaging standards– deploy “canonicals”
• Conceptual– no centralized design model– deploy semantic modeling
Business ProcessRequirements
Vocabulary Driven
SAP Oracle
XML Wrapper XML Wrapper
SAPAPI
OracleAPI
SAP Service Interface
(XML Schema)
Oracle Service Interface(XML Schema)
SAPRMD
OracleRMD
DictionaryCust Account Number
Cust DUNS Number
PO Number
PO Issue Date
Routing Instrument
Ship Date
Transit Direction Code
Transport Terms
Gap Analysis
Mapping
InterfaceModeling
InterfaceModeling
DomainModeling
DomainModeling
Transport Services
Transform
Code Generation
VocabularyEnrichment
Questions: Applications
• I know XML is very compatible to many of modern languages, is it compatible to old languages like COBOL– Not directly. – There are some tools for converting their data and
some tools for programming XML w/ COBOL
Security
Security
• Security is mostly a business problem, not a technical one. – Risk, trust and security depend on human
management of relationships and assessment.– Technology just adds more risk.
• Are there any security challenges on XML?– Visibility
Data Interchange – the UUP example
• I had a question when it comes to XML in the use of the Universal User Profile (UUP). In light of today's security and privacy concerns, do you think this concept will take off at some point? – I already trust ebay, paypal, etc with some of this
info
• With Passport not being used by as many people as Microsoft had hoped and no other visible vendor pushing an alternate, it's acceptance seems doubtful. – The trick will be if someone emerges with a model
that adds more value than risk—to me, the consumer.
Security
• How can we embed security functionality in XML environment?– Encryption is just changing the visibility.
• In Schwab's case, it is mentioned that there is no security mechanism in XML or SOAP. How can we overcome this pitfall in e-business environment?– It will take additional technology and industry initiatives.
• And as a follow-up question (to UPP), are there any security safeguards in place within XML or is that left solely up to those writing the applications that use XML.
“Chaord”
The Emergence of the “Chaord”
“Any self-organizing, adaptive, nonlinear,
complex community or system, whether physical,
biological or social, the behavior of which
exhibits characteristics of both order and chaos.
Or, more simply stated, a Chaord is any
chaotically ordered complex system.”
Dee HockFounder and CEO Emeritus - VISA
Oct. 22, 1994
4242
Futures
Questions: XML Web Services and eComm
• What is the future of XML in e-business?
• Are there ways to apply XML that are not currently being done?
• Do you think they will be realized?
• What are the threats to XML to remain open source?
• What are some upcoming developments in Web Services and XML?• How is security implemented in XML vs. Web Services?• Any risk in deployment of XML –related technologies in e-
Commerce and any solutions to overcome?• What are the disadvantages and advantages of XML in e-
commerce?
XML Breaks “The Camels Hump”
Time
Act
ivit
y
Research Standards Billion dollar investments
Source: Dr. David Clark, head of Advanced Network Architecture research group, MIT’s Laboratory for Computer Science
XML’s History
Activity does not decline, its rate of growth does.
• Industry Investment in standards
• Multiple phase Standards Development
• Investment during standards development
• Activity during lagging phase stays longer due to costs
Kann ich bitte ein Glas Wasser haben?
• Presentment– Again, louder– Reword– Reduction– Gesture– Translate
• Fulfillment– Guess– Look Up– Partial Understanding– Full Understanding
• shared context
Secondary Factors
– Trust– Policy
– Ability– Anticipation– Motivation
Wasser bitte!Wasser bitte!
WASSER!!!!!!WASSER!!!!!!
Can I please a glass of water have?Can I please a glass of water have?
Wuerden Sie mir bitte ein Glas Wasser reichen?Wuerden Sie mir bitte ein Glas Wasser reichen?
The Evolution of e-Commerce
Web services promise to bring these all together and make networks of computers useful and ubiquitous
• Silicon chips made computer ubiquitous
• GUIs made using computers ubiquitous
• The Web made accessing content ubiquitous
• XML made understanding content ubiquitous
1980scustom
applications
early 1990sERP systems
mid 1990sfax, phone, EDI
late 1990sB2C, B2B
2000sWeb Services
1975: FedEx installs the first drop box
1991: Crossing the Chasm and Virtual Corporation
published
1994: The Web carries commercial messages anywhere in the world.
1999: e-Everything, ad nauseum
2001: Crossroads -- “The P.T. Barnum
Era of B2B is over.”
Business Case
Tight coupling
• Resources – high set up costs– management attention
• Small numbers of partners– lengthy negotiations– detailed contracts – extensive monitoring of
performance
Coordination costs are steep Thousands of partners in a loose network
Loose coupling
• Focus on outcomes – not on the way the job gets
done
• Set standards – manage interfaces between
specialist’s activity – orchestrate the process
Loosely Coupled Applications
• What company can ignore the benefits of – partnering with business specialists?– outsourcing non-core activities to
focused providers?
The virtues of collaboration are clear: innovation and efficiency.
• In the quest for higher performance, companies at the cutting edge of process management:– gain flexibility and improve performance– handle critical cross-company processes
as networks rather than production lines
swap their tightly coupled processes for loosely coupled ones.
Text excerpts from the McKinsey Quarterly; 2002 Number 2Graphics from Enterprise E-Commerce
Data at the Edge
• In 1869 the transcontinental railroad enabled and accelerated the migration westward.
• In the 40’s and 50’s, the interstate system enabled and accelerated migration to the suburbs.
• In the 80’s and 90’s computing become less centralized – Accelerated by PCs, relational databases, SQL, the Web – Data migrated out of “glass houses” and closer to the user
• Web Services, XQuery, XML– The latest technologies to help people get better control over
data and processes that help them in their daily activities
and in doing so, data will migrate closer to the edge
New technologies do not create chaos, they expose and accelerate it.
Sources of Semantic Chaos
• Data at the edge enables different processes for– different payment history and methods– different customers and partners – different legal jurisdictions
• Data at the edge is more personalized – “Call Sally”
• My cell phone knows who I mean• A centralized corporate directory does not
• With personalization comes differences– with differences comes semantic chaos Don’t blame
my phone.
Managing Assets
Adding Context
Generating Intellectual Capital
Increasing Value
Capture Organize
Synthesize
Evaluate
Level of
Investm
en
t
Process Complexity
High
Low
Low High
Putting information into managed locations
Classifying documents, creating classification schemes Collecting information about the quality and usefulness of the information
Driving business processes with knowledge
Creating new knowledge from existing knowledge
The Information Continuum
Data Mgmt.Data Mgmt.
Information Mgmt.
Information Mgmt.
Knowledge Mgmt.
Knowledge Mgmt.
eXtreme Semantics
Semantics = Data + Behavior[1]
• Semantic Interoperability– Adaptive systems sharing semantic descriptions – The future requires new standards and systems
Behavior
Data
[1] Semantics in Business Systems: The Savvy Manager’s Guide Dave McComb; Morgan Kauffman; September 2003ISBN: 1-55860-917-2
– Semantics is the traction point between data and processes
– Practical: When a friend says cool... don’t put on a coat
– System: Purchase orders trigger processes to manufacture, package, ship and bill
Questions: XML Web Services and eComm
• What is the future of XML in e-business? – XML will remain a foundation of information interchange – and
information interchange is a foundation of e-business.
• Are there ways to apply XML that are not currently being done? – Validation
• Do you think they will be realized?– Yes, there is now technology that makes it affordable.
• What are the threats to XML to remain open source?– Our selves. We are making it so complex that some one can
come in and “just fix it” for us. – Complexity is creating a new priesthood– The very reason we broke SGML
XML Challenge
• Because XML was so stripped down, it was easy to adopt and extend; because it was so stripped down, adopters almost had to extend it.
• And we did. And other people did, too. You now have – XML + XLINK + XSL + Namespaces + Infoset + XML Linking
+ XPointer Framework + XPointer namespaces + XPointer xptr() + XSLT + XPath + XSL FO + DOM + Sax + stylesheet linking PI + XML Schema + XQuery + XML Encryption + XML Canonicalization + XML Signature + DOM Level 2 + DOM Level 3.
• But it grew. It grew more complex. It grew confusing.
http://www.mhxml.com/projects/w3c/xml-is-five-final.htm
Learning XML
• Should XML be a required class for all Information Technology graduates?– Yes. Information is the core of IT– NO. It would be difficult to assure the quality of that
education.
• What is the best way to go about learning XML? – Reliable Books
– Modeling Business Objects with XML Schema – System Architecture with XML
– Just Do It– www.xpriori.com
• See Also:– http://www.xml.com/axml/testaxml.htm
Thank You
Dave’s Web Sites:
Contivo - http://www.contivo.com
Personal - http://www.mhxml.com
First Sheep’s Law of the universe:“Everything has both intended and
unintended consequences. The intended consequences may or may not happen; the unintended consequences always do!”