SNU OOPSLA Lab. XML Documents 2 : Additional issues he ubiquitous XML(3) © copyright 2001 SNU OOPSLA Lab.
SNUOOPSLA Lab.
XML Documents 2 : Additional issues
The ubiquitous XML(3)
© copyright 2001 SNU OOPSLA Lab.
2SNUOOPSLA Lab.The ubiquitous XML
Additional Issues
XML Link White space Name space
3SNUOOPSLA Lab.The ubiquitous XML
Contents of XML Link HTML Link vs. XML Link concept of XML Link simple link extended link XPointer
XML Link
4SNUOOPSLA Lab.The ubiquitous XML
HTML Link vs. XML Link HTML Link
URL 은 하나의 문서전체만을 가리킬 뿐 , 문서 내 부분을 가리킬 수 있는 방법이 없다 . Document 들 사이의 연관성에 대한 개념이 없다 . 한 방향 LINK 만 가능하다
XML Link document 의 임의의 위치를 가리킬 수 있는 방법제공 다양한 element 를 link 할 수 있다 . (cf. ‘A’ element in
HTML) 다방향 Link 참조 , 주석 , 각주 등을 쉽게 처리할 수 있다 .
5SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (1/12) XLL (eXtensible Linking Language)
Xlink define how one document link another
document in fact, URL(URI)
Xpointer define how one document link the component
of documents
6SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (2/12) contains only one resource locator only the Href and xml-link attributes are required any name can be chosen for the linking element
See <simple href=“…”>book 9</simple> for details<!ATTLIST simple xml-link CDATA #FIXED “simple”>
<simple href=“http://ProcMan.xml#Sec9” xml-link=“simple”>
See Section 9 of the Procedures Manual</simple>
XML Link
7SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (3/12) Source
a phrase that directs the reader’s attention to other information
Target is located at the start of the required
text
XML Link
8SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (4/12) Resource
target object Linking element
source Traversal
the act of moving from the liking element to the resource
See <link> More Information</link> for details.
linking element
<chapter><title>More Information</title><p>The details are……</chapter>
resource
traversal
XML Link
9SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (5/12) Objects are identified using the URL
mechanismSee <link target=“MyServe.MyCorp.com/xml/Doc9#X123”>Details</link>
…</chapter><chapter ident=“X123”><title> Details </title><p> The details are …</chapter>...
Doc9xml
MyServe server
MyCorp
XML Link
10SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (6/12) Title
it is useful for simple links to be labeled, so that the user can decide whether it would be profitable to follow the link
<!ATTLIST link …title CDATA #IMPLIED>
… are you going to<link href=“#X123” title=“Location”>Scarborough</link> fair?
Role is used to create categories of link that can be accessed by
specialized browsers<!ATTLIST link …
role CDATA #IMPLIED>… are you going to<link href=“#X123” role=“describe”>Scarborough</link> fair?
XML Link
11SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (7/12) Content role and content title
locators in extended links are labeled with the title attribute
extended link itself, if it is an in-line link, should also have a title and role
XML Link
12SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (8/12)<!ATTLIST extend …
content-role CDATA #IMPLIEDcontent-title CDATA #IMPLIED>
<song><title>Are you going to <extend content-title=“song” content-role=“reference”>Scarborough<locator title=“location” role=“explain” href=“…”/><locator title=“history” role=“explain” href=“…”/></extend>fair?</title>…</song>
Are you going to Scarborough fair?
locationhistory
A popular seaside town in Yorkshire is Scarborough history
song
XML Link
13SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink(9/12)
The method for locating entire documents <link href=“/xml/myfiles/detail.xml”>See details</link>
Linking a specific element in the current file<link href=“../myfiles/detail.xml#part3”>See details,
part 3</link> entire document is delivered
Indicating that only referenced part of the document is required<link href=“../myfiles/detail.xml|part3”>See details, part
3</link>
XML Link
URL
14SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (10/12) Link Behavior (1/2)
Actuate user: the links only traversed when explicitly selected
by the user auto: the link is activated automatically as soon as the
linking element is presented to the user Show
replace: the browser replaces the source text with the resource required
embed: the resource is brought to and embedded in the source text
new: the browser opens a new window to display the resource, leaving the original window on-screen
XML Link
15SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (11/12) Link Behavior(2/2)
The show and actuate attributes also appearing in the extended links<extend show=“new”><locate href=“…” /><locate href=“…” show=“embed” /></extend>
XML Link
16SNUOOPSLA Lab.The ubiquitous XML
Concept of Xlink (12/12) Simple link
the primitive one-directional linking scheme, but make it possible to traverse links between documents
Extended link multi-directional linking scheme an extended link contains a number of locator
elements, each one points to a resource
XML Link
17SNUOOPSLA Lab.The ubiquitous XML
Simple Link Attributes in the linking element can
influence the means by which a link can be activated
a link could be activated by the person(‘user’ link) directly by the application(‘auto’ link)
the presentation technique required once it has been activated
application may jump to the specified resource(‘replace’)
display the resource in another window(‘new’) insert the resource into the original text(‘embed’)
XML Link
18SNUOOPSLA Lab.The ubiquitous XML
Extended Link (1/8) resources can be cross-related an extended link contains a
number of locator elements, each one points to a resource
XML Link
19SNUOOPSLA Lab.The ubiquitous XML
Extended Links (2/8) refer to a number of resources by including
embedded resource locators each locator is stored in a locator element all related locator elements are grouped
within an extended element the DTD author must ensure that the
extended element can contain the locator element, as well as any DTD specific elements appropriate at this point
XML Link
20SNUOOPSLA Lab.The ubiquitous XML
Extended Links (3/8)<!ELEMENT para (#PCDATA | extend | emph)><!ELEMENT extend (#PCDATA | locate | emph)><!ATTLIST extend xml-link=“extended” … ><!ELEMENT locate (#PCDATA)><!ATTLIST locate xml-link=“locator” …><para>Here are<extend>some <emph>extended</emph>links:<locate href=“…”>Locator 1</locator>,<locate href=“…”>Locator 2</locator></extend></para>Here are some extended links: Locator1, Locator2.
XML Link
21SNUOOPSLA Lab.The ubiquitous XML
Extended Link (4/8) In-line links
a link that serves as one of its own resources a link source that is embedded within the text
Out-of-line links a link that does not serve as one of its own
resources should be considered when
a read-only document is involved different links are required for different groups of
people, where seeing other’s links is confusing
XML Link
22SNUOOPSLA Lab.The ubiquitous XML
Extended Link (5/8) Out-of-line link
provides facility of separating the extended link from all the resources it defines
may physically appear in-line, in the sense that it is placed in the flow of text
a more obvious place to put out-of-line links is at the top of the document
XML Link
23SNUOOPSLA Lab.The ubiquitous XML
Extended Link (6/8) The Inline attribute must be set to
‘false’ to identify an out-of-line link<extend inline=“false”><locator href=“…”>Locator 1</locator><locator href=“…”>Locator 2</locator></extend>
resourceresource
Document A Document B
XML Link
24SNUOOPSLA Lab.The ubiquitous XML
Extended Link (7/8) Extended link group(1/2)
a number of extended document pointers are used to identify all the inter-linked documents
they are contained in an extended group element
=> all the documents concerned are deemed to be pointers to each other
XML Link
25SNUOOPSLA Lab.The ubiquitous XML
Extended Link (8/8) Extended link group(2/2)
Group element other documents in a group of inter-linked
documents can be identified with steps attribute contains a value stating how many steps to take
which contains document element with href attribute
<group steps=“2”><document href=“DocumentHub”>Document Hub</document></group>
XML Link
26SNUOOPSLA Lab.The ubiquitous XML
Xpointer (1/13) a mechanism for identifying a
designated resource by its location Instructions(location terms) in an
XPointer refer to the element hierarchy tree include references to siblings, children and
ancestors are read from left to right
XML Link
27SNUOOPSLA Lab.The ubiquitous XML
Xpointer (2/13) Examples
http://MyServe.MyCorp.com/xml/
doc9#ROOT()CHILD(3,chap)STRING(7,”Napoleon”,0)
http:// MyServe.MyCorp.com/xml/doc9?XML-
XPTR=ROOT()CHILD(3,chap)STRING(7,”Napoleon”,0)
XML Link
28SNUOOPSLA Lab.The ubiquitous XML
Xpointer (3/13) Absolute Locations(1/2)
HERE() identifies the current element(the linking element itself)
ID() specifies an element containing an attribute of type ID
ID(sec17)<section target=“sec17”>…</section>’
HTML() specifies the name of an Anchor element in an HTML
documentHTML(para3)<p><a name=“para3”>The third paragraph.</a>…</p>
XML Link
29SNUOOPSLA Lab.The ubiquitous XML
Xpointer (4/13) Absolute Locations(2/2)
ROOT() identifies the entire document as the container of
the target resource DITTO()
specifies the result of the first search as the starting-point for this second search
XML Link
30SNUOOPSLA Lab.The ubiquitous XML
Xpointer (5/13) Relative Locations(1/9)
CHILD() specifies a child of the current element
CHILD(3, .)<para>…</para> <1<list>…</list> <2<para>…</para> <3
CHILD(3, *)<number>13</number> <1High Str., <2<town>NewTown</town> <3
XML Link
31SNUOOPSLA Lab.The ubiquitous XML
Xpointer (6/13)
CHILD(1, *CDATA)<number>13</number>High Str., <1<town>NewTown</town>
CHILD(3, para)<para>…</para> <1<list>…</list><para>…</para> <2<table>…</table><para>…</para> <3
XML Link
Relative Locations(2/9)
32SNUOOPSLA Lab.The ubiquitous XML
Xpointer (7/13)CHILD(3, para, status, secret)<para status=“secret”>…</para> <1<para status=“SECRET”>…</para> <2<para status=“normal”>…</para><para status=“normal”>…</para><para status=“Secret”>…</para> <3
CHILD(2, para, author, “D. Adams”)<para author=“D.Adams”>…</para><para author=“Dikens”>…</para><para author=“D. Adams”>…</para> <1<para author=“d. adams”>…</para><para author=“D. Adams”>…</para> <2
XML Link
Relative Locations(3/9)
33SNUOOPSLA Lab.The ubiquitous XML
Xpointer (8/13)
CHILD(3, para, status, *IMPLIED)<para status=“secret”>…</para><para>…</para> <1<para>…</para> <2<para status=“normal”>…</para><para>…</para> <3
CHILD(3, PARA, STATUS, *)<para status=“secret”>…</para> <1<para>…</para><para status=“normal”>…</para> <2<para status=“normal”>…</para> <3<para>…</para>
XML Link
Relative Locations(4/9)
34SNUOOPSLA Lab.The ubiquitous XML
Xpointer (9/13) ANCESTOR()
specifies a search through enclosing elements FSIBLING()
identifies a following sibling to select next element:
<para>…</para><para>…</para> <1
to select the penultimate element:<chapter><para>…</para><para>…</para><para>…</para> < -2<para>…</para> < -1</chapter>
XML Link
Relative Locations(5/9)
35SNUOOPSLA Lab.The ubiquitous XML
Xpointer (10/13) PSIBLING()
identifies a previous sibling to the current element to select the previous element
<para>…</para> <1<para>…</para>
to select the second element in the enclosing element <chapter><para>…</para> < -1<para>…</para> < -2<para>…</para><para>…</para></chapter>
XML Link
Relative Locations(6/9)
36SNUOOPSLA Lab.The ubiquitous XML
Xpointer (11/13)
DESCENDANT() indicates an element anywhere within the
location source FOLLOWING()
has a similar effect as DESCENDANT, except that it not bounded by the current element’s end-tag, but searches on to the end of the document
PRECEDING() initiates a search back through the document,
ignoring document hierarchies
XML Link
Relative Locations(7/9)
37SNUOOPSLA Lab.The ubiquitous XML
Xpointer (12/13)
STRING() locates a given letter, word, phrase or other string
of text, such as ‘Napoleon the Emperor’ the first parameter is an occurrence counter the second parameter is the string to find
STRING(1, ‘Napoleon the Emperor’, 0)Using the N element for name and Occ for occupation, “Napoleon the Emperor” is coded… <1
XML Link
Relative Locations(8/9)
38SNUOOPSLA Lab.The ubiquitous XML
Xpointer(13/13) STRING()
tags are transparent to the search:STRING(1, ‘Napoleon the Emperor’, 0)Using the N element for name and Occ for occupation, “Napoleon the Emperor” is coded… <1<n>Napoleon</n> the <occ> Emperor</occ> <2
the third parameter is a value specifying an offset from the start of the search textSTRING(1, ‘Napoleon the Emperor’, 7)Using the N element for name and Occ for occupation, “Napoleon the Emperor” is coded… <1<n>Napoleon</n> the <occ> Emperor</occ> <2
XML Link
Relative Locations(9/9)
39SNUOOPSLA Lab.The ubiquitous XML
Additional Issues
XML Link White space Name space
40SNUOOPSLA Lab.The ubiquitous XML
Contents of White space concept Line-end normalization White space in markup Element content space Preserved space Ambiguous space
White space
41SNUOOPSLA Lab.The ubiquitous XML
Concept of white space ‘white space’ is used to describe a
number of miscellaneous characters that have no visual appearance, but in some way affect the formatting of a document
White space
42SNUOOPSLA Lab.The ubiquitous XML
Line-end normalization ASCII standard includes two special
characters, these are the CR and LF characters
XML processor uses the LF character to terminate lines
A sequence of identical line-endcodes are treated separately.
A Macintosh [CR]Data file.[CR]
A Macintosh [LF]Data file.[LF]
A Macintosh [CR][LF]Data file.[CR]{LF}
White space
43SNUOOPSLA Lab.The ubiquitous XML
White space in markup The two examples below are
deemed to be equivalent
<book issue=“3” date=“15/3/97”>
<bookIssue = “3”Date = “15/3/97” >
White space
44SNUOOPSLA Lab.The ubiquitous XML
Element content space Document authors may choose to insert white
space in order to improve the presentation. For example, the second document fragment
below is easier to read than the first:<sec><auth><first>Neil</first><second>Bradley</second></auth>…
<sec><auth>
<first>Neil</first><second>Bradley</second></auth>…
White space
45SNUOOPSLA Lab.The ubiquitous XML
Preserved space A distinction is made between act of leaving all white
space characters intact, and normalizing white space back to a single character.
When left intact, the white space is said to be preserved.
When normalized, it is said to have collapsed. The document author has some control over
normalization of white space in the text, using a reserved attribute named ‘xml:space’.
<para xml:space=“preserve”>Mrs WhiteNewtownEngland</para>
Mrs WhiteNewtownEngland
White space
46SNUOOPSLA Lab.The ubiquitous XML
Ambiguous space Ambiguities may arise as to whether some white
space is intended to be part of the document, or is just present to make the data file more readable
The problem is deciding whether or not the Line-end code after the Paragraph start-tag, or line-end code at the end of the text may be omitted or retained.
Is this line of text : [CR]To be kept separate from this one?
Is this line of text : To be kept separate from this one?
Is this line of text : To be kept separate from this one?
White space
47SNUOOPSLA Lab.The ubiquitous XML
Ambiguous space HTML
All white space between block elements is ignored
SGML Focus is on the line-end codes rather than on white
space in general. RS character identifies and RE character identifies take the roles of record delimiters.
White space
48SNUOOPSLA Lab.The ubiquitous XML
Additional Issues
XML Link White space Name space
49SNUOOPSLA Lab.The ubiquitous XML
Contents of Name space Compound Documents The Standard Namespace identification Using name spaces Simplification techniques DTD issues
Name space
50SNUOOPSLA Lab.The ubiquitous XML
Compound Documents It is possible for a single XML document to contain
fragments that are defined in different DTDs.
The well-formed nature of all XML structures makes it relatively simply to embed ‘foreign’ structures in documents. But there are two problems to identify which schema a particular element belongs to. How to avoid duplication of element and attribute names.
Name space
51SNUOOPSLA Lab.The ubiquitous XML
The Namespaces standard was produced by the W3C, and gained recommended status in Jan. 1999.
This standard focuses on two issues It provides a mechanism for identifying the
namespaces used in the document. It identifies which namespace a particular
element or attribute belongs to.
The Standard
Name space
52SNUOOPSLA Lab.The ubiquitous XML
Most standards can now be identified with a specific location on the Web.
The Namespaces standard uses URLs to identify each namespace.
Namespace identification
Name space
53SNUOOPSLA Lab.The ubiquitous XML
Using name spaces Namespaces are defined using attributes.the
attribute name ‘xmlns’ is used to declare a namespace, and at the same time declare the prefix that will stand in for the full URL in element and attribute names.
Attributes from one namespace can be used in elements from another.
<X:html xmlns:X=http://www.w3.org/TR/REC-html40>…<X:p>An HTML paragraph.</X:p>
</X:html>
Name space
54SNUOOPSLA Lab.The ubiquitous XML
Simplification techniques When every element and attribute has a prefix, the
document can become difficult to read, and the extra characters certainly add to its size. Fortunately, Standard includes the concept of a default namespace.
The default namespace can be changed at any point in the document hierarchy.<book xmlns=“file:/DTDs/book.dtd”
xmlns:X=http://www.w3.org/TR/REC-html40> … <X:td></X:td> … … <html xmlns=http://www.w3.org/TR/REC-html40>
… <td> … </td> ... … </html>
<para> . . . </para></book>
Name space
55SNUOOPSLA Lab.The ubiquitous XML
DTD issues In order to parse documents against a DTD, it is
necessary to include the prefixes in the element definitions:
<!ELEMENT document (shoe|boot|slipper|veh:bonnet| veh:boot| veh:wheel)*>
• The namespace definition can also be included in the DTD.• The DTD must also include references to all allowed
children in the element content models, regardless of the namespace they may belong to.
Name space