Optimizing your DITA Content Model for Translation Amber Swope DITA Strategies, Inc.
Nov 07, 2014
Optimizing your DITA Content Model for Translation
Amber SwopeDITA Strategies, Inc.
About the Speaker
• Over 20 years of experience in the industry at multiple companies of varying sizes and industries
• Author of numerous papers/presentations on information development and information architecture, including the “DITA Maturity Model” with Michael Priestley
copyright DITA Strategies, Inc. 2012
Process overview3
Know what DITA
provides
Indicate when to translate content
Use appropriate DITA elements
Remove ambiguit
y from content model
Avoid inline
content or key
references
copyright DITA Strategies, Inc. 2012
DITA knowledge poll4
1. Have implemented DITA and sent content through multiple rounds of translation
2. Have implemented DITA and sent content through first translation
3. Have implemented DITA but not yet sent content through first translation
4. Have some theoretical DITA knowledge, but no implementation experience
5. Know what the acronym means
copyright DITA Strategies, Inc. 2012
DITA overview5
• Darwin Information Typing Architecture (DITA)
• Modular, structured, XML framework based on a topic-based architecture
• Open-source standard approved and supported by OASIS
• Implemented by companies in many industries around the world
copyright DITA Strategies, Inc. 2012
Know what DITA provides
DITA translation supportBest practices
6
copyright DITA Strategies, Inc. 2012
DITA translation support7
Attributes that you can specify on each instance of an element
@translate attribute@xml:lang attribute@dir attribute
copyright DITA Strategies, Inc. 2012
@translate attribute8
Indicates whether the content of the element should be translated or not.
Default value is “yes”.Example:
copyright DITA Strategies, Inc. 2012
@xml:lang attribute9
Specifies the language of the element content. Values are from W3C (http://www.w3.org/TR/REC-
xml/)Example:
copyright DITA Strategies, Inc. 2012
@dir attribute10
Specifies the directionality of text.Values:
ltr – left-to-right (processing default) rtl – right-to-left
Example:
copyright DITA Strategies, Inc. 2012
Best practices
Update files only with changed textTranslate reused or common content firstProvide translations for generated outputProvide full source language text for verificationUse language-specific stylesheets
11
copyright DITA Strategies, Inc. 2012
Goals
Avoid translators changing elementsAutomate formatting with language-specific
stylesheets
12
copyright DITA Strategies, Inc. 2012
Indicate when to translate content
Use @translate attribute on an element
Identify specific elements to not be translated
13
copyright DITA Strategies, Inc. 2012
Use @translate attribute on an element
14
Pro: can control translation for each instance of an element
Con: must specify for each instance of an elementCommon elements for which to indicate
translation: <term> <ph> <keyword> <q>
Example
copyright DITA Strategies, Inc. 2012
Identify specific elements to not be translated15
Pro: can globally indicate that content is not be translated
Con: no flexibilityElements that are not usually translated:
All elements in the programming domain (<codeblock>, <codeph>, <parmname>,…) because they present code, which is usually in English
<tm> because trademarks are not usually translated
copyright DITA Strategies, Inc. 2012
Non-translated element example16
copyright DITA Strategies, Inc. 2012
Use appropriate DITA elements
Elements to useGlossary element support for
alternative forms of a word or phrase
17
copyright DITA Strategies, Inc. 2012
Elements to use18
<menucascade><uicontrol> for menu option selection
<fn> for footnotes<note> with appropriate @type attribute value<prereq> for prerequisitesAny element for which you generate a label
copyright DITA Strategies, Inc. 2012
Glossary support19
Glossary topic provides full definition of term, including alternatives for the primary term defined in the <glossterm> element
The alternatives are nested within the <glossAlt> element: <glossAbbreviation> – abbreviated form of the primary term <glossShortForm> – shorter alternative to the primary term <glossAcronym> – acronym for the primary term <glossSurfaceForm> – proper presentation for first instance
of term in outputReference glossary content with the <term> or
<abbreviated-form> element using key referencing
copyright DITA Strategies, Inc. 2012
Glossary usage20
1. Define all information for a term in glossary topic in source language
2. Create key reference to the glossary topic that defines the term.
If you want to reuse the primary term, use the <term> element If you want to reuse an acronym or the surface form, use the
<abbreviated-form> element
3. Translate all elements in the glossary topic as applicable in each target language; leave empty all inapplicable elements.
The DITA-OT processing resolves the <abbreviated-form> element to the <glossterm> element if <glossAcronym> and <glossSurfaceForm> are empty.
copyright DITA Strategies, Inc. 2012
Glossary example21
Glossary topic
Concept topic
Generated output
copyright DITA Strategies, Inc. 2012
Remove ambiguity from content model
GuidelinesElement usageSingle purpose for each elementManual formattingSpecialization or @outputclass
attribute
22
copyright DITA Strategies, Inc. 2012
Avoid Instead
Using the formatting elements
Using an element for multiple purposes
Typing formatting, such as quotation marks
Adding unnecessary formatting that processing can handle
Relying on @outputclass attribute values for element identification
Use element that identifies the content
Clearly indicate the proper usage for each element
Use proper element and update stylesheets
Specialize to create elements if necessary
23
Guidelines
copyright DITA Strategies, Inc. 2012
Element usage24
Content purpose Ambiguous Clear
User interface item <b> <uicontrol>
Citation of resource <i> or “…” <cite>
Presentation of new term
<i> <term>
Quotation “…” <q> or <lq>
Directory path <codeph> or <ph> <filepath>
copyright DITA Strategies, Inc. 2012
Single purpose for elements
Guidelines Be reasonable – find the balance between clarity and
complexity Use elements for their intended purpose Clearly define usage for content authors
Examples <filepath> – if the formatting for directory paths and
file names is the same, then use for both purposes <pre> versus <codeblock> versus <screen> versus
<systemoutput> – if formatting is same, the use <codeblock>
25
copyright DITA Strategies, Inc. 2012
Manual formatting to avoid
Quotation marksTable headingsTitlesTermsLabels
26
copyright DITA Strategies, Inc. 2012
Specialization versus @outputclass attribute
27
Specialization Allows you to create new element types and attributes that
are explicitly and formally derived from existing types Provides selectable elements or attributes for authors
@outputclass attribute Names a role that the element is playing Used primarily to provide styling instructions during
generation
copyright DITA Strategies, Inc. 2012
Specialize element Use @outputclass
No DITA element properly identifies the content
Authors need to use frequently and consistently
Authors must specify usage
You need to indicate a variation on output formatting for existing element
Expert needs to use infrequently
You can incorporate into templates (no author specification)
28
Specialization versus @outputclass attribute
copyright DITA Strategies, Inc. 2012
Specialization considerations
When authors must have control over processing, such as collapsible/expandable substeps
When authors must manually type a value
29
copyright DITA Strategies, Inc. 2012
Specialization examples
Sidebar support to provide sidebars for articlesSpecific table types to support consistencyCollapsible/expandable elements to allow authors
to control displayEmphasis element to eliminate <b> or <i> usageForeign word to identify non-translated foreign
wordsCustom list structures to support consistency
30
copyright DITA Strategies, Inc. 2012
Avoid inline content or key references
DefinitionsReferencing issuesBest practicesStrategies
31
copyright DITA Strategies, Inc. 2012
Definitions32
Content references allow you to directly reuse or include elements into topics
Key references allow you to indirectly reuse content (like a placeholder)
copyright DITA Strategies, Inc. 2012
Referencing issues
Article agreement of reused words or phrases Gender Singular v. plural
Capitalization First word in a sentence Expansion of abbreviated forms
Inflection in translated content Word changes by role in sentence
33
copyright DITA Strategies, Inc. 2012
Reference Do not reference
Complete units of content Block elements Full sentences
Non-translated textProper nouns (when
subject of sentence)
Common nounsTranslated text
34
References best practices
copyright DITA Strategies, Inc. 2012
Strategies
Consider including the article in the referenceAvoid using references as the first word in a
sentenceFor commands, do not include the noun
No:
Yes:
35
copyright DITA Strategies, Inc. 2012
Summary36
Know what DITA
provides
Indicate when to translate content
Use appropriate DITA elements
Remove ambiguit
y from content model
Avoid inline
content or key
references
copyright DITA Strategies, Inc. 2012
Resources
OASIS DITA Translation Subcommittee “Best Practice for Managing Acronyms and
Abbreviations in DITA” “Translation Best Practice for Leveraging Translation
Memory” “Best Practice for Indexing DITA Topics for
Translation” “Best Practice for Using the DITA CONREF Attribute
for Translation”http://
dita.xml.org/wiki/optimizing-dita-for-translations
37