Top Banner
40
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: epicenter2010   Open Xml
Page 2: epicenter2010   Open Xml

Trinity College, Dublin: 8 – 11 June 2010

An Introduction to Open XML

CRAIG MURPHY

Page 3: epicenter2010   Open Xml

An Introduction to Open XML

Housekeeping

Mobile ‘phones

Fire Exits

Toilets

3

Page 4: epicenter2010   Open Xml

An Introduction to Open XML 4

Session Overview

This session will provide an explanation and demonstration of how we can programmatically create and use WordML and ExcelML documents

I will be using the Open XML SDK to make life easier No manual creation and management of .zip files / containers Let System.IO.Packaging, etc. take care of that

Avoids a discussion about code bloat, XML bloat and performance (which is actually very good)

It won’t be a political view of the “document wars” debate There will be no XPS vs PDF vs Open XML vs ODF / OpenDocument

content!

Page 5: epicenter2010   Open Xml

An Introduction to Open XML

If you learn one thing from my session…

On this day…June 8th…

1978: Woman takes world sailing record

Yachtswoman Naomi James breaks the solo round-the-world sailing record by two days.

Page 6: epicenter2010   Open Xml

An Introduction to Open XML

Office 2010 – First Run

6

Page 7: epicenter2010   Open Xml

An Introduction to Open XML

Disclaimer

This session includes some content from Microsoft slide decks

Not going to be an in-depth look at the Open XML API Code and demonstrations to get you started

Simplified version of the methods I use to generate custom reports in a non-production version of production application!

I’m a developer, not a designer! No flashy graphics or fancy documents

Let’s ignore the i4i injunction a Judge in Texas imposed on Microsoft Word!

7

Page 8: epicenter2010   Open Xml

An Introduction to Open XML

About Me

60+ presentations delivered: IMTC 2008, epicenter 2009 NRW06, NRW07 DeveloperDeveloperDeveloper (UK / Ireland Community Events) Scottish Developers Agile Scotland British Computer Society (BCS) UK Borland User Group (DDG) Visual Basic User Group (VBUG) VBUG .net Winter 2001 conference XML One 2001

60+ articles/book reviews published: The Delphi Magazine developers’ magazine (Dotnet Developers’ Group - DDG) ASPToday.com (now Wiley, previously Wrox) ASP.NET Pro, International Developer CSharpCorner, DeveloperFusion

8

Open XML

XML

XSLT

XQuery

XML Schema

SOAP

WML

IntraWeb

Web Services

C# InterOp with Delphi

RUP

UML

TDD in C#, VB.net and Delphi 8

Scrum

Page 9: epicenter2010   Open Xml

An Introduction to Open XML 9

Agenda

Motivation

The Tools

What: Open XML SDK 2, API Design

How: Demos, Code Generation, Injection, Content Controls

Why: Summary

Resources

Page 10: epicenter2010   Open Xml

An Introduction to Open XML 10

Motivation

There are times when we are too focused on application development New/useful tools techniques are passed by

60-90 minute sessions like these, personally, help me save time by: Identifying new/useful tools and techniques

Demonstrating new/useful tools and techniques

Your takeaway: is Open XML something you should be investigating further, or not as the case may be

I have been using Excel automation (COM type libraries) for report creation…since 1999 Gone through the “macro dilemma” – to use macros or not?

For Win32 Borland Delphi applications

For Win32 .net C# applications

Page 11: epicenter2010   Open Xml

An Introduction to Open XML 11

The Tools

Visual Studio 2010 Professional

Open XML SDK 2 RTM (March 2010)

Sits inside the .NET 3.5 SP1 space (more about this later on); SDK makes use of LINQ

Office 2010 Standard Only required for viewing documents Unlike COM-based automation, an Office client is not required

A boon if you are preparing reports server-side

Previously used Visual Studio 2008 Professional Office 2007 Open XML SDK CTPs

Page 12: epicenter2010   Open Xml

An Introduction to Open XML 12

Agenda

Motivation

The Tools

What: Open XML SDK 2, API Design

How: Demos - Manual, Code Generation, Injection

Why: Summary

Resources

Page 13: epicenter2010   Open Xml

An Introduction to Open XML

Open XML SDK 2

Productivity Tool

DocumentReflector for code generation

OpenXMLClassExplorer explore the Open XML markup and the ECMA 376 specification

OpenXMLDiff graphically compare Open XML files

OpenXMLValidator to validate entire documents or “document parts” against Office 2007 or Office 2010 file formats

13

Page 14: epicenter2010   Open Xml

An Introduction to Open XML

What is Open XML?

…an open standard for word-processing documents, presentations, and spreadsheets that can be freely implemented by multiple applications on different platforms

…faithful representation of existing word-processing documents, presentations, and spreadsheets that are encoded in binary formats defined by Microsoft® Office applications, i.e. tightly coupled

…purpose of the Open XML standard is to de-couple documents created by Microsoft Office applications so that they can be manipulated by other applicationsindependent of proprietary formats and without the loss of datahttps://connect.microsoft.com/content/content.aspx?ContentID=9521&SiteID=589&wa=wsignin1.0

14

Page 15: epicenter2010   Open Xml

An Introduction to Open XML

Before…Open XML SDK V2

Namespaces, element names and attributes were irksome to remember and to get right

Generally, constants were used to make managing namespaces, etc. that bit easier

Lack of strong typing Code would compile

May produce incorrect results at run-time

15

<w:document xmlns:w='http://schemas.openxmlformats.org/wordprocessingml/2006/main\'><w:body><w:p><w:r><w:t>some text</w:t></w:r></w:p></w:body></w:document>

Page 16: epicenter2010   Open Xml

An Introduction to Open XML

Now…Open XML SDK V2

Strongly Typed Object Model Node identification using strings is a thing of the past

Loosely typed System.Xml.Linq.XElement usage can be replaced

e.g. DocumentFormat.OpenXml.WordProcessing.Paragraph Spelling mistakes are caught by compile-time type checking

Obviously strong typing is preferable

16

AFTER

var paragraphs = doc.MainDocumentPart.Document.Body.Elements<Paragraph>().Select

BEFOREvar paragraphs = doc.MainDocumentPart

.GetXDocument()

.Element(w + "document")

.Element(w + "body")

.Elements(w + "p")

.Select

Page 17: epicenter2010   Open Xml

An Introduction to Open XML

API Design

17

Page 18: epicenter2010   Open Xml

An Introduction to Open XML

API DesignSystem Support

18

.Net Framework 3.5 – The Open XML SDK leverages the advanced technology provided by .Net Framework 3.5, especially LINQ To XML, which makes manipulating XML much easier and more intuitive

System.IO.Packaging – The Open XML SDK needs to be able to add/remove parts contained within the Open XML Format packages. Included as part of .Net Framework 3.0 were a set of generic packaging APIs capable of adding and removing parts of OPC (Open Package Convention) conforming packages. Given that Open XML Formats are based on OPC, the SDK uses System.IO.Packaging APIs to open, edit and save Open XML Packages

Open XML Schemas – The Open XML SDK is based on Open XML Formats, which are represented and described as schemas. These schemas make up the foundation of the Open XML SDK, since the SDK enables Open XML developers to build solutions on top of Open XML Formats

Page 19: epicenter2010   Open Xml

An Introduction to Open XML

API DesignOpen XML File Format Base Level Stream Reading/Writing

includes stream reader and writer interfaces targeting Open XML elements and attributes

similar to XmlReader/XmlWriter, easier to use as the interfaces are Open XML aware

Open XML Low Level DOM Manipulate the Open XML tree directly by working with strongly typed objects and classes

instead of traditional XML nodes

Awareness of namespaces as well as element/attribute names is reduced

Intellisense for properties, etc.

Leverages LINQ

Open XML Packaging API Sits above System.IO.Packaging (.NET 3.0)

allows developers to manipulate Open XML parts with strongly typed classes and objects

Shipped in Open XML SDK v1.0

19

Page 20: epicenter2010   Open Xml

An Introduction to Open XML

API DesignValidation & Helpers Validation Layer

Open XML base layer does not guarantee creation of valid Open XML documents!

Our reliance on XML Schema, XSD files, is reduced if not removed

The SDK takes care of it on our behalf

Helper Functions Work directly on the XML elements and are functionally limited

by the file format standard

e.g. deletion of a WordML paragraph – a helper function may ensure that all additional steps are taken to leave the document is a valid state…

20

Page 21: epicenter2010   Open Xml

An Introduction to Open XML

The Importance of Validation

http://blogs.msdn.com/brian_jones/archive/2009/04/08/announcing-the-release-of-the-open-xml-sdk-version-2-april-2009-ctp.aspx

21

<w:body> <w:p> <w:r> <w:t>hello world</w:t> </w:r> </w:p> ... </w:body>

<w:body> <w:p> <w:t>hello world</w:t> </w:p> ... </w:body>

Page 22: epicenter2010   Open Xml

An Introduction to Open XML 22

Agenda

Motivation

The Tools

What: Open XML SDK 2, API Design

How: Demos, Code Generation, Injection, Content Controls

Why: Summary

Resources

Page 23: epicenter2010   Open Xml

An Introduction to Open XML

WordMLDocument Structure

23

Take a .docx, an .xlsx or a .pptx file, rename it as a .zip file

Open using Compressed Folders or your favourite zip utility

Very readable, but without the SDK, difficult to manage, especially in code

Page 24: epicenter2010   Open Xml

An Introduction to Open XML

Document Parts

A document part is… analogous to a file on the file system

stored inside the package in a specific location reachable via a URI

stored with a specific content type

mainly XML but other native types as well

Images, sounds, video, OLE objects

Content type is enforced Example: cannot tag JPEG part as GIF

[Open Excel - sample file – look for the image]

24

Page 25: epicenter2010   Open Xml

An Introduction to Open XML

ExcelMLDocument Structure

25

Relationships are stored in XML streams in the package Ties elements inside the

package to each other

Allows navigation of document without parsing parts

Package relationships stream URI: /_rels/.rels

Part relationships stream URI: _rels/[partname].rels

Page 26: epicenter2010   Open Xml

An Introduction to Open XML 26

demo

WordML and ExcelML

Page 27: epicenter2010   Open Xml

An Introduction to Open XML

Content Controls

New in Word 2007

Manageable via the Word Content Control Toolkit

Programmatic access to specific “fields” within a document

“Bindable” Can be bound to XML nodes

Makes use of the customXML folder

27

Page 28: epicenter2010   Open Xml

An Introduction to Open XML

Enabling the Developer ribbon – Word 2007

28

Page 29: epicenter2010   Open Xml

An Introduction to Open XML

Enabling the Developer ribbon – Word 2010

29

Page 30: epicenter2010   Open Xml

An Introduction to Open XML

Why Use Content Controls?

In situations where small amounts information is collected from many users: How often have you seen a spreadsheet being e-mailed to

hundreds of users, asking them to fill in “some” cells?

Give them a Word document with Content Controls Use a custom-written .NET application that aggregates the

information in the Content Controls into an Excel spreadsheet

30

Page 31: epicenter2010   Open Xml

An Introduction to Open XML 31

demo

Content Controls

CustomXML

in Word 2007 / Word 2010

Page 32: epicenter2010   Open Xml

An Introduction to Open XML

Deployment

All that you need to deploy are:

Your OpenXML-enabled application

DocumentFormat.OpenXml.dll

WindowsBase.dll

.NET (VPC test…)

c:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\WindowsBase.dll

http://blogs.msdn.com/dmahugh/archive/2006/12/14/finding-windowsbase-dll.aspx

33

Page 33: epicenter2010   Open Xml

An Introduction to Open XML 34

Agenda

Motivation

The Tools

What: Open XML SDK 2, API Design

How: Demos - Manual, Code Generation, Injection

Why: Summary

Resources

Page 34: epicenter2010   Open Xml

An Introduction to Open XML

Summary

Open XML is little more than a moderately complex XML document XML is readily accessible

in the .NET framework

in VB6

in Java

in Python, etc.

An Office installation is not required Office client not required on the server

Enables Office document creation from non-Microsoft platforms

“…it’s just zip, it’s just XML…” - Doug Mahugh http://channel9.msdn.com/posts/AdamKinney/Open-XML-File-Formats

35

Page 35: epicenter2010   Open Xml

An Introduction to Open XML

Summary

Start from a template document Easy replication of existing [client] documents

Use the DocumentRefector to generate Open XML code Refactor your report data into the generated code

Learn from the reflected / generated code

Open XML code is cleaner, more readable and more maintainable than its COM counterpart

Open XML documents can be consumed using applications and platforms from vendors other than Microsoft

36

Page 36: epicenter2010   Open Xml

An Introduction to Open XML 37

Resources (web-sites & blogs)

Open XML Format SDK 2.0 http://url.ie/tik

Microsoft’s Open XML portal http://www.openxmldeveloper.org/

If you are interested in Open XML / ODF conversion http://sourceforge.net/projects/odf-converter

http://www.twitter.com/openxml

Microsoft folks: Brian Jones http://blogs.msdn.com/brian_jones/

Doug Mahugh http://blogs.msdn.com/dmahugh/

Kevin Boske http://blogs.msdn.com/kevinboske/

Erika Ehrli http://blogs.msdn.com/erikaehrli/

Eric White http://blogs.msdn.com/ericwhite/

Page 37: epicenter2010   Open Xml

An Introduction to Open XML

Resources (web-sites & blogs)

Word 2007 Content Control Toolkit on CodePlex http://www.codeplex.com/dbe

Matthew Scott’s Content Controls and CustomXML Channel 9 video http://url.ie/u05

Wouter van Vugt http://blogs.code-counsel.net/Wouter/default.aspx

A collection of Open XML resources: http://www.craigmurphy.com/blog/?p=871

Including these slides and C# source code

38

Page 38: epicenter2010   Open Xml

An Introduction to Open XML 39

Resources (Books)

Open XML Explained

Wouter van Vugt

http://openxmldeveloper.org/articles/1970.aspx

Page 39: epicenter2010   Open Xml

An Introduction to Open XML

Contact Information

Craig Murphy

http://www.twitter.com/CAMURPHY

Updated slides, notes and source code:

http://www.CraigMurphy.com

http://www.CraigMurphy.com/blog

Page 40: epicenter2010   Open Xml

Questions