Top Banner
Francesco Ganora DataWeave A functional data transformation language from MuleSoft
19

MuleSoft DataWeave data transformation language

Jan 20, 2017

Download

Technology

fganora
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: MuleSoft DataWeave data transformation language

Francesco Ganora

DataWeave A functional data transformation language from MuleSoft

Page 2: MuleSoft DataWeave data transformation language

The data mapping challenge

JSON

XMLCSV

Fixed Width

POJO

JSONXML

CSV

Fixed Width

POJO

Structural TransformationValue TransformationConditional mapping

Filtering Grouping

Best practice: always define the mapping in terms of the desired target data structure

Page 3: MuleSoft DataWeave data transformation language

The old programmatic approach❖ Map the target message from the source message

programmatically (e.g., via a script or Java method)

❖ Sequence of procedural steps that incrementally build the target message from the source message

❖ Typical example: loop on elements of a source sequence and for each element instantiate a target sub-structure, then attach it to the overall target structure

❖ This approach is neither concise nor expressive; if implemented incorrectly, it is also inefficient

Page 4: MuleSoft DataWeave data transformation language

The templating approach❖ Template engines can be used as

data mapping engines:

❖ We define the target structure (template)

❖ We define how each part of the template is generated dynamically from source data

❖ The template consists of a semi-literal expression with placeholders e.g. $() in the this example

❖ More constructs are necessary to instantiate repetitive structures (looping), for conditional mapping, etc.

{“user”:

{“id”: “$(sourceData.userID)”,

“firstName”: “$(sourceData.givenName)”,

“lastName”: “$(sourceData.lastName)”,

“contacts”: {

“phone”: “$(sourceData.phoneNumber)”,

“email”: “$(sourceData.emailAddress)”

}}

<?xml version="1.0">

<user>

<id> $(sourceData.userID) </id>

<firstName> $(sourceData.givenName) </firstName>,

<lastName> $(sourceData.lastName) </lastName>

<contacts>

<phone> $(sourceData.phoneNumber) </phone>

<email> $(sourceData.emailAddress) </email>

</contacts>

</user>

JSON

XML

Page 5: MuleSoft DataWeave data transformation language

Issues with standard templating❖ Template depends on the concrete syntax of the target message (separate

templates for XML, JSON etc.)

❖ Placeholder syntax depends on the type of source message (e.g., XPath for XML, JSONPath for JSON, non-standard syntax for other media types)

❖ Placeholder syntax may clash with target message syntax (cannot use for example <> as placeholder markers with XML)

❖ Looping constructs of traditional template engines mix engine syntax with generated content (“PHP-like”)

❖ XSLT is a very powerful templating and transformation language, but it does have drawbacks (verbose XML syntax, cannot operate on non-tree-structured source message that cannot be rendered into XML, etc.)

Page 6: MuleSoft DataWeave data transformation language

DataWeave (DW)❖ Data mapping and

transformation tool from MuleSoft

❖ Tightly integrated with AnyPoint Studio IDE

❖ Non-procedural expression language

❖ Applies functional programming constructs (lambdas)

❖ Uses internal, canonical data format (application/dw)

Page 7: MuleSoft DataWeave data transformation language

Canonical data representation

1. DW parses the source message into application/dw canonical format using supplied metadata / DataSense capability

2. A DW expression is used to transform the source message (result still in canonical application/dw format)

3. DW renders the canonical target message into the target MIME type specified as a “header” to the DW expression (e.g. %output application/json)

This decouples the transformation from the concrete syntax of source and target messages!

Source message

<source MIME type>

parser rendererSource

message(canonical)

Target message

(canonical)

Target message

DW expression

<target MIME type>application/dw application/dw

Page 8: MuleSoft DataWeave data transformation language

The DW canonical format❖ Only 3 kinds of data in SW:

• Simple (String, Number, Boolean, Date types)

• Array

• Objects (key:value pairs)

❖ The canonical application/dw format is shown in a JSON-like concrete syntax in Anypoint Studio

❖ Parsing and rendering between application/json and application/dw is straightforward

[ { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233244", "sku_description": "Product A", qty: "20" }, { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233255", "sku_description": "Product B", qty: "50" }]

Page 9: MuleSoft DataWeave data transformation language

XML Parsing❖ repeated XML elements —> repeated object keys

❖ XML attributes —> special @() object

Page 10: MuleSoft DataWeave data transformation language

CSV parsing❖ Array of records (lines)

❖ Record (line) —> array element of type Object

❖ Field in record: object field (key is taken from CSV header line or configured metadata)

❖ Reader configuration to set field separator, etc.

Page 11: MuleSoft DataWeave data transformation language

DW transform structure

%dw 1.0%input payload application/csv%output application/json%type sapDate = :string { format: “YYYYMMDD” }%var unitOfMeasure = 'EA'%var doubleNumber = (nr) -> [nr * 2.0]%namespace xsi http://www.w3.org/2001/XMLSchema-instance%function fname(name) {firstName: upper name}

——-

order: { ID: payload.orderID ++ " dated " ++ payload.orderDate, nrLines: (sizeOf payload.orderItems) + 1, totalOrderAmount: payload.*orderItems reduce

$$ + (($.orderQuantity as :number) * ($.unitPrice as :number)) } }

Optional header contains:• transformation directives• reusable declarations

Body contains the DW transformation expression

Page 12: MuleSoft DataWeave data transformation language

Case study: introductionTransforming a list of order items into a corresponding list of delivery routes.

The source payload is unsorted list of items in CSV format:

OrderId;OrderDate;CustomerId;DeliveryDate;City;ProductId;Quantity

000001;2016-09-14;Customer1;2016-09-20;London;ProductA;120000001;2016-09-14;Customer1;2016-09-20;London;ProductB;88000002;2016-09-15;Customer2;2016-09-20;Paris;ProductC;60000002;2016-09-15;Customer2;2016-09-20;Paris;ProductA;100000002;2016-09-15;Customer2;2016-09-20;Paris;ProductD;15000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductB;14000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductD;30000004;2016-09-15;Customer4;2016-09-20;London;ProductC;14000004;2016-09-15;Customer4;2016-09-20;London;ProductE;30000005;2016-09-16;Customer4;2016-09-20;London;ProductB;20000006;2016-09-16;Customer2;2016-09-22;Paris;ProductD;7000006;2016-09-16;Customer2;2016-09-22;Paris;ProductE;30000007;2016-09-16;Customer5;2016-09-22;Berlin;ProductB;12

The target structure (described in the following slide) is a multi-level JSON structure.

This case study focuses on the structural transformation capabilities of DW, but DW offers a wide range of value and formatting capabilities, conditional mapping, and much more!

Page 13: MuleSoft DataWeave data transformation language

Case study: target format

[ { city: "<City>", deliveryDate: "<DeliveryDate>", stops: [ { customer: "<CustomerId>", orderitems: [ { ordernr: "<OrderId>", orderdate: "<OrderDate>", product: "<ProductId>", qty: "<Quantity>" } ] } ] } ]

JSON document with sequence of delivery routes by delivery date and city:

❖ Sort CSV order lines by city and delivery date

❖ Within each delivery date and city, group order lines by customer

❖ Render the structure as JSON

By city / delivery date

By customer

By order item

Page 14: MuleSoft DataWeave data transformation language

Case study: step 1Source message parsed as application/dw:

The DW expression payload evaluates the entire message payload (see earlier slide “CSV parsing)”

NOTE: the DW transformer Preview functionality in MuleSoft Anypoint Studio maps the sample source in realtime as you type the transformation!

Page 15: MuleSoft DataWeave data transformation language

Case study: step 2Sorting and grouping by combination of city and delivery date:

A composite key is used for sorting and grouping via the string concatenation operator (++) .

The groupBy operator creates an object with the group values as keys.

Page 16: MuleSoft DataWeave data transformation language

Case study: step 3Iterating over the group values (city/delivery date combination) to generate the 1st level of the target structure:

The pluck operator maps an object into an array. $$ is the key in the current iteration, $ is the value.

City and delivery date are mapped from the composite key by String manipulation.

Page 17: MuleSoft DataWeave data transformation language

Case study: step 4Within each route group, group by customer and generate 2nd (inner) level of target structure:

In the inner pluck the context for $ and $$ changes (e.g., $$ is now the CustomerID key).

Page 18: MuleSoft DataWeave data transformation language

Case study: (final) step 5Within each customer group, generate the 3rd (innermost) level of the target structure via the map operator:

Also get the JSON rending by changing the %output directive.

Page 19: MuleSoft DataWeave data transformation language

Thanks!

This is just a “taste” of the innovative DataWeave transformation language.

Find out more at:

https://docs.mulesoft.com/mule-user-guide/v/3.8/dataweave