Top Banner
Zementis © - Confidential PMML Overview
14

PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Mar 07, 2018

Download

Documents

hoangnhan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

-C

on

fid

en

tia

l

PMML

Overview

Page 2: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

2

Mo

dels

PM

ML

de

fines a

sta

ndard

no

t only

to

repre

sen

t data

-min

ing m

odels

, but

als

o d

ata

han

dli

ng

and d

ata

tran

sfo

rma

tio

ns

(pre

-and

post-

pro

cessin

g)

PM

ML

Pre

dic

tive M

od

el M

ark

up

Lan

gu

ag

e

Tra

nsfo

rmati

on

s

•P

MM

L is a

n X

ML

-based

la

ng

ua

ge

used to

define s

tatistical and d

ata

min

ing m

odels

and to

share

these b

etw

een c

om

plia

nt applic

ations.

•It

is a

ma

ture

sta

nd

ard

develo

ped b

y the D

MG

(Data

Min

ing G

roup)

to a

void

pro

prieta

ry issues

and incom

patibili

ties a

nd to d

eplo

y m

odels

.

•P

MM

L a

llow

s fo

r th

e c

lear

sepa

ration o

f ta

sks:

Model develo

pm

ent vs.

model deplo

ym

en

t. A

s a

consequence, scie

ntists

can focus o

n b

uild

ing the

best m

odel.

•P

MM

L e

limin

ate

s n

eed fo

r custo

m m

odel

deplo

ym

en

t and e

nsure

s s

cala

bili

ty a

nd

relia

bili

ty.

Page 3: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

-C

on

fid

en

tia

l3

Matu

red

an

d S

up

po

rted

by I

nd

ustr

yM

atu

red

an

d S

up

po

rted

by I

nd

ustr

y

PM

ML

PM

ML

In

du

str

y S

up

po

rt

�D

ata

Min

ing G

roup htt

p:/

/ww

w.d

mg.o

rg

�M

atu

re s

tanda

rd

�C

urr

en

t ve

rsio

n 4

.0 (

just re

leased)

�A

ctive g

roup a

nd

consta

nt enhancem

ents

�V

endor

independent consort

ium

�In

dustr

y s

uppo

rters

�M

ajo

r P

laye

rs:

IBM

, O

racle

, S

AP

, M

icro

soft

�A

naly

tics: K

XE

N, S

AS

, S

alford

, S

PS

S, Z

em

entis

�B

I: M

icro

str

ate

gy, T

era

data

, T

ibco

�O

pen

Sourc

e: K

NIM

E, R

, R

apid

-I

�O

thers

: E

quifax,

FIC

O,

Open D

ata

Gro

up

, V

isa,

Perv

asiv

e

�D

ata

Min

ing G

roup htt

p:/

/ww

w.d

mg.o

rg

�M

atu

re s

tanda

rd

�C

urr

en

t ve

rsio

n 4

.0 (

just re

leased)

�A

ctive g

roup a

nd

consta

nt enhancem

ents

�V

endor

independent consort

ium

�In

dustr

y s

uppo

rters

�M

ajo

r P

laye

rs:

IBM

, O

racle

, S

AP

, M

icro

soft

�A

naly

tics: K

XE

N, S

AS

, S

alford

, S

PS

S, Z

em

entis

�B

I: M

icro

str

ate

gy, T

era

data

, T

ibco

�O

pen

Sourc

e: K

NIM

E, R

, R

apid

-I

�O

thers

: E

quifax,

FIC

O,

Open D

ata

Gro

up

, V

isa,

Perv

asiv

e

Page 4: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

4

PM

ML

Co

mp

on

en

ts

�A

Data

Dic

tio

na

ryde

fines a

ll th

e r

aw

data

fie

lds (

inclu

din

g m

issin

g v

alu

e

str

ate

gy a

nd

outlie

r tr

eatm

en

t).

�S

evera

l D

ata

Tra

nsfo

rma

tio

ns

str

ate

gie

s a

llow

for

inte

lligent

extr

action o

f fe

atu

re d

ete

cto

rs f

rom

ra

w d

ata

(“d

ata

massagin

g”)

.

�A

com

pre

hensiv

e lis

t of D

ata

-Min

ing

M

od

els

off

ers

po

we

r and f

lexib

ility

.

�P

ost-

pro

cessin

g o

f re

sults a

llow

for

tailo

red d

ecis

ions.

�M

odel E

xpla

nation a

llow

s for

perf

orm

ance e

va

luation.

Page 5: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

5

PM

ML

Fil

es

3)

Po

st-

Pro

ce

ssin

g

Scalin

g o

f m

odel outp

uts

can b

e p

erf

orm

ed w

ith

PM

ML e

lem

ent

Targ

ets

1)

Pre

-Pro

ce

ssin

g

PM

ML e

lem

ents

Tra

nsfo

rma

tions,

Min

ing

Schem

aand F

un

ctions

allo

w fo

r e

ffective p

re-

pro

cessin

g

2)

Mo

dels

PM

ML a

llow

s for

severa

l

pre

dic

tive m

odelin

g

techniq

ues to b

e f

ully

expre

ssed

PM

ML

Page 6: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

6

PM

ML

: D

ata

Pre

-Pro

ce

ssin

gP

MM

L:

Data

Pre

-Pro

ce

ssin

g

�D

ata

Dic

tio

nary

: A

llow

s fo

r th

e e

xplic

it s

pecific

ation o

f valid

, in

valid

and m

issin

g v

alu

es.

�M

inin

g S

ch

em

a:

Used to d

efine t

he a

ppro

priate

tre

atm

ent

to

be a

pplie

d to m

issin

g a

nd invalid

valu

es.

�T

ran

sfo

rma

tio

ns: A

llow

for

variable

dis

cre

tization,

norm

aliz

ation, and m

appin

g w

ith h

andlin

g o

f m

issin

g a

nd

defa

ult v

alu

es.

�B

uil

t-in

Fu

ncti

on

s: A

rith

metic e

xpre

ssio

ns, handlin

g o

f da

te

and tim

e a

s w

ell

as s

trin

gs. A

lso u

sed for

imple

menting IF

-

TH

EN

-ELS

E logic

and B

oole

an o

pera

tions.

�D

ata

Dic

tio

nary

: A

llow

s fo

r th

e e

xplic

it s

pecific

ation o

f valid

, in

valid

and m

issin

g v

alu

es.

�M

inin

g S

ch

em

a:

Used to d

efine t

he a

ppro

priate

tre

atm

ent

to

be a

pplie

d to m

issin

g a

nd invalid

valu

es.

�T

ran

sfo

rma

tio

ns: A

llow

for

variable

dis

cre

tization,

norm

aliz

ation, and m

appin

g w

ith h

andlin

g o

f m

issin

g a

nd

defa

ult v

alu

es.

�B

uil

t-in

Fu

ncti

on

s: A

rith

metic e

xpre

ssio

ns, handlin

g o

f da

te

and tim

e a

s w

ell

as s

trin

gs. A

lso u

sed for

imple

menting IF

-

TH

EN

-ELS

E logic

and B

oole

an o

pera

tions.

1

Da

ta P

re-P

roc

es

sin

g

Page 7: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

7

Da

ta P

re-P

roc

es

sin

g:

PM

ML

Exam

ple

Arb

itra

ry P

iece

wis

e L

inear

Fu

ncti

on

Th

is P

MM

L c

od

e im

ple

men

ts:

Var_

b:=

inte

rpo

late

(Var_

a,(

(100,0

),(2

00,1

),(8

00,3

),(9

00,4

)))

See h

ttp

://w

ww

.dm

g.o

rg/v

3-2

/Tra

nsfo

rmati

on

s.h

tml -

loo

k f

or

ele

men

t N

orm

Co

nti

nu

ou

s.

Arb

itra

ry P

iece

wis

e L

inear

Fu

ncti

on

Th

is P

MM

L c

od

e im

ple

men

ts:

Var_

b:=

inte

rpo

late

(Var_

a,(

(100,0

),(2

00,1

),(8

00,3

),(9

00,4

)))

See h

ttp

://w

ww

.dm

g.o

rg/v

3-2

/Tra

nsfo

rmati

on

s.h

tml -

loo

k f

or

ele

men

t N

orm

Co

nti

nu

ou

s.

Page 8: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

8

Mo

deli

ng

Ele

men

tsM

od

eli

ng

Ele

men

ts

�P

MM

L a

llow

s for

severa

l p

red

icti

ve m

od

elin

gte

chniq

ues to b

e

expre

ssed d

irectly. S

upport

ed techniq

ues w

hic

h h

ave their o

wn

ele

ments

are

:

�R

egre

ssio

n a

nd

Ge

ne

ral R

egre

ssio

n

�N

eu

ral N

etw

ork

s

�S

uppo

rt V

ecto

r M

ach

ine

s

�D

ecis

ion

Tre

es

�N

aïv

e B

aye

s

�C

luste

rin

g

�S

eque

nce

s

�R

ule

Se

ts

�A

sso

cia

tion

Rule

s

�T

ime

-Se

rie

s (

as o

f P

MM

L 4

.0)

�T

ext

Mo

de

ls

�S

uppo

rt f

or

Mu

ltip

le M

od

els

�P

MM

L a

llow

s for

severa

l p

red

icti

ve m

od

elin

gte

chniq

ues to b

e

expre

ssed d

irectly. S

upport

ed techniq

ues w

hic

h h

ave their o

wn

ele

ments

are

:

�R

egre

ssio

n a

nd

Ge

ne

ral R

egre

ssio

n

�N

eu

ral N

etw

ork

s

�S

uppo

rt V

ecto

r M

ach

ine

s

�D

ecis

ion

Tre

es

�N

aïv

e B

aye

s

�C

luste

rin

g

�S

eque

nce

s

�R

ule

Se

ts

�A

sso

cia

tion

Rule

s

�T

ime

-Se

rie

s (

as o

f P

MM

L 4

.0)

�T

ext

Mo

de

ls

�S

uppo

rt f

or

Mu

ltip

le M

od

els

2

Ea

sy E

xp

res

sio

n o

f P

red

icti

ve

Mo

de

ls

Page 9: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

9

Mo

de

lin

g E

lem

en

ts:

PM

ML

Exam

ple

fo

r N

eu

ral N

etw

ork

Page 10: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

10

The P

MM

L c

ode b

elo

w im

ple

ments

score

post-

pro

cessin

g.

It u

ses t

he P

MM

L e

lem

ent T

arg

ets

for

checkin

g

boundaries (

min

and m

ax

) and

to

rescale

(re

scale

Co

nsta

nt

and r

escale

Fac

tor)

the

origin

al score

genera

ted b

y m

odel

See h

ttp:/

/ww

w.d

mg.o

rg/v

3-2

/Ta

rgets

.htm

l

The P

MM

L c

ode b

elo

w im

ple

ments

score

post-

pro

cessin

g.

It u

ses t

he P

MM

L e

lem

ent T

arg

ets

for

checkin

g

boundaries (

min

and m

ax

) and

to

rescale

(re

scale

Co

nsta

nt

and r

escale

Fac

tor)

the

origin

al score

genera

ted b

y m

odel

See h

ttp:/

/ww

w.d

mg.o

rg/v

3-2

/Ta

rgets

.htm

l

Da

ta P

os

t-P

roc

es

sin

g:

PM

ML

Exam

ple

3

Page 11: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

11

Ap

pli

cati

on

s

Se

rvic

e P

rovid

ers

Exte

rnal

Ven

do

rs D

ivis

ion

s

On

e S

tan

da

rd,

On

e P

roc

es

s

Page 12: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

12

Model Deployment

Model Building

Model Building

PM

ML

= E

as

y M

od

el

De

plo

ym

en

t

PMML

Page 13: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

13

PM

ML

-Z

em

en

tis

Co

ntr

ibu

tio

ns

•A

DA

PA

: A

decis

ion e

ngin

e that deplo

ys m

odels

expre

ssed in P

MM

L a

nd e

xecute

s

them

in r

eal-tim

e. N

ow

availa

ble

as a

serv

ice o

n th

e A

mazon C

loud.

•P

MM

L C

on

vert

er:

Valid

ate

s, convert

s,

and c

orr

ects

old

and n

ew

PM

ML c

ode.

Availa

ble

at th

e D

MG

website a

nd a

t h

ttp

://w

ww

.zem

entis.c

om

/pm

ml.htm

.

•C

on

trib

uti

ng

Mem

ber

of

the

DM

G:

Subm

itte

d s

evera

l pro

posals

for

PM

ML

4.0

and

already w

ork

ing w

ith o

the

r m

em

bers

on P

MM

L 4

.1.

•C

ode c

ontr

ibuto

r fo

r th

e R

PM

ML

packag

e(a

vaila

ble

on C

RA

N).

•P

MM

L A

rtic

les

: R

Journ

al and S

IGK

DD

Explo

rations N

ew

sle

tter.

Availa

ble

for

dow

nlo

adin

g a

t http:/

/ww

w.z

em

entis.c

om

/manual.htm

•P

MM

L B

log

s: S

evera

l blo

gs o

n P

MM

L topic

s (

htt

p://a

dapasuppo

rt.z

em

entis.c

om

and

http:/

/ww

w.p

redic

tive-a

naly

tics.info

).

Page 14: PMML - Ningapi.ning.com/.../PMML.pdf · PMML Predictive Model Markup Language Transformations processing) • PMML is an XML-based language used to define statistical and data mining

Ze

me

nti

s ©

14

Th

an

k Y

ou

!

U.S.A Headquarters

Asia Office

E-m

ail

:in

fo@

zem

entis.c

om

19/F., Unit A

Ho Lee Commercial Building

38-44 D’AguilarStreet

Central, Hong Kong (S.A.R.)

Tel: +852 2868-0878

Fax: +852 2845-6027

6125 Cornerstone Court East

Suite 250

San Diego, CA, 92121

Tel: +1 619 330-0780

Fax: +1 858 535-0227