Data-Build Command and TML Reference Guide...18-Datatype-conversion functions todate 217 tointeger 218 toreal 219 tostring 220 19-Functionsforworkingwith dates addcenturies,addcenturiescountbackwards

Spectrum Miner™

Version 8.0

Data-Build Command and TML Reference Guide

Copyright© 2017 Pitney Bowes Software Inc. All rights reserved. MapInfo andGroup 1 Software are trademarks of Pitney Bowes Software Inc. Allother marks and trademarks are property of their respective holders.

USPS® Notices

Pitney Bowes Inc. holds a non-exclusive license to publish and sellZIP + 4® databases on optical and magnetic media. The followingtrademarks are owned by the United States Postal Service: CASS,CASS Certified, DPV, eLOT, FASTforward, First-Class Mail, IntelligentMail, LACSLink, NCOALink, PAVE, PLANET Code, Postal Service,POSTNET, Post Office, RDI, SuiteLink , United States Postal Service,Standard Mail, United States Post Office, USPS, ZIP Code, and ZIP+ 4. This list is not exhaustive of the trademarks belonging to thePostal Service.

Pitney Bowes Inc. is a non-exclusive licensee of USPS® for NCOALink®

processing.

Prices for Pitney Bowes Software's products, options, and servicesare not established, controlled, or approved by USPS® or UnitedStates Government. When utilizing RDI™ data to determineparcel-shipping costs, the business decision on which parcel deliverycompany to use is not made by the USPS® or United StatesGovernment.

Data Provider and Related Notices

Data Products contained on this media and used within Pitney BowesSoftware applications are protected by various trademarks and byone or more of the following copyrights:© Copyright United States Postal Service. All rights reserved.© 2014 TomTom. All rights reserved. TomTom and the TomTom logoare registered trademarks of TomTom N.V.© 2016 HERE

Fuente: INEGI (Instituto Nacional de Estadística y Geografía)

Based upon electronic data © National Land Survey Sweden.© Copyright United States Census Bureau© Copyright Nova Marketing Group, Inc.

The Geocode Address World data set contains data licensed from the GeoNames Project(www.geonames.org) provided under the Creative Commons Attribution License ("AttributionLicense") located at http://creativecommons.org/licenses/by/3.0/legalcode. Your use of theGeoNames data is governed by the terms of the Attribution License, and any conflict between youragreement with Pitney Bowes Software, Inc. and the Attribution License will be resolved in favor ofthe Attribution License solely as it relates to your use of the GeoNames data.

ICU Notices

Permission is hereby granted, free of charge, to any person obtaining a copy of this software andassociated documentation files (the "Software"), to deal in the Software without restriction, includingwithout limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of theSoftware, and to permit persons to whom the Software is furnished to do so, provided that the abovecopyright notice(s) and this permission notice appear in all copies of the Software and that both theabove copyright notice(s) and this permission notice appear in supporting documentation.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS ORIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS.IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICEBE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES,OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THISSOFTWARE.

Except as contained in this notice, the name of a copyright holder shall not be used in advertisingor otherwise to promote the sale, use or other dealings in this Software without prior writtenauthorization of the copyright holder.

3Spectrum Miner™ 8.0 Data-Build Command and TML Reference Guide

Copyright

C o p y r i g h t

1 - Introduction

Spectrum Miner Overview 10Who should read this book 10

2 - Spectrum Miner data-buildcommands

About Spectrum Miner data-build commands 12Standard command-line options 18The -force command-line option 20The -macro command-line option 21

3 -Managing a SpectrumMinerdata build

qsbuild 24

4 - Commands for importingand exporting data

qsdbaccess 28qsimportdb 29qsdbcreatetable 31qsdbinsert 33qsdbupdate 35qsgenfdd 37qsimportflat 39qsexportflat 41qsimportstat 44

qsexportstat 46qsimportfocus 48How Spectrum Miner imports database types 50Date formats 52

5 - Commands for processingfoci

qssort 59qsderive 60qsmeasure 63qstrack 67qsselect 70qsrenamefields 73qsexportmetadata 76qsimportmetadata 76qsupdate 81

6 - Commands for combiningfoci

About combining foci 83qsjoin 83qsmerge 87

7 - Commands for managingfoci

qscopy 91qslink 92qsmove 93qsremove 93qsarchive 95qsunzip 95

Table of Contents

8 - Commands for producingreports

qssettings 98qsaudit 98qsdescribe 102qsdescribestat 104qshtmlunpack 105qsdtsnapshot, qsscsnapshot 105qsxt 108qsinfo 110qsdescribelicense 111

9 - Commands for buildingmodels

About the Scorecard Wizard 114qsscorecardwizard 115qsdecisiontree 120qsscorecard 122About the Association Rule Wizard 123qsruleminer 125

10 - Commands for workingwith QMML files

qsqmmlview 129qsqmmledit 129qslt 132qsqmml2sas 134qsqmml2sql 134qsqsfmtosas 135

11 - Other commands

qsmapgen 137

12 - Transaction MeasurementLanguage

About Transaction Measurement Language 142TML syntax 142Reserved words in TML 143

13 - TML statements

Field definition: the create statement 146Using aggregation functions and the where and

default clauses 148Splitting aggregations: the by clause 150Evaluating focus statistics: the calculate

statement 153

14 - Aggregation functions

Aggregation functions for measurements andderivations 156

any 159confintlower 160confintupper 160count 161countnonnull/countnonnulls 162countnull/countnulls 163countunique 164countuniquenonnull 164first 165last 166max (one argument) 167mean (one argument) 168median 168min (one argument) 169mode 170moderatio 171percentage 172percentagerate 172segindex 173significance 174stdev 175

sum (one argument) 175variance 176

15 - Field Derivation Language

About Field Derivation Language 179

16 - FDL syntax

Datatypes 181Expressions 183Conditional expressions 185Variables 187User-defined functions 190Arithmetic operators 193Relational operators 194Logical operators 196Operator precedence 197Built-in functions 198Reserved words in FDL 206

17 - Conditional functions

clamp 209cond 210iff 211ifnull, nvl 212isnull 213isselected 214replace 214

18 - Datatype-conversionfunctions

todate 217tointeger 218toreal 219tostring 220

19 - Functions for workingwithdates

addcenturies, addcenturiescountbackwards 223adddays 224addhours 225addminutes 226addmonths, addmonthscountbackwards 227addseconds 229addweeks 230addyears, addyearscountbackwards 231countcenturies 232countdays 233counthours 234countminutes 235countseconds 236countweeks 237countwholecenturies,

countwholecenturiesbackwards 238countwholedays 240countwholehours 241countwholeminutes 242countwholemonths,

countwholemonthsbackwards 243countwholeseconds 245countwholeweeks 246countwholeyears, countwholeyearsbackwards 247countyears 248day 249dayofweek 250gmt2edt 250hour 251minute 252month 253now 253second 254today 255weekofyear 255year 256

20 - Functions for workingwithstrings

concat 259endswith 260find 261left 262mid 263right 264soundex 265startswith 266strlen 267strmember 268substitute 269substr 270tolower 272toupper 273trim 273

21 - Regular expressions andassociated functions

Regular expressions 276Basic components of a regular expression 276Regular-expression operators 279match 280replaceall 282replacefirst 283

22 -Mathematical and statisticalfunctions

abs 286ceil 286cos 287exp 288floor 288log 289log10 290logbase 290

max (two or more arguments), maxnonnull 291mean (two or more arguments), meannonnull 292min (two or more arguments), minnonnull 293normalize 294pow 294product, productnonnull 295round 296sgn 297sin 297sqrt 298sum (two or more arguments), sumnonnull 299tan 300

23 - Data-sampling functions

numericTestTrainSplit 302numericTestTrainValidateSplit 302sampleEqualSize 303sampleExactNumber 304sampleExactPercentage 305sampleStratified 306testTrainSplit 307testTrainValidateSplit 308

24 - Random-number functions

About random-number functions in FDL 311rndBinomial 311rndBool 312rndExp 312rndGamma 313rndNormal 314rndPoisson 315rndUniform 315

25 - Return-on-investmentfunctions

ActionROI 318ActionROIAnnualized 319OfferROI 320OfferROIAnnualized 322

RetentionActionROI 323RetentionActionROIAnnualized 324RetentionOfferROI 326RetentionOfferROIAnnualized 327

26 - Miscellaneous functions

dblookup 330member 332rankOrder, rankOrderApprox 333rankOrderMean, rankOrderApproxMean 335rownum 336

27 - Binnings

bin 338Boolean 339DayFrom, WeekFrom, MonthFrom, YearFrom 340DayMultipleFrom, WeekMultipleFrom,

MonthMultipleFrom, YearMultipleFrom 341DayMultipleNumBins, WeekMultipleNumBins,

MonthMultipleNumBins,YearMultipleNumBins 342

DayMultiplePrePost, WeekMultiplePrePost,MonthMultiplePrePost, YearMultiplePrePost343

DayMultipleTo, WeekMultipleTo, MonthMultipleTo,YearMultipleTo 344

DayMultipleWidth, WeekMultipleWidth,MonthMultipleWidth, YearMultipleWidth 345

DayPrePost, WeekPrePost, MonthPrePost,YearPrePost 346

DayRange, WeekRange, MonthRange,YearRange 347

DayTo, WeekTo, MonthTo, YearTo 348EqualRange 349EqualRangeWidth 350NegativeNonNegative 351PreDuringPost 352PrePost 353Sign 354

28 - XML in Spectrum Miner

XML in Spectrum Miner 356Metadata specification for qsimportmetadata 357Aggregation specification for qsmeasure 366Derivation specification for qsderive 368Selection specification for qsselect 369Crosstab specification for qsxt 371Field name mapping specification for

qsrenamefields 374Decision-tree build specification for

qsdecisiontree 376Scorecard build specification for qsscorecard 379Binning specifications 381Attribute values 390

1 - Introduction

In this section

Spectrum Miner Overview 10Who should read this book 10

Spectrum Miner Overview

SpectrumMiner is a powerful predictive analytics solution that enables customer insight professionalsand business users alike to achieve a clear picture of their customers for the purpose of greatercustomer understanding, uncovering areas of opportunity, achieving optimal segmentation andpredicting future behavior.

Bridging the gap between standard Business Intelligence tools with a limited scope for exploringdata, and number-crunching solutions which require statistical programmers to build queries andproduce models, Spectrum Customer Analytics is a next-generation solution designed for unparalleledease of use – and fast actionable insight.

The solution utilizes powerful 3D data visualization and rapid modeling automation to uncoverimportant data relationships and deliver propensity scores at the push of a button, boosting predictivemodel accuracy and increasing the speed of analytic results.

Spectrum Miner can be used to predict profit-impacting behaviors and propensities, includingcustomer churn, cross sell and up sell opportunities, campaign planning and segmentation, customersatisfaction and loyalty, and customer lifetime value.

Who should read this book

This book is intended for Spectrum Miner users who are comfortable with the command line andwant to create or modify data builds. Some of it — in particular, the chapter describing the commandsthat are available for managing foci — may also be of interest to Spectrum Miner administrators.

This document is designed for reference and is not a tutorial.

Introduction

2 - Spectrum Minerdata-build commands

In this section

About Spectrum Miner data-build commands 12Standard command-line options 18The -force command-line option 20The -macro command-line option 21

About Spectrum Miner data-build commands

SpectrumMiner data-build commands provide key SpectrumMiner data-build, data-management,and reporting functionality from the command line. While you can enter commands individually toperform one-off tasks, the real advantage of using data-build commands is that you can combinethem in the likes of UNIX shell scripts and Windows batch files to perform complex data builds.

To manage data builds that contain multiple interdependent steps, you can also use SpectrumMiner's Data Build Manager tool or its command-line equivalent, qsbuild. Using a build plan toexpress the sequence of steps that is required saves time and minimizes human error. If you run aregular data-build or deployment process unattended, perhaps overnight, and something goes wrong(such as running out of disk space), you probably want to fix the problem and carry on from whereyou left off, rather than starting all over again. Likewise, if the operations needed for successivebuilds overlap to some extent, you may wish to save on processing time by performing suchoperations once only. Using the Data Build Manager or qsbuild makes it easy to keep track ofdependencies between operations, and remember which operations have already been performed.

Each data-build command requires certain command-line arguments and may acceptcommand-specific optional arguments as well as standard options [see Standard command-lineoptions on page 18]. In the section describing each command is a synopsis of the command, whichuses some special syntactic notation:

• Items in angle brackets are non-literal. For example, where you see <source focus>, you shouldenter the filename of an existing focus.

• Items in square brackets are optional. For example, where you see [-overwrite], you canenter -overwrite or nothing at all.

• Items separated by vertical bars are alternatives: you should choose just one of them. For example,where you see XML | HTML | Full, you can enter XML, HTML, or Full.

• An item followed by an ellipsis can be repeated as required. For example, where you see <field>[, <field> ...], you can enter Income (ignoring the optional item), Income,Gender,Income,Gender,Age, etc.

• Braces {...} are sometimes used to delineate non-optional items. (What counts as an item forthe purposes of alternation and repetition extends to the nearest vertical bar, square bracket, orbrace.) For example, where you see -input <source focus> {,-input <source focus>...}, you can enter -input one.ftr -input two.ftr or -input one.ftr -inputtwo.ftr -input three.ftr but not simply -input one.ftr.

There are data-build commands for importing and exporting data, for processing, joining, andmanaging data in foci, and for reporting on foci and models.

Commands for importing and exporting data

Spectrum Miner data-build commands

PurposeCommand

Create, test, or delete a User Database Connection, foraccess to a database.

qsdbaccess

Import data from a database table or from the result of adatabase query.

qsimportdb

Create an empty table in a database based on the fields ina focus.

qsdbcreatetable

Insert records from a focus into a database table.qsdbinsert

Update existing records in a database table using data froma focus.

qsdbupdate

Create a flat-data description file for data in a text file.qsgenfdd

Import data from a flat file.qsimportflat

Export data to a flat file.qsexportflat

Import data from an Excel dataset.qsimportstat

Export data to an Excel file.qsexportstat

Import data (without metadata) from an existing focus.qsimportfocus

Commands for processing foci

PurposeCommand

Sort the records in a focus.qssort

Derive fields in a focus.qsderive

Aggregate records in a focus to produce a new focus.qsmeasure

Derive fields in a focus using state-tracking derivations.qstrack

Add a record selection to a focus.qsselect

PurposeCommand

Rename fields in a focus.qsrenamefields

Export metadata from a focus.qsexportmetadata

Import metadata from a file to a focus.qsimportmetadata

Copy metadata from one focus to another.qsupdate

Commands for combining foci

PurposeCommand

Join fields from one or more secondary foci to a primaryfocus, matching records using key fields.

qsjoin

Merge records from similar foci to create a new focus,keeping records sorted by key fields.

qsmerge

Commands for managing foci

PurposeCommand

Copy a focus to create a new, independent focus.qscopy

Copy a focus, sharing the underlying data.qslink

Move or rename a focus.qsmove

Delete foci.qsremove

Create an archive file from foci or folders.qsarchive

Extract from an archive file.qsunzip

Commands for producing reports

PurposeCommand

Set common formatting options for various reports.qssettings

Produce a Profile and Audit of a focus.qsaudit

Display summary information about a focus.qsdescribe

Display summary information about an Excel dataset.qsdescribestat

Unpack an HTML or XML archive produced byqsaudit,qsdtsnapshot, or qsscsnapshot.

qshtmlunpack

Produce a Model Snapshot of a decision tree.qsdtsnapshot

Produce a Model Snapshot of a scorecard.qsscsnapshot

Apply a crosstab specification to a focus to produce a newcrosstab.

Display information on a focus's constituent files and itsrelationships to other foci.

qsinfo

Display information about a Spectrum Miner license.qsdescribelicense

Commands for building models

PurposeCommand

Apply a decision tree specification to a focus to produce anew decision tree model report.

qsdecisiontree

Apply a scorecard specification to a focus to produce a newscorecard model report.

qsscorecard

Build a scorecard on a specified focus by using a parametersfile.

qsscorecardwizard

Commands for working with QMML files

PurposeCommand

Show the contents of a QMML rules file.qsqmmlview

Modify a QMML rules file.qsqmmledit

Transform an FDL, QMML, or generic XML file to a differentformat.

Convert a QMML file to a SAS file.qsqmml2sas

Convert a QMML file to an SQL file.qsqmml2sql

Convert derivations from a metadata file to a SAS file.qsqsfmtosas

Other commands

PurposeCommand

Execute a build plan.qsbuild

Create a set of Decision Studio maps from the binnings inone or more categorical hierarchy files.

qsmapgen

Because of the potential interdependence of foci, you should not attempt to copy,move, rename, or delete foci using standard operating-system utilities. Instead, youshould always use the commands for managing foci listed above (or SpectrumMiner).

Data-build commands that generate files generally overwrite any existing files ofthe same name. Take care not to overwrite foci that are dependent on other focior that have dependants themselves, as you may render those other foci unusable.

Note: • To use the data-build commands, you must have command-line access to SpectrumMiner.See the Spectrum Miner Administration Guide for information on setting up your commandpath for using data-build commands (or ask your Spectrum Miner administrator).

• Command shells generally require you to quote command-line arguments that containcertain characters, notably spaces and special punctuation (such as single and doublequotation marks, "<," ">," "*," "=," and "#").

Some options take arguments that are comma-separated lists, for example, the -fieldsoption in several commands. Such a list is in fact a single command-line argument, so youshould either quote it or ensure that there are no spaces following the commas (or anywhereelse in the list).

• When you use the -verbose or -logfile option [see Standard command-line optionson page 18], the output of a data-build command (on screen or in the log file, according tothe option used) includes a line similar to the following:

11:55:21:

{P1/4/4Hm4096Pd2047Pi63Pw27Jm64Um-1Fs991Fh1784Fc0Fp0}

The cryptic string of characters encodes information about the memory and processorresources used by the command; if you are having performance problems with data-buildcommands, this information may help SpectrumMiner Support to resolve matters for you.

• The data-build commands qssort, qsjoin and qsmerge display or record data-specificmessages when you use the -verbose or -logfile option. These can help you tounderstand exactly how the command has acted on your data.

• The exit status returned by each command is zero if it succeeds or a non-zero value if itfails. When a command fails, it always produces an error message of the following form:

*** Error: <message>

Error messages appear on standard error — and also in a log file if you use the -logfileoption [see Standard command-line options on page 18]. Commands may also producewarning messages of the following form:

*** Warning: <message>

Warning messages appear either on standard error or in a log file (but not both).

• If a command fails because a focus is read-only, copy the read-only focus to a new locationusing qslink or qscopy and run the command again.

• When creating UDCs, you use the command qsdbaccess with the -add argument.Because it prompts you for information, it is not appropriate to use this form of the commandin a shell script or batch file.

• By default, data-build commands that read foci use the default subfocus if there is one, andthe top-level ("root") focus otherwise. You can override this behavior for a given commandby using the common -subfocus option.

• Many of the data-build commands take -field , -xfield , -tag and -xtag options tospecify lists of fields. Fields can be specified by using combinations of -field and -xfieldor -tag and -xtag but the field options and tag options cannot be used in combination.

Standard command-line options

Standalone options: use one of the following options in place of a data-build command's normalarguments:

EffectOption

Show usage text, including a summary of all validcommand-line options other than standard options.

Show usage text, including a summary of all validcommand-line options.

-helpall

Display the version of Spectrum Miner and associatedlicense information.

-version

Other options common to all commands

EffectOption

Use the specified preferences file (see the Spectrum MinerAdministration Guide).

Settings in this file override those in the standardsystem-wide and user-specific preferences files.

-config <preferences file>

On completion of the command, send an e-mail messageto the recipient specified in the Email preference

-email

emailaddress (see theSpectrumMiner AdministrationGuide).

The e-mail message includes an HTML-formatted statussummary, like that generated by the -statusfileoption.

Write detailed progress information complete with operationtimings, and other command-specific information about

-logfile <log file>

execution including any warning and error messages, to thespecified file, overwriting any existing file of that name.Continue to show error messages on standard output. Thisoption implies the -verbose option.

EffectOption

Use at most the specified amount of memory. The commandwill try to keep within the limit if possible, but will definitely

-memory <number of megabytes>

not exceed it. In cases where a command requires morememory than the maximum amount, the command fails.This option overrides the setting of the focus memorysoft limit preference (see the Spectrum MinerAdministration Guide).

Use the specified number of processors (if available), foroperations that can make use of more than one processor.

-parallel <number of processors>

This overrides the setting of the parallelismpreference (see the SpectrumMiner Administration Guide).

Show simple progress information, as a line of . and +characters.

-progress

Use the settings in the specified file to control numeric anddate formatting in a generated report. This option has an

-settings <settings file>

effect only for the report-generating commands, namelyqsaudit,qsdtsnapshot, andqsscsnapshot.

In the user-specific configuration directory on a SpectrumMiner client PC, Decision Studio creates a file calledsettings.xml when you make changes to settingsusing Edit->View Preferences. You can copy this file tothe server for use with the -settings option. Or youcan create a settings file on the server using the commandqssettings.

By default, the date format is European, the number ofdecimal places is 2, trailing zeros are not stripped, nolocale-specific thousand separator is used, and the timeformat is 24-hour.

Produce an HTML-formatted command-status summary inthe status file.

If you use the -email option, the status e-mail messageincludes a similar summary.

-statusfile <status file>

Show detailed progress information complete with operationtimings, and other command-specific information aboutexecution.

This option overrides the -progress option.

-verbose

Note: • If you specify the use of more than one processor, using the -parallel option or thecorresponding preference, the available memory, as specified using the -memory optionor the corresponding preference, is shared between the processors.

• Command-specific information about execution may include, for example: the number ofrecords that were input, output, or matched; information about database datatypeconversions; or diagnostic codes of interest to Spectrum Miner Support.

Examples

Display the help text for qsremove:

qsremove -help

Given the preferences file, audit.ini, containing the following:

[Audits and Snapshots]imageheight = 200barcolor = #FF0000

Use these preferences to create a Profile and Audit with images of height 200 pixels and red bars:

qsaudit -config audit.ini -input RetailCustApril.ftr

Use 512 megabytes of memory when creating a new focus, RetailTransAprilSorted.ftr,by sorting the RetailTransApril.ftr focus:

qssort -memory 512 -output RetailTransAprilSorted.ftr-input RetailTransApril.ftr -keys CustomerID

Use four processors (and default memory settings) when creating a new focus,RetailTransMaySorted.ftr, by sorting the RetailTransMay.ftr focus:

qssort -parallel 4 -output RetailTransMaySorted.ftr-input RetailTransMay.ftr -keys CustomerID

The -force command-line option

This option applies to commands that create foci, namely:

qscopy, qsderive, qsimportdb, qsimportflat, qsimportfocus,qsimportmetadata, qsimportstat, qsjoin, qslink, qsmeasure, qsmerge,qsrenamefields, qsselect, qssort, qstrack, qsupdate, qsscorecard, andqsdecisiontree.

EffectOption

Allow a new focus specified using the -output or -tooption to overwrite an existing focus.

Without the -force option, if the focus specified by-output or -to already exists, the command doesnothing (except issue a warning).

-force

Note: • The -force option should only be used when absolutely necessary, and you shouldensure that you no longer require either the focus that you are overwriting or any foci thatare linked to it [see qsinfo on page 110]. In most cases, it would be better to move theoriginal focus out of the way using qsmove before using a data-build command to createa new focus.

• The -force option only has an effect on explicitly specified output foci. If you are usinga data-build command to modify an existing focus implicitly, there is no need for -force.

Examples Create a new focus RetailTransAprilSorted.ftr, overwriting any existing focusof the same name, containing all the records from the focus RetailTransApril.ftr, sorted bythe field CustomerID:

qssort -force -output RetailTransAprilSorted.ftr-input RetailTransApril.ftr -keys CustomerID

The -macro command-line option

This option applies to the commands:

qsderive, qsmeasure, qstrack, and qsselect

PurposeCommand

Before parsing a TML or corresponding XML input file,replace all occurrences of $<name> with the given value.

You can use this option repeatedly on the same commandline to specify expansions for multiple macros.

-macro <name>=<value>

PurposeCommand

Perform macro substitution in TML and corresponding XMLinput files, reading <name>=<value> pairs fromsuccessive lines of the macros file.

The macros file may also be in XML format.

-macro @<macros file>

Note: • Names for macros must begin with a letter and contain only letters, digits, and underscore("_").

• The Data Build Manager provides similar, but independent, support for macros.

Examples

Given the selections file, selections-macro.tml, containing the following:

create ${monthName}Trans${year} :=month(PurchaseDate) = ${month} and year(PurchaseDate) = ${year};

Use macro substitutions to select records for April 1999 from the RetailTransApril.ftr focus:

qsselect -macro monthName=April -macro year=1999 -macro month=4-selections selections-macro.tml-input RetailTransApril.ftr -outputRetailTransAprilSelection1.ftr

Or, alternatively, given the macros file, macros.tml, containing the following:

monthName=Aprilyear=1999month=4

Apply these macros to achieve the same result:

qsselect -macro @macros.tml -selections selections-macro.tml-input RetailTransApril.ftr -outputRetailTransAprilSelection2.ftr

See also

XML in Spectrum Miner on page 356

3 - Managing aSpectrum Miner databuild

In this section

qsbuild 24

qsbuild

Synopsis qsbuild

[-input <build plan>]

[-targets <target> [, <target> ...]

[-skip <target> [, <target> ...]

[-D<name>=<value>]

[-propertyfile <properties file>] [-describe] [-dryrun]

[-validate full | -validate warn | -validate none]

[-schema <schema file>]

[-lib <class path>]

[-preprocess] [-warn]

Description: execute the build plan qsbuild.qsb in the current directory (unless a different buildplan is specified using the -input option).

Optional arguments

EffectOption

Assign the given value to the named parameter or otherproperty. (Note that there is no space after the -D.)

-D<name>=<value>

Instead of executing the build plan, describe it, listing themain targets and their descriptions.

-describe

Execute the build plan as usual, but display data-buildcommands instead of executing them. (Any tasks that arenot data-build commands are executed as usual.)

-dryrun

Use the specified build plan instead of the fileqsbuild.qsb in the current directory.

-input <build plan>

Use the specified class path as an additional search pathfor Java classes.

-lib <class path>

Validate the build plan, and generate an intermediate buildfile.

-preprocess

Managing a Spectrum Miner data build

EffectOption

Assign values to parameters or other properties, by reading<name>=<value> pairs from successive lines of theproperties file.

-propertyfile <properties file>

Validate the build plan against the specified schema (inRELAX NG syntax) instead of the schema referred to in the

-schema <schema file>

build plan. You can use a URI to specify the location of anon-local schema file.

Do not build the specified targets, even if other targetsdepend on them.

-skip <target> [, <target> ...]

Build the specified targets instead of the default target.-targets <target> [, <target> ...]

Validate the build plan against its schema, executing theplan only if it is valid (the default if you do not specify a-validate option).

-validate fail

Do not validate the build plan against its schema.-validate none

Validate the build plan against its schema; if there are errors,issue warnings, but attempt to execute the plan regardlessof its validity.

-validate warn

Rather than aborting the build plan when a target fails tobuild, issue a warning, and attempt to build other targetsthat do not depend on the failed target.

Besides these command-line options, qsbuild accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The build plan describes how to build an analysis dataset or deploy a model. It containsa number of "targets" with dependencies between them, and the sequences of operationsrequired to build the targets. The file uses an XML-based file format. For more informationon the Data Build Manager and the format of build plans, see the Spectrum Miner DataBuild Manager Reference Guide.

• You can also access most of the functionality of qsbuild through the Data Build Manager, available from Spectrum Miner.

Examples Execute a data build using the build plan qsbuild-starter.qsb:

qsbuild -input qsbuild-starter.qsb

Only execute the Clean task within the build plan qsbuild.qsb:

qsbuild -targets Clean -input qsbuild.qsb

Execute the build plan qsbuild.qsb with the Month parameter set to June:

qsbuild -DMonth=June -input qsbuild.qsb

See also

4 - Commands forimporting andexporting data

In this section

qsdbaccess 28qsimportdb 29qsdbcreatetable 31qsdbinsert 33qsdbupdate 35qsgenfdd 37qsimportflat 39qsexportflat 41qsimportstat 44qsexportstat 46qsimportfocus 48How Spectrum Miner imports database types 50Date formats 52

qsdbaccess

Synopsis qsdbaccess {,-add

| -list

| -test <UDC>

| -delete <UDC>,}[-system]

Description: interactively create a UDC (User Database Connection), list UDCs, test a UDC, ordelete a UDC.

Using -add, qsdbaccess prompts you to enter a DSN, username, and password.

UDCs are used in database-related data-build commands and other parts of Spectrum Miner toconnect to databases. See the Spectrum Miner Administration Guide for further information.

Optional arguments

EffectOption

Apply the operation (add, list, test, or delete) to the systemUDCs instead of your personal UDCs.

When you use -add with -system, qsdbaccessadditionally prompts you for a URL, a class, and a DRL.

-system

You should normally accept the defaults for these threeoptions.

The -system option is intended for use only by systemadministrators.

Besides this command-line option, qsdbaccess accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • A UDC is encrypted: if it contains user-authentication details, only you can use them toaccess the database in question.

• When you create a UDC, qsdbaccess overwrites any existing UDC of the same name(and therefore overwrites the associated user-authentication details).

• The name of a UDC is case-significant (fred and Fred are distinct), even when the DSNis not case-significant to the ODBC driver.

• Using SpectrumMiner, you can access the functionality of qsdbaccess through the EditDatabase Connection dialog box. .

Examples: create a UDC for the user user to access the retail database DSN:

qsdbaccess Please enter required text at the ">" prompt:DSN> retailUser> userPassword>Updating connection user@retailUpdate OKTesting connection user@retailODBC OKJDBC OKConnection OK

Create a system UDC (as the administrator user) for any user to access the retail database DSN:

qsdbaccess -system Please enter required text at the ">" prompt:DSN> retailUser>[Just pressing ENTER will use the system default]URL>Class>DRL>Updating connection dbuser@dbserver:dbUpdate OK

See also

qsdbcreatetable on page 31

qsdbinsert on page 33

qsdbupdate on page 35

qsimportdb on page 29

qsimportdb

Synopsis qsimportdb -udc <UDC>

-table <database table or view>

-fields <column>> [, <column> ...] | -fields @<fields file>]

-xfields <column> [, <column> ...] | -xfields @<fields file>] [-catalog<catalog name>] [-schema <schema name>]-output <destination focus> [-force]qsimportdb -udc <UDC>

-sql <SQL file>

-output <destination focus> [-force]

Description: create the destination focus from the specified database table or view, which is in thedatabase with the given UDC [see qsdbaccess on page 28].

Alternatively, create the focus from the result of an SQL SELECT statement (given in an SQL file).

Optional arguments

EffectOption

Locate the table in this database catalog.-catalog <catalog name>

Create fields in the destination focus corresponding only tothe specified columns in the table.

-fields <column> [, <column> ...]

Create fields in the destination focus corresponding only tocolumns in the table that are listed (one per line) in the fieldsfile.

-fields @<fields file>

See The -force command-line option on page 20.-force

Locate the table in this database schema.-schema <schema name>

Create fields in the destination focus corresponding to allcolumns in the table except the specified columns.

-xfields <column> [, <column> ...]

Create fields in the destination focus corresponding to allcolumns in the table except the columns that are listed (oneper line) in the fields file.

-xfields @<fields file>

Besides these command-line options, qsimportdb accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • If you specify an identifier (for example, a field or table name) as a command-line argument,qsimportdb passes it to the ODBC driver in quoted form, and your database may treatthe identifier as case-significant.

• SQL specified with the -sql option is passed as-is to the database as a single statement;a terminating semicolon is not required (and may be invalid syntax for some databasesystems).

• Using qsimportdb, it is possible to create a focus that does not have Spectrum Miner— compliant field names. To convert the focus into a form that you can use in DecisionStudio or with other data-build commands, you can rename fields using qsrenamefields.

• Using Spectrum Miner, you can access most of the database-query functionality ofqsimportdb through the Import from Query dialog box .

Examples Create a focus,RetailCustApril.ftr, containing all the records from theRetailCustApril table, in the DSN referred to by the user@retail UDC:

qsimportdb -table RetailCustApril-udc user@retail -output RetailCustApril.ftr

Create a focus, RetailTransAprilCA.ftr, containing only records from the RetailCustApriltable for which the value of the PaymentMethod column equals CA:

Given the SQL query file, select-statement.sql, containing the following:

select *from RetailTransAprilwhere (PaymentMethod = 'CA')

Apply this query to create a new focus, RetailTransAprilCA.ftr:

qsimportdb -sql select-statement.sql-udc user@retail -output RetailTransAprilCA.ftr

See also

qsdbcreatetable

Synopsis qsdbcreatetable -udc < UDC >

-focus <template focus> [-subfocus <subfocus>]

-table <destination table>

-fields <field> [, <field> ...] | -fields @<fields file> | -tags <tag> [,<tag> ...]]

-xfields <field> [, <field> ...] | -xfields @<fields file> | -xtags <tag>[, <tag> ...]]

[-catalog <catalog name>]

[-schema <schema name>]

[-output <SQL output file>]

qsdbcreatetable -udc <UDC>

-sql <SQL file>

Description: use the given template focus to define the field names and types for a new, emptydatabase table with the given table name, in the database with the given UDC [see qsdbaccesson page 28].

Alternatively, create a new table using an SQL CREATE TABLE statement (given in an SQL file).

Optional arguments

EffectOption

Create the table in this database catalog.-catalog <catalog name>

Create table columns corresponding only to the specifiedfields in the focus.

-fields <field> [, <field> ...]

Create table columns corresponding only to fields in thefocus that are listed (one per line) in the fields file.

Instead of creating a table in the database, create an SQLoutput file containing the appropriate CREATE TABLEstatement.

You can later use the SQL file as input toqsdbcreatetable (with the -sql option).

[-output <SQL output file>]

Create the table in this database schema.-schema <schema name>

Use the specified subfocus of the template focus.-subfocus <subfocus>

Create table columns corresponding only to the fields in thefocus that have the specified tags.

-tags <tag> [, <tag> ...]

Create table columns corresponding to all fields in the focusexcept the specified fields.

-xfields <field> [, <field> ...]

Create table columns corresponding to all fields in the focusexcept the fields that are listed (one per line) in the fieldsfile.

Create table columns corresponding to all fields in the focusexcept the fields that have the specified tags.

-xtags <tag> [, <tag> ...]

Besides these command-line options, qsdbcreatetable accepts the options common to alldata-build commands [see Standard command-line options on page 18].

Note: • If you specify an identifier (for example, a field or table name) as a command-line argument,qsdbcreatetable passes it to the ODBC driver in quoted form, and your database maytreat the identifier as case-significant.

• SQL specified with the -sql option is passed as-is to the database as a single statement;a terminating semicolon is not required (and may be invalid syntax for some databasesystems).

Example

Create an empty database table, RETAILAGGREGATIONS, from theRetailAggregationsApril.ftr focus, connecting via the retail UDC:

qsdbcreatetable -focus RetailAggregationsApril.ftr-table RETAILAGGREGATIONS -udc user@retail

See also

qsimportdb on page 29

qsdbinsert

Synopsis qsdbinsert-udc <UDC>

-input <source focus> [-subfocus <subfocus>]

Description: insert records from the source focus into the database table (using an SQL INSERTstatement). Locate the table in the database with the given UDC [see qsdbaccess on page 28].

Optional arguments

EffectOption

Insert data only from the specified fields in the focus.-fields <field> [, <field> ...]

Insert data only from fields in the focus that are listed (oneper line) in the fields file.

Use the specified subfocus of the source focus.-subfocus <subfocus>

Insert data only from fields in the focus that have thespecified tags.

Insert data from all fields in the focus except the specifiedfields.

Insert data from all fields in the focus except the fields thatare listed (one per line) in the fields file.

Insert data from all fields in the focus except the fields thathave the specified tags.

Besides these command-line options, qsdbinsert accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • To create an empty table into which you can insert records, use qsdbcreatetable.• The qsdbinsert command inserts additional records into a database table rather than

updating existing records. It doesn't check for duplicate records (or records with duplicatekeys), though your database may be configured to enforce such checking.

• The fields in the focus (or those specified using the -fields or -xfields options) mustcorrespond in name and be type-compatible with the columns in the database table.

• If you specify an identifier (for example, a field or table name) as a command-line argument,qsdbinsert passes it to the ODBC driver in quoted form, and your database may treatthe identifier as case-significant.

Example

Insert data from the RetailAggregationsApril.ftr focus into the RETAILAGGREGATIONSdatabase table:

qsdbinsert -input RetailAggregationsApril.ftr-table RETAILAGGREGATIONS -udc user@retail

See also

qsdbupdate

Synopsis qsdbupdate -udc <UDC>

{-keys <key field> [, <key field>] [, <key field>] | -key @<key file>}

Description: update records in the database table with data from the specified fields in the sourcefocus (using an SQL UPDATE statement). Each record in the focus updates all records with the samekey-field values in the database table. Locate the table in the database with the given UDC [seeqsdbaccess on page 28].

Optional arguments

EffectOption

Update the table using only data from the specified fieldsin the focus.

Update the table using only data from fields in the focus thatare listed (one per line) in the fields file.

EffectOption

Update the table using only data from the fields in the focusthat have the specified tags.

Update the table using data from all fields in the focus exceptthe specified fields.

Update the table using data from all fields in the focus exceptthe fields that are listed (one per line) in the fields file.

Update the table using data from all fields in the focus exceptthe fields that have the specified tags.

Besides these command-line options, qsdbupdate accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The qsdbupdate command does not insert any additional records into a database table.• The fields in the focus (or those specified using the -fields or -xfields options) must

correspond in name and be type-compatible with columns in the database table. However,you don't have to update the values in all columns of the table.

• If you specify an identifier (for example, a field or table name) as a command-line argument,qsdbupdate passes it to the ODBC driver in quoted form, and your database may treatthe identifier as case-significant.

Example

Update only the fields numberPurchases, averagePurchase, and totalPointsRedeemedin the RETAILAGGREGATIONS table using the RetailAggregationsMay.ftr focus:

qsdbupdate -udc user@retail -keys CustomerID-input RetailAggregationsMay.ftr -table RETAILAGGREGATIONS-fields "numberPurchases, averagePurchase,totalPointsRedeemed"

See also

qsgenfdd

Synopsis qsgenfdd -input <flat file>

[-template <template FDD file>]

[-headers]

[-separator <field separator>]

[-null <null marker>]

[-stringmarker <string marker>]

[-datemarker <date marker>]

[-dateformat <date format>] [-defaultday <day number>] [-defaultmonth<month number>] [-comment <table comment>]

Description: create a flat-data description file (.fdd file) for the flat file, by examining a sample ofthe records to work out the file format. Name the flat-data description file using the basename ofthe flat file and create it in the same directory as the flat file.

Optional arguments

EffectOption

Include this table comment in the flat-data description file.-comment <table comment>

Override the automatically-chosen date format [see Dateformats on page 52].

-dateformat <date format>

Override the automatically-chosen date-marker character.-datemarker <date marker>

Include this day number in dates that don't have a daycomponent.

-defaultday <day number>

Include this month number in dates that don't have a monthcomponent.

-defaultmonth <month number>

Include a description of the data as having an initial headersline, which contains field names.

-headers

Override the automatically chosen null-marker string.-null <null marker>

EffectOption

Override the automatically chosen field-separator character.-separator <field separator>

Override the automatically chosen string-marker character.-stringmarker <string marker>

Base the flat-data description file on this existing templateflat-data description file.

-template <template FDD file>

See Spectrum Miner Online Help for further information on flat-file options.

Besides these command-line options, qsgenfdd accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • If a valid flat-data description file already exists, qsgenfdd does nothing.• In most cases, qsgenfdd automatically chooses appropriate values for the metadata

describing a flat file. However, because it only scans a sample of the data in the flat file —and because there may be ambiguities in the flat-file format — it can guess wrongly. Evenso, you normally only have to override one or two parameters to arrive at a correctinterpretation of the data file.

• You can control the size of samples used in automatic format detection using the Flat FileDatabase preferences data format sample size and data sample size (seeSpectrum Miner Online Help).

• For automatic format detection to work, fields in the flat file must be delimited by a separatorcharacter rather than being distinguished by position (that is, the file cannot be in fixed-widthformat).

• If you use a template FDD file, it must contain a description of all the fields in a tablecomponent.

• Formatting options specified on the command line override global formatting options in atemplate FDD file, which in turn override automatically detected global formatting options.Command-line options do not override field-specific options.

• You can also access the functionality of qsgenfdd through theNew Focuswizard availablefrom Decision Studio, or Spectrum Miner.

Examples

Create a flat-data description file for the RetailCustMay.txt text file:

qsgenfdd -input RetailCustMay.txt

Use the resulting FDD file to create a new focus, RetailCustMay.ftr:

qsimportflat -input RetailCustMay.fdd -output RetailCustMay.ftr

Use a custom data-format string to create a FDD file for the RetailCustMay.txt text file:

qsgenfdd -dateformat "%d-%b-%Y %H:%M:%S" -input RetailCustMay.txt

See also

qsexportflat on page 41 qsimportflat on page 39

qsimportflat

Synopsis qsimportflat

-input <flat file>

-encoding <encoding>

-fields <field> [, <field> ...] | -fields @<fields file>]

[-maxwarnings <number of warnings>] | [-nowarnings]

[-maxerrors <number of errors>]

Description Create the destination focus from data in the flat file (which may be a text file or aflat-data description file).

Optional arguments

EffectOption

Input file encoding (UTF-8, Shift-JIS, LATIN1, EUC_JP,UTF16-LE, UTF_BE etc.). .

-encoding <encoding>

Create fields in the destination focus corresponding only tothe specified fields in the flat file.

Create fields in the destination focus corresponding only tofields in the flat file that are listed (one per line) in the fieldsfile.

Abort the flat-file import if the process generates more thanthe specified number of errors regarding the format of theflat file.

-maxerrors <number of errors>

EffectOption

Abort the flat-file import if the process generates more thanthe specified number of warnings regarding the format ofthe flat file.

-maxwarnings <number of warnings>

Do not generate any warnings regarding the format of theflat file.

-nowarnings

Create fields in the destination focus corresponding to allfields in the flat file except the specified fields.

Create fields in the destination focus corresponding to allfields in the flat file except the fields that are listed (one perline) in the fields file.

Besides these command-line options, qsimportflat accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • You should specify a flat-data description (FDD) file in preference to a text file as the inputto qsimportflat. You can create an FDD file from a text file using qsgenfdd.

• By default, the Flat File Database preference permit data format errors is set totrue (see SpectrumMiner Onlline Help). This means that all flat-file format errors are treatedas warnings and the -maxerrors option has no effect.

• The -maxerrors, -maxwarnings, and -nowarnings options override the settings ofthe Flat File Database preferences extraction max errors, extraction maxwarnings, and ignore extraction warnings respectively.

• You can also access the functionality of qsimportflat through the New Focus wizardavailable from Decision Studio or Spectrum Miner.

Examples

Create a new focus, RetailTransApril.ftr, containing all the fields from theRetailTransApril.txt flat file:

qsimportflat -input RetailTransApril.txt -output RetailTransApril.ftr

Create a new focus, RetailCustAprilSomeFields.ftr, containing fields CustomerID,StartDate, Age, Postcode and Gender from the RetailCustApril.txt flat file, while alsocreating an FDD file:

qsgenfdd -input RetailCustApril.txtqsimportflat -fields "CustomerID, StartDate, Age, Postcode, Gender"-input RetailCustApril.fdd-output RetailCustAprilSomeFields.ftr

then create another focus RetailCustAprilOtherFields.ftr, containing all fields exceptCustomerID and Postcode, from the flat file RetailCustApril.txt, using the same FDD file:

qsimportflat -xfields "CustomerID, Postcode"-input RetailCustApril.fdd-output RetailCustAprilOtherFields.ftr

See also

qsexportflat on page 41

qsexportflat

Synopsis qsexportflat

-output <flat file>

[-alwaysquotestrings]

[-records <FDL expression> | -records @<FDL file>]

[-template <template FDD file>]

[-fdd]

[-headers]

[-separator <field separator>]

[-null <null marker>]

[-stringmarker <string marker>]

[-datemarker <date marker>]

[-dateformat <date format>]

[-defaultday <day number>]

[-defaultmonth <month number>]

[-fixedformat]

Description create the flat file from data in the source focus.

Optional arguments

EffectOption

Quote all string values in the flat file, rather than only quotingstrings that contain special characters (such as theseparator).

-alwaysquotestrings

Use this format [see Date formats on page 52] for datesin the flat file.

-dateformat <date format>

Use this marker character to introduce dates in the flat file.-datemarker <date marker>

Include this day number in any generated flat-datadescription file (to describe dates formatted without a daycomponent).

-defaultday <day number>

Include this month number in any generated flat-datadescription file (to describe dates formatted without a monthcomponent).

-defaultmonth <month number>

As well as creating the flat file, create a flat-data descriptionfile (.fdd file) to describe the data format.

Create fields in the flat file corresponding only to thespecified fields in the source focus.

Create fields in the flat file corresponding only to fields inthe source focus that are listed (one per line) in the fieldsfile.

Use a fixed number of characters for each field (chosen toaccommodate all possible values of that datatype).

You can use this in conjunction with an empty field separatorto create a flat file in which fields are distinguished byposition instead of using a separator character.

-fixedformat

Include an initial headers line in the flat file, containing fieldnames.

-headers

Use this null-marker string instead of a blank to indicateoccurrences of the null value in the flat file.

-null <null marker>

Create records in the flat file corresponding only to recordsin the source focus for which the numeric FDL expressionis non-zero ("true").

-records <FDL expression>

EffectOption

Create records in the flat file corresponding only to recordsin the source focus for which the FDL expression in thespecified file is non-zero ("true").

-records @<FDL file>

Use this field-separator character instead of a comma.-separator <field separator>

Use this string-marker character instead of double quotationmarks.

-stringmarker <string marker>

Create fields in the flat file corresponding only to the fieldsin the focus that have the specified tags.

Base the flat-file format on this existing template flat-datadescription file.

-template <template FDD file>

Create fields in the flat file corresponding to all fields in thesource focus except the specified fields.

Create fields in the flat file corresponding to all fields in thesource focus except the fields that are listed (one per line)in the fields file.

Create fields in the flat file corresponding to all fields in thesource focus except the fields that have the specified tags.

See Spectrum Miner Online Help for further information on flat-file options.

Besides these command-line options, qsexportflat accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

• If you use an FDL selection expression (to export only a subset of the records in the focus), youmust export any fields used in the expression.

• If you specify a flat file with the filename extension .fdd, qsexportflat treats it as a flat-datadescription file (regardless of whether you specify the -fdd option) and creates a correspondingflat file with the filename extension .txt.

• If you use a template FDD file, it must contain a description of all the fields in a table component.• Formatting options specified on the command line override global formatting options in a template

FDD file. Command-line options do not override field-specific options.• If a flat file of the same name as the output file already exists, qsexportflat makes a backup copy

of the existing file (with "~" appended to the filename) rather than simply overwriting it.

• You can access most of the functionality of qsexportflat through the Export Focus dialog box inDecision Studio , or the Export Text wizard available from Spectrum Miner .

Examples

Create a new flat file, RetailTransApril.csv that contains all the records from theRetailTransApril.ftr focus:

qsexportflat -input RetailTransApril.ftr -output RetailTransApril.csv

Create a new flat file, RetailTransAprilDates.csv that contains only the fields CustomerIDand PurchaseDate from the RetailTransApril.ftr focus, using a particular date format:

qsexportflat -fields "CustomerID, PurchaseDate"-dateformat "%e %h %Y"-input RetailTransApril.ftr -output RetailTransAprilDates.csv

Given the template flat-data description file, template.fdd, containing the following:

separator |null (###)dateformat ((%e %h %Y))skip headertable \CustomerID string(18) \PurchaseDate date \Store integer \Amount real \PaymentMethod string(2) \PointsRedeemed integer

Apply this template to create a flat file, RetailTransAprilDates.txt:

qsexportflat -template template.fdd-input RetailTransApril.ftr -output RetailTransAprilDates.txt

See also

qsgenfdd on page 37 qsimportflat on page 39

qsimportstat

Synopsis qsimportstat

-input <source dataset>

[-type <dataset type>]

-fields <field> [, <field> ...] | -fields @<fields file>][-metadata<metadata file>]

Description: create the destination focus from Excel datasets.

Optional arguments

EffectOption

Create fields in the destination focus corresponding only tothe specified fields in the source dataset.

Create fields in the destination focus corresponding only tofields in the source dataset that are listed (one per line) inthe fields file.

Create the metadata file from the source Excel dataset.-metadata <metadata file>

Interpret the Excel dataset as the specified dataset type,overriding the default interpretation (which depends on thefilename extension).

-type <dataset type>

Create fields in the destination focus corresponding to allfields in the source dataset except the specified fields.

Create fields in the destination focus corresponding to allfields in the source dataset except the fields that are listed(one per line) in the fields file.

Besides these command-line options, qsimportstat accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • If you attempt to import from an Excel dataset and the metadata file already exists, theoperation fails. To avoid this, use the -force argument.

• Using Spectrum Miner, you can access most of the functionality of qsimportstatthrough the Focus Import dialog box.

Examples

Create a new focus RetailTransApril.ftr, containing all the fields from the Excel fileRetailTransApril.xlsx:

qsimportstat -input RetailTransApril.xlsx-output RetailTransApril.ftr

Create a new focus RetailCustAprilSomeFields.ftr, containing the fields CustomerID,StartDate, Age, Postcode, and Gender from the Excel file RetailCustApril.xlsm:

qsimportstat -fields "CustomerID, StartDate, Age, Postcode, Gender"-input RetailCustApril.xlsm -type xls-output RetailCustAprilSomeFields.ftr

See also

qsdescribestat on page 104

qsexportstat on page 46

qsexportstat

Synopsis qsexportstat

-output <destination dataset> [-overwrite]

-xfields <field> [, <field> ...] | -xfields @<fields file> | -xtags <tag>[, <tag> ...]] [-records <FDL expression> | -records @<FDL file>]

Description: create the destination dataset from an Excel focus.

Optional arguments

EffectOption

Create fields in the destination dataset corresponding onlyto the specified fields in the source focus.

Create fields in the destination dataset corresponding onlyto fields in the source focus that are listed (one per line) inthe fields file.

EffectOption

Allow a new dataset specified using the -output optionto overwrite an existing file.

Without the-overwrite option, if the dataset specifiedusing-output already exists, the command does nothing(except issue a warning).

-overwrite

Create records in the destination dataset correspondingonly to records in the source focus for which the numericFDL expression is non-zero ("true").

Create records in the destination dataset correspondingonly to records in the source focus for which the FDLexpression in the specified file is non-zero ("true").

Create fields in the destination dataset corresponding onlyto the fields in the source focus that have the specified tags.

Create an Excel dataset of the specified dataset type,overriding the default interpretation (which depends on thefilename extension).

Create fields in the destination focus corresponding to allfields in the source dataset except the specified fields.

Create fields in the destination focus corresponding to allfields in the source dataset except the fields that are listed(one per line) in the fields file.

Create fields in the destination focus corresponding to allfields in the source dataset except the fields that have thespecified tags.

Besides these command-line options, qsexportstat accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Using Spectrum Miner, you can access most of the functionality of qsexportstat throughthe Export for Another Program dialog box.

Examples Create a new Excel file RetailTransAprilStore800.xlsx, that contains a subsetof records (only purchases from store 800) and only spend fields, from the focusRetailTransApril.ftr:

qsexportstat -input RetailTransApril.ftr-output RetailTransAprilStore800.xlsx-fields "CustomerID, PurchaseDate, Store, Amount"-records "Store = \"800\""

See also

qsdescribestat on page 104 qsimportstat on page 44

qsimportfocus

Synopsis qsimportfocus

-xfields <field> [, <field>> ...] | -xfields @<fields file> | -xtags <tag>[, <tag> ...]]

[-records <FDL expression> | -records @<FDL file>] [-preservetypes]

Description: create the destination focus from the data in the source focus. Do not copy themetadata.

Optional arguments

EffectOption

Create fields in the destination focus corresponding only tothe specified fields in the source focus.

Create fields in the destination focus corresponding only tofields in the source focus that are listed (one per line) in thefields file.

EffectOption

Preserve legacy datatypes from a focus created using anearlier version of Spectrum Miner.

-preservetypes

Create records in the destination focus corresponding onlyto records in the source focus for which the numeric FDLexpression is non-zero ("true").

Create records in the destination focus corresponding onlyto records in the source focus for which the FDL expressionin the specified file is non-zero ("true").

Create fields in the destination focus corresponding only tothe fields in the source focus that have the specified tags.

Create fields in the destination focus corresponding to allfields in the source focus except the specified fields.

Create fields in the destination focus corresponding to allfields in the source focus except the fields that are listed(one per line) in the fields file.

Create fields in the destination focus corresponding to allfields in the source focus except the fields that have thespecified tags.

Besides these command-line options, qsimportfocus accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • To create a copy of a focus including metadata, use qscopy.• The destination focus is independent of the source focus, that is, it does not share underlying

data.• If the source focus was created using an earlier version of SpectrumMiner, it may contain

fields with legacy datatypes. By default, qsimportfocus converts such fields into fieldswith the standard integer, real, date, and string datatypes. You can force qsimportfocusto retain the original datatypes in the destination focus, by using the -preservetypesoption.

• Using SpectrumMiner, you can access most of the functionality of qsimportfocus throughthe Export as Focus dialog box .

Examples Create a new focus RetailTransAprilStandardTypes.ftr, that contains all therecords from the focus RetailTransApril.ftr, but converts fields with legacy datatypes to thestandard datatypes:

qsimportfocus -input RetailTransApril.ftr-output RetailTransAprilStandardTypes.ftr

Create a new focus RetailTransAprilStore800.ftr, that contains a subset of records (onlypurchases from store 800) and only spend fields, from the focus RetailTransApril.ftr:

qsimportfocus -input RetailTransApril.ftr-output RetailTransAprilStore800.ftr-fields "CustomerID, PurchaseDate, Store, Amount"-records "Store = \"800\""

Create a new focus, RetailTransApril2Customers.ftr, containing only the transactions forcustomers 20450000000036004 and 20450000000043009:

qsimportfocus -records "strmember(CustomerID, \"20450000000036004\",\"20450000000043009\")"-input RetailTransApril.ftr-output RetailTransApril2Customers.ftr

How Spectrum Miner imports database types

Importing to Spectrum Miner from a database

Spectrum Miner maps the following database types into its datatypes, based on the size of thedatabase type and the Databases preference settings.

SpectrumMiner field type

preservenumeric typeson import

represent largeintegers asstrings

PrecisionScaleDatabase type

stringchar, char2,varchar, varchar2,nchar, nchar2,nvarchar,nvarchar2

SpectrumMiner field type

preservenumeric typeson import

represent largeintegers asstrings

PrecisionScaleDatabase type

datedate, time,timestamp,datetime,smalldatetime

integerint, integer, tinyint,smallint, byteint, bit

string

bigint

string

< 0decimal, dec,numeric, number

string

integer

> 10 15

0decimal, dec,numeric, number

realfalsefalse

true> 15

string

realfloat, double, real

realtrue

> 0decimal, dec,numeric, number

string

realsmallmoney

Note: • In rows marked with , there may be some loss of precision.• In rows marked with , values too large to store as integer will be converted to Null.

Exporting from Spectrum Miner to a database

The database types that Spectrum Miner exports depend on both the target database and theODBC driver. Spectrum Miner suggests a suitable type for the database that you specify, whilethe ODBC driver interprets that suggestion into one of types in that database.

Date formats

When working with flat files and FDL expressions, you can specify how Spectrum Miner shouldinterpret or export dates and times, using either a standard date format [see Standard date formatson page 52] or a customized date format [see Customized date formats on page 52].

Standard date formats

Date formatName

MM/DD/YYYY:hh:mm:ss, for example, 12/31/2000:13:36:59American

DD/MM/YYYY:hh:mm:ss, for example, 31/12/2000:13:36:59European

YYYY/MM/DD:hh:mm:ss, for example, 2000/12/31:13:36:59YMD

Note: • All three of the standard date formats also apply to dates without a time component andto pure times ("dates" with only a time component). For date-only and time-only dates, thetime and date parts respectively are omitted.

• American, European, and YMD formats also encompass data in the forms MMDDYYYY,DDMMYYYY, and YYYYMMDD respectively (for input only).

Customized date formats

You can specify a customized date format by using special character codes, each beginning witha percent sign %:

Component of formatted dateCode

The percent sign itself: %%%

Matches any single character%?

Matches any number of characters; for example, to ignoretrailing characters, use %Y/%m/%d%*

The day name, in abbreviated form (Mon, Tue, ...,Sun)

The day name, in full (Monday, Tuesday, ...,Sunday)

The month name, in abbreviated form (Jan, Feb,..., Dec)

The month name, in full (January, February,..., December)

(Input) Equivalent to %a %b %e %T %Y

(Output) The date and time in the standard representationfor the current locale

Equivalent to %a %b %e %T %Z %Y%C

The day of the month, using two digits (01, 02, ...,31)

Equivalent to %m/%d/%y%D

The day of the month, with single digits preceded by a space( 1, 2, ..., 31)

Equivalent to %b%h

The hour of the day, using the 24-hour clock (00, 02,..., 23)

Component of formatted dateCode

The hour of the day, using the 12-hour clock (01, 02,..., 12)

The day of the year, using three digits (Julian dates) (001,002, ..., 366)

The month number, using two digits (01, 02, ...,12)

The minute of the hour, using two digits (00, 01, ...,59)

AM or PM, according to the time of day%p

Equivalent to %I:%M:%S %p%r

Equivalent to %H:%M%R

The second of the minute, using two digits (00, 01,..., 61)

Equivalent to %H:%M:%S%T

The date formatted according to the x ordering preference%x

Equivalent to %x%X

The year of the century, using two digits (00, 01,..., 99)

The year, using four digits%Y

The name of the current locale's time zone%Z

• To use %D or %y, you must also specify either a pivot year or a maximum future year (but not both).

If you set a pivot equal to a valid year Y, Spectrum Miner interprets two-digit years as yearsbetween Y - 99 and Y. For example, if you set the preference to 2077, SpectrumMiner interpretsthe two-digit years 76, 77, and 78 as 2076, 2077, and 1978 respectively.

If you set a maximum future year equal to a positive value n and the current year is Y, SpectrumMiner interprets all two-digit years as years between Y + n - 99 and Y + n. For example, ifyou set the preference to 50 and the current year is 2002, SpectrumMiner interprets the two-digityears 51, 52, and 53 as 2051, 2052, and 1953 respectively.

• If you set the write in locale-specific format preference, the codes %a, %A, %b, %B, %C, %p,%x, %X, and %Z may produce different results when displaying dates (but not when interpretingthem).

• As well as /, any other character (except for the field separator or null marker) may be used as aseparator within the date.

• For a date format that lacks an explicit day or month component, you can specify a default byusing the "day" or "month" qualifier.

• The special character codes are those of the standard C library function strftime.

Example R–1: Date Formatting Strings

Date format stringDate example

European02/04/2003

27/10/1940

American04/02/2003

10/27/1940

YMD2003/04/02

1940/10/27

%d-%b-%Y02-Apr-2003

27-Oct-1940

((%d %b %Y))02 Apr 2003

27 Oct 1940

((%Y-%m-%d %H:%M))2003-04-02 13:25

1940-10-27 08:23

%Y-%m-%d:%H:%M2003-04-02:13:25

1940-10-27:08:23

((%e %b %Y))2 Apr 2003

27 Oct 1940

((%B %e, %Y))April 2, 2003

((%d/%m/%Y %H:%M))02/04/2003 13:25

27/10/1940 08:23

((%Y-%m-%d %H:%M:%S%*))2003-04-02 13:25:15000000

((%B %d, %Y))April 02, 2003

%e-%b-%Y2-Apr-2003

27-Oct-1940

(European pivot=2010)02/04/03

27/10/40

(American pivot=2010)04/02/03

10/27/40

(YMD pivot=2010)03/04/02

40/10/27

(%d-%b-%y pivot=2010)02-Apr-03

27-Oct-40

((%B %d, %y) pivot=2010)April 02, 03

(%e-%b-%y pivot=2010)2-Apr-03

27-Oct-03

((%B %e, %y) pivot=2010)April 2, 03

((%I:%M %p))01:25 PM

08:23 AM

%H:%M13:25

5 - Commands forprocessing foci

In this section

qssort 59qsderive 60qsmeasure 63qstrack 67qsselect 70qsrenamefields 73qsexportmetadata 76qsimportmetadata 76qsupdate 81

qssort

Synopsis qssort

-input <source focus>> [-subfocus <subfocus>]

{-keys <key field> [, <key field>] [, <key field>]| -key @<key file>}}qssort-input <source focus> [-subfocus <subfocus>]

-check

{-keys <key field> [, <key field>] [, <key field>]| -key @<key file>}}

Description: sort the records of the source focus by the key fields, to produce the destination focus.

Alternatively, check that the source focus is already sorted.

Optional arguments

EffectOption

Besides this command-line option, qssort accepts the options common to all data-build commands[see Standard command-line options on page 18].

Note: • You can use at most three key fields with qssort.• Spectrum Miner collation order, as used by qssort, may differ in detail from the collation

order used by databases or third-party file-sorting utilities. Where it is important for a focusto be sorted, you should check the sort order — and if necessary re-sort — using qssort.

• It is, in general, much faster to check the sort order of a focus than to sort it.• It is, in general, slightly faster to sort a focus if it is already nearly sorted.• Even when the focus (or subfocus) contains only a selection of records, the destination

focus contains all the records from the source focus.• If you use the -verbose option, qssort reports the number of unique key values, and the

number of records containing the most frequent duplicate key value.• The command qssort requires temporary disk space comparable to the size of the source

focus as well as space for the destination focus.• The destination focus is independent of the source focus, that is, it does not share underlying

• Using SpectrumMiner, you can access most of the functionality of qssort through the SortFocus wizard .

Examples Check that the RetailCustApril.ftr focus is sorted by the CustomerID field:

qssort -check -input RetailCustApril.ftr -keys CustomerID

Create a new focus, RetailTransAprilSorted.ftr, containing all the records from theRetailTransApril.ftr focus, sorted by the CustomerID field:

qssort -output RetailTransAprilSorted.ftr-input RetailTransApril.ftr -keys CustomerID

qsderive

Synopsis qsderive -derivations <derivations file>

[-macro <name>=<value>[-macro <name>=<value> ...] | -macro @<name>=<macrofile>]

[-output <destination focus>] [-force]

[-random <integer seed>]

[-warn] [-savexml]

Description: copy all fields (by default) from the source focus to the destination focus; derive fieldsin the destination focus according to the field definitions in the derivations file, which contains createstatements or corresponding XML representations. By default, append the derived fields to thesource focus.

Optional arguments

PurposeCommand

EffectOption

Derive only the specified fields.-fields <field> [, <field>> ...]

PurposeCommand

Derive only fields that are listed (one per line) in the fieldsfile.

-fields @<fields file>>

See The -macro command-line option on page 21.-macro <name>=<value>

See The -macro command-line option on page 21.-macro @<macros file>

Instead of appending derived fields to the source focus,copy the source focus to the destination focus and appendthe derived fields to the destination focus.

-output <destination focus>

Use the integer seed, instead of 0, for any FDLrandom-number functions occurring in derivationexpressions.

-random <integer seed>

In addition to deriving one or more fields, write the derivationexpression(s) in TML and XML formats to files

-savexml

<output>.tml and <output>.xml, in the samedirectory as the destination focus, where <output> isthe basename of the destination focus.

If a field derivation fails, display a warning and continue toderive other fields (instead of stopping).

Derive all fields in the derivations file with the exception ofthe specified fields.

-xfields <field>> [, <field> ...]

Derive all fields in the derivations file with the exception ofthe fields that are listed (one per line) in the fields file.

Besides these command-line options, qsderive accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The order in which derived fields appear in the destination focus is the order in which theyappear in the derivations file.

• If you specify a destination focus, it shares underlying data with the source focus.• XML for qsderive takes the form of a <derivations> element, containing <field>

elements, each of which must in turn contain an FDL expression in an <fdl> element.

Each <fdl> element optionally includes an integer seed attribute to specify therandom-number seed for that FDL expression.

• You can use the XML file saved by qsderive using the -savexml option as a derivationsfile.

• Using Spectrum Miner, you can access most of the functionality of qsderive throughthe Derive Fields wizard .

Example

Given the derivations file, derivations-purchases.tml, containing the following:

create TransMonth := month(PurchaseDate);create CashRecd := if PaymentMethod = "CA" then Amount else 0;

apply these derivations to the RetailTransApril.ftr focus, to create a new focus,RetailTransAprilDeriv.ftr, containing all the fields of the RetailTransApril.ftr focus,plus two derived fields:

qsderive -derivations derivations-purchases.tml-input RetailTransApril.ftr -output RetailTransAprilDeriv.ftr

Apply all but the first of these derivations to the focus RetailTransApril.ftr, to create a newfocus RetailTransAprilDerivAllButFirstOne.ftr, containing all the fields of theRetailTransApril.ftr focus plus all but the first derived field:

qsderive -xfields TransMonth -derivations derivations-purchases.tml-input RetailTransApril.ftr-output RetailTransAprilDerivAllButFirstOne.ftr

Create a new focus,RetailCustAprilScored.ftr, containing all the fields of theRetailCustApril.ftr focus, plus the score fields created in Decision Studio's Scorecard Builderand saved as QMML file, derivations-score.qmml:

qsderive -derivations derivations-score.qmml-input RetailCustApril.ftr-output RetailCustAprilScored.ftr

Given the XML derivations file derivations-purchases.xml containing the following:

<?xml version="1.0" encoding="UTF-8"?><derivations xmlns="http://www.quadstone.com/xml"><field name="TransMonth"><fdl>month(PurchaseDate)</fdl></field><field name="CashRecd"><fdl>if PaymentMethod = "CA" then Amount else 0</fdl></field><field name="Random"><fdl seed="123">rndUniform()</fdl>

</field></derivations>

apply these derivations to the focus RetailTransApril.ftr, to create a new focusRetailTransAprilDeriv.ftr, containing all the fields of the focus RetailTransApril.ftrplus two derived fields:

qsderive -derivations derivations-purchases.xml-input RetailTransApril.ftr -output RetailTransAprilDeriv.ftr

See also

Derivation specification for qsderive on page 368

qstrack on page 67

qsmeasure

Synopsis qsmeasure

-aggregations <aggregations file> [, <aggregations file> ...]

-input <source focus>

[-statistics <statistics file> [, <statistics file> ...]]

[-library <FDL functions file> [, <FDL functions file> ...]]

[-macro <name>=<value> [-macro <name>=<value> ...] | -macro @<name>=<macrofile>]

Description: aggregate records in the source focus, to produce fields in the destination focusaccording to the field definitions in the aggregations files. The aggregations files contain createstatements, or corresponding XML representations. Use the key fields to identify groups of recordsfor aggregation [see Using aggregation functions and the where and default clauses on page148].

Optional arguments

EffectOption

Consider only the specified fields from the source focus (tooptimize performance of the command by avoidingconsideration of unused fields).

Consider only fields from the source focus that are listed(one per line) in the fields file.

Include FDL function definitions [see User-definedfunctions on page 190].

Expressions in the aggregations files may involve thesefunctions.

-library <FDL functions file> [,<FDL functions file> ...]

Compute statistics for the input focus according to thedefinitions in the statistics files, which containcalculate

-statistics <statistics file> [,<statistics file> ...]

statements [see Evaluating focus statistics: the calculatestatement on page 153] or corresponding XMLrepresentations.

Expressions in the aggregations files may refer to thesestatistics.

Consider only the fields from the source focus that have thespecified tags (to optimize performance of the command byavoiding consideration of unused fields).

Consider all fields from the source focus, except thespecified fields (to optimize performance of the commandby avoiding consideration of unused fields).

Consider all fields from the source focus, except the fieldsthat are listed (one per line) in the fields file.

Consider all fields from the source focus, except the fieldsthat have the specified tag (to optimize performance of thecommand by avoiding consideration of unused fields).

-xtags <>tag> [, <tag>> ...]

Besides these command-line options, qsmeasure accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The source focus must be sorted by the key fields [see qssort on page 59]. However,qsmeasure does not check that this is the case, and does not issue a warning or error ifthe focus is not sorted correctly.

• The destination focus contains key fields corresponding to the key fields in the source focus,followed by the aggregated fields (in the same order in which they appear in the aggregationsfile), followed by any derived fields (again, in the same order in which they appear in theaggregations file). To force an aggregated field to appear after a derived field in thedestination focus, you can use a temporary statement to create the aggregated field, andthen derive a field from that.

• The destination focus is independent of the source focus, that is, it does not share underlyingdata.

• To compute aggregations involving the states between transactions rather than just thevalues in the transaction records themselves, derive suitable intermediate fields usingqstrack before using qsmeasure.

• XML for qsmeasure takes the form of an <aggregations> element, containing <fieldcontext="aggregation"> elements, each of which must in turn contain a TMLexpression in an <fdl> element. You can use a <by> element to specify groupings, anda <where> element to filter the input records.

• Using Spectrum Miner, you can access most of the functionality of qsmeasure throughthe Aggregate Records wizard .

ExamplesGiven the aggregations file, aggregations-purchases.tml, containing the following:

create numberPurchases := count();create averagePurchase := mean(Amount);create totalPointsRedeemed := sum(PointsRedeemed);create averagePointsPerPurchase :=totalPointsRedeemed / numberPurchases;

Apply these aggregations and the derivation to the RetailTransAprilSorted.ftr focus tocreate the RetailAggregationsApril.ftr focus containing the fields CustomerID,numberPurchases, averagePurchase, totalPointsRedeemed andaveragePointsPerPurchase:

qsmeasure -aggregations aggregations-purchases.tml-input RetailTransAprilSorted.ftr-output RetailAggregationsApril.ftr-keys CustomerID

Given the FDL functions file, fdl-functions-storeSplit.fdl, containing the following:

function StoreSplitFunction( Store )[element_names = "0,800,600,700,300,400,100,500,900,Other"

]{caseStore = "0000" : 1;Store = "0800" : 2;Store = "0600" : 3;Store = "0700" : 4;Store = "0300" : 5;Store = "0400" : 6;Store = "0100" : 7;Store = "0500" : 8;Store = "0900" : 9;default : 10;}

Given the aggregations file, aggregations-mostCommon.tml, containing the following:

create mostCommonStore := mode(Store);create averageSpendInStore_ := mean(Amount)by StoreSplitFunction(Store);

Apply these aggregations to the RetailTransAprilSorted.ftr focus, to create theRetailStoreSplitsApril.ftr focus, containing the fields CustomerID, mostCommonStore,averageSpendInStore_0, averageSpendInStore_800, averageSpendInStore_600,averageSpendInStore_700, averageSpendInStore_300, averageSpendInStore_400,averageSpendInStore_100, averageSpendInStore_500, averageSpendInStore_900and averageSpendInStore_Other:

qsmeasure -library fdl-functions-storeSplit.fdl-aggregations aggregations-mostCommon.tml-input RetailTransAprilSorted.ftr-output RetailStoreSplitsApril.ftr -keys CustomerID

Alternatively, given the XML aggregations file aggregations-mostCommon.xml, containing thefollowing:

<?xml version="1.0" encoding="UTF-8"?><aggregations xmlns="http://www.quadstone.com/xml">

<field name="mostCommonStore" context="aggregation"><fdl>mode(Store)</fdl>

</field><field name="averageSpendInStore_" context="aggregation">

<fdl>mean(Amount)</fdl><by>StoreSplitFunction(Store)</by>

</field></aggregations>

apply these aggregations to the focus RetailTransAprilSorted.ftr, to create the focusRetailStoreSplitsApril.ftr, containing the same fields as the example above:

qsmeasure -library fdl-functions-storeSplit.fdl-aggregations aggregations-mostCommon.xml-input RetailTransAprilSorted.ftr-output RetailStoreSplitsApril.ftr -keys CustomerID

Given the statistics file, statistics-amount.tml, containing the following:

calculate averageAmount := mean(Amount);

and given the aggregations file, aggregations-statistic.tml, containing the following:

create averageSpend := mean(Amount);create bigSpender := averageSpend > STATISTIC.averageAmount;

apply this aggregation and derivation to the RetailTransAprilSorted.ftr focus to create theRetailRankApril1.ftr focus, containing the fields CustomerID, averageSpend, andbigSpender:

qsmeasure -statistics statistics-amount.tml-aggregations aggregations-statistic.tml-input RetailTransAprilSorted.ftr-output RetailRankApril1.ftr-keys CustomerID

Alternatively, speed up the aggregation by importing only the fields CustomerID and Amount fromthe RetailTransAprilSorted.ftr focus:

qsmeasure -fields "CustomerID, Amount"-statistics statistics-amount.tml-aggregations aggregations-statistic.tml-input RetailTransAprilSorted.ftr-output RetailRankApril2.ftr-keys CustomerID

See also

Aggregation specification for qsmeasure on page 366

qstrack

Synopsis qstrack

-trackers <trackers file> [, <trackers file> ...] -input <source focus>-output <destination focus> [-force]-key <key field> | -key @<keyfile>[-statistics <statistics file> [, <statistics file> ...]] [-library<FDL functions file> [, <FDL functions file> ...]]

[-macro <name>=<value> [-macro <name>=<value> ...] | -macro @<name>=<macrofile>]

-fields <field> [, <field> ...] | -fields @<fields file> | -tags <tag> [,<tag> ...]]-xfields <field> [, <field> ...] | -xfields @<fields file> |-xtags <tag> [, <tag> ...]]

Description: copy all fields (by default) from the source focus to the destination focus; derive fieldsin the destination focus according to the field definitions in the trackers file, which contains createstatements or corresponding XML representations. These definitions typically involve state variables[seeVariables on page 187]. Use the key field to identify groups of records for tracking state: qstrackresets all state variables at the start of a new group.

Optional arguments

EffectOption

Consider only the specified fields from the source focus (tooptimize performance of the command by avoidingconsideration of unused fields).

Consider only fields from the source focus that are listed(one per line) in the fields file.

Include FDL function definitions [see User-definedfunctions on page 190].

Expressions in the trackers files may involve these functions.

-library <FDL functions file> [,<FDL functions file> ...]

Compute statistics for the input focus according to thedefinitions in the statistics files, which containcalculate

-statistics <statistics file> [,<statistics file> ...]

statements [see Evaluating focus statistics: the calculatestatement on page 153] or corresponding XMLrepresentations.

Expressions in the trackers files may refer to these statistics.

EffectOption

Consider only the fields from the source focus that have thespecified tags (to optimize performance of the command byavoiding consideration of unused fields).

Consider all fields from the source focus, except thespecified fields (to optimize performance of the commandby avoiding consideration of unused fields).

Consider all fields from the source focus, except the fieldsthat are listed (one per line) in the fields file.

Consider all fields from the source focus, except the fieldsthat have the specified tags (to optimize performance of thecommand by avoiding consideration of unused fields).

Besides these command-line options, qstrack accepts the options common to all data-build commands[see Standard command-line options on page 18].

Note: • The order in which derived fields appear in the destination focus is the order in which theyappear in the trackers file.

• The source focus must be sorted by the key field [see qssort on page 59]. However,qstrack does not check that this is the case, and does not issue a warning or error if thefocus is not sorted correctly.

• The destination focus is independent of the source focus, that is, it does not share underlyingdata.

• XML for qstrack takes the form of a <trackers> element, containing <fieldcontext="tracker"> elements, each of which must in turn contain an FDL expressionin an <fdl> element.

Examples

Given the trackers file, trackers-monthlySpend.tml, containing the following:

create monthlySpend := (state currentMonth := null;state currentYear := null;state monthSpend := 0;if (currentMonth = month(PurchaseDate) andcurrentYear = year(PurchaseDate))then (monthSpend := monthSpend + Amount)else (monthSpend := Amount);currentMonth := month(PurchaseDate);currentYear := year(PurchaseDate);monthSpend;);

apply these derivations to the focus RetailTransAprilSorted.ftr to create a new focusRetailTransAprilRunningSpend.ftr:

qstrack -trackers trackers-monthlySpend.tml -key CustomerID-input RetailTransAprilSorted.ftr-output RetailTransAprilRunningSpend.ftr

Alternatively, given the XML trackers file trackers-monthlySpend.xml, containing the following:

<?xml version="1.0" encoding="UTF-8"?><trackers xmlns="http://www.quadstone.com/xml">

state currentMonth := null;state currentYear := null;state monthSpend := null;if (currentMonth = month(PurchaseDate) and currentYear =

year(PurchaseDate))then (monthSpend := monthSpend + Amount)else (monthSpend := Amount);currentMonth := month(PurchaseDate);currentYear := year(PurchaseDate);path>monthSpend;h

</fdl></field>

</trackers>

apply these derivations to the focus RetailTransAprilSorted.ftr to create a new focusRetailTransAprilRunningSpend.ftr:

qstrack -trackers trackers-monthlySpend.xml -key CustomerID-input RetailTransAprilSorted.ftr-output RetailTransAprilRunningSpend.ftr

See also

qsderive on page 60

qsselect

Synopsis qsselect -selections <selections file>

[-macro <name>=< value>[-macro <name>=<value> ...] | -macro @<name>=<macrofile>]

[-output <destination focus>>] [-force][-random <integer seed>] [-savexml][-selection <field>]

Description: copy all fields (by default) from the source focus to the destination focus; derive anumeric field in the destination focus according to the first field definition (by default) in the selectionsfile, which is a file containing create statements or a corresponding XML file. By default, appendthe derived field to the source focus.

Apply a record selection to the new field, selecting only records with the value 1.

Optional arguments

EffectOption

Instead of appending derived fields to the source focus,copy the source focus to the destination focus and appendthe derived fields to the destination focus.

Use the integer seed, instead of 0, for any FDLrandom-number functions occurring in derivationexpressions.

-random <integer seed>

In addition to deriving one or more fields and applying aselection, write the derivation expression(s) from the

-savexml

selections file in TML and XML formats to files<output>.tml and <output>.xml, in the samedirectory as the destination focus, where <output> is thebasename of the destination focus.

Instead of using the first field definition in the selections file,use the definition of the specified field.

-selection <field>

Besides these command-line options, qsselect accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • Typically, the derivation expressions in the selections file are applications of relational orlogical operators, which produce the values 1 for "true" and 0 for "false." In such cases,qsselect selects those records for which the logical expression is true.

• If you specify a destination focus, it shares underlying data with the source focus.• XML for qsselect takes the form of a <selections> element, containing <fieldcontext="selection"> elements, each of which must in turn contain an FDL expressionin an <fdl> element.

• You can use the XML file saved by qsselect using the -savexml option as a selectionsfile.

Examples,

Given the selections file, selections-sixmonth.tml, containing the following:

create lastSixMonths :=countwholemonths(PurchaseDate, #1999/07/01) < 6;

apply this selection to the RetailTransApril.ftr focus:

qsselect -selections selections-sixmonth.tml-input RetailTransApril.ftr-output RetailTransApril6Months.ftr

Given the selections file, selections-random.tml, containing the following:

create random := rndUniform() < 0.1;

apply this selection to select 10% of the records in the RetailTransApril.ftr focus, with aknown seed (12345678), to allow the same series of records to be selected on subsequentapplications:

qsselect -selections selections-random.tml -random 12345678-input RetailTransApril.ftr-output RetailTransAprilSampleSelect.ftr

Alternatively, given the XML selections file, selections-random.xml, containing the following:

<?xml version="1.0" encoding="UTF-8"?><selections xmlns="http://www.quadstone.com/xml"?>

<field name="random" context="selection"><fdl seed="12345678">rndUniform() < 0.1</fdl>

</field></selections>

Apply this selection to select the same 10% of the records in the RetailTransApril.ftr focus,using the same seed:

qsselect -selections selections-random.xml-input RetailTransApril.ftr-output RetailTransAprilSampleSelect.ftr

See also

Selection specification for qsselect on page 369

qsrenamefields

Synopsis qsrenamefields

{-map <old name>=<new name> | -map @<input mapping file>} [{-map <oldname>=<new name> | -map @<input mapping file>} ...]

[-invert]

[-mapping <output mapping file>]

qsrenamefields

[-map QSCompliant]

[-invert]

[-mapping <output mapping file>]

Description: rename fields in the source focus, applying <old name>=<new name> mappingsand mappings in input mapping files.

Alternatively, rename fields in the source focus to unique, SpectrumMiner— compliant equivalents,using a built-in algorithm .

Optional arguments

PurposeCommand

Reverse the sense of the field-name mappings.-invert

Create an XML mapping file, describing the actual mappingused from old to new field names.

You can use this file as an input mapping file forqsrenamefields. In conjunction with the -invert option,

-mapping <output mapping file>

you can then use a reverse field-name mapping even in the

PurposeCommand

case where the original field-name mapping was algorithmic(and not necessarily one-to-one).

Instead of directly renaming the fields in the source focus,copy the source focus to the destination focus and renamethe fields in the destination focus.

Besides these command-line options, qsrenamefields accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • Omitting the -map argument altogether has the same effect as specifying -mapQSCompliant.

• An input mapping file is either in XML format or contains <old name>=<new name> pairson successive lines of the file.

• If you use more than one -map argument, qsrenamefields applies the field-name mappingsin the order that they appear on the command line.

• If you try to rename fields so that more than one field has the same name, qsrenamefieldswarns you and does nothing.

• If you specify a destination focus, it shares underlying data with the source focus.• You cannot rename a field more than once in a single invocation of the command.• If the source focus includes subfoci, qsrenamefields renames fields across the entire

hierarchy of subfoci.• XML for qsrenamefields takes the form of a <mappingset> element containing <map>

elements, each of which must contain a <name> and <alias> element.• Given a focus produced by qsimportdb, which may include field names that are notSpectrum Miner — compliant, you can use qsrenamefields (with -map QSCompliant)to convert the focus into a form that you can use in Decision Studio or with other data-buildcommands.

• You can access some of the functionality of qsrenamefields through the Table Viewer inDecision Studio , or the Rename Fields dialog box available from Spectrum Miner .

Examples

Create a focus, RetailCustAprilRenamed1.ftr, with the StartDate field renamed asInitialized, Age renamed as CurrentAge, and Gender renamed as Sex:

qsrenamefields -map StartDate=Initialized-map Age=CurrentAge -map Gender=Sex-input RetailCustApril.ftr -output RetailCustAprilRenamed1.ftr

Alternatively, given the input mapping file,mapping-file.txt, with the following contents:

StartDate=InitializedAge=CurrentAgeGender=Sex

apply this mapping file to create the focus RetailCustAprilRenamed2.ftr, with field StartDaterenamed as Initialized, Age renamed as CurrentAge, and Gender renamed as Sex:

qsrenamefields -map @mapping-file.txt-input RetailCustApril.ftr -output RetailCustAprilRenamed2.ftr

or, alternatively, given the XML input mapping file mapping-file.xml, with the following contents:

<?xml version="1.0" encoding="UTF-8"?><mappingset xmlns="http://www.quadstone.com/xml">

<map><name>StartDate</name><alias>Initialized</alias>

</map><map>

<name>Age</name><alias>CurrentAge</alias>

</map><map>

<name>Gender</name><alias>Sex</alias>

</map></mappingset>

apply this mapping file to create the focus RetailCustAprilRenamed2.ftr, with field StartDaterenamed as Initialized, Age renamed as CurrentAge, and Gender renamed as Sex:

qsrenamefields -map @mapping-file.xml-input RetailCustApril.ftr -output RetailCustAprilRenamed2.ftr

Create a focus,RetailCustAprilQSCompliant.ftr, with Spectrum Miner-compliant field names:

qsrenamefields-input RetailCustApril.ftr-output RetailCustAprilQSCompliant.ftr

See also

Field name mapping specification for qsrenamefields on page 374

qsexportmetadata

Synopsis qsexportmetadata -input <source focus> [-output <metadata file>]

Description: export metadata from the source focus, including focus history, binnings, comments,derivations, interpretations, record selections, subfocus structure, and default subfocus. By default,write the information to standard output.

Optional arguments

EffectOption

Create this metadata file instead of writing to standardoutput.

-output <metadata file>

Besides this command-line option, qsexportmetadata accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The metadata output by qsexportmetadata is in an XML-based file format [see Metadataspecification for qsimportmetadata on page 357].

• Using SpectrumMiner, you can access the functionality of qsexportmetadata through theExport Metadata dialog box .

Examples

Create a new XML file RetailTransApril.xml, that contains the metadata from the focusRetailTransApril.ftr:

qsexportmetadata -input RetailTransApril.ftr-output RetailTransApril.xml

See also

qsimportmetadata on page 76

qsimportmetadata

Synopsis qsimportmetadata

-input <source focus> -metadata <metadata file>[-output <destinationfocus>] [-force] [-details <kind> [, <kind> ...] -fields <field> [, <field>

...] | -fields @<fields file>] -fields <field> [, <field> ...] | -fields@<fields file>] [-dryrun] [-warn]

Description: import metadata, including binnings, comments, derivations, interpretations, recordselections, subfocus structure and default subfocus from the metadata file to the source focus.

Optional arguments

EffectOption

Import the specified kinds of metadata, which can includebinnings, comments, derivations,

-details <kind> [, <kind> ...]

history, interpretations,selections, launch (default subfocus) andsubfoci (subfocus structure). Categorical interpretationsare classified as binnings, not interpretations. In the absenceof this option, all kinds of metadata except history areimported.

Do not import metadata, but display information about themetadata that would have been imported.

-dryrun

Instead of importing field metadata from all fields, importmetadata from just the specified fields.

Instead of importing field metadata from all fields, importmetadata from just the fields listed (one per line) in the fieldsfile.

Instead of directly applying the metadata to the source focus,copy the source focus to the destination focus and applythe metadata to the destination focus.

Rather than aborting the operation if import of somemetadata fails, issue a warning, and attempt to import theremaining metadata.

Import field metadata from all fields except the specifiedfields.

Import field metadata from all fields except the fields thatare listed (one per line) in the fields file.

Besides these command-line options, qsimportmetadata accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • If you import history, and the metadata file includes non-empty history metadata,qsimportmetadata replaces the history metadata in the focus; if the history metadata in themetadata file is missing or blank, qsimportmetadata does not replace the history metadatain the focus.

• If you specify a destination focus, it shares underlying data with the source focus.• The metadata file uses an XML-based file format [see Metadata specification forqsimportmetadata on page 357]. Metadata XML for qsimportmetadata takes the form of a<metadata> element, containing <focus> elements, which can contain <comment>,<history>, nested <focus> and <field> elements. The <field> elements cancontain <comment>, <binning>, <recordselection> and <fdl> elements.

• If you import a categorical binning to a field that does not have a categorical interpretation,qsimportmetadata first interprets the field as categorical, creating base categories from thevalues contained in the field (and a binning called "unnamed node"). If the base categoriesin the imported binning refer to values that are not referred to by the base categories of thefield, qsimportmetadata adds these as base categories. Finally, it adds the categories (ifany) from the imported binning directly below the base categories of the field (unless theimported binning has the same set of base categories as the field, in which caseqsimportmetadata omits the first level of the imported binning, adding the other levelsdirectly below the base categories of the field).

• If you import a categorical binning to a field that has a categorical interpretation, and theimported binning contains at least one <category> element, qsimportmetadata clears thecategorical interpretation on the field (deleting all existing categorical binnings on the field)before it imports the binning. (If the imported binning contains no <category> elements,qsimportmetadata does nothing.)

• If you import a subfocus structure, the newly created subfocus automatically inherits fieldattributes (such as analysis candidates and binnings) from the parent subfocus. This occurseven if you do not explicitly specify the attributes in the subfocus definition because theattributes are set in the process of applying the subfocus structure.

• A subfocus defined in the metadata file takes precedence over a subfocus of the samename in the source file, so any existing metadata associated with such a subfocus isoverwritten.

• Using Spectrum Miner, you can access most of the functionality of qsimportmetadatathrough the Import Metadata to Focus dialog box .

Examples

Given the focus metadata file metadata-fieldfocus.xml, with the following contents:

<?xml version="1.0" encoding="UTF-8"?><metadata xmlns="http://www.quadstone.com/xml">

<focus><comment>This focus was created for the Retail Analysis

project.</comment>

<field name="Age"><comment>Imported from database field DOBin CUSTINFO.</comment>

</field><field name="Gender"><comment>Imported from CUSTINFO:SEX,mapping 1="M" and 2="F".</comment></field>

</focus></metadata>

import this metadata file to create a focus RetailCustAprilCommented2.ftr, containing fieldcomments and a focus comment:

qsimportmetadata -metadata metadata-fieldfocus.xml-input RetailCustApril.ftr-output RetailCustAprilCommented2.ftr

Alternatively, import the same metadata file to create a focus RetailCustAprilCommented3.ftr,with a field comment on the field Age only:

qsimportmetadata -fields Age -metadata metadata-fieldfocus.xml-input RetailCustApril.ftr-output RetailCustAprilCommented3.ftr

Given the focus metadata file metadata-focushtml.xml, with the following contents:

<div>This focus was created for theRetail Analysis project.<br/>An audit is available on the<a href="http://intranet.company.com/audits">intranet</a>.

</div></comment>

</focus></metadata>

import this metadata file to create a focus RetailCustAprilCommented4.ftr, containing afocus comment that has HTML formatting:

qsimportmetadata -metadata metadata-focushtml.xml-input RetailCustApril.ftr-output RetailCustAprilCommented4.ftr

Given the focus metadata file metadata-catbinning1.xml, with the following contents:

<category name="Single" levelname="Summarized"><category name="Single" value="1"

levelname="Detail"/></category>

</category><category name="Other">

</categories></categorical>

</binning></field>

</metadata>

import this metadata file to create a focus Lion1.ftr, in which MaritalStatus has a categoricalbinning with two levels (one of which names the base categories, and the other of which mergesthree categories):

qsimportmetadata -metadata metadata-catbinning1.xml-input Lion.ftr -output Lion1.ftr

Given the focus metadata file metadata-catbinning2.xml, with the following contents:

</binning></field>

</metadata>

import this metadata file to create a focus Lion2.ftr, in which SRVOverallSatisfaction hasa categorical binning with a particular ordering of the categories, from "Very Dissatisfied" through"Very Satisfied:"

qsimportmetadata -metadata metadata-catbinning2.xml-input Lion1.ftr -output Lion2.ftr

See also

qsexportmetadata on page 76

qsupdate

Synopsis qsupdate -from <template focus> -to <destination focus> [-force]

Description: apply the focus metadata in the template focus to the destination focus, includingdefinitions of subfoci and derived fields, as well as any field interpretations, binnings and recordselections.

Optional arguments

EffectOption

Besides this command-line option, qsupdate accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The destination focus cannot have any existing field interpretations, binnings, or recordselections.

• If the template focus and destination focus have differing field names or datatypes,qsupdate may be unable to copy some aspects of the template focus's metadata.

• This data-build command is deprecated: use qsexportmetadata and qsimportmetadatainstead.

Example

Apply derivations, interpretations, binnings, and record selections from the RetailCustApril.ftrpreprocessed focus to the RetailCustMay.ftr newly-imported focus:

qslink -from RetailCustMay.ftr -to RetailCustMayUpdated.ftrqsupdate -from RetailCustApril.ftr -to RetailCustMayUpdated.ftr

6 - Commands forcombining foci

In this section

About combining foci 83qsjoin 83qsmerge 87

About combining foci

Spectrum Miner provides two ways of combining the fields from multiple foci to create a newfocus, the first of which focus join is a variant of a left join. You can use focus join in SpectrumMiner or via the qsjoin Spectrum Miner data-build command.

The second method uses Decision Studio to import fields from a focus in a variant of a two-tableleft join.

You can also merge the records of multiple foci, in SpectrumMiner or via the qsmerge SpectrumMiner data-build command.

See also

dblookup on page 330

qsjoin

Synopsis qsjoin

-input <primary focus> [-subfocus <subfocus>]

[-equalnulls] [-importmeta] [-onetoone]

[-match <prefix>] [-unmatched <prefix>]

-join <secondary focus> [ -subfocus <subfocus>] -fields <field> [, <field>...] | -fields @<fields file> | -tags <tag> [, <tag> ...]] -xfields <field>[, <field> ...] | -xfields @<fields file> | -xtags <tag> [, <tag> ...]][-join <secondary focus> [ -subfocus <subfocus>] -fields <field> [, <field>...] | -fields @<fields file> | -tags <tag> [, <tag> ...]] -xfields <field>[, <field> ...] | -xfields @<fields file> | -xtags <tag> [, <tag> ...]]...]

Description: join fields from the secondary foci to the primary focus, matching records using thekey fields. By default, join all non-key fields from the secondary foci.

Unless you use the -onetoone option, qsjoin effectively performs a series of left outer joins (wherethe primary focus is the left table), incorporating the fields from each of the secondary foci in turn.The key values in a record in the primary focus uniquely identify a corresponding record in eachsecondary focus (where one exists).

Optional arguments

EffectOption

When matching records, treat the null value in a key fieldas an ordinary value (that is, consider all occurrences of the

-equalnulls

null value to be equal to one another but distinct from allother values).

Instead of joining all fields from the secondary focus, joinjust the specified fields.

Instead of joining all fields from the secondary focus, joinjust the fields listed (one per line) in the fields file.

See The -force command-line option on page 20. Thisalso affects foci created using the -unmatched option.

-force

Import field metadata from the secondary foci.-importmeta

For each secondary focus, create an additional integer fieldin the output focus, containing the value 1 for records that

-match <prefix>

include data from the secondary focus and the value 0 forall other records. Use the given prefix together with the nameof the secondary focus to name the field.

Instead of performing a left outer join, attempt to perform aone-to-one match of groups of records with the same keys.

-onetoone

Instead of joining fields to the primary focus, copy theprimary focus to the destination focus and then join fieldsto the destination focus.

Use the specified subfocus of the primary or secondaryfocus last mentioned on the command line (using-inputor -join).

-subfocus <subfocus>

Instead of joining all fields from the secondary focus, joinjust the fields that have the specified tags.

For each secondary focus, create an alternative output focuspreserving those records from the secondary focus whose

-unmatched <prefix>

key fields do not match records in the primary focus. Usethe given prefix together with the name of the secondaryfocus to name the alternative output focus.

Join all fields in the focus except the specified fields.-xfields <field> [, <field> ...]

EffectOption

Join all fields in the focus except the fields that are listed(one per line) in the fields file.

Join all fields in the focus except the fields that have thespecified tags.

Besides these command-line options, qsjoin accepts the options common to all data-build commands[see Standard command-line options on page 18].

• For each named key field, the fields of that name in the primary and secondary foci must all be ofcompatible type, that is, all string fields, all date fields, or all numeric fields.

• The primary and secondary foci must be sorted by the key fields [see qssort on page 59]. If thefoci are not sorted appropriately, qsjoin fails with an error message to that effect.

• For a left outer join (the default join type for qsjoin), no two records in a secondary focus can sharethe same keys.

• For a left outer join (the default join type for qsjoin), if a secondary focus contains no recordcorresponding to a given record in the primary focus, qsjoin uses a null record instead.

• For a left outer join (the default join type for qsjoin), if more than one record in the primary focushas the same set of keys, copies of the corresponding records (if any) from the secondary fociare joined to each of these records.

• If you use the -onetoone option, for each set of records in the primary focus that share a givencombination of key-field values, qsjoin attempts to join fields from successive records in a secondaryfocus that share the same combination of key-field values, in such a way that records are matchedone-to-one.

If, for a given key, there are fewer records in a secondary focus than in the primary focus, qsjoinuses null records in place of missing records in the secondary focus.

If, for a given key, there are more records in a secondary focus than in the primary focus, qsjointreats surplus records in the secondary focus as unmatched. If you use the -unmatched option,qsjoin preserves these records in an alternative output focus.

• If the join operation would otherwise result in duplicate field names, qsjoin renames fields asneeded, by adding numeric suffixes.

• If you specify a destination focus, it shares underlying data with the primary focus (but isindependent of the secondary focus).

• When matching records, qsjoin normally treats each occurrence of the null value in a key field asa new, distinct value: a record with the null value in a key field never corresponds to a record inanother focus. You can override this behavior using the -equalnulls option.

• For backward compatibility, if you specify only one secondary focus, and use the -match or-unmatched option, the name of the secondary focus is not used to name the match field oralternative output focus.

• Using SpectrumMiner, you can access most of the functionality of qsjoin through the Join Fieldsto Focus wizard .

Examples

Create a focus, RetailAprilAnalysis1.ftr, by combining records from theRetailCustApril.ftr and RetailAggregationsApril.ftr foci:

qsjoin -input RetailCustApril.ftr -join RetailAggregationsApril.ftr-keys CustomerID -output RetailAprilAnalysis1.ftr

Alternatively, add only the fields numberPurchases and totalAmount from theRetailAggregationsApril.ftr focus:

qsjoin -fields numberPurchases,totalAmount-input RetailCustApril.ftr -join RetailAggregationsApril.ftr-keys CustomerID -output RetailAprilAnalysis2.ftr

Create the focus RetailAprilAnalysis3.ftr, by combining records from the fociRetailCustApril.ftr, RetailAggregationsApril.ftr, andRetailAggregationsMay.ftr:

qsjoin -input RetailCustApril.ftr -join RetailAggregationsApril.ftr-join RetailAggregationsMay.ftr-keys CustomerID -output RetailAprilAnalysis3.ftr

Repeat the above, and write any unmatched transaction records to the fociunmatchedRetailAggregationsApril.ftr or unmatchedRetailAggregationsMay.ftr:

qsjoin -unmatched unmatched-input RetailCustApril.ftr -join RetailAggregationsApril.ftr-join RetailAggregationsMay.ftr-keys CustomerID -output RetailAprilAnalysis4.ftr

Repeat the above, joining only a subset of the fields from the secondary foci:

qsjoin -unmatched unmatched-input RetailCustApril.ftr-join RetailAggregationsApril.ftr-fields numberPurchasesApr,totalAmountApr-join RetailAggregationsMay.ftr-fields numberPurchasesMay,totalAmountMay-keys CustomerID -output RetailAprilAnalysis5.ftr

See also

qsmerge on page 87

qsmerge

Synopsis qsmerge

-input <primary focus> [-subfocus <subfocus>]

[-equalnulls]

[-nodups]

-merge <secondary focus> [ -subfocus <subfocus>][-merge <secondary focus>[ -subfocus <subfocus>] ...]

Description Merge records from the primary and secondary foci, interleaving them to create thedestination focus, in such a way that it remains sorted by the key fields [see qssort on page 59].

Optional arguments

EffectOption

When testing for duplicate keys, treat the null value in akey field as an ordinary value (that is, consider all

-equalnulls

occurrences of the null value to be equal to one another butdistinct from all other values).

Avoid duplicating key-field values: if two or more recordsshare key-field values, retain the first such record from the

-nodups

first focus to contain such a record (treating foci in the order:primary, earliest-mentioned secondary, etc.); discard theremainder of the records with the same key-field values.

Use the specified subfocus of the primary or secondaryfocus last mentioned on the command line (using-inputor -merge).

-subfocus <subfocus>

Besides these command-line options, qsmerge accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • The primary and secondary foci must be sorted by the key fields [see qssort on page 59].If the foci are not sorted appropriately, qsmerge fails with an error message to that effect.

• Corresponding fields in the primary and secondary foci (identified by name) must be of thesame exact type, but they do not need to be arranged in the same order.

• The destination focus contains the same set of fields as the primary focus, arranged in thesame order. If a secondary focus contains fields that are not in the primary focus, qsmergeignores them. If the primary focus contains fields that are not in a secondary focus, qsmergeuses the null value in place of missing values in the secondary focus.

• If two or more records share the same key-field values, those from the primary focus appearfirst in the destination focus, followed by those from each secondary focus in turn. Unlessyou use the -nodups option, the merge operation does not remove any records withduplicate keys.

• By using the same focus twice on the command line — as the argument to both -inputand -merge — and using the -nodups option, you can use qsmerge to remove recordswith duplicate key values from a focus.

• The destination focus is independent of the primary and secondary foci, that is, it does notshare underlying data.

• When ordering records or testing for duplicate keys, qsmerge normally treats the null valuein a key field as a new, distinct value: a record with the null value in a key field nevercorresponds to a record in another focus. You can override this behavior using the-equalnulls option.

• Using Spectrum Miner, you can access most of the functionality of qsmerge through theMerge Records dialog box .

• If the merged foci contains legacy datatypes, the legacy types are converted to the mostappropriate supported datatype in the output focus.

If conversion fails because the datatype does not have sufficient storage, rather thantruncating the content the field will contain the null value. In this case, a summary messageshowing the effected field and number of entries on completion.

• In order to merge two foci, the fields to be merged must contain compatible data types.

Examples

Create a focus, RetailTransAprilMayJune.ftr, containing all the records in the fociRetailTransAprilSorted.ftr, RetailTransMaySorted.ftr, andRetailTransJuneSorted.ftr:

qsmerge -input RetailTransAprilSorted.ftr-merge RetailTransMaySorted.ftr-merge RetailTransJuneSorted.ftr-keys CustomerID -output RetailTransAprilMayJune.ftr

Alternatively, include only records from the RetailTransAprilSorted.ftr focus whose keyvalues also appear in the RetailTransMaySorted.ftr focus:

qsmerge -nodups-input RetailTransAprilSorted.ftr

-merge RetailTransMaySorted.ftr-keys CustomerID -output RetailTransAprilMayNodups.ftr

See also

qsjoin on page 83

7 - Commands formanaging foci

In this section

qscopy 91qslink 92qsmove 93qsremove 93qsarchive 95qsunzip 95

qscopy

Synopsis qscopy -from <source focus>

{-to <destination focus> | -to <directory>} [-force]

Description: copy the source focus to the destination focus or to a focus in the specified directory(preserving the name of the source focus). Copy the underlying data so that the new focus isindependent of other foci.

Optional arguments

EffectOption

Besides this command-line option, qscopy accepts the options common to all data-build commands[see Standard command-line options on page 18].

Because of the potential interdependence of foci, you should not attempt to copya focus using standard operating-system utilities. Instead, you should always useqscopy or qslink (or Spectrum Miner).

Note: • If you try to copy a focus on top of an existing focus, qscopy warns you and does nothing.• To create a linked copy of a focus, in which the copy shares the underlying data with the

original focus, use qslink.• To create a copy of a focus, or of a subset of a focus, not including any metadata , useqsimportfocus.

• You can also access the functionality of qscopy through Spectrum Miner .

Example

Create a new focus, RetailArchiveMay.ftr, archiving the results of the May analysis sessionin the RetailAnalysisMay.ftr focus:

qscopy -from RetailAnalysisMay.ftr -to RetailArchiveMay.ftr

See also

qsmove on page 93

qsremove on page 93

qslink

Synopsis qslink -from <source focus>

{-to <destination focus> | -to <directory>} [-force]

Description: copy the source focus to the destination focus or to a focus in the specified directory(preserving the name of the source focus). Share the underlying data with the original focus. (Thenew focus is dependent on the original focus.)

Optional arguments

EffectOption

Besides this command-line option, qslink accepts the options common to all data-build commands[see Standard command-line options on page 18].

Note: • If you try to copy a focus on top of an existing focus, qslink warns you and does nothing.• To create an independent copy of a focus, in which the copy does not share the underlying

data with the original focus, use qscopy.• You can also access the functionality of qslink through Spectrum Miner .

Example

Create a focus RetailLoyaltyProjectMay.ftr, that shares underlying data with theRetailAnalysisMay.ftr focus:

qslink -from RetailAnalysisMay.ftr-to RetailLoyaltyProjectMay.ftr

See also

qsimportfocus on page 48

qsmove on page 93

qsremove on page 93

qsmove

Synopsis qsmove -from <source focus>

-to {,<destination focus> | <directory>,}

Description: rename the source focus to the destination focus, or move the source focus to thespecified directory, taking into account possible data dependencies.

Optional arguments Besides the required arguments, qscopy accepts the options common to alldata-build commands [see Standard command-line options on page 18].

Note: • If you try to rename or move a focus on top of an existing focus, qsmove warns you anddoes nothing.

• You can also access the functionality of qsmove through Spectrum Miner .

Because of the potential interdependence of foci, you should not attempt to renameor move a focus using standard operating-system utilities. Instead, you shouldalways use qsmove (or Spectrum Miner).

Examples

Rename the RetailTransAprilSorted.ftr focus to RetailTransAprilSort.ftr:

qsmove -from RetailTransAprilSorted.ftr-to RetailTransAprilSort.ftr

Move the RetailArchiveMay.ftr focus to the Retail directory:

qsmove -from RetailArchiveMay.ftr -to Retail

See also

qscopy on page 91

qslink on page 92

qsremove on page 93

qsremove

Synopsis qsremove -focus <focus file>

[-focus <focus file> ...]

[-force]

[-recursive]

Description: delete one or more focus files, taking into account possible data dependencies.

Optional arguments

EffectOption

Ignore any errors caused by failure to update links withrelated foci.

-force

Delete the focus and all dependent foci.-recursive

In additon to these command-line options, qsremove accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Because of the potential interdependence of foci, you should not attempt to deletea focus using standard operating-system utilities. Instead, you should always useqsremove (or Spectrum Miner).

Note: • If there is a circular dependency in the chain, or the parent of the focus chain is not writeable,an error will be displayed. In this case, you need to use both -force and -recursiveto delete the focus.

• If you use more than one -focus argument and qsremove fails to delete one of the foci,it nevertheless continues to delete subsequent foci (but produces an error message andreturns a non-zero exit status).

• Removing a focus with qsremove will fail if you do not have write permission to the .ftrfocus file.

• After removing a focus using qsremove, there may be a backup .ftr file left behind. Youcan safely delete this as you would delete any file.

• You can also access the functionality of qsremove through Spectrum Miner .

Examples

Delete the RetailTransApril6Months.ftr focus and all dependents:

qsremove -focus RetailTransApril6Months.ftr

Delete two foci RetailTransAprilCA.ftr and RetailTransApril2Customers.ftr:

qsremove -focus RetailTransAprilCA.ftr-focus RetailTransApril2Customers.ftr

See also

qscopy on page 91

qslink on page 92

qsmove on page 93

qsarchive

Synopsis qsarchive -input <focus or folder>

[, -input <focus or folder> ...]-output <archive file>

Description: create an archive file from a list of files or folders.

Optional arguments: besides the required arguments, qsarchive accepts the options commonto all data-build commands [see Standard command-line options on page 18].

Note: • If you create an archive file from a folder that contains foci that link to data outside thefolder, the external data is copied into the archive.

• You can access most of the functionality of the data-build command qsarchive by usingthe Archive dialog box in Spectrum Miner.

Example

Create an archive file FirstQuarter.zip that shares underlying data with the January.ftr,February.ftr, and March.ftr foci:

qsarchive -input January.ftr -input February.ftr -input March.ftr-output FirstQuarter.zip

See also

qsunzip on page 95

qsunzip

Synopsis qsunzip -input <archive file> [-output <output directory>]

[-overwrite]

Description: extract files and folders from an archive file.

Optional arguments

EffectOption

Extract to an alternative directory to the one in which thearchive file is located. If you don't provide an -output

-output <output directorygt;

option, the files and folders will be extracted to the currentdirectory.

Overwrites files and folders if you extract to a location thatcontains the same content as the zip archive.

-overwrite

In addition to these command-line options, qsunzip accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • If you don't use the -overwrite option and extract to a location that contains the samesource files and folders, the extract will fail.

• You can access most of the functionality of the data-build command qsunzip by using theExtract dialog box in Spectrum Miner.

Example Extract from the archive file firstquarter.zip into the same location asfirstquarter.zip:

qsunzip -input firstquarter.zip

See also

qsarchive on page 95

8 - Commands forproducing reports

In this section

qssettings 98qsaudit 98qsdescribe 102qsdescribestat 104qshtmlunpack 105qsdtsnapshot, qsscsnapshot 105qsxt 108qsinfo 110qsdescribelicense 111

qssettings

Synopsis qssettings [<property>=<value> ...]

Description: set the value of each listed property, to control aspects of the output of thereport-generating commands qsaudit, qsdtsnapshot, and qsscsnapshot.

You can set values of the following properties:

• DateFormat, with 0 corresponding to European, 1 corresponding to American, and 2 correspondingto YMD

• NumDecimalPlaces• StripTrailingZeroes• ThousandSeparators• TwelveHour

Optional arguments Besides the required arguments, qssettings accepts the options commonto all data-build commands [see Standard command-line options on page 18].

The qssettings command creates or modifies the file settings.xml in your user-specificSpectrum Miner configuration directory. Use this file in conjunction with qsaudit,

qsdtsnapshot, or qsscsnapshot by specifying the -settings option [see Standardcommand-line options on page 18].

Example Create a settings file to display dates with YMD date formats, 12-hour time format andnumbers with trailing zeros, to four decimal places and a locale-specific thousand separator:

qssettings DateFormat=2 TwelveHour=true StripTrailingZeroes=falseNumDecimalPlaces=4 ThousandSeparators=true

See also

Date formats on page 52 qsaudit on page 98 qsdtsnapshot, qsscsnapshot on page 105

qsaudit

Synopsis qsaudit [-generate Full]

-input <focus> [-subfocus <subfocus>]

[-reference <archived HTML report> | -reference <archived XML report>]

-output <archived HTML report>

[-records <FDL expression> | -records @<FDL file>>]

[-targets <field> [, <field> ...] | -targets @<fields file> | -notarget]

[-overwrite]

[-htmlimages largepng | -htmlimages smallpng | -htmlimages svg |-htmlimages none]

[-paginate]

[-nopartition]

[-partitionfield <field>]

qsaudit -generate XML

-input <focus> [-subfocus <subfocus>]

[-reference <archived HTML report> | -reference <archived XML report>]

-output <archived XML report>

[-records <FDL expression> | -records @<FDL file>]

[-targets <field> [, <field> ...] | -targets @<fields file> | -notarget]

[-overwrite]

[-nopartition]

[-partitionfield <field>] qsaudit -generate HTML

{-input <archived HTML repor> | -input <archived XML report>}

[-overwrite]

[-htmlimages largepng | -htmlimages smallpng | -htmlimages svg |-htmlimages none] [-paginate]

Description: create a Profile and Audit report on the specified focus, or a comparison reportdescribing differences between the specified focus and the focus described in a reference report(with the -reference option). By default, audit all the fields in the focus, in relation to the focusobjective if there is one. This is a two-stage process: first the command generates an intermediateXML file (packaged in a .qsxml archive) and then from this intermediate file it creates the finishedHTML-formatted report (packaged in a .qshtml archive).

Alternatively, create just the intermediate .qsxml archive — or create a .qshtml archive from anexisting .qshtml or .qsxml archive.

To unpack .qshtml and .qsxml archives, for viewing or reuse outside Spectrum Miner, useqshtmlunpack.

Optional arguments

EffectOption

Audit only the specified fields (except any fields excludedusing -xfields).

Audit only the fields listed (one per line) in the fields file.-fields @<fields file>

Create an HTML report from an XML or HTML report.-generate HTML

Create an HTML report directly (the default if you do notspecify a -generate option).

-generate Full

Create an XML report.-generate XML

Include printer- and screen-optimized PNG bitmaps andSVG images.

-htmlimages largepng

Do not include any images.-htmlimages none

Include screen-optimized PNG bitmaps and SVG images(the default if you do not specify a-htmlimages option).

-htmlimages smallpng

Include only SVG images.-htmlimages svg

Create a non-uplift Profile and Audit, ignoring any partitioninterpretation in the focus.

-nopartition

Don't audit fields in the focus in relation to any target fields,even if the focus has an objective.

-notarget

Overwrite any existing report.

Otherwise, if there is already a report of the same name,qsaudit warns you and does nothing.

-overwrite

Create a separate HTML page for each audited field.-paginate

EffectOption

Use the specificed field as the partition field.-partitionfield <field>

Report only on the records for which the numeric FDLexpression is non-zero ("true").

Report only on the records for which the FDL expression inthe specified file is non-zero ("true").

-records @<>FDL file>

Use the previously created archived HTML report as areference report, and create a comparison report.

-reference <archived HTML report>

Use the specified subfocus of the focus.-subfocus <subfocus>

Audit only the fields that have the specified tags (except anyfields excluded using -xtags).

Audit fields in the focus in relation to the specified targetfields, instead of the focus objective.

-targets <field> [, <field> ...]

Audit fields in the focus in relation to the target fields listed(one per line) in the fields file.

-targets @<fields file>

Do not audit the specified fields.-xfields <field> [, <field> ...]

Do not audit the fields listed (one per line) in the fields file.-xfields @<fields file>

Do not audit the fields that have the specified tags.-xtags <tag> [, <tag> ...]

Besides these command-line options, qsaudit accepts the -settings option, as well as optionscommon to all data-build commands [see Standard command-line options on page 18].

Note: • By default, a comparison report includes the same fields as in the reference report. If youexplicitly choose fields to include in the comparison report (using the -fields option),they are added to the previously profiled fields.

• A comparison report uses the same target fields as in the reference report. Any explicitlyspecified target fields are ignored.

• Additional report-formatting options are available through Audits and Snapshotspreferences (see Spectrum Miner Online Help).

• If you do not include an objective and -target option, qsaudit will fail to audit the focus.• Using Spectrum Miner, you can access most of the functionality of qsaudit through theCreate Profile and Audit dialog box .

Examples

Create a new Profile and Audit RetailCustApril.qshtml from the focusRetailCustApril.ftr:

qsaudit -input RetailCustApril.ftr -output RetailCustApril.qshtml

Create a new Profile and Audit RetailCustMay.qshtml from the focus RetailCustMay.ftr,auditing all fields by Age:

qsaudit -targets Age -input RetailCustMay.ftr-output RetailCustMay.qshtml

Overwrite this to create a Profile and Audit in which each field audit is on a separate HTML page:

qsaudit -paginate -targets Age -overwrite -input RetailCustMay.ftr-output RetailCustMay.qshtml

Create a Profile and Audit RetailCustAprilnoimages.qshtml without images, from the focusRetailCustApril.ftr, only for the fields StartDate, Age, and Gender:

qsaudit -fields "StartDate, Age, Gender" -htmlimages none-input RetailCustApril.ftr-output RetailCustAprilnoimages.qshtml

Re-create the HTML output, with one field audit per HTML page (and without recreating the underlyingaudit data):

qsaudit -generate HTML -paginate-input RetailCustApril.qshtml-output RetailCustAprilpaginate.qshtml

Create a comparison report RetailCustMayApril.qshtml from the RetailCustMay.ftrfocus in comparison with the reference report RetailCustApril.qshtml:

qsaudit -reference RetailCustApril.qshtml-input RetailCustMay.ftr -output RetailCustMayApril.qshtml

See also

XML in SpectrumMiner on page 356 qsdescribe on page 102 qsdtsnapshot, qsscsnapshot onpage 105 qssettings on page 98

qsdescribe

Synopsis qsdescribe

[-output <report file>]

[-fields [-detail]]

Description: display summary information about the focus in plain-text format, including the numberof fields and records and the focus history.

Optional arguments

EffectOption

Report more detailed information on the fields in the focus,including field statistics, derivation expressions, and seedsfor derived fields.

-detail

Report information on each of the fields in the focus.-fields

Write the report to the specified file instead of standardoutput.

-output <report file>

Besides these command-line options, qsdescribe accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • By default, qsdescribe reports on the entire hierarchy of subfoci within a focus. If you usethe -subfocus option, qsdescribe only reports on the subfocus that you specify.

• Using Spectrum Miner, you can access most of the functionality of qsdescribe throughthe Display Focus Information dialog box .

Examples

Report the number of fields, number of records, and history of the RetailCustApril.ftr focus:

qsdescribe -input RetailCustApril.ftr

Additionally, report the field names and types:

qsdescribe -fields -input RetailCustApril.ftr

Finally, report the field statistics, selections, comments, and derivations on the fields:

qsdescribe -detail -fields -input RetailCustApril.ftr

See also

qsaudit on page 98

qssettings on page 98

qsdescribestat

Synopsis qsdescribestat -input <source dataset>

[-detail]

Description: display a list of the field names in the Excel dataset in plain-text format.

Optional arguments

EffectOption

List the datatypes of fields alongside their names.-detail

Interpret the Excel dataset as the specified dataset type,overriding the default interpretation (which depends on thefilename extension).

Besides these command-line options, qsdescribestat accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Examples

Report the field names of the Excel dataset RetailCustApril.xlsx:

qsdescribestat -input RetailCustApril.xlsx

Additionally, report the field types:

qsdescribestat -detail -input RetailCustApril.xlsx

See also

qsexportstat on page 46

qsimportstat on page 44

qshtmlunpack

Synopsisqshtmlunpack <archived report> <output directory> [<HTML filename>]

Description: unpack a Spectrum Miner-generated report in the form of a .qshtml or .qsxmlarchive, creating the specified output directory to contain the components of the report. In the caseof an archived HTML report, the HTML file itself (within the output directory) is by default namedqsreport.html.

Optional arguments

EffectOption

Use this filename for the HTML file in an unpacked HTMLreport, instead of qsreport.html.

When unpacking an archived XML report,qshtmlunpack ignores this argument.

Using Spectrum Miner, you can access most of the functionality of qshtmlunpack usingthe right-click option Unpack to Folder.

Example

Unpack the archive RetailCustApril.qshtml as an HTML Profile and AuditRetailCustApril.html in subdirectory www:

qshtmlunpack RetailCustApril.qshtml www RetailCustApril.html

See also

qsaudit on page 98

qsdtsnapshot, qsscsnapshot on page 105

qsdtsnapshot, qsscsnapshot

Synopsis

{qsdtsnapshot -input <decision tree> | qsscsnapshot -input <scorecard>}[-generate Full]

[-focus <focus> [-subfocus<subfocus>]]

[-audit modeled | -audit all | -audit none]

[-description <text file>]

[-overwrite]

{qsdtsnapshot -input <decision tree> | qsscsnapshot -input <scorecard>}

-generate XML

-output <archived XML report>

[-focus <focus> [-subfocus<subfocus>]]

[-audit modeled | -audit all | -audit none]

[-overwrite]

{qsdtsnapshot | qsscsnapshot}

-generate HTML

{-input <archived HTML report> | -input<archived XML report>}

[-description <text file>

[-overwrite]

Description: create a Model Snapshot of the specified decision tree (.qsdt file) or scorecard(.qssc file). By default, audit all the fields used in the model. This is a two-stage process: first thecommand generates an intermediate XML file (packaged in a .qsxml archive), and then from thisintermediate file it creates the finished HTML-formatted report (packaged in a .qshtml archive).

Alternatively, create just the intermediate .qsxml archive — or create a .qshtml archive from anexisting .qshtml or .qsxml archive.

To unpack .qshtml and .qsxml archives, for viewing or reuse outside Spectrum Miner, useqshtmlunpack.

Optional arguments

EffectOption

Audit all the fields in the focus.-audit all

Audit the fields used in the model (the default if you do notspecify a -audit option).

-audit modeled

Do not audit any fields.-audit none

Include the contents of this text file as a description of themodel in the "User Supplied Metadata" section of the report.

-description <text file>

Report on the model as applied to the specified focus, ratherthan the focus used to build the model.

-focus <focus>

Create an HTML report from an XML or HTML report.-generate HTML

Create an HTML report directly (the default if you do notspecify a -generate option).

-generate Full

Create an XML report.-generate XML

Include printer- and screen-optimized PNG bitmaps andSVG images.

-htmlimages largepng

Do not include any images.-htmlimages none

Include screen-optimized PNG bitmaps and SVG images(the default if you do not specify a-htmlimages option).

-htmlimages smallpng

Include only SVG images.-htmlimages svg

Overwrite any existing report.

Otherwise, if there is already a report of the same name,qsdtsnapshot qsscsnapshot warns you anddoes nothing.

-overwrite

Besides these command-line options, qsdtsnapshot and qsscsnapshot accept the -settingsoption, as well as options common to all data-build commands [see Standard command-lineoptions on page 18].

Note: • If you do not use the -focus option, the focus used to build the model must still exist.• Additional report-formatting options are available through audits and snapshots

preferences (see Spectrum Miner Online Help).• Using Spectrum Miner, you can access most of the functionality of qsdtsnapshot andqsscsnapshot through the Decision Tree Snapshot or Scorecard Snapshot dialogbox .

Examples

Create a Model Snapshot from the decision-tree file early-adopter-modelApril.qsdt, builton the focus RetailCustApril.ftr, and applied to the same focus:

qsdtsnapshot -input early-adopter-modelApril.qsdt-output early-adopter-modelApril.qshtml

Apply this model to the RetailCustMay.ftr focus, but this time auditing over all fields, ratherthan only those involved in defining the model:

qsdtsnapshot -focus RetailCustMay.ftr -audit all-input early-adopter-modelApril.qsdt-output early-adopter-modelMay.qshtml

See also

qsaudit on page 98

qssettings on page 98

Synopsis qsxt -focus <focus> [-subfocus <subfocus>]

[-spec <crosstab file>] [-comparable]

[-output <crosstab file>]

[-description <text file>]

Description: apply a crosstab specification (supplied by default on standard input) to the specifiedfocus, to create a new crosstab (by default on standard output).

Optional arguments

EffectOption

Where relevant, use binnings from the result part of thesupplied crosstab rather than from the specification part.

-comparable

Include the contents of this text file as a description of thecrosstab.

-description <text file>

Create this crosstab file instead of writing to standard output.-output <crosstab file>

Use this crosstab file instead of reading from standard input.-spec <crosstab file>

Besides these command-line options, qsxt accepts the options common to all data-build commands[see Standard command-line options on page 18].

Note: • In the output crosstab, qsxt bins each field using one of the following (in decreasing orderof preference): a binning for the field in the supplied crosstab specification (or result partif you use the -comparable option); a binning for the field in the supplied focus; the defaultbinning parameters, as configured by binning preferences (see Spectrum Miner OnlineHelp).

• Using Spectrum Miner, you can access most of the functionality of qsxt through theApply Crosstab dialog box .

Examples

Create a new crosstab, ageGenderMay.qstv, from the RetailCustMay.ftr focus, using theageGenderApril.qstv crosstab specification file (generated from Decision Studio):

qsxt -focus RetailCustMay.ftr -spec ageGenderApril.qstv-output ageGenderMay.qstv

Additionally, use the binning contained in the ageGenderApril.qstv crosstab file to create theageGenderMay-binbyApril.qstv result:

qsxt -comparable -focus RetailCustMay.ftr -spec ageGenderApril.qstv-output ageGenderMay-binbyApril.qstv

See also

Crosstab specification for qsxt on page 371

qsinfo

Synopsis qsinfo -input <source focus>

[-generate HTML | -generate XML]

Description: display information on the locations and sizes of the files that constitute the sourcefocus, including any data files shared with other foci. Display the source focus's relationships toother foci (listing both foci that depend on it and foci on which it depends). By default, write the reportto standard output in an XML-based file format.

Optional arguments

EffectOption

Create the report in HTML format.-generate HTML

Create the report in XML-based file format (the default ifyou do not specify a -generate option).

-generate XML

Besides these command-line options, qsinfo accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • A focus is stored on disk as a .ftr file (the "focus file") together with a collection of datafiles and other files in two folders, with names ending in .rdx ("Raw Data eXtract") and.xtr ("eXTRa").

A new focus that is created by copying an existing focus [see qscopy on page 91] or byimporting data from another kind of dataset is independent of other foci. On the other hand,a focus that is saved from Decision Studio with a new name, or one that is created bymaking a linked copy [see qslink on page 92] is dependent on data folders belonging toother foci.

• Using Spectrum Miner, you can access most of the functionality of qsinfo through theDisplay Focus Properties dialog box .

Example

Create a new HTML focus properties report RetailCustApril.html, from the focusRetailCustApril.ftr:

qsinfo -input RetailAprilAnalysis.ftr -generate HTML-output RetailAprilAnalysis.html

See also

qsdescribelicense

Synopsis qsdescribelicense

[-input <license file>]

Description: display a report about the installed Spectrum Miner license in plain-text format.

Optional arguments

EffectOption

Report on the specified license file instead of the installedlicense.

-input <license file>

Besides these command-line options, qsdescribelicense accepts the options common to alldata-build commands [see Standard command-line options on page 18].

The report may include the following components:Note:

• the name of the license holder• the IP address or subnet for which the license is valid• the maximum focus size for which the license is valid• start and expiration dates for the license• a list of licensed Spectrum Miner components, with possible per-component overrides

for IP address/subnet, maximum focus size, and dates of validity

Example

Show the license keys in the current Spectrum Miner license file:

qsdescribelicense

9 - Commands forbuilding models

In this section

About the Scorecard Wizard 114qsscorecardwizard 115qsdecisiontree 120qsscorecard 122About the Association Rule Wizard 123qsruleminer 125

About the Scorecard Wizard

The Scorecard Wizard automates scorecard modeling using best-practice methodology: performingoptimized binning, then variable reduction, and finally building a sequence of models on test andtraining data to establish the best "model size." The modeling methodology consists of the followingstages.

Test/training methodology: part of the source data is used for training, and part for testing. Thissplit can be generated at random or using a designated binary indicator field.

Optimized binning: the training data is used to select a binning for each analysis candidate thatmaximizes its univariate predictive power against the objective.

For non-categorical fields, the optimized binning wizard is used iteratively to choose the best binning.The number of bins is increased until the number of "turning points" in the plot of the averageobjective value by bin exceeds a configurable threshold (a "turning point" being a change of sign inthe slope of the graph). Intuitively, this ensures that the profile of average objective value as afunction of each analysis candidate is relatively "smooth," with not too many ups and downs.

For categorical fields, categories where the average objective value is not significantly different fromthe overall mean are merged together. A standard difference of means test [see significance onpage 174] is used, with a Bonferroni correction applied based on the total number of categories, c,that appear for the field. Thus, all categories where the confidence level are merged (with aconfigurable significance threshold).

During this phase, fields with suspiciously high correlation to the objective are excluded from furtheranalysis (with a warning message). A warning is also issued for fields where any individual bincontains a small number of records. Both thresholds are configurable.

Variable Reduction: the variable reduction phase reduces the pool of analysis candidates(independent predictors) from those initially marked to a smaller set of n, in order to decreaserun-time of the final "right-sizing" phase.

The process partitions the analysis candidates into groups of up to 2n fields and builds a scorecard(using the training data) on each group, discarding the worst fields by stepwise exclusion until onlythe best n fields of the group remain. By repeating this procedure, eventually the n best overall fieldsare identified.

Right-sizing: the right-sizing phase chooses the scorecard that best balances complexity (numberof predictors) against generalizability (as measured by test-set performance). The training data isused to build a sequence of n scorecards with the first containing all n analysis candidates, andeach subsequent model excluding the worst predictor. The performance of each model is thenassessed on the test data, with the best model size chosen using a configurable criterion. As a finalstep, the weights in the selected model are refit based on all data (i.e. both test and training) toimprove accuracy. All interim models can be saved along with the final model if desired.

Results: various outputs are (optionally) generated, including:

• An HTML report summarizing how the methodology was applied, including the input parametersand links to the detailed execution log. This includes a table of performance data that can be easilycharted within Excel.

• The final recommended model (in .qssc and QMML formats).• A Model Snapshot report for the recommended model.• The final dataset including the selected analysis candidates, optimized binnings, and model

predictions. Intermediate datasets can also be saved if desired, containing the metadata generatedby the methodology to that point. These are often useful to drill down and explore the decisionsmade during the process.

See also

qsscorecardwizard on page 115

qsscorecardwizard

Synopsis qsscorecardwizard <parameters file>

Description: build a scorecard on a specified focus by using a parameter file.

You can also access the functionality of the data-build command qsscorecardwizard byusing the Scorecard Wizard dialog box in Spectrum Miner.

Modeling parameters: the modeling process executed by qsscorecardwizard is controlled bythe parameter file you specify as a command-line argument. For example, with a file calledparameters.ini containing:

[globals]inputFocus=D:/Data/DirectBank/DirectBank.ftrobjectiveField=CardVisaxmlReport=D:/Data/DirectBank/DirectBank.xml[variable reduction]maxVariables=5[model development]finalModel=D:/Data/DirectBank/DirectBank_auto.qsscfinalFocus=D:/Data/DirectBank/DirectBank_auto_3rightsized.ftr

the following command

qsscorecardwizard parameters.ini

would create a scorecard model DirectBank_auto.qssc along with a new focusDirectBank_auto_3rightsized.ftr and a model report DirectBank.xml.

The parameter file has a similar format to a Windows .ini file:

• Each section is indicated by a section heading enclosed in square brackets, e.g. [globals].The four valid section names are listed below.

• Parameters within each section are defined by specifying the parameter name, an equal sign andthe parameter value, e.g. maxBins = 20.

• Parameter names are not case-sensitive, and leading and trailing whitespace on each line isignored.

• Blank lines and lines starting with the comment characters # or ; (along with optional leadingwhitespace) are ignored.

There are default values for several of the parameters, which can be modified on a site-wide orper-user basis, but these are always overridden by parameters specified in the command-line .inifile. Specifying a parameter value in any of the following locations overrides any setting in a priorlocation (where <smhome> is the Spectrum Miner installation directory):

• The required standard defaults in

<smhome>/ext/scorecardwizard/defaults.ini.

• Optional site-wide defaults in

<smhome>/ext/scorecardwizard/scorecardwizard.ini

• Optional per-user defaults in

<smhome>/shared/users/<user>@<domain>/scorecardwizard.ini

• The required parameter file specified on the command line.

Parameters in the [globals] section

DescriptionDefaultParameter

The location of the input focus.(required)inputFocus

The objective field name. Required unless the focus contains an objectiveinterpretation. If set, this setting overrides any focus interpretation.

(none)objectiveField

A comma-separated list of analysis candidate fields. If set, any analysiscandidate interpretations in the focus are ignored.

(none)useFields

A comma-separated list of fields to exclude from the list of analysiscandidates.

(none)ignoreFields

The field to use as the test-training indicator. The field must be a numericbinary field with 1 indicating training data, 0 indicating test. If not set, asplit will be generated automatically.

(none)testTrainingField

The proportion of records (between 0 and 1) to select for training, unlesstestTrainingField is specified.

0.5trainingProportion

The random seed to use. If not set, a random seed will be generatedautomatically. You can replicate a previous run by using the seedreported by that run (in the "parameters" section).

(none)randomSeed

Set to true to output additional diagnostic information to the console log.This extra detail is always logged to the file log; see logFile.

falseverbose

The location to log execution progress. If not specified, the log is createdin <params>-<YYYY-mm-dd-HH-MM-SS>.log alongside

(automatic)logFile

the <params>.ini parameter file specified on the command line.(When launched via Spectrum Miner, the log is created in<smhome>/shared/logs/qsscorecardwizard-<user>-<YYYY-mm-dd-HH-MM-SS>.log.)

The location to create the XML summary report. By default the reportis saved at <params>_report.xml alongside the<params>.ini parameter file specified on the command line.

(automatic)xmlReport

Parameters in the [optimized binning] section

Set to false to skip optimized binning.truedoOptimizedBinning

The minimum number of records thatmust appear in a non-null bin during

0minBinSize

optimized binning of non-categoricalfields. Any bin (including null andcategorical bins) with fewer records willtrigger a warning in the final report.

The maximum number of bins thatshould be created for non-categorical

20maxBins

fields. The selected number of binsmight be lower, based on the"turning-point" constraint (see below).

The maximum number of changes ofsign in the difference of the mean of the

2maxTurningPoints

objective field between consecutive bins(excluding null, unclassified). A valueof 0 forces the final binning to bemonotone in the mean of the objectivefield. A value ofmaxBins (or greater)removes this constraint. Typically avalue of 1 or 2 is reasonable.

The maximum "believable" Gini (or R^2)percentage. Fields with predictive

95maxQuality

power above this threshold will beexcluded from further analysis. Notethis check is always carried out (evenwhen optimized binning is disabled),unless a value of 100 is specified.

The significance level that determineswhether a categorical bin has a

0.001pValue

response rate significantly different fromthe overall population. The value isspecified on a 0 to 1 scale, typicallynear 0. Note that a Bonferronicorrection is also applied.

The location to save the interim focusafter optimized binning, if desired.

(none)binnedFocus

Parameters in the [variable reduction] section

Set to false to skip variable reduction(and use all analysis candidates in themodel right-sizing phase).

truedoVariableReduction

The maximum number of analysiscandidates to allow in the final model(typically 20 — 100).

100maxVariables

The location to save the interim focusafter variable reduction, if desired.

(none)reducedFocus

Parameters in the [model development] section

Set to false to skip model right-sizing.truedoModelDevelopment

Specifies the type of model, for labelingpurposes. Valid values are:

responsemodelType

response, risk, churn,satisfaction,dissatisfaction.

Specifies whether scorecard modelsshould be fit with logistic

logisticregressionMethod

regression or a (faster) linearapproximation.

Specifies either field or bin levelparameter estimation in the scorecard.

fieldregressionLevel

Bin-level weighting can be moreaccurate on training data but is morelikely to over-fit on test data.

Specifies the right-sizing criterion usedto choose the best number of fields.

minrightSizeCriteria

Valid values are : BIC for BayesianInformation Criteria, AIC for AkaikeInformation Criteria (typically weakerthan BIC in the sense of allowing morefields), delta for diminishing returnsin quality measure for binary outcomes(seeminQualityIncreasePctbelow), min or max for the minimum(or maximum) of all recommendedcriteria (excluding delta in thecontinuous case).

This is a parameter used by thedelta form of

0.5minQualityIncreasePct

rightSizeCriteria: therecommended model is the one withthe most fields such that the worst fieldadds at least the specified amount tothe test-set Gini (or R^2) percentage.

Set to true to save intermediate modelsas.qssc and.qmml files, otherwise

falsesaveIntermediateModels

only the final recommendation will besaved.

Specify the name of the prediction fieldfor the final model. If not set, a unique

(automatic)predictionField

field name based on the model type isused, e.g.scorecardwizard__response.

The location to save the focuscontaining the final analysis candidatesand recommended model, if desired.

(none)finalFocus

The location stem to save the finalscorecard model file (in both .qsscand .qmml format), if desired.

(none)finalModel

If both finalModel andfinalFocus are specified, and the

truerunModelSnapshot

option is true, a Model Snapshot(.qshtml file) is created alongsidethe finalModel files.

See also

About the Scorecard Wizard on page 114

qsdecisiontree

Synopsis qsdecisiontree -input <source focus> [-subfocus<subfocus>]

-build <decision-tree build specification> -result <decision-tree buildreport> [-output <destination focus>] [-force]

Description: use a decision-tree build specification to build a decision tree on the specified focus,and create a new decision-tree build report (containing decision-tree statistics, a decision-treeprediction score and a calibration crosstab) in an XML-based file format.

Optional arguments

EffectOption

Create this focus, containing an additional derived field thatapplies the new decision-tree prediction score to the inputfocus.

Besides these command-line options, qsdecisiontree accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Example: given the decision-tree build specification, decisiontree-specification.xml, withthe following contents:

<?xml version="1.0" encoding="UTF-8"?><decisiontree xmlns="http://www.quadstone.com/xml">

</analysiscandidates><resultfield>

</specification></decisiontree>

use this specification to create a decision-tree build report decisiontree-report.xml, andcreate a new focus RetailCustAprilPredictedAge.ftr, containing the derived decision-treeprediction field PredictedAge:

qsdecisiontree -build decisiontree-specification.xml-input RetailCustApril.ftr -result decisiontree-report.xml-output RetailCustAprilPredictedAge.ftr

See also

Decision-tree build specification for qsdecisiontree on page 376

qsscorecard on page 122

qsscorecard

Synopsis qsscorecard -input <source focus> [-subfocus <subfocus>]

-build <scorecard build specification> -result <scorecard build report>[-output <destination focus>] [-force]

Description: use a scorecard build specification to build a scorecard on the specified focus, andcreate a new scorecard build report (containing scorecard statistics, a scorecard prediction scoreand a calibration crosstab) in an XML-based file format.

Optional arguments

EffectOption

Create this focus, containing an additional derived field thatapplies the new scorecard prediction score to the inputfocus.

Besides these command-line options, qsscorecard accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • For a binary objective, qsscorecard uses logistic regression; for a continuous objective, ituses linear regression.

Example

Given the scorecard build specification, scorecard-specification.xml, with the followingcontents:

<?xml version="1.0" encoding="UTF-8"?><scorecard xmlns="http://www.quadstone.com/xml">

</objectivefield><analysiscandidates>

</specification></scorecard>

use this specification to create a scorecard build report scorecard-report.xml, and create anew focus RetailCustAprilPredictedAge.ftr, containing the derived scorecard predictionfield PredictedAge:

qsscorecard -build scorecard-specification.xml-input RetailCustApril.ftr -result scorecard-report.xml-output RetailCustAprilPredictedAge.ftr

See also

qsdecisiontree on page 120

Scorecard build specification for qsscorecard on page 379

About the Association Rule Wizard

The Association Rule Wizard automates discovery of association and sequencing rules. This istypically used to explore detailed customer transaction, basket or event data for applications suchas:

Market basket analysis: identify combinations of products that commonly appear (or don't appear)together within a single purchase transaction (basket of items).

Best next product: determine the most common sequence in which products or services arepurchased; for example, banking customers are more likely to open an investment/trading accountafter opening a high-interest savings account.

Marketing impact analysis: identify patterns of marketing communications that are more likely tooccur before a purchase

Data preparation: Association Rule Wizard runs directly against transaction or event data storedin a focus dataset, benefitting from Spectrum Miner's high-performance data access.

• For Market Basket analysis, you should use an item-level dataset containing one record per item.The dataset should contain a basket identifier field that groups items in the same basket, as wellas a product identifier field (binned and categorical) that indicates the product name or code foreach item. The dataset should be sorted by the basket identifier key field.

• For Next Product analysis, you should pick a dataset with one record per product or service. Thedataset should contain a basket identifier — often the customer identifier — that groups productstogether, along with the product identifier (binned and categorical) and a product timestamp (integer

or date) that indicates the order in which products were acquired. The dataset should be sortedby the basket identifier key field.

Algorithm parameters: Association Rule Wizard is based on the industry-standard apriori algorithmand includes a broad range of functionality to control rule discovery.

You can restrict rule discovery to particular products of interest, and control various parameters thatdefine what an interesting rule is (support, confidence, rule complexity, extended rule selectionmeasures). Note that these settings can impact execution time: in particular, reducing the minimumsupport value will increase run time unless a maximum rule complexity (number of antecedents) isalso specified.

Terminology: Association Rule Wizard uses a variety of terms that might be unfamiliar, including:

Rule: in general, a rule asserts that if a set of conditions is satisfied (the left-hand side, or antecedentof the rule), then some other condition will occur (the right-hand side, or consequent of the rule). Inassociation and sequencing rules, the conditions in the rule correspond to the presence of particularitems in the basket. For example, a simple rule might be: if a basket contains bread, then it will alsocontain jam.

Antecedent: the left-hand side of a rule, i.e. the set of items that must appear in the basket for therule to be relevent.

Consequent: the right-hand side of a rule, i.e. the set of items that the rule predicts will be presentwhen the left-hand side (antecedent) is satisfied.

Support: the (absolute) support of an item set is the number of baskets that contain all items in theset. Relative support is the same number expressed as a percentage of the total number of baskets.

Rule Support: the support of a rule is the support of the rule's antecedent, i.e. the proportion ofbaskets for which the left-hand side of the rule applies. This measures how often the rule is relevant.

Rule Confidence: the confidence of a rule is the support of the set of all items that appear in therule (on left or right), divided by the support of the rule's antecedent. It measures the likelihood thatthe rule's consequent (right-hand side) is true given that the antecedent of the rule is satisfied, i.e.the likelihood that the rule is correct.

Absolute Confidence Difference to Prior: the difference in the posterior and the prior confidencemeasures (with and without the antecedent items). This measures how much more (or less) likelythe rule's consequent is when the left-hand side of the rule is true. For example, the rule "if bread,then jam" is not very interesting if it's confidence is similar to the rule "if (anything) then jam".

Lift Value: the ratio of the posterior and the prior confidence measures (with and without theantecedent items).

Results: various outputs are (optionally) generated, including:

• An interactive HTML model report summarizing the rules created, their metrics and the option tofilter on interesting rules.

• A scored new dataset creating an aggregated dataset (one record per basket identifier) indicatingwhether each selected rule is satisfied for that group of records.

• The rules as deployable code in Transaction Measurement Language format.

qsruleminer

Synopsisqsruleminer -ini <configuration file> -output <rules file> [-force]

Description: find association or sequencing rules.

Parameters: the configuration file controls how the rule mining process executes. Each line of thefile can specify one of the options listed below. Lines starting with # or lacking an equals sign ( =) are ignored. (Note trailing comments are not supported.) A line starting with eof causes the restof the file to be ignored.

For example, the following ini file would seek rules displaying lift as the additional measure withat least three and at most 10 items, with minimum confidence of 20% and minimum support of 5%,and minimum lift of 25%:

focus = C:/association/MarketBasket.ftr

task = market basket

key fields = TRANS_NO

product field = ProductName

options = -tr -el -d25 -s5.0 -c20.0 -n10 -m3

The location of the input focus.(required)focus

Specifies the discovery task. Validvalues are: segment, marketbasket, next product.

(required)task

Specifies a key field or composite key(comma-separated list of fields) that

(required)key fields

define a basket. The input datasetshould be in sorted order with respectto the key.

Specifies the product field identifier. Thefield should be binned and categorical.

(required)product field

Each bin defines a separate product(so a categorical binning hierarchy canbe used to work at different levels of

detail on the same underlying productdata). Composite keys are notsupported.

Comma-separated list of items thatrestricts the items that can appear on

(none)rhs

the right-hand side (consequent) of arule. Note that no excess whitespaceis permitted — - the labels itemA ,itemB and itemC need to matchexactly the corresponding bin labels forthe product field.

(next product task only) Specifies thefield to use as the timestamp for each

(none)timestamp field

item. Must be integer or date type.Composite time-stamp keys are notsupported.

(segment task only) Specifies acomma-separated list of fields to use in

(none)segment fields

the segment discovery task. Bin labelswithin the specified fields willcorrespond to items in the discoverytask.

Pass everything after the equal signdirectly to the apriori algorithm.

(none)options

Support apriori options include (also seehttp://www.borgelt.net/doc/apriori/apriori.html#options):

DescriptionOption

Discover association rules-tr

Minimum support value (default: 10%)-s<#>

Maximum support value: positive: percentage; negative:absolute number (default: 100%)

Minimum confidence value (default: 80%)-c<#>

DescriptionOption

Minimum number of items per rule (default: 1)-m<#>

Maximum number of items per rule (default: no limit)-n<#>

Do not use an additional rule evaluation measure-ex

Minimum value for the additional rule evaluation measure(default 10%)

To specify the additional rule evaluation measure:

Rule Confidence-ec

Absolute Confidence Difference to Prior-ed

Lift Value-el

Difference of Lift Value to 1-ea

Difference of Lift Quotient to 1-eq

Conviction-ev

Difference of Conviction to 1-ee

Difference of Conviction Quotient to 1-er

Certainty Factor-ef

Normalized Chi Squared Measure-en

P-value from Chi Squared Measure-ep

Information Difference to Prior-ei

P-value from G Statistic-eg

Binary Log of Support Quotient-eb

10 - Commands forworking with QMMLfiles

In this section

qsqmmlview 129qsqmmledit 129qslt 132qsqmml2sas 134qsqmml2sql 134qsqsfmtosas 135

qsqmmlview

Synopsis qsqmmlview -input <source QMML file>

[-prettyprint | -rulesets | -ruleset <ruleset>]

Description: display information about the source QMML file: show the contents as neatly-formattedXML (the -prettyprint option). Show summary information for all the rulesets in the file (the-rulesets option); or show detailed information for a specified ruleset (the -ruleset option).

By default, qsqmmlview shows detailed information for all rulesets in the file.

Optional arguments: besides the required arguments, qsqmmlview accepts the options commonto all data-build commands [see Standard command-line options on page 18].

Note: • If you specify the -rulesets or the -ruleset option, and the QMML file is not of theqmml:rules form, qsqmmlview fails.

• If you use the -ruleset option, and the QMML file does not contain a ruleset with thespecified name, qsqmmlview fails.

• For information on QMML rules and rulesets, seeQuadstoneMetadataMarkup Language.

Examples

Check which (if any) of the rulesets in the QMML rules document model.qmml are live:

qsqmmlview -input model.qmml -rulesets

View the names and types of the input and output fields for the (compiled) ruleset OutcomeRulesin the QMML rules document model.qmml:

qsqmmlview -input model.qmml -ruleset OutcomeRules

See also

qsqmmledit on page 129

qsqmmledit

Synopsis qsqmmledit -input <source QMML file> [-output <destination QMMLfile>]

-addrule <rules file>

qsqmmledit -input <source QMML file> [-output <destination QMML file>]-addruleset <ruleset> <rule> [<rule> ...]

qsqmmledit -input <source QMML file> [-output <destination QMML file>]

-deleteruleset <ruleset> qsqmmledit -input <source QMML file>

[-output <destination QMML file>]

-parse

qsqmmledit -input <source QMML file> [-output <destination QMML file>]

{-compile | -compile <ruleset> [<ruleset> ...] }

[-l] [-s]

qsqmmledit

-input <source QMML file> [-output <destination QMML file>]

{-setlive | -setlive <ruleset> | -setlive <ruleset> true | -setlive<ruleset> false}

Description: modify the source QMML file (if the -output option is not used) performing one ofthe following operations:

• Add a rule from the rules file, which is either a standard XML file containing <field> elements,or a QMML FDL interchange file containing rules in <fdl-derivation> and <fdl-function>elements.

An unnamed rule is given the name Rule. If the source QMML file is not of the qmml:rules form,or already contains a rule with the same name as any of the rules in the rules file, qsqmmleditfails.

• Add a ruleset containing one or more rules (already present as <field> elements in the sourceQMML file).

If the source QMML file is not of the qmml:rules form, or already contains a ruleset with thespecified ruleset name, or does not contain rules with all of the specified names, qsqmmleditfails.

• Delete a ruleset (but not the rules that it contains).

If the source QMML file is not of the qmml:rules form, or does not contain a ruleset with thespecified name, qsqmmledit fails.

• Parse the file.

If you use File->Export as QMML in Decision Studio to create the source QMML file, SpectrumMiner parses the QMML file automatically. In this case, you do not need to use the -parsecommand. Otherwise, the -parse command is a prerequisite for compilation.

If the source QMML file is not of the qmml:rules form, qsqmmledit does nothing (unless youuse the -s option, in which case it fails). If the source QMML file is of the qmml:rules form,qsqmmledit attempts to parse all unparsed rules. If a rule cannot be parsed, qsqmmleditproduces a warning message and continues with the next rule (unless you use the -s option, inwhich case it fails).

• Compile all rulesets or the specified rulesets (a prerequisite for deployment).

If the source QMML file is not of the qmml:rules form, qsqmmledit does nothing (unless youuse the -s option, in which case it fails). If a rule is not parsed, qsqmmledit fails.

• Mark all rulesets as live (ready for deployment), or mark a specified ruleset as live (true/default)or not (false).

If the source QMML file is not of the qmml:rules form, or does not contain a ruleset with thespecified name, or the specified ruleset is not compiled, qsqmmledit fails.

Optional arguments

EffectOption

As each ruleset is compiled, mark it as live.-l

Instead of modifying the source QMML file, create thespecified destination QMML file, and apply the requestedchanges to it.

-output <destination QMML file>

Treat the file strictly, disallowing empty rulesets. When usedwith the -parse option, fail if the source QMML file is not

of the qmml:rules form, or if any unparsed rule in thesource file cannot be parsed. When used with the-compile option, fail if the source QMML file is not ofthe qmml:rules form.

Besides these command-line options qsqmmledit accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

Note: • For information on QMML rules and rulesets, seeQuadstoneMetadataMarkup Language.• A rule can only be parsed if it is supported by the RealTime variant of FDL.

Examples

Add rules from the QMML FDL interchange file model.xfdl to the QMML rules documentmodel.qmml:

qsqmmledit -input model.qmml -addrule model.xfdl

Assign a rule Outcome in the QMML rules document model.qmml to a new ruleset OutcomeRules:

qsqmmledit -input model.qmml-addruleset OutcomeRules Outcome

Deploy the QMML rules document model.qmml, by parsing all embedded FDL, compiling allrulesets, and marking the ruleset OutcomeRules as live:

qsqmmledit -input model.qmml -parseqsqmmledit -input model.qmml -compileqsqmmledit -input model.qmml -setlive OutcomeRules

See also

qslt on page 132

qsqmmlview on page 129

Synopsis qslt {-input <source file> | -input -}

[-output <destination file> | -output -]

[[-source <source type>] [-target <destination type>] | -transform <namedtransformation>]

[-spec <specification file>]

Description: transform the specified source file or standard input (with -input -) to a differentformat, writing the result to standard output by default.

By default, qslt deduces the source and destination file types from the filename extensions (ifpresent), and from the file contents in the case of an XML source file. It then attempts to find andapply a transformation that matches these types. If none exists, or more than one possibletransformation exists, the command does nothing except issue an error message and fail.

The standard transformations matching source and destination file types are:

• From fdl (an FDL expression, or a set of TML create statements) to qmml:expressions (fullyparsed FDL in QMML)

• From fdl (an FDL expression, or a set of TML create statements) to qmml:x-fdl (QMML FDLinterchange format) and vice versa

• From qmml:expressions (fully parsed FDL in QMML) to qmml:x-fdl (QMML FDL interchangeformat)

• From qmml:rules (QMML rules format) to fdd (flat-data description format)

The remaining standard transformation is the named transformation prettyprint, which transformsarbitrary XML into neatly-formatted XML (with line breaks and indentation). You can only choose itby using the -transform option.

Optional arguments

EffectOption

Write the transformed data to this destination file (overwritingany existing file of the same name).

-output <destination file>

Write the transformed data to standard output (the defaultif you do not specify a -output option).

-output -

Consider only transformations matching this source type(overriding a deduced source type, if any). You must specifythis option if qslt is reading from standard input.

-source <source type>

Locate the required transformation in this file, ignoring thestandard transformations and any system-wide oruser-specified customizations.

-spec <specification file>

Consider only transformations matching this destinationtype (overriding a deduced destination type, if any). You

-target <destination type>

must specify this option if qslt is writing to standardoutput.

Use the named transformation instead of selecting atransformation on the basis of source and destination types.

-transform <named transformation>

Besides these command-line options qslt accepts the options common to all data-build commands[see Standard command-line options on page 18].

Note: • When converting between the fdl type and QMML, TML create statements correspondto named field derivations in QMML, while single FDL expressions correspond to unnamedfield derivations in QMML.

• The conversion from qmml:rules to the fdd type omits derived fields.• As well as the standard transformation types, you can define custom transformations, on

a system-wide or per-user basis. Custom transformations can accept additionalcommand-line options.

• For details of the QMML forms qmml:expressions, qmml:rules, and qmml:x-fdl,see Quadstone Metadata Markup Language.

Examples

Transform the field metadata represented by the FDD file model.fdd into an "empty" QMML rulesdocument model.qmml (describing a set of fields, but containing no rules), using a user-specifiedtransformation described in the file ext/qmml/qslt.xml:

qslt -source fdd -target qmml:rules -input model.fdd-output model.qmml -spec ext/qmml/qslt.xml

Transform an FDL expression in the file model.fdl to QMML FDL interchange format, writing theresult to the file model.xfdl:

qslt -source fdl -target qmml:x-fdl -input model.fdl-output model.xfdl

See also

XML in Spectrum Miner on page 356 qsqmmledit on page 129

qsqmml2sas

Synopsis qsqmml2sas <QMML file>

Description: convert a QMML file to a SAS file that has a separate SAS statement for each fieldderivation.

Note: • The command is experimental.• You can access most of the functionality of the data-build command qsqmml2sas by using

the Convert to SAS dialog box in Spectrum Miner.

Example

Convert model.qmml to model.sas:

qsqmml2sas model.qmml

qsqmml2sql

Synopsis qsqmml2sql <QMML file>

{Oracle | Teradata V2Rx | MS SQL Server}

[<field> [, <field> ... ]]}

Description: use a supported database type and a source table to convert a QMML file to an SQLfile that has a separate SQL SELECT statement for each field derivation.

Optional arguments

EffectOption

Converts only derived fields.<field> [, <field> ... ]

Note: • The command is experimental.• You can access most of the functionality of the data-build command qsqmml2sql by using

the Convert to SQL dialog box in Spectrum Miner.

Example

Convert model.qmml to model.sql, create SELECT statements for Microsoft SQL Server, anduse prod.customer as the source table for any scoring.

qsqmml2sas model.qmml "MS SQL Server" prod.customer

qsqsfmtosas

Synopsis qsqsfmtosas <metadata file>

Description: convert derivations from a metadata file to a SAS file that has a separate SAS statementfor each field derivation.

Note: • The command is experimental.• You can access most of the functionality of the data-build command qsqsfmtosas by

using the Convert to SAS dialog box in Spectrum Miner.

Example

Convert metadata.qsfm to metadata.sas:

qsqsfmtosas metadata.qsfm

11 - Other commands

In this section

qsmapgen 137

qsmapgen

Synopsis qsmapgen

-input <categorical hierarchy> [-input < categorical hierarchy> ...]

-name <map name> -output <directory> [-overwrite]

[-style radial_drill | -style radial_layers]

Description: from one or more categorical hierarchy files (typically with filename extension .hrc),create a representation of the categorical hierarchies suitable for use in the Decision Studio MapViewer, with a region corresponding to each category. Create the map or maps in the specifieddirectory, using the specified map name as the basis for filenames.

Optional arguments

EffectOption

Allow new map files specified using the -name and-output options to overwrite existing files.

Without the -overwrite option, if any of the map filesto be created already exists, the command does nothing(except issue a warning).

-overwrite

Create a hierarchy of maps suitable for drilling down in theDecision Studio Map Viewer, with the top-level map (shown

-style radial_drill

initially) based on the coarsest-grained binning in eachcategorical binning hierarchy (the default if you do notspecify a -style option).

Create an independent map for each level in the categoricalbinning hierarchies, so that a Decision Studio user can

-style radial_layers

choose the degree of coarseness at which to view fields inthe Map Viewer, though without the option to drill down.

Besides these command-line options, qsmapgen accepts the options common to all data-buildcommands [see Standard command-line options on page 18].

• If a categorical hierarchy file does not specify a correct hierarchy, qsmapgen fails.• You should not use a field value in more than one categorical hierarchy file. If you do, the behavior

of qsmapgen is undefined.

Other commands

• If you specify a category name in more than one categorical hierarchy file, or in more than onelevel of the same file, qsmapgen adds one or more suffixes to the region names to ensureuniqueness.

• For a map produced using -style radial_drill (or no specified -style option), the top-levelmap consists of .mif and .mid files, named using the specified map name. Drill-down maps forthe next level of detail are contained in the next subdirectory. Maps at the subsequent level ofdetail (if any) are contained in the next subdirectory of that directory, and so on.

• The maps produced using -style radial_layers consist of pairs of .mif and .mid files,each named using the given map name together with a numeric suffix corresponding to the levelwithin the hierarchy.

• Using SpectrumMiner, you can access most of the functionality of qsmapgen through theCreateCategory Map dialog box .

Examples: given the categorical hierarchy file Referrer.hrc, containing the following:

B, Banner Ad, Marketing

D, Direct, Direct

E, E-mail Link, Marketing

G, Google, Search Engine

M, MSN, Search Engine

Y, Yahoo, Search Engine

L, Lycos, Search Engine

and the categorical hierarchy file PurchasePage.hrc, containing the following:

01, Menswear

02, Womens

03, Kids

04, Home

create a map file clickthru.mif (and supporting files) suitable for drilling down, in a new directorywebmaps:

qsmapgen -input Referrer.hrc -input PurchasePage.hrc-output webmaps -name clickthru

The top-level map has the following regions (where "spikes" on the circumference indicate regionsthat support drilldown to the next level):

Other commands

Drilldown on the "SearchEngine" region produces a map with the following regions:

Alternatively, create independent map files clickthru_layer_1.mif andclickthru_layer_2.mif (and supporting files) in the directory webmaps:

qsmapgen -input Referrer.hrc -input PurchasePage.hrc-output webmaps -name clickthru_layer -style radial_layers

The clickthru_layer_1.mif map has the following regions:

Other commands

and the clickthru_layer_2.mif map has the following regions:

Other commands

12 - TransactionMeasurementLanguage

In this section

About Transaction Measurement Language 142TML syntax 142Reserved words in TML 143

About Transaction Measurement Language

Transaction Measurement Language (TML) is a collection of lightweight syntactic wrappers aroundFDL expressions, used to specify field derivations and aggregations in conjunction with the DeriveFields andAggregate Recordswizards, and theSpectrumMiner data-build commands qsderive,qsmeasure, qstrack and qsselect.

See also

About Field Derivation Language on page 179

qsderive on page 60

qsmeasure on page 63

qsselect on page 70

qstrack on page 67

TML syntax on page 142

TML syntax

A derivations, trackers, selections, or aggregations file contains one or more TML create statements,while a statistics file contains one or more TML calculate statements.

• TML is a case-sensitive language:

• TML keywords (create, by, string etc.) are written in lowercase letters, with the exceptionof STATISTIC.

• The names of all aggregation functions [see Aggregation functions for measurements andderivations on page 156] are written in lowercase letters.

• The names Average, average, and AVERAGE all refer to different fields or statistics.

• Names for fields and statistics must begin with a letter ("A" — "Z" or "a" — "z"), contain only letters,digits, and underscore ("_"), and be no longer than 128 characters; optionally quoted in singlequotation marks, for example, 'CustomerID'. You must enclose the name of a field or statisticin single quotation marks if it coincides with a TML or FDL reserved word, or differs from one onlyin case.

• TML syntax is described using railroad diagrams like this:

Transaction Measurement Language

As long as you follow the arrows and don't backtrack, any path through a railroad diagram leadsto syntactically valid (though not necessarily meaningful) TML:

• Words in oval/circular boxes (like create) are TML keywords; you should type them exactly asthey appear.

• You should also type punctuation in oval boxes (like ":" and ":=") exactly as it appears (youcan't have a space between ":" and "=" in ":=").

• Don't forget the semicolon that ends each TML statement.• Terms in rectangular boxes (like expression) refer to other railroad diagrams or to forms described

elsewhere in words; you should not type these literally.

• TML is a freely formatted language: you can break lines or insert extra spaces between syntacticelements (in oval/circular boxes in railroad diagrams) without affecting the interpretation of aderivations, trackers, selections, aggregations, or statistics file.

• As well as introducing spaces between syntactic elements, you can include comments, whichbegin with a double slash ("//") and continue to the end of the same line.

• Wherever TML syntax requires an expression, you can use any FDL expression [see Expressionson page 183], except that an expression involving a TML reserved word, a semicolon, or braces{...} must be enclosed in parentheses (...).

See also

About Transaction Measurement Language on page 142

Evaluating focus statistics: the calculate statement on page 153

Field definition: the create statement on page 146

Reserved words in FDL on page 206

Reserved words in TML on page 143

Reserved words in TML

The following tokens are reserved words in TML:

aggregation, all, and, as, auto, automatic, bin, binning, bool, boolean,by, calculate, constant, create, crosstab, date, default, description,dimension, double, drl, enum, except, fields, file, float, format, from,full, function, group, include, inplace, input, int, integer, join, left,long, merge, meta, modifiable, normal, on, open outer, outfile, output,parameter, pragma, real, ref, reference, rename, right, separate_functions,short, show, show_empty_segments, show_summaries, sort, state, statistic,strbytes, string, suppress, table, temp, temporary, track, tracker, view,where

To use a reserved word, or a word that differs only in case from a reserved word, as a field or statisticname in TML, or as an identifier [see Expressions on page 183] in FDL, you must enclose it in singlequotation marks — for example, 'Date'.

See also

13 - TML statements

In this section

Field definition: the create statement 146Using aggregation functions and the where and default clauses 148Splitting aggregations: the by clause 150Evaluating focus statistics: the calculate statement 153

Field definition: the create statement

The create statement defines a field by derivation or aggregation, depending on the context inwhich it appears.

Derivations: the derivations file for qsderive, or the equivalentDerive Fieldswindow, the trackersfile for qstrack, and the selections file for qsselect must contain create statements of thefollowing form:

The aggregations file for qsmeasure, or the equivalent Aggregate Records window can optionallycontain create statements of the following form:

In both cases, the create statement defines a field called name in terms of an FDL expression[see Expressions on page 183]. The datatype of the field is the datatype of the expression, unlessyou override it by specifying a compatible type [see Datatypes on page 181].

• For each successive create statement in the derivations file, qsderive, or the Derive Fieldswindow derives a field in the output focus using the given expression, which can refer both to fieldsin the source focus and to fields derived earlier in the derivations file.

• For each successive create statement in the trackers file, qstrack derives a field using thegiven expression, which can refer to fields in the source focus and to fields derived earlier in thetrackers file, as well as to focus statistics defined in any statistics file. The expression typicallyinvolves state variables [see Variables on page 187].

• Given a selections file, qsselect derives a field corresponding by default to the first createstatement and ignores any other statements in the file. The defining expression must be of boolean(or compatible) type.

• For each successive create statement of this form in the aggregations file, qsmeasure, or theAggregate Records window derives a field in the output focus using the given expression, whichcan refer to fields derived earlier in the aggregations file, and to focus statistics defined in anystatistics file.

• A temporary statement has the same effect as a create statement, except that the field doesn'tappear in the output focus. However you can still refer to the temporary field in subsequentexpressions in the aggregations file.

Aggregations The aggregations file for qsmeasure or the equivalentAggregate Recordswindowcan contain create statements of the following form:

TML statements

For each create statement of this form in the aggregations file, qsmeasure or the AggregateRecords window effectively derives a virtual field corresponding to the expression following theaggregation function. This expression can refer to fields in the source focus.

After deriving the virtual field, qsmeasure or theAggregate Recordswindow applies the aggregationfunction [see Using aggregation functions and the where and default clauses on page 148].

The default datatype of the resulting field(s) depends on the aggregation function and the values,operators, and functions involved in the expression to which it applies, but you can override it byspecifying a compatible type.

A temporary statement has the same effect as a create statement, except that the field doesn'tappear in the output focus. However you can still refer to the temporary field in subsequentexpressions in the aggregations file.

Note: • In qsderive, or Derive Fields derivations files, qsselect selections files, and qstracktrackers files, the expression can involve field statistics, global variables, and functionsdefined in terms of these. These FDL features are not available in a qsmeasure orAggregate Records aggregations file.

• You must put extra parentheses around the expression if it contains a semicolon, braces,or a TML reserved word [see Reserved words in TML on page 143].

• Each create statement must start on a new line.

Examples

Given the derivations file, derivations-customer.tml, containing the following:

// Length of standing of customer in monthscreate CustMonths := countwholemonths(StartDate, today());// A key field of extended lengthcreate CustomerID_24 : string(24) := CustomerID;

Apply these derivations to the focus RetailCustApril.ftr to create a new focusRetailCustAprilDeriv.ftr:

qsderive -derivations derivations-customer.tml-input RetailCustApril.ftr -output RetailCustAprilDeriv.ftr

TML statements

Given the selections file, selections-customer.tml, containing the following:

// Select male customers aged between 18 and 65 inclusive.create Selection := Gender = 1 and Age >= 18 and Age <= 65;

Apply this selection to the focus RetailCustApril.ftr to create a new focusRetailCustAprilSelection.ftr:

qsselect -selections selections-customer.tml-input RetailCustApril.ftr%-output RetailCustAprilSelection.ftr

Given the aggregations file, aggregations-mostRecent.tml, containing the following:

temporary LastPurchase := max(PurchaseDate);create AvgTransVal := mean(Amount - 0.01 * PointsRedeemed);create NumTrans := count();create MonthsSinceLastPurchase :=countwholemonths(LastPurchase, today());

Apply these aggregations to the focus RetailTransApril.ftr to create a customer-orientedfocus RetailAggRecentApril:

qsmeasure -aggregations aggregations-mostRecent.tml-input RetailTransAprilSorted.ftr -output RetailAggRecentApril%-keys CustomerID

See also

Aggregation functions for measurements and derivations on page 156

qsderive on page 60

qsmeasure on page 63

qsselect on page 70

qstrack on page 67

Splitting aggregations: the by clause on page 150

Using aggregation functions and the where and defaultclauses

You can aggregate records using qsmeasure or the equivalent Aggregate Records window.

TML statements

When you use a create statement in an aggregations file [see Field definition: the createstatement on page 146], qsmeasure or theAggregate Recordswindow effectively derives a virtualfield corresponding to the expression in parentheses following the aggregation function.

Conceptually, qsmeasure or the Aggregate Records window divides the records in the sourcefocus — augmented by the virtual field — into groups of records with the same key-field value. Foreach group of records (corresponding to a single key-field value), qsmeasure or the AggregateRecords window applies the aggregation function to the values of the virtual field within thoserecords, to produce a single result value.

The result values for all the groups taken together consitute a new field, with the name given in thecreate statement — unless the create statement contains a by clause [see Splittingaggregations: the by clause on page 150].

Within each group of records with a common key-field value, certain aggregation functions ignorerecords for which the virtual field contains null [see The null value on page 181].

If the create statement contains a where clause, the aggregation function ignores records forwhich the associated expression evaluates to 0 ("false"). This expression can refer to fields in thesource focus.

Because aggregation functions can ignore some of the records in a group with a common key-fieldvalue, it's possible for all the records in the group to be skipped. Many aggregation functions returnthe null value in this case. By including a default clause, you can specify a constant expressionto be evaluated instead of the aggregation in the event that it returns null.

When you use the sum aggregation function [see sum (one argument) on page 175], youshould usually specify a default sum of 0. If you don't use a default clause, the sum overan empty record selection is null, which is unlikely to be what you want.

Examples Given the aggregations file, aggregations-purchaseSummary.tml, containing thefollowing:

create firstPurchaseDate := first(PurchaseDate);create commonPaymentMethod := mode(PaymentMethod);create Spend := sum(Amount) default 0;create SpendInStore800 := sum(Amount)

TML statements

where Store = "0800" default 0;create proportionSpendInStore800 := SpendInStore800 / Spend;create NumTransCRRedeemed := count()

where PaymentMethod = "CR" and not isnull(PointsRedeemed);

Apply these aggregations to the focus RetailTransApril.ftr to create a customer-orientedfocus RetailAggSummaryApril:

qsmeasure -aggregations aggregations-purchaseSummary.tml-input RetailTransAprilSorted.ftr -output RetailAggSummaryApril-keys CustomerID

See also

Expressions on page 183

Splitting aggregations: the by clause

The aggregations file for qsmeasure, or the equivalent Aggregate Records window contains asequence of create statements [see Field definition: the create statement on page 146]. As partof such a create statement, you can optionally include a by clause, to break your aggregationdown according to the application of either a split function or the bin function to an expressionderived from other fields.

For each possible integer i returned by the split function or bin function, qsmeasure, or theAggregate Records window creates a separate field, containing values aggregated from just thoserecords for which the function application evaluates to i. The name of each of these related fieldsis formed from the name in the create statement together with a suffix that depends on the functionused for splitting the aggregation.

• The following built-in split functions are available (with given field-name suffixes corresponding tosuccessive function values):

Field-name suffixesFunction

"_1," "_2," "_3," ..., "_31"day

"Sun," "Mon," "Tue," ..., "Sat"dayofweek

"0000," "0100," "0200" ..., "2300"hour

TML statements

Field-name suffixesFunction

"NonNull," "Null"isnull

"MisMatch," "Match"match

"NotMember," "Member"member

"0," "1," "2," ..., "59"minute

"Jan," "Feb," "Mar," ..., "Dec"month

"0," "1," "2," ..., "59"second

"Negative," "Zero," "Positive"sgn

"NotMember," "Member"strmember

"_1," "_2," "_3," ..., "_53"weekofyear

• You can use the bin function to apply an FDL binning to an expression. Used as a split function,the bin function gives rise to field-name suffixes "_1," "_2," ..., "_n," "_null," and "_other" (wheren depends on the number of bins in the binning).

• You can also define your own split function [see User-defined functions on page 190]. Theelement_names attribute defines the field-name suffixes corresponding to the function returnvalues.

Examples Given the aggregations file, aggregations-builtinSplit.tml, containing thefollowing:

create numberPurchases := count();create numberPurchasesOn_ := count() by dayofweek(PurchaseDate);create proportionPurchases_Mon :=numberPurchasesOn_Mon / numberPurchases;create purchaseAmountIn_ :=sum(Amount) by month(PurchaseDate) default 0;

Apply these aggregations to the focus RetailTransApril.ftr to create a customer-orientedfocus RetailDateSplitsApril.ftr containing fields CustomerID, numberPurchases,numberPurchasesOn_Sun, numberPurchasesOn_Mon, ..., numberPurchasesOn_Sat,proportionPurchases_Mon and purchaseAmountIn_Jan, purchaseAmountIn_Feb,..., purchaseAmountIn_Dec:

qsmeasure -aggregations aggregations-builtinSplit.tml -keys CustomerID -input RetailTransApril.ftr-output RetailDateSplitsApril.ftr

TML statements

Given the aggregations file, aggregations-binSplit.tml, containing the following:

create numberPurchasesAmountBand :=count() by bin(EqualRange(20,100,4),Amount);create numberPurchasesDateBand := count() bybin(PrePost(todate(19990401), todate(19990501),todate(19990601)), PurchaseDate);create ratioPrePost :=numberPurchasesDateBand_3 / numberPurchasesDateBand_2;

Apply these aggregations to the RetailTransApril.ftr focus to create a customer-orientedfocus, RetailRangeSplitsApril.ftr, containing fields CustomerID,numberPurchasesAmountBand_1, numberPurchasesAmountBand_2, ...,numberPurchasesAmountBand_6, numberPurchasesAmountBand_null,numberPurchasesAmountBand_other (corresponding to two end bins, four internal bins andthe null and unclassified bin), numberPurchasesDateBand_1, numberPurchasesDateBand_2,..., numberPurchasesDateBand_4, numberPurchasesDateBand_null,numberPurchasesDateBand_other and ratioPrePost:

qsmeasure -aggregations aggregations-binSplit.tml -keys CustomerID -input RetailTransApril.ftr-output RetailRangeSplitsApril.ftr

Given the FDL functions file, fdl-functions-storeGroups.fdl, containing the following:

function StoreGroupFunction( Store )[

element_names = "North, South, East, West"]{case

Store = "0000" or Store = "0800" : 1;Store = "0300" or Store = "0600" or Store = "0700" : 2;Store = "0100" or Store = "0400" : 3;default : 4;

Given the aggregations file, aggregations-userDefinedSplit.tml, containing the following:

create totalPurchases := sum(Amount);create totalPurchasesByStore_ :=sum(Amount) by StoreGroupFunction( Store );create proportionPurchases_North :=totalPurchasesByStore_North / totalPurchases;

Apply these aggregations to the RetailTransApril.ftr focus to create a customer-orientedfocus, RetailGroupSplitsApril.ftr, containing fields CustomerID, totalPurchases,totalPurchasesByStore_North, totalPurchasesByStore_South,

TML statements

totalPurchasesByStore_East, totalPurchasesByStore_West andproportionPurchases_North:

qsmeasure -aggregations aggregations-userDefinedSplit.tml-library fdl-functions-storeGroups.fdl-keys CustomerID-input RetailTransApril.ftr -output RetailGroupSplitsApril.ftr

See also

Using aggregation functions and the where and default clauses on page 148

Evaluating focus statistics: the calculate statement

The statistics file for qsmeasure, the equivalentAggregate Recordswindow, or qstrack containscalculate statements of the following form:

The calculate statement defines a focus statistic called name in terms of an aggregation functionapplied to an FDL expression [see Expressions on page 183], which can refer to fields in the sourcefocus.

As in the case of a measurement [see Using aggregation functions and the where and defaultclauses on page 148], qsmeasure, qstrack, or the Aggregate Records window effectivelyderives a virtual field corresponding to the expression in parentheses following the aggregationfunction.

The calculate statement gives rise to a single result value, obtained by applying the aggregationfunction to all the values in the virtual field. (In contrast, a create statement in an aggregations filegives rise to a field, in which each value is obtained by applying the aggregation function to thevalues in the virtual field for a group of records.)

The default datatype of a focus statistic depends on the aggregation function and the values,operators, and functions involved in the expression to which it applies, but you can override it byspecifying a compatible type [see Datatypes on page 181].

TML statements

If the calculate statement contains a where clause, the aggregation function ignores records forwhich the associated expression evaluates to 0 ("false"). This expression can refer to fields in thesource focus.

Once you have defined a focus statistic in the statistics file, you can refer to it in expressions in theaggregations or trackers file using the special form STATISTIC.name. You can use this formanywhere that you can refer to a field in the source focus or trackers file.

Examples Given the statistics file, statistics-amount.tml, containing the following:

calculate averageAmount := mean(Amount);

Given the aggregations file, aggregations-statistic.tml, containing the following:

create averageSpend := mean(Amount);create bigSpender := averageSpend > STATISTIC.averageAmount;

Apply these aggregations to the focus RetailTransApril.ftr to create a customer-orientedfocus RetailAprilBigSpenders.ftr containing fields CustomerID, averageSpend, andbigSpender:

qsmeasure -statistics statistics-amount.tml-aggregations aggregations-statistic.tml-keys CustomerID-input RetailTransApril.ftr -output RetailAprilBigSpenders.ftr

See also

TML statements

14 - Aggregationfunctions

In this section

Aggregation functions for measurements and derivations 156any 159confintlower 160confintupper 160count 161countnonnull/countnonnulls 162countnull/countnulls 163countunique 164countuniquenonnull 164first 165last 166max (one argument) 167mean (one argument) 168median 168min (one argument) 169mode 170moderatio 171percentage 172percentagerate 172segindex 173significance 174stdev 175sum (one argument) 175variance 176

Aggregation functions for measurements and derivations

There are a number of aggregation functions available for use in measurements and derivations.

To use a function in a measurement context means to use it in an aggregations file [see Usingaggregation functions and the where and default clauses on page 148] or statistics file [seeEvaluating focus statistics: the calculate statement on page 153] for qsmeasure or the equivalentAggregate Records window, or in a statistics file for qstrack..

To use a function in a derivation context means to use it in a Decision Studio derivation, in thederivations file for qsderive or the equivalentDerive Fieldswindow, in the trackers file for qstrack,or in a selections file for qsselect..

Except where otherwise noted, an aggregation function takes a single numeric, string, or date fieldas an argument.

Aggregation functions for both measurements and derivations You can use the followingaggregation functions in both measurement and derivation contexts:

ResultFunction

The number of records in the group/segment

(This function does not take an argument.)

The maximum non-null value in the field for thegroup/segment

(In a derivation context, this function does not accept astring-valued expression as an argument.)

max [see max (one argument) on page 167]

The mean (common average) of the non-null values in thefield for the group/segment

(This function requires a numeric expression as anargument.)

mean [see mean (one argument) on page 168]

The minimum non-null value in the field for thegroup/segment

(In a derivation context, this function does not accept astring-valued expression as an argument.)

min [see min (one argument) on page 169]

Aggregation functions

ResultFunction

The standard deviation of the non-null values in the field forthe group/segment

The sum of the non-null values in the field for thegroup/segment

sum [see sum (one argument) on page 175]

The variance of the non-null values in the field for thegroup/segment

variance

Aggregation functions for measurements only You can use the following aggregation functionsin a measurement context only:

ResultFunction

A non-null value in the field, drawn from an arbitrary non-nullrecord in the group/segment

The lower bound of a 95% confidence interval for the meanof the field in the group

confintlower

The upper bound of a 95% confidence interval for the meanof the field in the group

confintupper

The number of records in the group for which the field hasnon-null values [see The null value on page 181]

In derivations, use countnonnulls instead.

countnonnull [see countnonnull/countnonnulls on page162]

The number of records in the group for which the fieldcontains null [see The null value on page 181] In derivations,use countnulls instead.

countnull [see countnull/countnulls on page 163]

The number of unique values that occur in the field,potentially including the null value

countunique

ResultFunction

The number of unique values, excluding the null value, thatoccur in the field

countuniquenonnull

The first non-null value in the field for the groupfirst

The last non-null value in the field for the grouplast

The median (middle value) of the non-null values in the fieldfor the group

(This function requires a numeric or string-valued expressionas an argument.)

median

The mode (most-common value) of the non-null values inthe field for the group

The proportion of non-null records in the group that containthe most-common non-null value in the field

moderatio

Aggregation functions for derivations only You can use the following aggregation functions ina derivation context only:

ResultFunction

The number of records in the segment for which the fieldhas non-null values [see The null value on page 181]

In measurements, use countnonnull instead.

countnonnulls [see countnonnull/countnonnulls on page162]

The number of records in the segment for which the fieldcontains null [see The null value on page 181]

In measurements, use countnull instead.

countnulls [see countnull/countnulls on page 163]

The number of records in the segment, as a percentage ofthe total number of records

percentage

The percentage of non-null records in the segment thatcontain the value 1 in the field

The field must contain only the values 0 and 1.

percentagerate

ResultFunction

A positive number uniquely identifying the segment withinthe crosstab formed by the breakdown fields

segindex

The statistical significance of any discrepancy between themean value of the field in the segment and the mean value

significance

of the field for the whole population, with a sign indicatingthe direction of the deviation

Purpose: pick an arbitrary non-null value from a field (possibly per group or segment).

Syntax any(x)

Arguments

DescriptionNameType

The field from which to pick a valuexinteger, real, date, or string

Result

DescriptionType

An arbitrary non-null value from the field x (possibly pergroup or segment)

as input

You cannot use this function in a Decision Studio derivation, in the derivations file forqsderive or the equivalent Derive Fields window, in the trackers file for qstrack, or in aselections file for qsselect.

See also

first on page 165

last on page 166

confintlower

Purpose: calculate the lower bound of a 95% confidence interval for the mean of a field.

Syntax confintlower(x)

Arguments

DescriptionNameType

The field to be consideredxnumeric

Result

DescriptionType

The lower bound of an interval within which the segment'spopulation mean for the field x is expected to fall with 95%confidence

See also

confintupper on page 160

significance on page 174

confintupper

Purpose: calculate the upper bound of a 95% confidence interval for the mean of a field.

Syntax confintupper(x)

Arguments

DescriptionNameType

Result

DescriptionType

The upper bound of an interval within which the segment'spopulation mean for the field x is expected to fall with 95%confidence

See also

confintlower on page 160

significance on page 174

Purpose: count the number of records (possibly per group or segment).

Syntax count()

Arguments None

Result

DescriptionType

The number of records (possibly per group or segment)integer

See also

countnonnull/countnonnulls on page 162

countnull/countnulls on page 163

countunique on page 164

countuniquenonnull on page 164

countnonnull/countnonnulls

Purpose: count the number of records for which a field has non-null values (possibly per group orsegment).

Syntax countnonnull(x)

Arguments

DescriptionNameType

The field to be consideredxinteger, real, date, or string

Result

DescriptionType

The number of records for which the field x has non-nullvalues (possibly per group or segment)

integer

In Decision Studio crosstabs and in an measurement context [see Aggregation functionsfor measurements and derivations on page 156] this function is countnonnull, while ina derivation context it is countnonnulls.

See also

count on page 161

countnull/countnulls

Purpose: count the number of records for which a field contains the null value (possibly per groupor segment).

Syntax countnull(x)

Arguments

DescriptionNameType

Result

DescriptionType

The number of records for which the field x is null (possiblyper group or segment)

integer

In Decision Studio crosstabs and in an measurement context [see Aggregation functionsfor measurements and derivations on page 156] this function is countnull, while in aderivation context it is countnulls.

See also

count on page 161

countunique

Purpose: count the number of unique values occurring in a field, potentially including the null value(possibly per group or segment).

Syntax countunique(x)

Arguments

DescriptionNameType

Result

DescriptionType

The number of unique values, including null, that occur inx (possibly per group or segment)

integer

See also

count on page 161

countuniquenonnull

Purpose: count the number of non-null unique values occurring in a field (possibly per group orsegment).

Syntax countuniquenonnull(x)

Arguments

DescriptionNameType

Result

DescriptionType

The number of unique values, excluding null, that occur inx (possibly per group or segment)

integer

See also

count on page 161

Purpose: pick the first non-null value from a field (possibly per group or segment).

Syntax first(x)

Arguments

DescriptionNameType

Result

DescriptionType

The first non-null value encountered in the field x (possiblyper group or segment)

as input

See also

any on page 159

last on page 166

Purpose: pick the last non-null value from a field (possibly per group or segment).

Syntax last(x)

Arguments

DescriptionNameType

Result

DescriptionType

The last non-null value encountered in the field x (possiblyper group or segment)

as input

See also

any on page 159

first on page 165

max (one argument)

Purpose: calculate the maximum non-null value in a field (possibly per group or segment).

Syntax max(x)

Arguments

DescriptionNameType

The field whose values are to becompared

xinteger, real, date, or string

Result

DescriptionType

The maximum non-null value in the field x (possibly pergroup or segment)

as input

Note: • Be careful not to confuse this aggregation function with the multi-argument function [seemax (two or more arguments), maxnonnull on page 291] of the same name.

• In a derivation context, this function does not accept a string- or date-valued field as anargument.

See also

min (one argument) on page 169

mean (one argument)

Purpose: calculate the mean (common average) of the non-null values in a field (possibly per groupor segment).

Syntax mean(x)

Arguments

DescriptionNameType

The field whose mean is to becalculated

xnumeric

Result

DescriptionType

The mean of the non-null values in the field x (possibly pergroup or segment)

Be careful not to confuse this aggregation function with the multi-argument function [seemean (two or more arguments), meannonnull on page 292] of the same name.

See also

median on page 168

mode on page 170

median

Purpose: calculate the median (middle value) of the non-null values in a field (possibly per groupor segment).

Syntax median(x)

Arguments

DescriptionNameType

The field whose median is to becalculated

xinteger, real, or string

Result

DescriptionType

The median value of the non-null values in the field x(possibly per group or segment)

as input

See also

mean (one argument) on page 168

mode on page 170

min (one argument)

Purpose: calculate the minimum non-null value in a field (possibly per group or segment).

Syntax min(x)

Arguments

DescriptionNameType

The field whose values are to becompared

Result

DescriptionType

The minimum non-null value in the field x (possibly pergroup or segment)

as input

Note: • Be careful not to confuse this aggregation function with the multi-argument function [seemin (two or more arguments), minnonnull on page 293] of the same name.

• In a derivation context, this function does not accept a string- or date-valued field as anargument.

See also

max (one argument) on page 167

Purpose: calculate the mode (most common value) of the non-null values in a field (possibly pergroup or segment).

Syntax mode(x)

Arguments

DescriptionNameType

The field whose mode is to becalculated

Result

DescriptionType

The mode of the non-null values in the field x (possibly pergroup or segment)

as input

Note: • In the event of a tie for the most common value, the function returns the least tying number,earliest tying date, or alphabetically earliest tying string.

• You cannot use this function in a Decision Studio derivation, in the derivations file forqsderive or the equivalent Derive Fields window, in the trackers file for qstrack, or ina selections file for qsselect.

See also

mean (one argument) on page 168

median on page 168

moderatio on page 171

moderatio

Purpose: calculate the proportion of non-null records containing the most common non-null valuein a field (possibly per group or segment).

Syntax moderatio(x)

Arguments

DescriptionNameType

The field whose mode ratio is to becalculated

Result

DescriptionType

The mode ratio of the fieldx (possibly per group or segment)real

See also

mode on page 170

percentage

Syntax percentage()

Purpose: calculate the number of records in a crosstab segment, as a percentage of the total.

Arguments None

Result

DescriptionType

The percentage of records (possibly per group or segment)real

You can only use this function in a Decision Studio derivation, in the derivations file forqsderive or the equivalent Derive Fields window, in the trackers file for qstrack, or in aselections file for qsselect.

See also

percentagerate

Purpose: for a binary (1/0) field, calculate the proportion of non-null records containing the value1, as a percentage.

Syntax percentagerate(x)

Arguments

DescriptionNameType

The binary field to be consideredxnumeric

Result

DescriptionType

The percentage of records in a binary field having the value1 (possibly per group or segment)

You cannot use this function in an aggregations file [see Using aggregation functions andthe where and default clauses on page 148] or statistics file [see Evaluating focus

statistics: the calculate statement on page 153] for qsmeasure or the equivalentAggregateRecords window, or in a statistics file for qstrack.

See also

segindex

Purpose: generate an index number representing a segment.

Syntax segindex()

Arguments None

Result

DescriptionType

A positive number uniquely identifying a segment within acrosstab

integer

Note: • If null, unclassified, or empty segments are hidden in a crosstab view, they still receive asegment index, so there may appear to be gaps in the sequence of visible indices.

• You cannot use this function in an aggregations file [see Using aggregation functionsand the where and default clauses on page 148] or statistics file [see Evaluating focusstatistics: the calculate statement on page 153] for qsmeasure or the equivalentAggregate Records window, or in a statistics file for qstrack.

Example R–2: Marking approximate deciles

You can bin a field "myfield" into ten equal-population bins and then derive a field "decile":

segindex() by myfield

See also

bin on page 338

significance

Purpose: calculate the statistical significance of any discrepancy between the mean value of a fieldfor a crosstab segment and the mean value of the same field for the whole population.

Syntax significance(x)

Arguments

DescriptionNameType

Result

DescriptionType

A value between -1 and 1, whose sign indicates thedirection in which the segment mean of x deviates from the

population mean of x and whose absolute value indicatesthe statistical significance of that deviation, taking segmentsize into account

You cannot use this function in an aggregations file [see Using aggregation functions andthe where and default clauses on page 148] or statistics file [see Evaluating focus

statistics: the calculate statement on page 153] for qsmeasure or the equivalentAggregateRecords window, or in a statistics file for qstrack.

See also

confintlower on page 160

confintupper on page 160

Purpose: calculate the standard deviation of the non-null values in a field (possibly per group orsegment).

Syntax stdev(x)

Arguments

DescriptionNameType

The field whose standard deviation isto be computed

xnumeric

Result

DescriptionType

The standard deviation of the non-null values in the field x(possibly per group or segment)

See also

sum (one argument)

Purpose: calculate the sum of the non-null values in a field (possibly per group or segment).

Syntax sum(x)

Arguments

DescriptionNameType

The field whose values are to besummed

xnumeric

Result

DescriptionType

The sum of the non-null values in the field x (possibly pergroup or segment)

as input

Note: • Be careful not to confuse this aggregation function with the multi-argument function [seesum (two or more arguments), sumnonnull on page 299] of the same name.

• Note that when the sum aggregation function is applied to an empty set of records (forexample, as a result of record filtering), the result is null. You should normally override thisbehavior, by setting the default value to 0.

See also

variance

Purpose: calculate the statistical variance of the non-null values in a field (possibly per group orsegment).

Syntax variance(x)

Arguments

DescriptionNameType

The field whose variance is to becomputed

xnumeric

Result

DescriptionType

The variance of the non-null values in the field x (possiblyper group or segment)

See also

15 - Field DerivationLanguage

In this section

About Field Derivation Language 179

About Field Derivation Language

Field Derivation Language (FDL) is a simple language for specifying the creation of new fields inSpectrum Miner foci. You can use FDL in the Table Viewer in Decision Studio, the Derive Fieldsand Aggregate Records wizards, and the Spectrum Miner data-build commands: qsderive,qsmeasure, qstrack and qsselect.

You can employ FDL to:

• Transform, combine, and perform a variety of other computations on existing fields in a focus• Partition a focus into a set of random samples• Define segmentations and predictive models• Select records according to a criterion of your choice

You typically use it in simple, "one-line" expressions such as those described in Basic expressionson page 184, but you can also use it to construct scripts with complex, data-dependent executionpaths.

FDL syntax includes:

• Operators for arithmetic [see Arithmetic operators on page 193].• Operators for comparison [see Relational operators on page 194] and for combining logical

expressions [see Logical operators on page 196].• Two alternative constructs for conditional evaluation of statements: if...then...else andcase [see Conditional expressions on page 185].

• A comprehensive set of built-in functions [see Built-in functions on page 198].• The ability to encapsulate common transformations as user-defined functions [see User-definedfunctions on page 190], for re-use and sharing.

• Variables, to hold values for re-use within an FDL script, including (in some contexts) global"accumulator" variables [see Global variables in Decision Studio on page 189].

• A facility for incorporating crosstabs into derivations .

See also

Field Derivation Language

16 - FDL syntax

In this section

Datatypes 181Expressions 183Conditional expressions 185Variables 187User-defined functions 190Arithmetic operators 193Relational operators 194Logical operators 196Operator precedence 197Built-in functions 198Reserved words in FDL 206

Datatypes

The following basic datatypes are supported in FDL [see About Field Derivation Language onpage 179]:

Examples (as FDL literals)DescriptionType

78004, -100A number without a fractional part(between -2 147 483 647 and 2 147483 646)

integer

42.3, -0.56A number with a fractional part,represented internally as a

"floating-point" number with 15-digitprecision (between and

, or 0)

#2009/01/01,#2009/01/01:16:01:39

A date/time between AD 1507 and AD3015, to one-second precision

"hello"A character string, representing textualdata

string

42, -1A small number without a fractional part(between qsminsmallint andqsmaxsmallint)

smallinteger

Note: • If you use a literal integer value (such as 12345678910) that is too large for an integerdatatype, you will get an overflow. You can often work around this by using a real valueinstead: (in this case, 12345678910.0).

• The smallinteger type has limited applicability within Spectrum Miner, and should only beused where absolutely necessary. In general, you should use the integer type instead.

The null value

In addition to the ordinary values for each of the basic datatypes, a field or expression of integer,real, date, or string type can take the special value null. The null value always represents an unknown

FDL syntax

value (due to missing data, because of an overflow in a calculation — such as division by zero —or for some other reason). It is distinct from both zero and the empty string "".

FDL functions and operators preserve the uncertainty introduced by a null argument or operand,often giving the null value as a result. There are, however, functions — notably isnull and someaggregation functions — that deal specially with the null value.

Examples

1 + null equals null.

1 != null equals null.

1 < null equals null.

null = null equals null.

null != null equals null.

false or null equals null.

true and null equals null.

true or null equals true.

false and null equals false.

See also

ifnull, nvl on page 212

Boolean data

There is no formal boolean (true/false) datatype in FDL. However, you can treat an integer field (orvariable) containing only the values 0 and 1, or a logical expression such as Income < 10000, asthough it were of boolean type.

Logical expressions evaluate to 1 when they are true and 0 when they are false. Conditionalexpressions use these values to determine which of two or more other expressions to evaluate.Logical operators also work with these values. A value of 0 in an integer field or variable isinterpreted as "false" and any other (non-null) value is interpreted as "true."

Additionally, you can use the keywords true and false as boolean literals (which stand for 1 and0 respectively).

FDL syntax

Type-compatibility

The integer and real datatypes are said to be type-compatible, because you can use an integervalue wherever a real value is required (but not vice versa), for example, as an argument to afunction. Date and string datatypes are only type-compatible with themselves.

The operands of a binary arithmetic operator [see Arithmetic operators on page 193] or relationaloperator [see Relational operators on page 194] have to be type-compatible. Indeed, in the caseof a binary arithmetic operator, both operands must be numeric; the result is real unless bothoperands are integers.

Likewise, the result expressions in an if or case expression [see Conditional expressions on page185] have to be type-compatible, so that the type of the result can be determined before evaluation.If integer- and real-valued expressions are mixed, the result is real-valued.

Type conversion

As well as using implicit datatype conversion (which occurs when the datatype of an expressionforces it), you can explicitly convert between datatypes with the tointeger, toreal, todate andtostring FDL functions.

When converting from a string to a numeric, any trailing non-numeric characters are ignored.Note:

Expressions

An FDL expression is one of the following:

• A basic expression [see Basic expressions on page 184]• A conditional expression [see Conditional expressions on page 185]• A variable initialization or assignment [see Variables on page 187]• A semicolon-separated list of expressions (an expression list), enclosed in parentheses

Every expression has an associated datatype [see Datatypes on page 181] and can be evaluated.

To evaluate an expression list in a given context, evaluate each constituent expression in turn, fromfirst to last; the value of the expression list as a whole is the last computed value.

Note: • The datatype is inferred automatically. You cannot specify it other than by using a localvariable of the required type, or by using an expression that forces the type. For example,the following expression forces the type to be real:

FDL syntax

if false then 0.0 else ...

• You typically use expression lists in combination with variable assignment.

Basic expressions

A basic expression is one of the following:

• An identifier, optionally quoted in single quotation marks, for example, 'CustomerID'. You mustenclose an identifier in single quotation marks if it does not begin with a letter ("A" — "Z" or "a" —"z"), contains characters other than letters, digits, and underscores, or coincides with a TML orFDL reserved word (or differs from one only in case). An identifier must be no longer than 128characters.

• A numeric literal, for example, 30, -3.28, or 3.1e8 (the last of these representing )• A string literal, enclosed in double quotation marks, for example, "A string"

• A date literal, prefixed by "#," for example, #2004/03/16 or #2004/03/16:09:12:13• The keyword "null" (see The null value on page 181)• A function application, consisting of a function name (an identifier) followed by a comma-separated

list of arguments (expressions) in parentheses; for example, mean(IncomeA + IncomeB,TotalEstIncome)

• An arithmetic expression, consisting of numeric-valued expressions connected by arithmeticoperators [see Arithmetic operators on page 193], for example, 365 / 52

• A relation, consisting of numeric- or string-valued expressions connected by relational operators[see Relational operators on page 194], for example, Age > 30 or 'State' = "New York"

• A logical expression, consisting of relations or other boolean-valued expressions connected bylogical operators [see Logical operators on page 196], for example, (Income > 250000) and(MaritalStatus = "Single")

• A statistic identifier, consisting of the TML keyword STATISTIC followed by a period followed byan identifier, for example, STATISTIC.AverageIncome (only in a TML aggregations file)

• A field statistic — only in a Decision Studio derivation, a qsderive derivations file, or a qsselectselections file

Note: • FDL identifiers are case-significant. For example, the expressions CustomerID andCUSTOMERID refer to different fields, while the expressions isnull(Age) andIsNull(Age) involve different functions.

• A numeric literal is evaluated as an integer if it contains no decimal point and doesn't usescientific notation. Otherwise it is evaluated as a real number.

• A string literal can contain any characters other than double quotation marks (") andnewline characters.

• A date literal is initially evaluated in YMD [see Date formats on page 52] format; if thisfails, it is evaluated according to the setting of your read preference .

FDL syntax

• As well as introducing spaces between syntactic elements, you can include comments,which begin with a double slash ("//") and continue to the end of the line. For example,

// This is a comment.2 + 2 // This is a comment.

See also

Conditional expressions

In FDL, there are two main ways of conditionally evaluating an expression: the if expression andthe case expression.

See also

The if expression

If the condition (an FDL expression returning a boolean [see Boolean data on page 182] value) istrue, the expression following then is evaluated; otherwise, the expression following else isevaluated. The value of the if expression is the value of the evaluated expression.

Note: • The expressions following then and else must be type-compatible [see Datatypes onpage 181].

• If you omit the "else" part of an if expression and the condition is false, the value of the ifexpression is null [see The null value on page 181].

• A null [see The null value on page 181] condition is treated by the if expression as thoughit were false. This is an exception to the general rule that the uncertainty introduced by thenull value is preserved by FDL expressions. This can lead to unexpected results: for example,when Age is null, the following (at first sight equivalent) expressions generate the null value,0 and 1 respectively:

FDL syntax

Age > 40

if Age > 40 then 1 else 0

if Age <= 40 then 0 else 1

• You can test for multiple conditions using nested if expressions — if expressions in whichthe expression following then or else is itself an if expression. If the expression followingthen is an if expression, you should enclose it in parentheses to avoid ambiguity.Alternatively, you can use the case expression [see The case expression on page 186].

• If your condition can be expressed as x = 0 (or ) for some number x, you can also usethe built-in FDL function cond instead of an if expression.

Examples

if Age < 40 then "Young"

if Age < 40 then "Young" else "Senior"

if Age > 40 then "Young" else if Age < 60 then "MiddleAged" else "Senior"

if (Age >= 13 and Age < 20) then "Teenager"

if isnull(Age) then mean(Age) else Age

if Responder then RespondDate else StartDate

if match(Postcode, "^EH") then "Y" else "N"

if StartDate < #1998/09/08 then "Loyal" else "Recent"

The case expression

If the first condition (an FDL expression returning a boolean [see Boolean data on page 182] value)is true, the corresponding expression (to the right of the colon) is evaluated and returned as thevalue of the case expression. Otherwise, the next condition is tested; if it is true, the correspondingexpression is evaluated and returned as the value of the case expression. Otherwise, the nextcondition is tested, and so on.

The special default keyword is interpreted as an always-true expression: you would normally useit as the last condition in a case expression (as any subsequent conditions would be ignored).

Note: • All the expressions occurring to the right of colons must be type-compatible [seeDatatypeson page 181].

FDL syntax

• If none of the conditions is true (and there is no default keyword), the value of the caseexpression is null [see The null value on page 181].

• A null [see The null value on page 181] condition is treated by the case expression asthough it were false. This is an exception to the general rule that the uncertainty introducedby the null value is preserved by FDL expressions. For example, when Age is null, thefollowing expression generates the value 3:

Age > 40 : 1;

Age <= 40 : 2;

default : 3;

• If you use a case expression in an expression list or nest a case expression as a condition(inside an if expression or another case expression), you should enclose it in parenthesesto avoid ambiguity.

Examples

Age < 40: "Young";

default: "NotYoung";

Response = "Exc": 7;

Response = "V. Sat": 6

Response = "Sat": 5;

Response = "F. Sat": 4;

Response = "Poor": 3;

Response = "Bad": 2;

default: 1;

Variables

A variable in FDL is a temporary storage location for a real, integer, string, or date value. You canuse a local variable to evaluate a complicated subexpression% — particularly useful if the samesubexpression occurs more than once% — or to evaluate an expression involving a random-numberfunction, to avoid its value changing from one evaluation to the next. You can also use global

FDL syntax

variables in Decision Studio [see Global variables in Decision Studio on page 189] and statevariables in TML trackers files [see State variables in Transaction Measurement Language onpage 189] to carry information from one record to the next.

To assign a value to a variable, use the following syntax:

The first such assignment is an initialization and determines the class of the variable% — local (nokeyword), global, or state. The keyword global or state is only used in conjunction with the firstassignment to a global or state variable (the initialization).

The identifier [see Basic expressions on page 184] is the name of the variable (which must notcoincide with a field name in the focus).

For any class of variable, its datatype [see Datatypes on page 181] is that of the first value that isassigned to it.

The value of an assignment expression is simply the value of the expression to the right of the ":="sign.

For example, the following expression list randomly assigns the value 1 (to 20% of the records), 2(to another 20% of the records), or 0 (to the remainder of the records):

x := rndUniform();

x < 0.2 : 1;

x < 0.4 : 2;

default : 0;

It could be written equivalently in this more compact form:

( x := rndUniform() ) < 0.2 : 1;

x < 0.4 : 2;

default : 0;

However, the following expression list generates a different (and probably unwanted) result:

rndUniform() < 0.2 : 1;

rndUniform() < 0.4 : 2;

default : 0;

FDL syntax

Global variables in Decision Studio

Variables are local by default. A local variable is destroyed as soon as the last expression in the(outermost enclosing) expression list [see Expressions on page 183] is evaluated — The next timethat the same variable is used, it is re-initialized. A local variable cannot carry information from onerecord to the next. For example, the following field-derivation expression list produces a field with1 in every record:

i := 0;

i := i + 1

A global variable differs from a local variable in that it is not destroyed until Decision Studio or thedata-build command has calculated the field values for all records. The initialization only happensonce, for the first record in the focus. For example, the following field-derivation expression listproduces an index field, with 1 in the first record, 2 in the second, and so on:

global i := 0;

i := i + 1

You can use a global variable in a field-derivation expression list in a Decision Studio derivation, inthe derivations file for qsderive or the equivalent Derive Fields window, in a trackers file for qstrack,or in a selections file for qsselect.

State variables in Transaction Measurement Language

You can use a state variable in a TML field definition in the qstrack trackers file.

A state variable differs from a local variable or a global variable in that it is not destroyed until qstrackhas calculated all the field values for the current group (as defined by the key field). The initializationhappens once for each group.

For example, the following field definition defines a running-balance field, assuming that the field"Deposit" contains an amount deposited in a transaction (or a negative value for a withdrawal) andthe field "InitialBalance" contains the initial balance for the customer (perhaps from a previouslyjoined customer table):

create Balance := (

state bal := InitialBalance;

bal := bal + Deposit

FDL syntax

User-defined functions

To define an FDL function that you can use in the same way as a built-in function [see Built-infunctions on page 198], use the following syntax:

The identifier [seeBasic expressions on page 184] following the "function" keyword is the nameof the function. The list of (formal) arguments comes next:

Each formal argument (other than one preceded by a "field" keyword) acts like a local variable[see Variables on page 187] that is initialized with the value of the corresponding actual argumentwhen the function is applied. The type [seeDatatypes on page 181] of a formal argument is the typespecified in the function definition, if present; otherwise it is that of the actual argument used whenthe function is applied. Note that the enclosing parentheses are required even if there are no formalarguments.

Following the list of arguments comes an optional list of function attributes [see Function attributeson page 192], and then the defining body of the function. The expression list [see Expressions onpage 183] is effectively prefixed with local variable assignments corresponding to the formal arguments(other than those preceded by a "field" keyword), before being evaluated to give the return valuefor the function. The type of the return value is that of the last expression.

Examples Define a function to replace zeros in the result of one expression (of any type) with theresult of another expression:

function replaceZeros(expr1, expr2)

if expr1 = 0 then expr2 else expr1

A function which takes a string argument and returns it with the first letter capitalized:

function titleCase(string text)

FDL syntax

concat(toupper(left(text, 1)), right(text, strlen(text) - 1));

• When specifying the type of a function argument, use "real", "integer", "string" or "date."• You can include field statistics in a function definition, but each function argument that is used in

a field statistic must be preceded by a "field" keyword. For example, to compare a value witha summary over the whole focus:

function MyBand(real field x)

averageX := mean(x);

stdevX := stdev(x);

x < averageX - stdevX : "Lower";

x > averageX + stdevX : "Upper";

default: "Mid";

• You can include function definitions at the beginning of the expression list that defines a derivedfield. Functions so defined are local to the field derivation.

You can also define globally accessible functions by storing them in a file (or files) in a folder listedin the derivation libraries preference . In Decision Studio, in the Function Family paneof the Derivation Functions dialog box in the Table Viewer, functions in a file named class.fdlappear in the class named class. For example, functions in a file named Conditional.fdl appear inthe Conditional class. You can reuse an existing class name, to augment a class with your ownuser-defined functions, or create your own class by using a new name.

When using the Spectrum Miner data-build command qsmeasure, the equivalent AggregateRecords window, or the data-build command qstrack, you can specify a file containing additionalglobal function definitions.

• You cannot define aggregation functions in a measurement context, that is, in an aggregationsfile [see Using aggregation functions and the where and default clauses on page 148] orstatistics file [see Evaluating focus statistics: the calculate statement on page 153] forqsmeasure or the equivalent Aggregate Records window, or in a statistics file for qstrack..

• If the expression list ends with an alphabetic character, it must be separated from the closingbrace by a semicolon or whitespace.

FDL syntax

Function attributes

Specify function attributes as follows:

Useful attributes are:

synopsis and description. These attributes provide a standard way of making the function"self-documenting." If a function is installed in a folder listed in the derivation librariespreference , the synopsis is shown when you highlight the function in the Functions pane of theDerivation Functions dialog box in the Table Viewer. When you click on the Details button, thedescription is shown. The synopsis text should be short, but the description text can be longer.

For example:

function nvl(testNull, replaceValue)

synopsis = "Replace null values"

description = "The nvl() function replaces null values...strings."

(Note that the description attribute must all be on the same line.)

element_names. In an aggregations file, you can use a function with the element_names attributeto split an aggregation to create multiple fields [see Splitting aggregations: the by clause on page150].

The value of this attribute is a string containing a comma-separated list of names, which areassociated with function return values 1, 2, ..., in the order that they are listed.

For example, the element_names attribute of the following function associates the names "Low","Medium" and "High" with return values 1, 2, and 3 respectively:

function band(x)

[ element_names = "Low,Medium,High" ]

clamp(floor(x / 1000), 0, 2) + 1

The element_names attribute of the following function associates the names "North", "South","East" and "West" with return values 1, 2, 3 and 4 respectively:

FDL syntax

function StoreGroupFunction( Store )

element_names = "North, South, East, West"

Store = "0000" or Store = "0800" : 1;

Store = "0300" or Store = "0600" or Store = "0700" : 2;

Store = "0100" or Store = "0400" : 3;

default : 4;

Arithmetic operators

Except for unary minus, which prefixes its single operand, all FDL arithmetic operators are infixbinary operators, that is, they are written between their two operands.

ResultOperandsOperator

The sum of the operandsnumeric+ (addition)

The result of subtracting the secondoperand from the first

numeric- (subtraction)

The negative of the operandnumeric- (unary minus)

The product of the operandsnumeric* (multiplication)

The (real) result of dividing the firstoperand by the second

numeric/ (division)

The (integer) result of dividing the firstoperand by the second, ignoring anyremainder

integerdiv (integer division)

FDL syntax

The remainder on integer-dividing thefirst operand by the second (the

integermod (modulo)

modulus), with the same sign as thefirst operand — for example, 9 % 7returns 2 but -9 % 7 returns -2

Examples

2 + 5 equals 7.

2 - 5 equals -3.

- (2 + 5) equals -7.

2 * 5 equals 10.

2 / 5 equals 0.4.

2 div 5 equals 0.

2 mod 5 equals 2.

5 % 2 equals 1.

Note: • The addition, subtraction, unary minus, and multiplication operators preserve the type oftheir operands (which are type-compatible [see Type-compatibility on page 183]). If oneoperand is an integer and the other is a real number, the result is real.

• If the second operand of the division, integer-division, or modulo operator is zero, the resultis null [see The null value on page 181], as division by zero is undefined.

• mod requires integer operands.• You can also use % instead of mod.

See also

Logical operators on page 196

Operator precedence on page 197

Relational operators on page 194

Relational operators

All FDL relational operators are infix binary operators, that is, they are written between their twooperands.

FDL syntax

True if the first operand is less than thesecond; false if the first operand isgreater than or equal to the second

integer, real, date, or string< (less than)

True if the first operand is less than orequal to the second; false if the firstoperand is greater than the second

integer, real, date, or string<= (less than or equal to)

True if the first operand is greater thanthe second; false if the first operand isless than or equal to the second

integer, real, date, or string> (greater than)

True if the first operand is greater thanor equal to the second; false if the firstoperand is less than the second

integer, real, date, or string>= (greater than or equal to)

True if both operands have the samevalue; false if the operands have distinct

integer, real, date, or string= (equality)

values. For string types, the comparisonignores trailing space characters in bothoperands.

True if the operands have distinctvalues; false if both operands have the

integer, real, date, or string!= (inequality)

same value. For string types, thecomparison ignores trailing spacecharacters in both operands.

Examples

2 < 5 equals 1 (TRUE).

5 < 5 equals 0 (FALSE).

"A" < "B" equals 1 (TRUE).

5 <= 5 equals 1 (TRUE).

5 > 5 equals 0 (FALSE).

7 > 5 equals 1 (TRUE).

#2003/12/31 > #2003/11/30 equals 1 (TRUE).

5 >= 5 equals 1 (TRUE).

"A" = "A" equals 1 (TRUE).

"A" = "a" equals 0 (FALSE).

FDL syntax

"A" != "A" equals 0 (FALSE).

"A" != "a" equals 1 (TRUE).

Note: • The operands of a binary relational operator must be type-compatible [seeType-compatibility on page 183].

• When comparing dates, "less than" means earlier. When comparing strings, "less than"means earlier in the alphabet (or more specifically, earlier in the underlying characterrepresentation). Likewise for "greater than" etc.

• You can also use == or eq instead of = and <> instead of !=.

See also

Arithmetic operators on page 193

Boolean data on page 182

Logical operators

The logical negation operator prefixes its single operand, while the logical conjunction and disjunctionoperators are infix binary operators, that is, they are written between their two operands.

True if both operands are true; false ifeither operand is false

booleanand (logical conjunction)

True if either operand is true; false ifboth operands are false

booleanor (logical disjunction)

True if the operand is false; false if theoperand is true

booleannot (logical negation)

Examples

("A" = "A") and (5 = 5) equals 1 (TRUE).

("A" = "B") and (5 = 5) equals 0 (FALSE).

("A" = "B") and (5 = 2) equals 0 (FALSE).

("A" = "A") or (5 = 5) equals 1 (TRUE).

("A" = "B") || (5 = 5) equals 1 (TRUE).

FDL syntax

("A" = "B") or (5 = 2) equals 0 (FALSE).

not ("A" = "A") equals 0 (FALSE).

!(("A" = "B") or (5 = 2)) equals 1 (TRUE).

Note: • There is currently no boolean [see Boolean data on page 182] datatype in FDL, so theoperands for logical operators are actually integers. You should not, however, apply theseoperators to arbitrary integers.

• You can also use and instead of and, || instead of or, and ! instead of not.

See also

Operator precedence on page 197

Operator precedence

In the absence of parentheses, FDL operations in an expression are carried out in order of operatorprecedence, from highest to lowest:

OperatorsClass

-Unary minus

* / div modMultiplication-style operators

+ -Addition-style operators

< <= > >= != =Relational operators

notLogical negation

and orBinary logical operators

If there is a tie for precedence, operations are carried out from left to right.

Subexpressions in parentheses are always evaluated first.

Examples -5 + 12 / 4 * 2 + 20 * 3 equals 61.

FDL syntax

that is, the expression is evaluated as: -5 + ((12 / 4) * 2) + (20 * 3)

Moving the parentheses:

-(5 + 12 / (4 * 2) + 20) * 3 equals -79.5.

See also

Built-in functions

FDL includes a comprehensive set of built-in functions.

Conditional functions

DescriptionFunction

Constrain a number to lie within a given interval.clamp

Produce one of two results depending on whether anexpression is zero or non-zero.

Replace the null value.ifnull, nvl

Test whether a value is null.isnull

Flag the records that are currently selected.isselected

Replace a number if it does not lie within a given interval.replace

Datatype-conversion functions

FDL syntax

PurposeFunction

Convert a value to a date.todate

Convert a value to an integer.tointeger

Convert a value to a real number.toreal

Convert a value to a string.tostring

Functions for working with dates

PurposeFunction

Calculate a date from another date using an offset incenturies.

addcenturies, addcenturiescountbackwards

Calculate a date from another date using an offset in days.adddays

Calculate a date from another date using an offset in hours.addhours

Calculate a date from another date using an offset inminutes.

addminutes

Calculate a date from another date using an offset inmonths.

addmonths, addmonthscountbackwards

Calculate a date from another date using an offset inseconds.

addseconds

Calculate a date from another date using an offset in weeks.addweeks

Calculate a date from another date using an offset in years.addyears, addyearscountbackwards

Calculate the period between two dates in centuries.countcenturies

Calculate the period between two dates in days.countdays

Calculate the period between two dates in hours.counthours

Calculate the period between two dates in minutes.countminutes

FDL syntax

PurposeFunction

Count the number of seconds between two dates.countseconds

Calculate the period between two dates in weeks.countweeks

Count the number of complete centuries between two dates.countwholecenturies, countwholecenturiesbackwards

Count the number of complete days between two dates.countwholedays

Count the number of complete hours between two dates.countwholehours

Count the number of complete minutes between two dates.countwholeminutes

Count the number of complete months between two dates.countwholemonths, countwholemonthsbackwards

Count the number of seconds between two dates.countwholeseconds

Count the number of complete weeks between two dates.countwholeweeks

Count the number of complete years between two dates.countwholeyears, countwholeyearsbackwards

Calculate the period between two dates in years.countyears

Obtain the day-of-month part of a date.day

Obtain a number representing the day-of-week of a date.dayofweek

Convert a Greenwich Mean Time (GMT) date to EasternDaylight Time (EDT).

gmt2edt

Obtain the hours part of a date.hour

Obtain the minutes part of a date.minute

Obtain the month part of a date.month

Obtain the current date and time.now

Obtain the seconds part of a date.second

FDL syntax

PurposeFunction

Obtain the current date.today

Calculate the week-of-year of a date, relative to a specifiedstart date.

weekofyear

Obtain the year part of a date.year

Functions for working with strings

PurposeFunction

Concatenate two or more strings.concat

Test whether one string ends with another.endswith

Test whether one string is contained within another.find

Return a substring of specified length from the beginningof a string.

Return a substring of specified length from the middle of astring.

Return a substring of specified length from the end of astring.

Reduce each word to a four-character string for indexingpurposes.

soundex

Test whether one string starts with another.startswith

Obtain the length of a string.strlen

Determine set membership.strmember

Replace one string with another.substitute

Obtain a substring from a string.substr

FDL syntax

PurposeFunction

Convert a string to lowercase text.tolower

Convert a string to uppercase text.toupper

Remove all spaces from a string except for single spacesbetween words.

Regular expressions and associated functions

PurposeFunction

Test a string for a regular expression match [see Regularexpressions on page 276].

Replace all substrings that match a regular expression [seeRegular expressions on page 276].

replaceall

Replace the first substring that matches a regular expression[see Regular expressions on page 276].

replacefirst

Mathematical and statistical functions

DescriptionType

Calculate the absolute value of a number.abs

Round a number up to the nearest integer.ceil

Calculate the cosine of an angle.cos

Calculate the exponential of a number.exp

Round a number down to the nearest integer.floor

Calculate the natural (base-e) logarithm of a number.log

Calculate the base-10 logarithm of a number.log10

FDL syntax

DescriptionType

Calculate the logarithm of a number, to a specified base.logbase

Calculate the maximum of two or more numbers, or thelatest of two or more dates, or the alphabetically latest oftwo or more strings.

max (two or more arguments), maxnonnull

Calculate the mean (common average) of two or morenumbers.

mean (two or more arguments), meannonnull

Calculate the minimum of two or more numbers, or theearliest of two or more dates, or the alphabetically earliestof two or more strings.

min (two or more arguments), minnonnull

Normalize field values to lie in the interval [0,1].normalize

Calculate the result of raising one number to the power ofanother number.

Calculate the product of two or more numbers.product, productnonnull

Round a number to the nearest integer.round

Calculate the signum (sign) of a number.sgn

Calculate the sine of an angle.sin

Calculate the square root of a number.sqrt

Calculate the sum of two or more numbers.sum (two or more arguments), sumnonnull

Calculate the tangent of an angle.tan

Data-sampling functions

PurposeFunction

Create a test/training segmentation for use in modelvalidation.

numericTestTrainSplit

FDL syntax

PurposeFunction

Create a test/training/validation segmentation for use inmodel validation.

numericTestTrainValidateSplit

Create, at random, a segmentation index for groups of equalsize.

sampleEqualSize

Flag, at random, an exact number of customers from aselected population.

sampleExactNumber

Flag, at random, an exact percentage of customers from aselected population.

sampleExactPercentage

Flag, at random, a specified number of customers from asegment and the rest of the population.

sampleStratified

Create a test/training segmentation for use in modelvalidation.

testTrainSplit

Create a test/training/validation segmentation for use inmodel validation.

testTrainValidateSplit

Random-number functions

PurposeFunction

Generate a random integer based on a binomial distribution.rndBinomial

Generate either 0 or 1 randomly (with equal probability).rndBool

Generate a random positive number based on anexponential distribution.

rndExp

Generate a random positive number based on a gammadistribution.

rndGamma

Generate a random number based on a normal distribution.rndNormal

Generate a random non-negative integer based on adiscrete Poisson distribution.

rndPoisson

FDL syntax

PurposeFunction

Generate a random number between 0 and 1 based on auniform distribution.

rndUniform

Return-on-investment functions

PurposeFunction

Estimate the (money) per-customer return on investmentcontribution from taking action designed to generate adefinite response.

ActionROI

Estimate the annualized per-customer return on investmentmultiple from taking action designed to generate a definiteresponse from a customer.

ActionROIAnnualized

Estimate the per-customer return on investment contributionfrom making an offer designed to generate a definiteresponse.

OfferROI

Estimate the annualized per-customer return on investmentmultiple from making an offer designed to generate a definiteresponse from a customer.

OfferROIAnnualized

Estimate the per-customer (money) return on investmentcontribution from taking action designed to prevent attrition.

RetentionActionROI

Estimate the annualized per-customer return on investmentcontribution from taking action designed to prevent attrition.

RetentionActionROIAnnualized

Estimate the per-customer return on investment from makingan offer designed to prevent attrition.

RetentionOfferROI

Estimate the annualized per-customer return on investmentfrom making an offer designed to prevent attrition.

RetentionOfferROIAnnualized

Miscellaneous functions

PurposeFunction

Look up values in a reference table (stored as a focus).dblookup

FDL syntax

PurposeFunction

Determine set membership.member

Identify the rank of a value within a list of values.rankOrder, rankOrderApprox

Identify the rank of a value within a list of values.rankOrderMean, rankOrderApproxMean

Generate the number of each row in the table.rownum

Binnings

PurposeFunction

Obtain a bin index corresponding to a value.bin

See also

User-defined functions on page 190

Reserved words in FDL

The following tokens are reserved words in FDL: accum, agg, aggregate, aggregation,and, by, case, date, default, div, double, else, eq, false, field, float,function, global, if, int, integer, long, mod, not, null, number, numeric,or, real, short, state, string, then, true, wheredate, default, div,double, else, eq, false, field, float, function, global, if, int, integer,long, mod, not, null, number, numeric, or, real, short, state, string,then, true, where and while

To use a reserved word, or a word that differs only in case from a reserved word, as an identifier[see Expressions on page 183] in FDL, or as a field or statistic name in TML, you must enclose itin single quotation marks — for example, 'State'.

Although identifiers in FDL are case-significant (fred and Fred are distinct), the reservedwords are not case-significant and are therefore reserved in any case combination: for

example, if, If, iF, and IF are all representations of the same token; all suchcombinations are reserved and may not be used as identifiers.

See also

FDL syntax

17 - Conditionalfunctions

In this section

clamp 209cond 210iff 211ifnull, nvl 212isnull 213isselected 214replace 214

Purpose: constrain a number to lie within a given interval.

Syntax clamp(x, min, max)

Arguments

DescriptionNameType

The number to be clampedxnumeric

The lower bound of the intervalminas x

The upper bound of the intervalmaxas x

Result

DescriptionType

x if ; min if x < min; max if x > maxas input

Examples

To clamp values within a range from 18 to 65 inclusive, replacing values outside the range with theappropriate range boundaries:

clamp(CustomerAge,18,65)CustomerAge

See also

replace on page 214

Purpose: produce one of two results depending on whether an expression is zero or non-zero.

Syntax cond(cond, trueExpr, falseExpr)

Arguments

DescriptionNameType

The expression to testcondinteger

"True" (non-zero) branchtrueExprinteger, real, date, or string

"False" (zero) branchfalseExpras trueExpr

Result

DescriptionType

trueExpr if cond 0; falseExpr if cond = 0as trueExpr, falseExpr

Unlike an if expression [see The if expression on page 185] or case expression [see Thecase expression on page 186], both branches are evaluated by the cond function.

More significantly, cond preserves the uncertainty of a null value [see The null value onpage 181] in the condition.

If you don't need this behavior, use an if or case expression for speed.

Examples

To target customers with a zero spend in Quarter 2 for a mailing campaign:

cond(SpendQtr2, "Don't Mail","Mail")

SpendQtr2

Don't Mail100.39

cond(SpendQtr2, "Don't Mail","Mail")

SpendQtr2

Mail-5.72

nullnull

See also

clamp on page 209

iff on page 211

ifnull, nvl on page 212

replace on page 214

Purpose: produce one of two results depending on whether an expression is zero or non-zero.

Syntax iff(logical_text, value_if_true, value_if_false)

Arguments

DescriptionNameType

The expression to testlogical_textreal

"True" (non-zero) branchvalue_if_trueinteger, real, date, or string

"False" (zero or null) branchvalue_if_falseas value_if_true

Result

DescriptionType

value_if_true if logical_text 0;value_if_false if logical_text = 0 orlogical_text is null

as value_if_true, value_if_false

Examples iff(Score > 650, "Mail", "No Mail")

See also

cond on page 210

ifnull, nvl

Purpose: replace the null value. You can use nvl as an alias for ifnull.

Syntax ifnull(testNull, replaceValue)

nvl(testNull, replaceValue)

Arguments

DescriptionNameType

The value to be testedtestNullinteger, real, date, or string

The replacement for the null valuereplaceValueas testNull

Result

DescriptionType

A copy of the value testNull with replaceValuereplacing the null value

as input

Examples:

To replace missing values with a value derived from another field:

ifnull(SpendQtr1,AnnualSpend/4)

AnnualSpendSpendQtr1

45.21178.9745.21

37.64150.56null

196.07784.27null

59.05274.1859.05

See also

cond on page 210

isnull on page 213

isnull

Purpose: test whether a value is null.

Syntax isnull(x)

Arguments

DescriptionNameType

The value to be testedxinteger, real, date, or string

Result

DescriptionType

1 if x is null; 0 otherwiseinteger

Examples

To replace null value entries in a field with that field's mean value:

if isnull(Age) then mean(Age) elseAge

44null

See also

isselected

Purpose: flag the records that are currently selected.

Syntax isselected()

Arguments None

Result

DescriptionType

1 if the record is currently selected, otherwise 0integer

The Table Viewer has special support for immediate expansion of this function.Note:

Examples if isselected() then "LowRisk" else "HighRisk"

replace

Purpose: replace a number if it does not lie within a given interval.

Syntax replace(x, min, max, y)

Arguments

DescriptionNameType

The number to be testedxnumeric

The lower bound of the intervalminas x

The upper bound of the intervalmaxas x

The replacement valueyas x

Result

DescriptionType

x if ; y otherwiseas input

Examples

To clamp values within a range from 18 to 65 inclusive, replacing values outside the range with themean value:

replace (CustomerAge, 18, 65,mean(CustomerAge))

CustomerAge

See also

clamp on page 209

18 -Datatype-conversionfunctions

In this section

todate 217tointeger 218toreal 219tostring 220

todate

Purpose: convert a value to a date.

Syntax todate(x [, format] )

Arguments

DescriptionNameType

A value to convertxinteger, real, date, or string

A date format [see Date formats onpage 52] to use for the conversion

formatstring

(optional and only allowed if x is ofstring type)

Result

DescriptionType

x, as a datedate

Note: • If the argument is of integer type, it is assumed to represent the date as YYYYMMDD.• If the argument is of real type, it is assumed to represent the date as

YYYYMMDD.HHMMSS.• If the argument is of string type, and no date format argument is provided, the string is

converted according to the setting of your read preference .

Examples (Date values shown with European date format.)

todate(20040109) equals 09/01/2004:00:00:00.

todate(20040109.122853) equals 09/01/2004:12:08:52.

todate("09/01/2004") equals 09/01/2004:00:00:00.

todate("09-Jan-2004","%d-%b-%Y") equals 09/01/2004:00:00:00.

todate(#09/01/2004:12:28:53) equals 09/01/2004:12:28:53.

See also

tointeger on page 218

toreal on page 219

tostring on page 220

Type conversion on page 183

tointeger

Purpose: convert a value to an integer.

Syntax tointeger(x)

Arguments

DescriptionNameType

Result

DescriptionType

x, as an integerinteger

Note: • If the argument is of real type, digits after the decimal point are ignored.• If the argument is of date type, the result is of the form YYYYMMDD (the time component

is ignored).• If the argument is of string type, any trailing non-numeric characters are ignored.

Examples

tointeger(100) equals 100.

tointeger(100.65) equals 100.

tointeger("100.65") equals 100.

tointeger(#09/01/2004:12:28:53) equals 20040109.

See also

todate on page 217

toreal on page 219

toreal

Purpose: convert a value to a real number.

Syntax toreal(x)

Arguments

DescriptionNameType

Result

DescriptionType

x, as a real numberreal

Note: • If the argument is of date type, the result is of the form YYYYMMDD.HHMMSS.• If the argument is of string type, any trailing non-numeric characters are ignored.

Examples

toreal(100) equals 100.0.

toreal(100.65) equals 100.65.

toreal("100.65") equals 100.65.

toreal(#09/01/2004:12:28:53) equals 20040109.122853.

See also

todate on page 217

tostring

Purpose: convert a value to a string.

Syntax tostring(x)

Arguments

DescriptionNameType

Result

DescriptionType

x, as a stringstring

Note: • If the argument is of real type, the result is in ordinary decimal (not scientific) format.• If the argument is of date type, the result is formatted according to the setting of your datewrite preference .

Examples

tostring(100) equals "100".

tostring(1.0065e2) equals "100.650000".

tostring("100.65") equals "100.65".

tostring(#09/01/2004:12:28:53) equals "09/01/2004:12:28:53".

See also

todate on page 217

toreal on page 219

19 - Functions forworking with dates

In this section

addcenturies, addcenturiescountbackwards 223adddays 224addhours 225addminutes 226addmonths, addmonthscountbackwards 227addseconds 229addweeks 230addyears, addyearscountbackwards 231countcenturies 232countdays 233counthours 234countminutes 235countseconds 236countweeks 237countwholecenturies, countwholecenturiesbackwards 238countwholedays 240countwholehours 241countwholeminutes 242countwholemonths, countwholemonthsbackwards 243countwholeseconds 245countwholeweeks 246countwholeyears, countwholeyearsbackwards 247countyears 248day 249dayofweek 250gmt2edt 250hour 251minute 252month 253now 253

second 254today 255weekofyear 255year 256

addcenturies, addcenturiescountbackwards

Purpose: calculate a date from another date using an offset in centuries.

Syntax addcenturies(date, n)

addcenturiescountbackwards(date, n)

Arguments

DescriptionNameType

The starting datedatedate

The number of centuries to add (whichcan be negative)

ninteger

Result

DescriptionType

The date n calendar centuries after the date date(interpreted as |n| centuries before date if n < 0).

The time part of the result is the same as the time part ofdate.

See addmonths, addmonthscountbackwards on page227 for an explanation of the two variants.

The result of adding n centuries using the addcenturies (respectivelyaddcenturiescountbackwards) function is defined to be the result of adding 1200n monthswith the addmonths (respectively addmonthscountbackwards) function.

Examples

To calculate a date one century later than d, showing the difference between the addcenturies andaddcenturiescountbackwards variants (YMD date format):

addcenturiescountbackwards(d,1)addcenturies(d,1)d

2000/02/152000/02/141900/02/14

See also

countcenturies on page 232

countwholecenturies, countwholecenturiesbackwards on page 238

Date formats on page 52

adddays

Purpose: calculate a date from another date using an offset in days.

Syntax adddays(date, n)

Arguments

DescriptionNameType

The number of days to add (which canbe negative)

ninteger

Result

DescriptionType

The date n days after the date date (interpreted as |n|days before date if n < 0).

Examples

European date format: adddays(#09/01/2004, 25) equals 03/02/2004.

European date format:adddays(#09/01/2004:11:43:46, 25) equals03/02/2004:11:43:46.

American date format: adddays(#01/09/2004, 25) equals 02/03/2004.

American date format:adddays(#01/09/2004:11:43:46, 25) equals02/03/2004:11:43:46.

YMD date format: adddays(#2004/01/09, 25) equals 2004/02/03.

YMD date format: adddays(#2004/01/09:11:43:46, 25) equals 2004/02/03:11:43:46.

See also

countdays on page 233

countwholedays on page 240

addhours

Purpose: calculate a date from another date using an offset in hours.

Syntax addhours(date, n)

Arguments

DescriptionNameType

The number of hours to add (which canbe negative)

ninteger

Result

DescriptionType

The date n hours after the date date (interpreted as |n|hours before date if n < 0)

All dates in FDL include a time component.Note:

Examples

European date format: addhours(#09/01/2004:12:28:53, 25) equals10/01/2004:13:28:53.

American date format: addhours(#01/09/2004:12:28:53, 25) equals01/10/2004:13:28:53.

YMD date format: addhours(#2004/01/09:12:28:53, 25) equals 2004/01/10:13:28:53.

See also

counthours on page 234

countwholehours on page 241

addminutes

Purpose: calculate a date from another date using an offset in minutes.

Syntax addminutes(date, n)

Arguments

DescriptionNameType

The number of minutes to add (whichcan be negative)

ninteger

Result

DescriptionType

The date n minutes after the date date (interpreted as|n| minutes before date if n < 0)

Examples

European date format: addminutes(#09/01/2004:12:28:53, 25) equals09/01/2004:12:53:53.

American date format: addminutes(#01/09/2004:12:28:53, 25) equals01/09/2004:12:53:53.

YMD date format:addminutes(#2004/01/09:12:28:53, 25) equals2004/01/09:12:53:53.

See also

countminutes on page 235

countwholeminutes on page 242

addmonths, addmonthscountbackwards

Purpose: calculate a date from another date using an offset in months.

Syntax addmonths(date, n)

addmonthscountbackwards(date, n)

Arguments

DescriptionNameType

The number of months to add (whichcan be negative)

ninteger

Result

DescriptionType

The date n calendar months after the date date(interpreted as |n| months before date if n < 0).

The month and year parts of the result are determined bythe offset n.

Then the day of month of the result is set to be the sameas it is in the date date (for addmonths) or the samenumber of days from the end of the month as it is in the datedate (for addmonthscountbackwards).

If the day of month would end up being before the beginningor after the end of the month (because the result month isa shorter month than the month containing date), it isclamped to the first or last day of the month as appropriate.

DescriptionType

Note: • If you add n months using addmonths (or addmonthscountbackwards), you won'tnecessarily get the same result as you would get by adding one month n times, as the dayof month might be constrained by an intermediate short month in the latter case.

• The results of adding to a date with addmonths and addmonthscountbackwards maydiffer if the month part of the date and the month part of the result contain different numbersof days.

Examples

To calculate a date one month later than d, showing the propagation of time parts (YMD date format):

addmonths(d,1)d

2007/09/14:00:00:002007/08/14:00:00:00

2007/09/14:12:34:562007/08/14:12:34:56

To calculate a date one month later than d, showing the difference between the addmonths andaddmonthscountbackwards variants (YMD date format):

addmonthscountbackwards(d,1)addmonths(d,1)d

2007/09/132007/09/142007/08/14

To calculate a date one month later than d, showing the clamping behavior of addmonths at theend of a month (YMD date format):

addmonths(d,1)d

2007/09/302007/08/31

To calculate a date one month later than d, showing the clamping behavior ofaddmonthscountbackwards at the start of the month (YMD date format):

addmonthscountbackwards(d,1)d

2007/09/012007/08/02

2007/09/012007/08/01

To calculate the six-month anniversary of each customer's acquisition, using the customer acquisitiondate field StartDate:

addmonths(StartDate, 6)

See also

countwholemonths, countwholemonthsbackwards on page 243

addseconds

Purpose: calculate a date from another date using an offset in seconds.

Syntax addseconds(date, n)

Arguments

DescriptionNameType

The number of seconds to add (whichcan be negative)

ninteger

Result

DescriptionType

The date n seconds after the date date (interpreted as|n| seconds before date if n < 0)

Examples

European date format: addseconds(#09/01/2004:12:28:53, 25) equals09/01/2004:12:54:18.

American date format: addseconds(#01/09/2004:12:28:53, 25) equals01/09/2004:12:54:18.

YMD date format:addseconds(#2004/01/09:12:28:53, 25) equals2004/01/09:12:54:18.

See also

countseconds on page 236

countwholeseconds on page 245

addweeks

Purpose: calculate a date from another date using an offset in weeks.

Syntax addweeks(date, n)

Arguments

DescriptionNameType

The number of weeks to add (which canbe negative)

ninteger

Result

DescriptionType

The date n weeks after the datedate (interpreted as|n|weeks before date if n < 0).

Examples

European date format: addweeks(#09/01/2004, 8) equals 05/03/2004.

American date format: addweeks(#01/09/2004, 8) equals 03/05/2004.

YMD date format: addweeks(#2004/01/09, 8) equals 2004/03/05.

See also

countweeks on page 237

countwholeweeks on page 246

addyears, addyearscountbackwards

Purpose: calculate a date from another date using an offset in years.

Syntax addyears(date, n)

addyearscountbackwards(date, n)

Arguments

DescriptionNameType

The number of years to add (which canbe negative)

ninteger

Result

DescriptionType

The date n calendar years after the datedate (interpretedas |n| years before date if n < 0).

See addmonths, addmonthscountbackwards on page227 for an explanation of the two variants.

The result of adding n years using the addyears (respectively addyearscountbackwards)function is defined to be the result of adding 12n months with the addmonths (respectivelyaddmonthscountbackwards) function.

Examples

To calculate a date one year later than d, showing the difference between the addyears andaddyearscountbackwards variants (YMD date format):

addyearscountbackwards(d,1)addyears(d,1)d

2008/02/152008/02/142007/02/14

To calculate the one-year anniversary of each customer's acquisition, using the customer acquisitiondate field StartDate:

addyears(StartDate, 1)

See also

countwholeyears, countwholeyearsbackwards on page 247

countyears on page 248

countcenturies

Purpose: calculate the period between two dates in centuries.

Syntax countcenturies(date1, date2)

Arguments

DescriptionNameType

the first datedate1date

the second datedate2date

Result

DescriptionType

The period between date1 and date2 in centuries (anegative number if date1 is later than date2)

Note: • If the number is not an exact multiple of 100 years (taking time parts of dates into account),the result includes a fractional part.

• The result of the countcenturies function is defined to be the result of the countyearsfunction, divided by 100.

Examples

European date format: countcenturies(#09/01/2004, #09/01/2054) equals 0.5.

American date format: countcenturies(#01/09/2004, #01/09/2054) equals 0.5.

YMD date format: countcenturies(#2004/01/09, #2054/01/09) equals 0.5.

See also

countwholecenturies, countwholecenturiesbackwards on page 238

countdays

Purpose: calculate the period between two dates in days.

Syntax countdays(date1, date2)

Arguments

DescriptionNameType

Result

DescriptionType

The period between date1 and date2 in days (anegative number if date1 is later than date2)

If the time parts of the two dates differ, the result includes a fractional part.Note:

Examples

European date format:

countdays(#09/01/2004, #03/02/2004) equals 25.

countdays(#09/01/2004:10:00:00, #03/02/2004:11:00:00) equals 25.04.

American date format:

YMD date format:

See also

adddays on page 224

countwholedays on page 240

counthours

Purpose: calculate the period between two dates in hours.

Syntax counthours(time1, time2)

Arguments

DescriptionNameType

the first date/timetime1date

the second date/timetime2date

Result

DescriptionType

The period between time1 and time2 in hours (anegative number if time1 is later than time2)

If the minutes or seconds parts of the two dates differ, the result includes a fractional part.Note:

Examples

European date format: counthours(#09/01/2004:12:28:53, #10/01/2004:13:28:53)equals 25.

American date format: counthours(#01/09/2004:12:28:53, #01/10/2004:13:28:53)equals 25.

YMD date format: counthours(#2004/01/09:12:28:53, #2004/01/10:13:28:53) equals25.

See also

countwholehours on page 241

countminutes

Purpose: calculate the period between two dates in minutes.

Syntax countminutes(time1, time2)

Arguments

DescriptionNameType

Result

DescriptionType

The period between time1 and time2 in minutes (anegative number if time1 is later than time2)

If the seconds parts of the two dates differ, the result includes a fractional part.Note:

Examples

European date format: countminutes(#09/01/2004:12:28:53, #09/01/2004:12:53:53)equals 25.

American date format: countminutes(#01/09/2004:12:28:53, #01/09/2004:12:53:53)equals 25.

YMD date format: countminutes(#2004/01/09:12:28:53, #2004/01/09:12:53:53)equals 25.

See also

countwholeminutes on page 242

countseconds

Purpose Count the number of seconds between two dates.

Syntax countseconds(time1, time2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of seconds between time1 and time2 (anegative number if time1 is later than time2)

The only difference between this function and countwholeseconds is that the result of thelatter is of type integer.

Examples

European date format: countseconds(#09/01/2004:12:28:53, #09/01/2004:12:29:18)equals 25.

American date format: countseconds(#01/09/2004:12:28:53, #01/09/2004:12:29:18)equals 25.

YMD date format: countseconds(#2004/01/09:12:28:53, #2004/01/09:12:29:18)equals 25.

See also

countweeks

Purpose: calculate the period between two dates in weeks.

Syntax countweeks(date1, date2)

Arguments

DescriptionNameType

Result

DescriptionType

The period between date1 and date2 in weeks (anegative number if date1 is later than date2)

If the period is not an exact multiple of seven days (taking time parts of dates into account),the result includes a fractional part.

Examples

European date format: countweeks(#09/01/2004, #05/03/2004) equals 8.

American date format: countweeks(#01/09/2004, #03/05/2004) equals 8.

YMD date format: countweeks(#2004/01/09, #2004/03/05) equals 8.

See also

countwholeweeks on page 246

countwholecenturies, countwholecenturiesbackwards

Purpose: count the number of complete centuries between two dates.

Syntaxcountwholecenturies(date1, date2)countwholecenturiesbackwards(date1,date2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete centuries elapsed betweendate1and date2 (a negative number if date1 is later thandate2).

See countwholemonths, countwholemonthsbackwardson page 243 for an explanation of the two variants.

integer

Note: • The result of the countwholecenturies (respectivelycountwholecenturiesbackwards) function is defined as the result of thecountwholemonths (respectively countwholemonthsbackwards) function dividedby 1200, ignoring any remainder.

• The results of comparing two dates with countwholecenturies andcountwholecenturiesbackwardsmay differ, but only when both dates lie in February,one date lies in a year divisible by 400, and the other lies in a year divisible by 100 but notdivisible by 400.

Examples

To count the number of complete centuries between d1 and d2, showing the difference betweenthe countwholecenturies and countwholecenturiesbackwards variants (YMD date format):

countwholecenturiesbackwards(d1,d2)countwholecenturies(d1,d2)d2d1

002000/02/131900/02/14

012000/02/141900/02/14

112000/02/151900/02/14

See also

addcenturies, addcenturiescountbackwards on page 223

countcenturies on page 232

countwholedays

Purpose: count the number of complete days between two dates.

Syntax countwholedays(date1, date2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete 24-hour periods elapsed betweendate1 anddate2 (a negative number ifdate1 is laterthan date2)

integer

Examples

European date format:

countdays(#09/01/2004:10:00:00, #03/02/2004:11:00:00) equals 25.

American date format:

YMD date format:

See also

adddays on page 224

countdays on page 233

countwholehours

Purpose: count the number of complete hours between two dates.

Syntax countwholehours(time1, time2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete hours elapsed between time1and time2 (a negative number if time1 is later thantime2)

integer

Examples

European date format: countwholehours(#09/01/2004:12:28:53,#10/01/2004:13:28:53) equals 24.

American date format: countwholehours(#01/09/2004:12:28:53,#01/10/2004:13:28:53) equals 24.

YMD date format: countwholehours(#2004/01/09:12:28:53, #2004/01/10:13:28:53)equals 24.

See also

counthours on page 234

countwholeminutes

Purpose: count the number of complete minutes between two dates.

Syntax countwholeminutes(time1, time2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete minutes elapsed betweentime1and time2 (a negative number if time1 is later thantime2)

integer

Examples

European date format: countwholeminutes(#09/01/2004:12:28:53,#09/01/2004:12:53:53) equals 24.

American date format: countwholeminutes(#01/09/2004:12:28:53,#01/09/2004:12:53:53) equals 24.

YMD date format:countwholeminutes(#2004/01/09:12:28:53, #2004/01/09:12:53:53)equals 24.

See also

countminutes on page 235

countwholemonths, countwholemonthsbackwards

Purpose: count the number of complete months between two dates.

Syntax countwholemonths(date1, date2) countwholemonthsbackwards(date1,date2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete months elapsed betweendate1and date2 (a negative number if date1 is later thandate2)

integer

Note: • The time parts of date1 and date2 are ignored.• The number of complete months between two dates is the greatest number of months that

can be added to the first date (truncating time parts, and using addmonths in the case ofcountwholemonths, and addmonthscountbackwards in the case ofcountwholemonthsbackwards) such that if the second date is later than the first date,the result is no later than the second date, and if the second date is earlier than the firstdate, the result is no earlier than the second date.

• The results of comparing two dates with countwholemonths andcountwholemonthsbackwards may differ when the dates lie in months of differentlengths.

• Because of the clamping behavior of addmonths (and addmonthscountbackwards),the absolute value of the results of comparing two dates with countwholemonths (orcountwholemonthsbackwards) may depend on the order of the arguments when thedates lie in months of different lengths.

Examples

To count the number of complete months between d1 and d2, showing the truncation of time parts(YMD date format):

countwholemonths(d1,d2)d2d1

12007/09/14:00:00:002007/08/14:00:00:00

12007/09/14:00:00:002007/08/14:12:34:56

To count the number of complete months between d1 and d2, showing the difference between thecountwholemonths and countwholemonthsbackwards variants (YMD date format):

countwholemonthsbackwards(d1,d2)countwholemonths(d1,d2)d2d1

002007/09/122007/08/14

102007/09/132007/08/14

112007/09/142007/08/14

To count the number of complete months between d1 and d2, showing the clamping behavior ofcountwholemonths at the end of the month (YMD date format):

countwholemonths(d2,d1)countwholemonths(d1,d2)d2d1

012007/09/302007/08/31

To count the number of complete months between d1 and d2, showing the clamping behavior ofcountwholemonths at the start of the month (YMD date format):

countwholemonthsbackwards(d2,d1)countwholemonthsbackwards(d1,d2)d2d1

0-12007/06/012007/07/01

To count the number of monthly e-mail communications each customer has received, wherecommunications are sent on the fifth day of each month (the final communication having been senton March 5th 2006), and using the customer acquisition date field StartDate:

-countwholemonths(#2006/03/05, StartDate)

To count the number of monthly statements each customer has received, where statement are sentseven days before the end of each month (the final statement having been sent on March 24th2006), and using the customer acquisition date field StartDate:

-countwholemonthsbackwards(#2006/03/24, StartDate)

See also

addmonths, addmonthscountbackwards on page 227

countwholeseconds

Purpose: count the number of seconds between two dates.

Syntax countwholeseconds(time1, time2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of seconds between time1 and time2 (anegative number if time1 is later than time2)

integer

The only difference between this function and countseconds is that the result of the latteris of type real.

Examples

European date format: countwholeseconds(#09/01/2004:12:28:53,#09/01/2004:12:29:18) equals 25.

American date format: countwholeseconds(#01/09/2004:12:28:53,#01/09/2004:12:29:18) equals 25.

YMD date format:countwholeseconds(#2004/01/09:12:28:53, #2004/01/09:12:29:18)equals 25.

See also

countwholeweeks

Purpose: count the number of complete weeks between two dates.

Syntax countwholeweeks(date1, date2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete weeks elapsed between date1and date2 (a negative number if date1 is later thandate2)

integer

Examples

European date format: countwholeweeks(#09/01/2004, #09/03/2004) equals 8.

American date format: countwholeweeks(#01/09/2004, #03/09/2004) equals 8.

YMD date format: countwholeweeks(#2004/01/09, #2004/03/09) equals 8.

See also

countweeks on page 237

countwholeyears, countwholeyearsbackwards

Purpose: count the number of complete years between two dates.

Syntax countwholeyears(date1, date2)

countwholeyearsbackwards(date1, date2)

Arguments

DescriptionNameType

Result

DescriptionType

The number of complete years elapsed between date1and date2 (a negative number if date1 is later thandate2).

See countwholemonths, countwholemonthsbackwardson page 243 for an explanation of the two variants.

integer

Note: • The result of the countwholeyears (respectively countwholeyearsbackwards)function is defined as the result of the countwholemonths (respectivelycountwholemonthsbackwards) function divided by 12, ignoring any remainder.

• The results of comparing two dates with countwholeyears andcountwholeyearsbackwards may differ, but only when both dates lie in February, onedate lies in a leap year, and the other does not.

Examples

To count the number of complete years between d1 and d2, showing the difference between thecountwholeyears and countwholeyearsbackwards variants (YMD date format):

countwholeyearsbackwards(d1,d2)countwholeyears(d1,d2)d2d1

002008/02/132007/02/14

012008/02/142007/02/14

112008/02/152007/02/14

To compute the age of each customer at March 1st 2007 from a date-of-birth field DOB (YMD dateformat):

countwholeyears(DOB, #2007/03/01)

See also

addyears, addyearscountbackwards on page 231

countyears on page 248

countyears

Purpose: calculate the period between two dates in years.

Syntax countyears(date1, date2)

Arguments

DescriptionNameType

Result

DescriptionType

The period between date1 and date2 in years (anegative number if date1 is later than date2)

Note: • If the month, day, or time parts of the two dates differ, the result includes a fractional part,counting the incomplete year as a fraction of the number of days (365 or 366) in the yearof the later of the two dates.

Examples

European date format: countyears(#09/01/2004, #09/01/1994) equals -10.

American date format: countyears(#01/09/2004, #01/09/1994) equals -10.

YMD date format: countyears(#2004/01/09, #1994/01/09) equals -10.

See also

countwholeyears, countwholeyearsbackwards on page 247

Purpose: obtain the day-of-month part of a date.

Syntax day(date)

Arguments

DescriptionNameType

A datedatedate

Result

DescriptionType

The day-of-month part of dateinteger

Examples

European date format: day(#09/01/2004) equals 9.

American date format: day(#01/09/2004) equals 9.

YMD date format: day(#2004/01/09) equals 9.

See also

dayofweek

Purpose: obtain a number representing the day-of-week of a date.

Syntax dayofweek(date)

Arguments

DescriptionNameType

A datedatedate

Result

DescriptionType

The day-of-week ofdate (a number between 0 and 6, with0 representing Sunday)

integer

Examples

European date format: dayofweek(#09/01/2004) equals 5 (that is, Friday).

American date format: dayofweek(#01/09/2004) equals 5 (that is, Friday).

YMD date format: dayofweek(#2004/01/09) equals 5 (that is, Friday).

See also

gmt2edt

Purpose: convert a Greenwhich Mean Time (GMT) date to Eastern Daylight Time (EDT).

Syntax gmt2edt(datetimeGMT)

Arguments

DescriptionNameType

The GMT date to be converteddatetimeGMTdate

Result

DescriptionType

The EDT date, accounting for daylight savingsdate

Examples

Winter: GMT2EDT(#09/01/2004:12:28:53) equals 09/01/2004:07:28:53.

Summer: GMT2EDT(#09/07/2004:12:28:53) equals 09/07/2004:08:28:53.

Purpose: obtain the hours part of a date.

Syntax hour(time)

Arguments

DescriptionNameType

A date/timetimedate

Result

DescriptionType

The hours part of timeinteger

Examples

European date format: hour(#09/01/2004:12:28:53) equals 12.

American date format: hour(#01/09/2004:12:28:53) equals 12.

YMD date format: hour(#2004/01/09:12:28:53) equals 12.

See also

minute

Purpose: obtain the minutes part of a date.

Syntax minute(time)

Arguments

DescriptionNameType

A date/timetimedate

Result

DescriptionType

The minutes part of timeinteger

Examples

European date format: minute(#09/01/2004:12:28:53) equals 28.

American date format: minute(#01/09/2004:12:28:53) equals 28.

YMD date format: minute(#2004/01/09:12:28:53 equals 28.

See also

Purpose: obtain the month part of a date.

Syntax month(date)

Arguments

DescriptionNameType

A datedatedate

Result

DescriptionType

The month part of dateinteger

Examples

European date format: month(#09/01/2004) equals 1.

American date format: month(#01/09/2004) equals 1.

YMD date format: month(#2004/01/09) equals 1.

See also

Purpose: obtain the current date and time.

Syntax now()

Arguments None

Result

DescriptionType

The current date (including time), obtained from the systemclock

Examples

European date format: now() equals 09/01/2004:12:28:53.

American date format: now() equals 01/09/2004:12:28:53.

YMD date format: now() equals 2004/01/09:12:28:53.

See also

today on page 255

second

Purpose: obtain the seconds part of a date.

Syntax second(time)

Arguments

DescriptionNameType

A date/timetimedate

Result

DescriptionType

The seconds part of timeinteger

Examples

European date format: second(#09/01/2004:12:28:53) equals 53.

American date format: second(#01/09/2004:12:28:53) equals 53.

YMD date format: second(#2004/01/09:12:28:53) equals 53.

See also

Purpose: obtain the current date.

Syntax today()

Arguments None

Result

DescriptionType

The current date obtained from the system clock, with thetime part set to 00:00:00

Examples

European date format: today() equals 09/01/2004:00:00:00.

American date format: today() equals 01/09/2004:00:00:00.

YMD date format: today() equals 2004/01/09:00:00:00.

See also

now on page 253

weekofyear

Purpose: calculate the week-of-year of a date, relative to a specified start date.

Syntax weekofyear(date, yearStart)

Arguments

DescriptionNameType

The date whose week-of-year is to beobtained

datedate

The start date (for example, of afinancial year)

yearStartdate

Result

DescriptionType

The week-of-year of date, relative to yearStartinteger

Only the month and day parts of the start date are used.Note:

Examples

European date format: weekofyear(#09/07/2004, #01/04/2004) equals 14.

American date format: weekofyear(#07/09/2004, #04/01/2004) equals 14.

YMD date format: weekofyear(#2004/07/09, #2004/04/01) equals 14.

See also

Purpose: obtain the year part of a date.

Syntax year(date)

Arguments

DescriptionNameType

A datedatedate

Result

DescriptionType

The year part of dateinteger

Examples

European date format: year(#09/01/2004) equals 2004.

American date format: year(#01/09/2004) equals 2004.

YMD date format: year(#2004/01/09) equals 2004.

See also

20 - Functions forworking with strings

In this section

concat 259endswith 260find 261left 262mid 263right 264soundex 265startswith 266strlen 267strmember 268substitute 269substr 270tolower 272toupper 273trim 273

concat

Purpose: concatenate two or more strings.

Syntax concat( , , ...)

Arguments

DescriptionNameType

Strings to be concatenated, , ...string

Result

DescriptionType

The string obtained by writing the characters of , followedby the characters of , etc.

string

Examples

Derive a customer's complete name from two fields FirstName and Surname:

concat(FirstName,"",Surname)

SurnameFirstName

John BrownBrownJohn

David SmithSmithDavid

Derive a customer's complete name, but using only their initial:

concat(substr(FirstName,0,0),".",Surname)

SurnameFirstName

J. BrownBrownJohn

D. SmithSmithDavid

endswith

Purpose: test whether one string ends with another.

Syntax endswith(find_text, within_text)

Arguments

DescriptionNameType

The string to be found at the endfind_textstring

The string to be testedwithin_textstring

Result

DescriptionType

1 (true) if find_text is found at the end ofwithin_text; 0 (false) otherwise

integer

The endswith function is based on the match function. It appends a $ character tofind_text before treating the result as a regular expression [see Regular expressionson page 276].

Examples

To derive a field that flags e-mail addresses from .net domains:

endswith(".net", EmailAddress)EmailAddress

1J.Brown@glob.net

0D.Smith@spectrumsoftware.com

See also

find on page 261

startswith on page 266

Purpose Test whether one string is contained within another.

Syntax find(find_text, within_text)

Arguments

DescriptionNameType

The string to be foundfind_textstring

Result

DescriptionType

1 (true) if find_text is found in within_text; 0(false) otherwise

integer

The find function is based on the match function, and it treats find_text as a regularexpression [see Regular expressions on page 276].

Examples

To flag e-mail addresses containing the text "minersoftware" in upper, lower, or mixed case:

find("minersoftware",tolower(EmailAddress))

EmailAddress

1J.BROWN@PITNEYBOWES.COM

0J.Brown@glob.net

1D.Smith@pitneybowes.com

See also

endswith on page 260

startswith on page 266

Purpose: return a substring of specified length from the start of a string.

Syntax left(text, num_chars)

Arguments

DescriptionNameType

The string to extract fromtextstring

The number of charactersnum_charsinteger

Result

DescriptionType

The substring formed by num_chars characters at thestart of text

string

The left function is based on the substr function.Note:

Examples

To use the first two characters of a StateCode field to identify west-coast customers:

member (left (StateCode, 2), "CA","OR", WA")

StateCode

1OR009

1CA043

0UT005

1WA027

See also

mid on page 263

right on page 264

trim on page 273

Purpose: return a substring of specified length from the middle of a string.

Syntax mid(text, start_num, num_chars)

Arguments

DescriptionNameType

The first indexstart_numstring

The number of charactersnum_charsstring

Result

DescriptionType

The substring formed by thenum_chars characters fromcharacter position start_num (inclusive) of text,where the character position are numbered starting from 1

string

The mid function is based on the substr function.Note:

Examples

To derive a location from the fifth through seventh characters of a store code:

mid(StoreCode, 5, 3)StoreCode

EDI0114EDI256

LON1863LON836

EDI9326EDI039

BOS0387BOS041

See also

left on page 262

right on page 264

trim on page 273

Purpose: return a substring of specified length from the end of a string.

Syntax right(text, num_chars)

Arguments

DescriptionNameType

The number of charactersnum_charsinteger

Result

DescriptionType

The substring formed by num_chars characters at theend of text

string

The right function is based on the substr function.Note:

Examples

To return the house/street component of a UK postcode field, that is, the last three characters:

right(Postcode, 3)Postcode

0AYNE25 0AY

7RAEH3 7RA

1QXSL4 1QX

0NJEH9 0NJ

See also

left on page 262

mid on page 263

trim on page 273

soundex

Purpose: reduce each word to a four-character string for indexing purposes.

Syntax soundex(text)

Arguments

DescriptionNameType

The text to reducetextstring

Result

DescriptionType

A four-character reduced index codingstring

Examples

To derive the soundex code for a family name:

soundex(Surname)Surname

B650Brown

S530Smythe

S530Smith

B650Bruno

The Soundex function is not defined for non ascii characters.Note:

startswith

Purpose: test whether one string starts with another.

Syntax startswith(find_text, within_text)

Arguments

DescriptionNameType

The string to look for at the startfind_textstring

Result

DescriptionType

1 iffind_text is found at the start ofwithin_text;0 otherwise

integer

The startswith function is based on the match function. It prepends a ^ character tofind_text before treating the result as a regular expression [see Regular expressionson page 276].

Examples

To derive the type of credit card used in a transaction, such as "Discover", "AmEx", "Visa" or"MasterCard", based on the card prefix numbers:

startswith ("6011", CC_Number): "Discover";

startswith ("4", CC_Number): "Visa";

startswith ("5", CC_Number): "MasterCard";

startswith ("34",CC_Number) or startswith ("37", CC_Number): "AmEx";

default: null;

CC_TypeCC_Number

Visa4111-1111-1111-1111

MasterCard5431-1111-1111-1111

AmEx341-1111-1111-1111

Discover6011-6011-6011-6611

See also

endswith on page 260

find on page 261

strlen

Purpose: obtain the length of a string.

Syntax strlen(string)

Arguments

DescriptionNameType

The string whose length is to bedetermined

stringstring

Result

DescriptionType

The number of characters in stringinteger

Examples

To derive a string field of exactly four characters, by padding a shorter string field with leading zeros:

strlen(AccountID) = 1: concat("000", AccountID);

default: AccountID;

AccountAccountID

002222

0333333

44444444

strmember

Purpose: determine set membership.

Syntax strmember(x, , , ...)

Arguments

DescriptionNameType

The value to be testedxstring

The elements of the set, , ...as x

Result

DescriptionType

1 (true) if x is in the set ; 0 (false) otherwiseinteger

Note: • You can only use string literal values for the list of set elements.• Trailing whitespace in the values is not ignored.• This function is deprecated in favor of the more generalmember function.

See also

substitute

Purpose: replace one string with another.

Syntax substitute(text, old_text, new_text)

Arguments

DescriptionNameType

The string to be searchedtextstring

The string to be replacedold_textstring

The replacement stringnew_textstring

Result

DescriptionType

A copy of the stringtextwithnew_text replacing eachnon-overlapping instance of old_text

string

The substitute function is based on the replaceall function, and it treats old_text as aregular expression [see Regular expressions on page 276].

Examples

To remove whitespace from a field named PostalCode:

substitute(PostalCode,"[[:blank:]]", "")

PostalCode

ND069ND 069

IN099IN 099

UT005UT 0 0 5

OR009O R 009

See also

replacefirst on page 283

substr

Purpose: obtain a substring from a string.

Syntax substr(string, start, end)

Arguments

DescriptionNameType

The initial stringstringstring

The first indexstartinteger

DescriptionNameType

The second indexendinteger

Result

DescriptionType

The substring formed from charactersstart throughend(inclusive) ofstring, where the characters are numbered

string

starting from 0. Positive index values count from the startof the string, and negative index values count from the endof the string.

Examples

To select the first two and last three characters of a PostalCode field to give the state code:

substr(PostalCode, -3,-1)

substr(PostalCode, 0, 1)PostalCode

015MAMA015

055WAWA055

005UTUT005

119MNMN119

To derive the type of credit card used in a transaction, such as "Discover", "AmEx", "Visa" or"MasterCard", based on the card prefix numbers:

substr(CC_Number, 0, 3) = "6011": "Discover";

substr(CC_Number, 0, 0) = "4": "Visa";

substr(CC_Number, 0, 0) = "5": "MasterCard";

substr(CC_Number, 0, 1) = "34" or substr(CC_Number, 0, 1) = "37": "AmEx";

default: null;

CC_TypeCC_Number

Visa4111-1111-1111-1111

MasterCard5431-1111-1111-1111

AmEx341-1111-1111-1111

Discover6011-6011-6011-6611

tolower

Purpose: convert a string to lowercase text.

Syntax tolower(string)

Arguments

DescriptionNameType

A string to be convertedstringstring

Result

DescriptionType

The string obtained from string by replacing everyuppercase letter in string with the correspondinglowercase letter

string

Examples tolower("ZYXWV") equals "zyxwv"

See also

toupper on page 273

toupper

Purpose: convert a string to uppercase text.

Syntax toupper(string)

Arguments

DescriptionNameType

A string to be convertedstringstring

Result

DescriptionType

The string obtained from string by replacing everylowercase letter in string with the correspondinguppercase letter

string

Examples toupper("abcde") equals "ABCDE"

See also

tolower on page 272

Purpose: remove all spaces from a string except for single spaces between words.

Syntax trim(text)

Arguments

DescriptionNameType

The string to be trimmedtextstring

Result

DescriptionType

The string formed from text removing all spaces exceptfor single spaces between words

string

The trim function is based on the replaceall function.Note:

Examples

To derive a field that replace multiple whitespace characters with a single space:

trim(CommentField)CommentField

Returned faulty goodsReturned faulty goods

Missed appointmentMissed appointment

Engineer to visit: MondayEngineer to visit: Monday

ResolvedResolved

21 - Regularexpressions andassociated functions

In this section

Regular expressions 276Basic components of a regular expression 276Regular-expression operators 279match 280replaceall 282replacefirst 283

Regular expressions

Regular expressions are for matching patterns in strings [see Datatypes on page 181]. A regularexpression is a specially formatted string that represents a pattern of characters. For example, theregular expression "ar{1,2}y?$" represents "the letter 'a,' followed by one or two 'r's, possibly followedby 'y,' followed by the end of a string." (That pattern occurs in strings like "marry" and "jar" — butnot in "may" or "jars.")

A regular expression consists of basic components [seeBasic components of a regular expressionon page 276] combined using operators [see Regular-expression operators on page 279].

If a regular expression (other than a subexpression) could match more than one portion of a string,it always matches the leftmost, longest candidate portion. For example, the regular expression"a[^ad]+" (representing "the letter 'a', followed by one or more letters other than 'a' or 'd'") matches"adaptationally." The matched portion is not "ation," as there are candidate portions further to theleft. Neither is it "ap," because, although there are no candidates further to the left, there is a longercandidate portion starting at the same character.

• To test a string for a regular expression match, use the match function.• To replace matched portions of a string with another string, use the replacefirst or replaceall

function.

Note: • Regular expressions are typically (but not necessarily) literals, in which case they must beenclosed in double quotation marks, for example, "ar{1,2}y?$".

• Miner/FDL uses Regular expression provided by the ICU library. (we can link tohttp://userguide.icu-project.org/strings/regexp)

• ICU Regular Expressions conform toUnicode Technical Standard #18 , Unicode RegularExpressions, level 1, and in addition include Default Word boundaries and Name Propertiesfrom level 2.

Basic components of a regular expression

The basic components of a regular expression [see Regular expressions on page 276] are atoms,anchors, and back-references.

AtomsAn atom matches exactly one of a specified set of characters:

Character matchedAtom

Any character. (match-any)

Any one of the specified characters[...] (match-list)

Any character other than one of the specified characters[^...] (non-match-list)

The character itselfA character other than ., ^, $, \, [, ], or *,occurring outside a match-list or non-match-list

The character itself., ^, $, \, [, ], or *, occurring outside amatch-list or non-match-list

A match-list or non-match-list can include simple characters, character ranges, and character classes.You cannot use character ranges or character classes except in a match-list or non-match-list.

A character range is a pair of characters separated by a hyphen and is equivalent to a list of all thecharacters in that range (within the character set). For example, "a-e" is equivalent to "abcde" ina match-list.

A character class is one of the following pre-defined tokens (where the enclosing brackets areadditional to those enclosing the match-list or non-match-list):

MatchesToken

An alphanumeric character, that is an alphabetic characteror decimal digit

[:alnum:]

An alphabetic character, that is, "A" — "Z" and "a" — "z"[:alpha:]

A tab or space character[:blank:]

A control character[:cntrl:]

A decimal digit, that is, "0" — "9"[:digit:]

A printable character other than space[:graph:]

A lowercase letter[:lower:]

A printable character or space[:print:]

MatchesToken

A punctuation character[:punct:]

A whitespace character[:space:]

An uppercase character[:upper:]

A hexadecimal digit, that is, "0" — "9," "A" — "F," and "a"— "f"

[:xdigit:]

Note: • You can include a mixture of simple characters, character ranges, and character classesin a single match-list or non-match-list. For example, the match-list "[a-ex[:digit:]]"matches the lowercase letters "a" — "e" and "x," as well as decimal digits ("0" — "9").

• A hyphen in a match-list or non-match-list stands for itself when it is (a) at the end of arange, (b) at the start of the match-list or non-match-list (in which case it can also serveas the start of a range), or (c) at the end of the match-list or non-match-list.

AnchorsThe characters ^ and $ are anchors; they match the beginning and end of a string respectively.

For example, to match "The" occurring at the start of a string and nowhere else, you could use theregular expression "^The."

Back-references

A back-reference is a backslash followed by a single digit n (other than 0). It matches exactly thesame characters as the nth subexpression enclosed in $...$.

For example,to match "yoyo," "dodo," etc. — but not "dojo" — you could use the regular expression"$.o$$1."

Note: • The nth back-referenced subexpression is always the subexpression beginning with thenth opening backslash/parenthesis (even if subexpressions are nested).

• The empty subexpression "" can be useful in regular expressions involving alternation[see The alternation operator on page 280]. For example,

$fish\($\|profit\)-and-$chips$2\|loss$3$

matches both "fish-and-chips" and "profit-and-loss," but not "fish-and-loss" or"profit-and-chips."

Regular-expression operators

An operator acts on one or two regular expressions [see Regular expressions on page 276] tocreate a new regular expression. The operators, from highest to lowest precedence, are the repetitionoperators [see Repetition operators on page 279], the concatenation operator [see Theconcatenation operator on page 280], and the alternation operator [see The alternation operatoron page 280].

If there is a tie for precedence, operations are carried out from left to right.

Subexpressions enclosed in (...) are always evaluated first.

Repetition operators

The repetition operators are unary and are written following their single operand.

DescriptionOperator

If a regular expression a matches one or more adjacentportions of a string, the regular expression a* matches the

* (zero-or-more)

concatenation of those portions. It also matches any emptysubstring.

If a regular expression a matches one or more adjacentportions of a string, the regular expression a\+ matchesthe concatenation of those portions.

\+ (one-or-more)

If a regular expression a matches at least m and no morethan n adjacent portions of a string, the regular expression

{m,n} (0 m n)

a{m,n} matches the concatenation of those portions. Ifm=0, it also matches any empty substring.

Equivalent to a{0,1}, the regular expression a\?matches any empty substring, or any portion of a string thatthe regular expression a matches.

\? (optional)

The "adjacent portions" mentioned in the definitions of *, \+, and {...} are not necessarilyidentical strings. If you need to match exact repetitions, use back-references [seeBack-references on page 278].

The concatenation operator

The concatenation operator is not actually written, but is implied whenever you write two regularexpressions next to each other.

If regular expressions a and b match adjacent portions x and y of a string z (with x preceding y),the concatenation of a and b (written as a followed by b) matches the portion of z containing x andy.

Any string of non-special characters, that is, characters with no special meaning in a regularexpression, serves as a regular expression that matches that string, because of the implicitconcatenation operator.

In practice, most regular expressions consist of simple character strings, interspersed witha few special characters. For example, the regular expression "[SsDd]imple" matches thestrings "Simple," "simple," "Dimple," and "dimple."

The alternation operator

The alternation operator "|" is a binary operator, written between its operands.

If either of the regular expressions a and b matches a portion of a string, the alternation a|b alsomatches that portion.

Purpose: test a string for a regular expression match [see Regular expressions on page 276].

Syntax match(regexp, string)

Arguments

DescriptionNameType

The regular expressionregexpstring

The string to be testedstringstring

Result

DescriptionType

1 (true) if regexpmatches string; 0 (false) otherwiseinteger

Examples

To flag customers whose home postcode and bank branch postcode match in the first two characters(using the substr function):

match(substr(HomePostcode,0, 1),substr(BranchPostcode,0,1))

BranchPostCodeHomePostcode

1NE30 1QXNE25 0AY

0NG18 1HTSW3

1CF31 1HYCF36

0E10 8AJEH9 0NJ

To look in a company name field for the first three characters from a first-name field, followed byany text and then the contents of a family name field (ignoring case):

rx := concat(substr(toupper(FirstName),0,2),"[[:print:]]*", toupper(FamilyName);match(rx,toupper(CompanyName))

CompanyNameFamilyNameFirstName

11234 Pat Smith LtdSmithPatrick

11234 Pat J Smith LtdSmithPatrick

To do the same, but with only some letters or digits and a single space between the first name andfamily name:

rx := concat(substr(toupper(FirstName),0,2),"[[:alnum:]]* ",toupper(FamilyName);match(rx,toupper(CompanyName))

CompanyNameFamilyNameFirstName

11234 Pat Smith LtdSmithPatrick

01234 Pat J Smith LtdSmithPatrick

See also

replaceall on page 282

replaceall

Purpose: replace all substrings that match a regular expression [see Regular expressions onpage 276].

Syntax replaceall(regexp, replacement, string)

Arguments

DescriptionNameType

The replacement stringreplacementstring

The string to be searchedstringstring

Result

DescriptionType

A copy of the string string with replacementreplacing each non-overlapping portion that regexpmatches

string

The first matching portion of the string to be searched is the leftmost, longest portion thatthe regular expression matches. The next search applies to the remainder of the string tobe searched, and so on. For example,

replaceall("i..", "X", "initiation")

yields the string "XtXX" (rather than, say, "inXatX"). And

replaceall("i.*i", "X", "initiation")

gives the result "Xon" (rather than, say, "XtXon").

Examples

To detect and remove whitespace from a PostalCode field using a regular expression match:

replaceall("[[:blank:]]","",PostalCode))PostalCode

ND069ND 069

IN099IN 099

UT005UT 0 0 5

OR009O R 009

See also

match on page 280

replacefirst

Purpose: replace the first substring that matches a regular expression [see Regular expressionson page 276].

Syntax replacefirst(regexp, replacement, string)

Arguments

DescriptionNameType

The replacement stringreplacementstring

The string to be searchedstringstring

Result

DescriptionType

A copy of the string string with replacementreplacing the first portion that regexp matches (if any)

string

Examples

To extract a customer's name from an e-mail address field in the format"Firstname.Surname@company.com", by removing everything from the "@" onwards andreplacing the "." with a space:

replacefirst("@.*", "",(replacefirst ("\.", " ", Email)))

John BrownJohn.Brown@myco.com

Angela SmithAngela.Smith@yourco.com

See also

match on page 280

replaceall on page 282

22 - Mathematical andstatistical functions

In this section

abs 286ceil 286cos 287exp 288floor 288log 289log10 290logbase 290max (two or more arguments), maxnonnull 291mean (two or more arguments), meannonnull 292min (two or more arguments), minnonnull 293normalize 294pow 294product, productnonnull 295round 296sgn 297sin 297sqrt 298sum (two or more arguments), sumnonnull 299tan 300

Purpose: calculate the absolute value of a number.

Syntax abs(x)

Arguments

DescriptionNameType

A signed numberxnumeric

Result

DescriptionType

, the absolute value, or magnitude, of xas input

Examples

abs(10000) equals 10000.

abs(-25000) equals 25000.

To calculate the absolute error between a predicted value and a historical value: abs(Value -PredictedValue)

See also

sgn on page 297

Purpose: round a number up to the nearest integer.

Syntax ceil(x)

Arguments

DescriptionNameType

The number to be roundedxnumeric

Result

DescriptionType

, the integer such thatinteger

Examples

ceil(1.25) equals 2.

ceil(-1.25) equals -1.

To generate a random integer between 1 and 10 inclusive: ceil(rndUniform()*10)

See also

floor on page 288

round on page 296

Purpose: calculate the cosine of an angle.

Syntax cos(x)

Arguments

DescriptionNameType

An angle in radiansxnumeric

Result

DescriptionType

cos x, the cosine of xreal

Examples cos(0) equals 1.

Purpose: calculate the exponential of a number.

Syntax exp(x)

Arguments

DescriptionNameType

The number whose exponential is to becomputed

xnumeric

Result

DescriptionType

, the exponential of xreal

Examples exp(0) equals 1.

Purpose: round a number down to the nearest integer.

Syntax floor(x)

Arguments

DescriptionNameType

Result

DescriptionType

, the integer such thatinteger

Examples

floor(1.25) equals 1.

floor(-1.25) equals -2.

To generate a random integer between 0 and 9 inclusive: floor(rndUniform()*10)

See also

ceil on page 286

round on page 296

Purpose: calculate the natural (base-e) logarithm of a number.

Syntax log(x)

Arguments

DescriptionNameType

A positive numberxnumeric

Result

DescriptionType

, the natural logarithm of xreal

Examples log(1) equals 0.

Purpose: calculate the base-10 logarithm of a number.

Syntax log10(x)

Arguments

DescriptionNameType

A positive numberxnumeric

Result

DescriptionType

, the base-10 logarithm of xreal

Examples log10(100) equals 2.

logbase

Purpose: calculate the logarithm of a number, to a specified base.

Syntax logbase(x, base)

Arguments

DescriptionNameType

The positive number whose logarithmis to be computed

xnumeric

The base to be usedbasenumeric

Result

DescriptionType

, the base-base logarithm of xreal

Examples logbase(100,10) equals 2.

max (two or more arguments), maxnonnull

Purpose: calculate the maximum of two or more numbers, or the latest of two or more dates, orthe alphabetically latest of two or more strings. The max variant returns the null value if any of itsarguments is null; the maxnonnull variant ignores null arguments (but still returns the null value ifall of its arguments are null).

Syntax max( , , ...)

maxnonnull( , , ...)

Arguments

DescriptionNameType

The values to be compared, , ...integer, real, date, or string (all thesame type)

Result

DescriptionType

The maximum of , , ...as input

Note: • Be careful not to confuse this multi-argument max function with the aggregation function[see max (one argument) on page 167] of the same name.

• For strings, "alphabetically latest" means "latest in the underlying character representation."

Examples

max(5.5, -3, 7, -8.5) equals 7.

max(5.5, -3, null, -8.5) equals null.

maxnonnull(5.5, -3, 7, -8.5) equals 7.

maxnonnull(5.5, -3, null, -8.5) equals 5.5.

To determine the next best action for a customer based on different ROI measures:

maxvalue := max (ROI1, ROI2, ROI3);

maxvalue = ROI1: "Product1";

mean (two or more arguments), meannonnull

Purpose: calculate the mean (common average) of two or more numbers. The mean variant returnsthe null value if any of its arguments is null; the meannonnull variant ignores null arguments (butstill returns the null value if all of its arguments are null).

Syntax mean( , , ...)

meannonnull( , , ...)

Arguments

DescriptionNameType

The numbers to average, , ...numeric

Result

DescriptionType

The mean of , , ...real

Be careful not to confuse this multi-argument mean function with the aggregation function[see mean (one argument) on page 168] of the same name.

Examples

mean(5.5, -3, 7, -8.5) equals 0.25.

mean(5.5, -3, null, -8.5) equals null.

meannonnull(5.5, -3, 7, -8.5) equals 0.25.

meannonnull(5.5, -3, null, -8.5) equals -2.

To calculate the average yearly spend from four separate quarter spend fields:

mean(SpendQtr1, SpendQtr2, SpendQtr3, SpendQtr4)

min (two or more arguments), minnonnull

Purpose: calculate the minimum of two or more numbers, or the earliest of two or more dates, orthe alphabetically earliest of two or more strings. The min variant returns the null value if any of itsarguments is null; the minnonnull variant ignores null arguments (but still returns the null valueif all of its arguments are null).

Syntax

min( , , ...)

minnonnull( , , ...)

Arguments

DescriptionNameType

The values to be compared, , ...integer, real, date, or string (all thesame type)

Result

DescriptionType

The minimum of , , ...as input

Note: • Be careful not to confuse this multi-argument min function with the aggregation function[see min (one argument) on page 169] of the same name.

• For strings, "alphabetically earliest" means "earliest in the underlying characterrepresentation."

Examples

min(5.5, -3, 7, -8.5) equals -8.5.

min(5.5, -3, null, -8.5) equals null.

minnonnull(5.5, -3, 7, -8.5) equals -8.5.

minnonnull(5.5, -3, null, -8.5) equals -8.5.

normalize

Purpose: normalize field values to lie in the interval [0,1].

Syntax normalize(x)

Arguments

DescriptionNameType

The number to be normalizedxnumeric

Result

DescriptionType

Value between 0 and 1 given byreal

Purpose: calculate the result of raising one number to the power of another number.

Syntax pow(x, exponent)

Arguments

DescriptionNameType

The number to be raised to a powerxnumeric

The exponentexponentnumeric

Result

DescriptionType

, x raised to the power exponentreal

Examples pow(10, 2) equals 100.

product, productnonnull

Purpose: calculate the product of two or more numbers. The product variant returns the null valueif any of its arguments is null; the productnonnull variant ignores null arguments (but still returnsthe null value if all of its arguments are null).

Syntax product( , , ...)

productnonnull( , , ...)>

Arguments

DescriptionNameType

The numbers to be multiplied, , ...numeric

Result

DescriptionType

The product of , , ...as input

Examples

product(5.5, -3, 7, -8.5) equals 981.75.

product(5.5, -3, null, -8.5) equals null.

productnonnull(5.5, -3, 7, -8.5) equals 981.75.

productnonnull(5.5, -3, null, -8.5) equals 140.25.

Purpose: round a number to the nearest integer.

Syntax round(x)

Arguments

DescriptionNameType

Result

DescriptionType

the nearest integer to x, or, in the event of a tie, the greaterof the two nearest integers (in absolute value)

integer

Examples

round(1.25) equals 1.

round(-1.25) equals -1.

round(1.5) equals 2.

round(-1.5) equals -2.

See also

ceil on page 286

floor on page 288

Purpose: calculate the signum (sign) of a number.

Syntax sgn(x)

Arguments

DescriptionNameType

A signed numberxnumeric

Result

DescriptionType

A number representing the sign of x:as input

Examples

sgn(25) equals 1.

sgn(-25) equals -1.

See also

abs on page 286

Purpose: calculate the sine of an angle.

Syntax sin(x)

Arguments

DescriptionNameType

Result

DescriptionType

sin x, the sine of xreal

Examples sin(3.14159/2) equals 1.

Purpose: calculate the square root of a number.

Syntax sqrt(x)

Arguments

DescriptionNameType

A non-negative numberxnumeric

Result

DescriptionType

, the square root of xreal

Examples

sqrt(25) equals 5.

sqrt(-25) equals null.

sum (two or more arguments), sumnonnull

Purpose: calculate the sum of two or more numbers. The sum variant returns the null value if anyof its arguments is null; the sumnonnull variant ignores null arguments (but still returns the nullvalue if all of its arguments are null).

Syntax sum( , , ...)

sumnonnull( , , ...)

Arguments

DescriptionNameType

The numbers to be added, , ...numeric

Result

DescriptionType

The sum of , , ...as input

Be careful not to confuse this multi-argument sum function with the aggregation function[see sum (one argument) on page 175] of the same name.

Examples

sum(5.5, -3, 7, -8.5) equals 1.

sum(5.5, -3, null, -8.5) equals null.

sumnonnull(5.5, -3, 7, -8.5) equals 1.

sumnonnull(5.5, -3, null, -8.5) equals -6.

To derive the total spend from six weekly spend fields:

sum(SpendWk1, SpendWk2, SpendWk3, SpendWk4, SpendWk5, SpendWk6)

Purpose: calculate the tangent of an angle.

Syntax tan(x)

Arguments

DescriptionNameType

Result

DescriptionType

tan x, the tangent of xreal

If x is of the form for some integer n, tan x is undefined. In that case, the FDL functionreturns null.

Examples tan(3.14159/4) equals 1.

23 - Data-samplingfunctions

In this section

numericTestTrainSplit 302numericTestTrainValidateSplit 302sampleEqualSize 303sampleExactNumber 304sampleExactPercentage 305sampleStratified 306testTrainSplit 307testTrainValidateSplit 308

numericTestTrainSplit

Purpose: create a test/training segmentation for use in model validation.

Syntax numericTestTrainSplit(testFraction)

Arguments

DescriptionNameType

The fraction of the data to be used asthe test set

testFractionreal

Result

DescriptionType

Values 0 for the training set or 1 for the test setinteger

Examples

To derive a field that allocates the value 0 to approximately 70% of the records (chosen randomly)and the value 1 to the remainder: numericTestTrainSplit(0.3)

See also

numericTestTrainValidateSplit on page 302

sampleEqualSize on page 303

testTrainSplit on page 307

testTrainValidateSplit on page 308

numericTestTrainValidateSplit

Purpose: create a test/training/validation segmentation for use in model validation.

Syntax numericTestTrainValidateSplit(testFraction, validateFraction)

Arguments

DescriptionNameType

testFractionreal

The fraction of the data to be used asthe hold-out set

validateFractionreal

Result

DescriptionType

Values 0 for the training set, 1 for the test set or 2 for thevalidation set

integer

Examples

To derive a field that allocates the value 0 to approximately 50% of the records (chosen randomly),the value 1 to approximately 30% of the records (again chosen randomly), and the value 2 to theremainder: numericTestTrainValidateSplit(0.3, 0.2)

See also

numericTestTrainSplit on page 302

sampleEqualSize

Purpose: create a segmentation index that randomly assigns records to segments of approximatelyequal size.

Syntax sampleEqualSize(NumberOfSegments)

Arguments

DescriptionNameType

The number of segments to createNumberOfSegmentsinteger

Result

DescriptionType

A number between 1 and NumberOfSegments(inclusive)

integer

Examples

To derive a field enumerating five (approximately) equal-sized populations: sampleEqualSize(5)

See also

sampleExactNumber

Purpose: create a random sample of an exact size, specified as a number of records.

Syntax sampleExactNumber(NumberToSelect)

Arguments

DescriptionNameType

The size of sample to create, as anumber of records

NumberToSelectinteger

Result

DescriptionType

1 if a record is chosen for the sample; 0 otherwiseinteger

Note: • Records that are not currently selected in the focus are assigned the null value.

• You can only use this function in a Decision Studio derivation, in the derivations file forqsderive or the equivalent Derive Fields window, in the trackers file for qstrack, or ina selections file for qsselect.

Examples

To choose 1,250 records for a sample: sampleExactNumber(1250)

See also

sampleExactPercentage on page 305

sampleStratified on page 306

sampleExactPercentage

Purpose: create a random sample of an exact size, specified as a percentage of the size of thecurrently-selected population.

Syntax sampleExactPercentage(PercentageToSelect)

Arguments

DescriptionNameType

The size of sample to create, as apercentage of the size of the currentpopulation

PercentageToSelectinteger

Result

DescriptionType

Note: • Records that are not currently selected in the focus are assigned the null value.• If PercentageToSelect multiplied by the population size is not a whole number, the

number chosen will be the nearest whole number above or below PercentageToSelectmultiplied by the population size, and will on average give the specified percentage.

• You can only use this function in a Decision Studio derivation, in the derivations file forqsderive or the equivalent Derive Fields window, in the trackers file for qstrack, or ina selections file for qsselect.

Examples To choose 25% of the population for a sample:

sampleExactPercentage(25)

See also

sampleExactNumber on page 304

sampleStratified on page 306

sampleStratified

Purpose: create a random sample that contains an exact number of records from a specifiedsegment, together with an exact number of records from the remainder of the currently-selectedpopulation.

Syntax sampleStratified(NumberFromSegmentToSelect, SegmentTotalSize,NumberOutsideSegmentToSelect, OutsideSegmentTotalSize, InSegmentExpression)

Arguments

DescriptionNameType

The number of records in the sampleto be chosen from the specifiedsegment

NumberFromSegmentToSelectinteger

The total number of records in thespecified segment

SegmentTotalSizeinteger

The number of records in the sampleto be chosen from the remainder of thepopulation

NumberOutsideSegmentToSelectinteger

The total number of records in theremainder of the population

OutsideSegmentTotalSizeinteger

The (boolean) expression thatdetermines whether or not a record isin the specified segment

InSegmentExpressioninteger

Result

DescriptionType

Note: • Records that are not currently selected in the focus are assigned the null value.• You can only use this function in a Decision Studio derivation, in the derivations file forqsderive or the equivalent Derive Fields window, in the trackers file for qstrack, or ina selections file for qsselect.

Examples To create a sample comprising 2,500 records from a total of 25,000 females, and 5,000records from a total of 75,000 non-females:

sampleStratified(2500, 25000, 5000, 75000, Gender = "F")

See also

sampleExactNumber on page 304

sampleExactPercentage on page 305

testTrainSplit

Purpose: create a test/training segmentation for use in model validation.

Syntax testTrainSplit(testFraction)

Arguments

DescriptionNameType

testFractionreal

Result

DescriptionType

String values TestSet or TrainingSetstring

Examples

To derive a field that allocates the value "TrainingSet" to approximately 70% of the records(chosen randomly) and the value "TestSet" to the remainder:

testTrainSplit(0.3)

See also

testTrainValidateSplit

Purpose: Create a test/training/validation segmentation for use in model validation.

Syntax testTrainValidateSplit(testFraction, validateFraction)

Arguments

DescriptionNameType

testFractionreal

The fraction of the data to be used asthe hold-out set

validateFractionreal

Result

DescriptionType

String values TestSet, TrainingSet orValidationSet

string

Examples

To derive a field that allocates the value "TrainingSet" to approximately 50% of the records(chosen randomly), the value "TestSet" to approximately 30% of the records (again chosenrandomly), and the value "ValidationSet" to the remainder:

testTrainValidateSplit(0.3, 0.2)

See also

24 - Random-numberfunctions

In this section

About random-number functions in FDL 311rndBinomial 311rndBool 312rndExp 312rndGamma 313rndNormal 314rndPoisson 315rndUniform 315

About random-number functions in FDL

When evaluating an FDL expression involving a random number function (numericTestTrainSplit,numericTestTrainValidateSplit, rndBinomial, rndBool, rndExp, rndGamma, rndNormal,rndPoisson, or rndUniform sampleEqualSize, sampleExactNumber, sampleExactPercentage,sampleStratified, testTrainSplit, and testTrainValidateSplit), Spectrum Miner needs an integer"seed" to initialize its sequence of random numbers. With the same seed, the same expression willalways produce the same sequence of results.

Note: • The seed is stored as part of the field information.• The (pseudo)random number generation is based on that of Marsaglia, Zaman, and Tsang

(1990).

rndBinomial

Purpose: generate a random integer based on a binomial distribution.

Syntax rndBinomial(N, p)

Arguments

DescriptionNameType

The (positive) number of trialsNinteger

The probability of success for each trial(between 0 and 1)

pnumeric

Result

DescriptionType

A random non-negative integer sampled from the binomialdistribution, with probability given by

integer

The expectation and variance of the binomial distribution are and respectively.Note:

Examples rndBinomial (50, 0.05)

See also

About random-number functions in FDL on page 311

rndBool

Purpose: generate either 0 or 1 randomly (with equal probability).

Syntax rndBool()

Arguments None

Result

DescriptionType

Either 0 or 1, with probabilitiesinteger

Examples

To derive a field that allocates the value 0 to approximately 50% of the records (chosen randomly)and the value 1 to the remainder: rndBool()

See also

rndExp

Purpose: generate a random positive number based on an exponential distribution.

Syntax rndExp(b)

Arguments

DescriptionNameType

The (positive) scale parameterbnumeric

Result

DescriptionType

A positive random number sampled from the exponentialdistribution, with probability density function given by

(x > 0)

The expectation and variance of the exponential distribution are and respectively.Note:

Examples rndExp(10)

See also

rndGamma

Purpose: generate a random positive number based on a gamma distribution.

Syntax rndGamma(a, b)

Arguments

DescriptionNameType

The (positive) number of independentevents

ainteger

The (positive) event ratebnumeric

Result

DescriptionType

A positive random number sampled from the gammadistribution , with probability density function given by

The expectation and variance of the gamma distribution are and respectively.Note:

Examples rndGamma (50, 0.05)

See also

rndNormal

Purpose: generate a random number based on a normal distribution.

Syntax rndNormal( , )

Arguments

DescriptionNameType

The mean of the distributionnumeric

The standard deviation of thedistribution

numeric

Result

DescriptionType

A random number sampled from the normal distribution, with probability density function given by

The expectation and variance of the normal distribution are and respectively.Note:

Examples

To derive a Gaussian distribution with zero mean and unit standard deviation: rndNormal(0,1)

See also

rndPoisson

Purpose: generate a random non-negative integer based on a discrete Poisson distribution.

Syntax rndPoisson(mean)

Arguments

DescriptionNameType

The (positive) mean of the distributionnumeric

Result

DescriptionType

A random non-negative integer sampled from the discretePoisson distribution, with probability given by( )

integer

The expectation and variance of the discrete Poisson distribution are both equal to the mean,.

Examples rndPoisson(10)

See also

rndUniform

Purpose: generate a random number between 0 and 1 based on a uniform distribution.

Syntax rndUniform()

Arguments None

Result

DescriptionType

A random number sampled from a uniform distribution over[0, 1)

Examples

To generate a random real number in the range [0,10): rndUniform() * 10

See also

25 -Return-on-investmentfunctions

In this section

ActionROI 318ActionROIAnnualized 319OfferROI 320OfferROIAnnualized 322RetentionActionROI 323RetentionActionROIAnnualized 324RetentionOfferROI 326RetentionOfferROIAnnualized 327

ActionROI

Purpose: estimate the per-customer (money) return on investment contribution from taking actiondesigned to generate a definite response. The action happens whether or not the customer responds,so there is only a cost of action, not of fulfillment.

Syntax ActionROI(ResponseProbability, ValueOfResponse, CostOfAction)

Arguments

DescriptionNameType

The estimated probability that thecustomer will respond to the action byexhibiting some response behavior

ResponseProbabilitynumeric

The estimated value of the customer'sresponse to the marketing action (if itoccurs)

ValueOfResponsenumeric

The estimated cost of the action(whether or not the customer responds)

CostOfActionnumeric

Result

DescriptionType

The estimated per-customer (money) return on investmentcontribution from taking action

numeric

Note: • The response probability should be an "uplift," that is, the probability that the customer willexhibit that behavior when they would not otherwise have done so. If there is any possibilityof the customer's exhibiting the desired behavior without the action being taken, you shouldmodel this by using an uplift model or, failing that, by subtracting the likelihood of responsegiven the action from the likelihood without the action.

• The value of response should be a net revenue that ignores the cost of the marketingaction.

See also

ActionROIAnnualized on page 319

OfferROI on page 320

OfferROIAnnualized on page 322

RetentionActionROI on page 323

RetentionActionROIAnnualized on page 324

RetentionOfferROI on page 326

RetentionOfferROIAnnualized on page 327

ActionROIAnnualized

Purpose: estimate the annualized per-customer return on investment multiple from taking actiondesigned to generate a definite response from a customer. The action happens whether or not thecustomer responds, so there is only a cost of action, not of fulfillment.

Syntax ActionROIAnnualized(ResponseProbability, ValueOfResponse,CostOfAction, DaysToROI)

Arguments

DescriptionNameType

CostOfActionnumeric

The number of days expected to berequired to achieve the given return oninvestment

DaysToROInumeric

Result

DescriptionType

The estimated annualized per-customer return on investmentmultiple from taking action

numeric

• The value of response should be a net revenue that ignores the cost of the marketingaction.

See also

ActionROI on page 318

OfferROI

Purpose: estimate the per-customer return on investment contribution from making an offer designedto generate a definite response.

Syntax OfferROI(ResponseProbability, ValueOfResponse, CostOfOffer,CostOfFulfilment)

Arguments

DescriptionNameType

CostOfOffernumeric

The estimated cost of a fulfilling apositive response (zero if there is no(additional) cost of fulfillment)

CostOfFulfilmentnumeric

Result

DescriptionType

The estimated per-customer return on investment frommaking an offer

numeric

• The value of response should be a net revenue that ignores the cost of the marketingaction and the cost of fulfillment to the customer's response.

See also

OfferROIAnnualized

Purpose: estimate the annualized per-customer return on investment multiple from making an offerdesigned to generate a definite response from a customer.

SyntaxOfferROIAnnualized(ResponseProbability, ValueOfResponse, CostOfOffer,CostOfFulfilment, DaysToROI)

Arguments

DescriptionNameType

CostOfOffernumeric

The estimated cost of a fulfilling apositive response (zero if there is no(additional) cost of fulfillment)

DaysToROInumeric

Result

DescriptionType

The estimated per-customer annualized return on investmentmultiple from making an offer

numeric

Note: • The response probability should be an "uplift," that is, the probability that the customer willexhibit that behavior when they would not otherwise have done so. If there is any possibilityof the customer's exhibiting the desired behavior without the action being taken, you should

model this by using an uplift model or, failing that, by subtracting the likelihood of responsegiven the action from the likelihood without the action.

• The value of response should be a net revenue that ignores the cost of the marketingaction and the cost of fulfillment to the customer's response.

See also

RetentionActionROI

Purpose: estimate the per-customer (money) return on investment contribution from taking actiondesigned to prevent attrition. The action is assumed to have a fixed cost, regardless of whether thecustomer is saved or not, and whether or not a customer "accepts" the offer.

Syntax RetentionActionROI(SaveProbability, NetRevenueAtRisk, CostOfAction)

Arguments

DescriptionNameType

The estimated probability that thecustomer will be "saved" (for the period

SaveProbabilitynumeric

under consideration) as a result of theaction taken

The estimated net revenue that will belost if the customer undergoes attritionover the period under consideration

NetRevenueAtRisknumeric

CostOfActionnumeric

Result

DescriptionType

The estimated per-customer (money) return on investmentcontribution from action designed to prevent attrition

numeric

Note: • To say a customer is "saved" means "caused not to leave when the customer otherwisewould have done so." This value is the negative of the uplift in attrition rate produced byan uplift model. Alternatively, it can be calculated as P(attrition|no action) - P(attrition|action).

• Net revenue at risk should be a net revenue that ignores the cost of the action designedto prevent attrition.

See also

RetentionActionROIAnnualized

Purpose: estimate the annualized per-customer return on investment contribution from taking actiondesigned to prevent attrition. The action is assumed to have a fixed cost, regardless of whether thecustomer is saved or not, and whether or not a customer "accepts" the offer.

Syntax RetentionActionROIAnnualized(SaveProbability, NetRevenueAtRisk,CostOfAction, DaysToROI)

Arguments

DescriptionNameType

CostOfActionnumeric

DaysToROInumeric

Result

DescriptionType

The estimated annualized per-customer return on investmentcontribution from action designed to prevent attrition

numeric

Note: • To say a customer is "saved" means "caused not to leave when the customer otherwisewould have done so." This value is the negative of the uplift in attrition rate produced byan uplift model. Alternatively, it can be calculated as P(attrition|no action) - P(attrition|action).

• Net revenue at risk should be a net revenue that ignores the cost of the action designedto prevent attrition.

See also

RetentionOfferROI

Purpose: estimate the per-customer return on investment from making an offer designed to preventattrition. An offer, as distinct from an action, has a different cost according to whether or not it isaccepted by the customer.

Syntax RetentionOfferROI(SaveProbability, ResponseProbability,NetRevenueAtRisk, CostOfOffer, CostOfFulfilment)

Arguments

DescriptionNameType

The estimated probability that thecustomer will respond to the offer

The estimated cost of the offer (whetheror not the customer responds)

CostOfOffernumeric

The estimated cost of fulfilling a positiveresponse to the offer

Result

DescriptionType

The estimated per-customer return on investment from anoffer designed to prevent attrition

numeric

Note: • To say a customer is "saved" means "caused not to leave when the customer otherwisewould have done so". This value is minus the uplift in attrition rate produced by an upliftmodel. Alternatively, it can be calculated as P(attrition|no action) - P(attrition|action).

• Net revenue at risk should be a net revenue ignoring the cost of the action designed toprevent attrition.

See also

RetentionOfferROIAnnualized

Purpose: estimate the annualized per-customer return on investment from making an offer designedto prevent attrition. An offer, as distinct from an action, has a different cost according to whether ornot it is accepted by the customer.

Syntax RetentionOfferROIAnnualized(SaveProbability, ResponseProbability,NetRevenueAtRisk, CostOfOffer, CostOfFulfilment, DaysToROI)

Arguments

DescriptionNameType

The estimated probability that thecustomer will respond to the offer

The estimated cost of the offer (whetheror not the customer responds)

CostOfOffernumeric

DescriptionNameType

The estimated cost of fulfilling a positiveresponse to the offer

DaysToROInumeric

Result

DescriptionType

The estimated annualized per-customer return on investmentfrom an offer designed to prevent attrition

numeric

Note: • To say a customer is "saved" means "caused not to leave when the customer otherwisewould have done so". This value is minus the uplift in attrition rate produced by an upliftmodel. Alternatively, it can be calculated as P(attrition|no action) - P(attrition|action).

• Net revenue at risk should be a net revenue ignoring the cost of the action designed toprevent attrition.

See also

26 - Miscellaneousfunctions

In this section

dblookup 330member 332rankOrder, rankOrderApprox 333rankOrderMean, rankOrderApproxMean 335rownum 336

dblookup

Purpose: look up values in a reference table (stored as a focus).

Syntax dblookup(d, k, f, x)

Arguments

DescriptionNameType

The pathname of the lookup tabledstring

The name of the key field in the lookuptable

kstring

The name of another field in the lookuptable

fstring

The lookup valuexas field k

Result

DescriptionType

The value y in the field f of table d, for the record indexedby the value x occurring in the key field k

as field f

Note: • You should only use this function for small lookup tables. If the lookup table is comparablein size to a customer or transaction table, you should consider performing a join instead.

• If the lookup value doesn't occur in the key field, the null value is returned.• If the lookup value occurs more than once in the key field, the first value is used.• Trailing whitespace is stripped from the lookup value, but not from the values of the key

field in the lookup table. This may produce unexpected behavior if the key field in the lookuptable contains values with trailing whitespace.

• For backwards compatibility, you can prefix the pathname of the lookup table with the stringfocus:.

Example

In a focus, the store at which a customer most frequently shops is recorded in the field PopularShop.Another focus stores.ftr comprises a lookup table that describes facilities at each store: in thisfocus, StoreCode is the key field, and the field HasFuel indicates whether each store has a gasstation. Store 105 has a gas station, stores 299 and 204 do not, and so on; store 189 is not in thelookup table. To derive a field that flags whether each customer's favorite store has a gas station:

dblookup("focus:stores.ftr","StoreCode", "HasFuel", PopularShop)

PopularShop

dblookup("focus:stores.ftr","StoreCode", "HasFuel", PopularShop)

PopularShop

null189

member

Purpose: determine set membership.

Syntax member(x, , , ...)

Arguments

DescriptionNameType

The value to be testedxinteger, real, date, or string

The elements of the set, , ...as x

Result

DescriptionType

1 (true) ifx is in the set ; 0 (false) otherwiseinteger

Note: • You can only use literal values (numbers, strings, or dates) for the list of set elements.• It is in general more efficient to use the member function than a disjunction of equality

tests. For example, the expression strmember(F, "a", "b", "c") is quicker toevaluate than F = "a" or F = "b" or F = "c".

• For backwards compatibility, you can use strmember instead of member when thearguments are strings.

• Trailing whitespace in string values is not ignored.

Examples

To use the first two characters of a StateCode field to identify West-coast customers:

member (substr (StateCode, 0, 1),"CA", "OR", WA")

StateCode

1OR009

1CA043

0UT005

1WA027

To flag customers whose 10-digit numeric phone numbers have Edinburgh (131) or Glasgow (141)area codes (by determining the first three digits using the div [see Arithmetic operators on page193] operator):

if member (TelNo div 10000000, 131,141) then "Edin/Glas" else null

Edin/Glas1312204491

null1753833999

null1611234567

Edin/Glas1417654321

See also

rankOrder, rankOrderApprox

Purpose: identify the rank of a value within a list of values.

Syntax

rankOrder(x, , , ...)

rankOrderApprox(x, , , ...)

Arguments

DescriptionNameType

The value to be rankedxinteger, real, date, or string

The list of values for comparison, , ...as x

Result

DescriptionType

The position of the first occurrence of x within the list ofvalues obtained by sorting , , ... in descending order (latestto earliest for dates; reverse alphabetical order for strings)

If x doesn't occur in the list, rankOrder returns null, whilerankOrderApprox returns the position of x within the list ofvalues obtained by sorting x, , , ... in descending order.

integer

Note: • When the value to be ranked occurs exactly once in the list of values, rankOrder,rankOrderApprox, rankOrderMean, and rankOrderApproxMean all give the same result.

• When the value to be ranked occurs more than once in the list of values, rankOrderApproxgives the same result as rankOrder.

• When the value to be ranked doesn't occur in the list of values, rankOrderApprox givesthe same result as rankOrderApproxMean.

• For strings, "alphabetical order" means the natural order of the underlying characterrepresentation.

Examples

rankOrder(6.74, 6.74, 1.99, 0.00, 9.27) equals 2.

rankOrderApprox(6.74, 1.99, 0.00, 9.27) equals 2.

To check where a customer's number of phone calls in Quarter 4 ranks in a year:

rankOrder(CallsQ4, CallsQ1, CallsQ2, CallsQ3, CallsQ4)

To check where a customer's number of phone calls in month 5 ranks with the previous four:

RankOrderApprox(CallsM5, CallsM1, CallsM2, CallsM3, CallsM4)

See also

rankOrderMean, rankOrderApproxMean on page 335

rankOrderMean, rankOrderApproxMean

Purpose: identify the rank of a value within a list of values, or its average rank if it occurs more thanonce.

Syntax rankOrderMean(x, , , ...)

rankOrderApproxMean(x, , , ...)

Arguments

DescriptionNameType

The value to be rankedxinteger, real, date, or string

The list of values for comparison, , ...as x

Result

DescriptionType

The mean rank of x within the list of values obtained bysorting , , ... in descending order (latest to earliest fordates; reverse alphabetical order for strings)

If x doesn't occur in the list, rankOrderMean returnsnull, while rankOrderApproxMean returns the

position of x within the list of values obtained by sorting x,, , ... in descending order.

Note: • When the value to be ranked occurs exactly once in the list of values, rankOrder,rankOrderApprox, rankOrderMean, and rankOrderApproxMean all give the sameresult.

• When the value to be ranked occurs more than once in the list of values,rankOrderApproxMean gives the same result as rankOrderMean.

• When the value to be ranked doesn't occur in the list of values, rankOrderApproxMeangives the same result as rankOrderApprox.

• For strings, "alphabetical order" means the natural order of the underlying characterrepresentation.

Examples

rankOrderMean(6.74, 6.74, 1.99, 6.74, 9.27) equals 2.5.

rankOrderApproxMean(6.74, 1.99, 6.74, 9.27, 6.74) equals 2.5.

See also

rankOrder, rankOrderApprox on page 333

rownum

Purpose: generate the number of each row in the table.

Syntax rownum()

Arguments None

Result

DescriptionType

An index number from 1 to the number of records in thetable

integer

Examples To add an index column to an N-record dataset, with values :

rownum()ID

1AZ195

2DI539

3JD974

4BY643

27 - Binnings

In this section

bin 338Boolean 339DayFrom, WeekFrom, MonthFrom, YearFrom 340DayMultipleFrom, WeekMultipleFrom, MonthMultipleFrom,

YearMultipleFrom 341DayMultipleNumBins, WeekMultipleNumBins, MonthMultipleNumBins,

YearMultipleNumBins 342DayMultiplePrePost, WeekMultiplePrePost, MonthMultiplePrePost,

YearMultiplePrePost 343DayMultipleTo, WeekMultipleTo, MonthMultipleTo, YearMultipleTo 344DayMultipleWidth, WeekMultipleWidth, MonthMultipleWidth,

YearMultipleWidth 345DayPrePost, WeekPrePost, MonthPrePost, YearPrePost 346DayRange, WeekRange, MonthRange, YearRange 347DayTo, WeekTo, MonthTo, YearTo 348EqualRange 349EqualRangeWidth 350NegativeNonNegative 351PreDuringPost 352PrePost 353Sign 354

Purpose: obtain a bin index corresponding to a value.

Syntax bin(binning, x)

Arguments

DescriptionNameType

A named binning, with parametervalues (which must be literal values) inparentheses if required

binningbinning

A value compatible with the binningxinteger, real, date, or string

Result

DescriptionType

The bin index of the bin in which x liesinteger

Note: • Bin indices are positive integers. For a numeric binning with n nominal bins (numbered 2to n + 1), the lower and upper end bins are numbered 1 and n + 2 respectively, while thenull and unclassified bins are numbered n + 3 and n + 4.

• This function is not available in Decision Studio. However, if x is a field and b is the primarybinning associated with x, you can achieve the same result using segindex:

segindex() by x

Examples

Split aggregations using the equal-range binning EqualRange on the Amount field, resulting in aneight-way segmented aggregation (four internal bins between amount values 20 and 100 plus twoend bins, a null bin and an unclassified bin):

create totalPointsByAmountBands :=sum(PointsRedeemed) by bin(EqualRange(20, 100, 4), Amount);

Binnings

Split aggregations into eight segments based on the campaign binning PrePost, starting on theMay 1999, and with aggregates from the month before and after the campaign:

create totalSpendPrePostMayCampaign_ := sum(Amount) bybin(PrePost(todate(19990401), todate(19990501),todate(19990601)), PurchaseDate);

See also

Boolean on page 339

DayFrom, WeekFrom, MonthFrom, YearFrom on page 340

DayMultipleFrom, WeekMultipleFrom, MonthMultipleFrom, YearMultipleFrom on page 341

DayMultipleNumBins,WeekMultipleNumBins, MonthMultipleNumBins, YearMultipleNumBinson page 342

DayMultiplePrePost, WeekMultiplePrePost, MonthMultiplePrePost, YearMultiplePrePost onpage 343

DayMultipleTo, WeekMultipleTo, MonthMultipleTo, YearMultipleTo on page 344

DayMultipleWidth, WeekMultipleWidth, MonthMultipleWidth, YearMultipleWidth on page 345

DayPrePost, WeekPrePost, MonthPrePost, YearPrePost on page 346

DayRange, WeekRange, MonthRange, YearRange on page 347

DayTo, WeekTo, MonthTo, YearTo on page 348

EqualRange on page 349

EqualRangeWidth on page 350

NegativeNonNegative on page 351

PreDuringPost on page 352

PrePost on page 353

Sign on page 354

Boolean

Purpose: bin boolean [see Boolean data on page 182] values.

Parameters None

Result

Binnings

ValuesBin index

false1

null [see The null value on page 181]3

unclassified4

DayFrom, WeekFrom, MonthFrom, YearFrom

Purpose: bin date values into a specified number of periods (days, weeks, months, or years)following a given date.

Parameters

DescriptionNameType

The start of the periodic portion of thebinning

a ("start_date")date

The number of periods (days, weeks,months, or years) in the periodic portionof the binning

n ("num_bins")integer

Result

ValuesBin index

dates earlier than a; the lower end bin1

the n consecutive periods following a2, ..., n + 1

dates n periods after a and later; the upper end binn + 2

null [see The null value on page 181]n + 3

Binnings

ValuesBin index

unclassifiedn + 4

See also

DayMultipleFrom,WeekMultipleFrom,MonthMultipleFrom,YearMultipleFrom

Purpose: bin date values into a specified number of periods following a given date, where a periodis a specified number of days, weeks, months, or years.

Parameters

DescriptionNameType

The number of units (days, weeks,months, or years) in a period

m ("width")integer

The number of bins in the periodicportion of the binning

Result

ValuesBin index

the n consecutive periods (each of m units) following a2, ..., n + 1

dates n x m units after a and later; the upper end binn + 2

Binnings

ValuesBin index

unclassifiedn + 4

See also

DayMultipleNumBins, WeekMultipleNumBins,MonthMultipleNumBins, YearMultipleNumBins

Purpose: bin date values into approximately the specified number of periods between two givendates, where the period is a fixed multiple of the unit (day, week, month, or year).

Parameters

DescriptionNameType

The end of the periodic portion of thebinning

b ("end_date")date

The target number of periods (multiplesof a day, week, month, or year) in theperiodic portion of the binning

Result

ValuesBin index

Binnings

ValuesBin index

periods between a and b2, ..., N - 3

dates b and later; the upper end binN - 2

null [see The null value on page 181]N - 1

unclassifiedN

Note: • These binnings are like the DayRange, WeekRange, MonthRange, and YearRange [seeDayRange, WeekRange, MonthRange, YearRange on page 347] binnings, except thatthe period may be two or more units (days, weeks, months, or years), to yield approximatelythe target number of bins.

• When the period is two or more units, bin N - 3 can be shorter than bins 2 to N - 4.

See also

DayMultipleWidth, WeekMultipleWidth, MonthMultipleWidth, YearMultipleWidth on page 345

DayMultiplePrePost, WeekMultiplePrePost,MonthMultiplePrePost, YearMultiplePrePost

Purpose: bin date values into specified numbers of bins before and after a given reference date,where each bin is a given multiple of the unit (day, week, month, or year).

Parameters

DescriptionNameType

The reference date for the binninga ("ref_date")date

The number of bins before and after thereference date

m ("width")integer

The number of units (days, weeks,months, or years) in each bin

Result

Binnings

ValuesBin index

dates earlier than n x m units before a; the lower end bin1

the n consecutive periods (each of m units) preceding a2, ..., n + 1

the n consecutive periods (each of m units) following an + 2, ..., 2n + 1

dates n x m units after a and later; the upper end bin2n + 2

null [see The null value on page 181]2n + 3

unclassified2n + 4

See also

PrePost on page 353

DayMultipleTo, WeekMultipleTo, MonthMultipleTo,YearMultipleTo

Purpose: bin date values into a specified number of periods preceding a given date, where a periodis a specified number of days, weeks, months, or years.

Parameters

DescriptionNameType

b ("end_date")date

m ("width")integer

Binnings

DescriptionNameType

The number of bins in the periodicportion of the binning

Result

ValuesBin index

dates earlier than n x m units before a; the lower end bin1

the n consecutive periods (each of m units) preceding a2, ..., n + 1

dates a and later; the upper end binn + 2

unclassifiedn + 4

See also

DayMultipleWidth, WeekMultipleWidth,MonthMultipleWidth, YearMultipleWidth

Purpose: bin date values into periods between two given dates, where the period is a specifiedmultiple of the unit (day, week, month, or year).

Parameters

DescriptionNameType

Binnings

DescriptionNameType

b ("end_date")date

m ("width")integer

Result

ValuesBin index

periods between a and b2, ..., N - 3

unclassifiedN

Note: • These binnings are like the DayRange, WeekRange, MonthRange, and YearRange [seeDayRange, WeekRange, MonthRange, YearRange on page 347] binnings, except thatthe period may be two or more units (days, weeks, months, or years).

• When the period is two or more units, bin N - 3 can be shorter than bins 2 to N - 4.

See also

DayMultipleNumBins,WeekMultipleNumBins, MonthMultipleNumBins, YearMultipleNumBinson page 342

DayPrePost, WeekPrePost, MonthPrePost, YearPrePost

Purpose: bin date values into specified numbers of periods (days, weeks, months, or years)preceding and following a given reference date.

Parameters

Binnings

DescriptionNameType

The reference date for the binninga ("ref_date")date

The number of periods (days, weeks,months, or years) to bin before thereference date

m ("before")integer

The number of periods to bin after thereference date

n ("after")integer

Result

ValuesBin index

dates earlier than m periods before a; the lower end bin1

the mconsecutive periods preceding a2, ..., m + 1

the n consecutive periods following am + 2, ..., m + n + 1

dates n periods after a and later; the upper end binm + n + 2

null [see The null value on page 181]m + n + 3

unclassifiedm + n + 4

See also

PrePost on page 353

DayRange, WeekRange, MonthRange, YearRange

Purpose: bin date values into periods (days, weeks, months, or years) between two given dates.

Parameters

Binnings

DescriptionNameType

b ("end_date")date

Result

ValuesBin index

periods (days, weeks, months, or years) between a and b2, ..., N - 3

unclassifiedN

Note: • If there is less than a complete period between a and b, bin 2 contains dates between a(inclusive) and b (exclusive).

• If there is not an exact number of periods between a and b (but there is at least onecomplete period), the bin immediately preceding b is lengthened to accommodate theexcess.

See also

DayTo, WeekTo, MonthTo, YearTo

Purpose: bin date values into a specified number of periods (days, weeks, months, or years)preceding a given date.

Parameters

Binnings

DescriptionNameType

b ("end_date")date

The number of periods (days, weeks,months, or years) in the periodic portionof the binning

Result

ValuesBin index

dates earlier than n periods before b; the lower end bin1

the n consecutive periods preceding b2, ..., n + 1

dates b and later; the upper end binn + 2

unclassifiedn + 4

See also

EqualRange

Purpose: bin numeric values into a specified number of equal-width intervals.

Parameters

DescriptionNameType

The lower end of the equal-rangeportion of the binning

a ("from")real

Binnings

DescriptionNameType

The upper end of the equal-rangeportion of the binning

b ("to")real

The number of bins in the equal-rangeportion of the binning

Result

ValuesBin index

; the lower end bin1

[a, a + (b - a)/n), ..., [b - (b - a)/n, b)2, ..., n + 1

; the upper end binn + 2

unclassifiedn + 4

See also

EqualRangeWidth on page 350

EqualRangeWidth

Purpose: bin numeric values into intervals of a specified width.

Parameters

DescriptionNameType

The lower end of the equal-rangeportion of the binning

a ("from")real

The upper end of the equal-rangeportion of the binning

b ("to")real

Binnings

DescriptionNameType

The width of bins in the equal-rangeportion of the binning

r ("width")real

Result

ValuesBin index

; the lower end bin1

equal-width bins of approximate width r2, ..., N - 3

the upper end binN - 2

unclassifiedN

If a - b isn't an exact multiple of r, an approximation to r is used, so that the bins between aand b are of equal width.

See also

EqualRange on page 349

NegativeNonNegative

Purpose: bin numeric values according to sign, grouping 0 with positive values.

Parameters None

Result

ValuesBin index

; that is, negative values1

; that is, non-negative values2

Binnings

ValuesBin index

unclassified4

See also

Sign on page 354

PreDuringPost

Purpose: bin date values into pre-campaign, during-campaign, and post-campaign periods.

Parameters

DescriptionNameType

The start of the pre-campaign perioda ("pre_date")date

The start of the campaignb ("campaign_start_date")date

The end of the campaignc ("campaign_end_date")date

The end of the post-campaign periodd ("post_date")date

Result

ValuesBin index

dates between a (inclusive) and b (exclusive)2

dates between b (inclusive) and c (exclusive)3

dates between c (inclusive) and d (exclusive)4

Binnings

ValuesBin index

dates d and later; the upper end bin5

unclassified7

See also

PrePost on page 353

PrePost

Purpose: bin date values into pre-campaign and post-campaign periods.

Parameters

DescriptionNameType

The start of the pre-campaign perioda ("pre_date")date

The campaign dateb ("campaign_date")date

The end of the post-campaign periodc ("post_date")date

Result

ValuesBin index

dates between a (inclusive) and b (exclusive)2

Binnings

ValuesBin index

dates between b (inclusive) and c (exclusive)3

dates c and later; the upper end bin4

unclassified6

See also

Purpose: bin numeric values according to sign.

Parameters None

Result

ValuesBin index

; that is, negative values1

; that is, positive values3

unclassified5

See also

NegativeNonNegative on page 351

Binnings

28 - XML in SpectrumMiner

In this section

XML in Spectrum Miner 356Metadata specification for qsimportmetadata 357Aggregation specification for qsmeasure 366Derivation specification for qsderive 368Selection specification for qsselect 369Crosstab specification for qsxt 371Field name mapping specification for qsrenamefields 374Decision-tree build specification for qsdecisiontree 376Scorecard build specification for qsscorecard 379Binning specifications 381Attribute values 390

XML in Spectrum Miner

To allow easy interchange of results with other applications, XML is used throughout SpectrumMiner.

All XML inputs to the Spectrum Miner data-build commands (and any Spectrum Miner functionthat is implemented via a Spectrum Miner data-build command) must be in the Spectrum Minernamespace. The easiest way to ensure this is to include the attributexmlns="http://www.quadstone.com/xml" in the root element of the XML document.

XML files created by SpectrumMiner automatically include an optional schemaLocation attribute.If you use a schema-aware editor to create XML files, you can access these schemas directly fromyour Spectrum Miner installation at:

<smhome>/server/qs8.0/integration/schemas

(where <smhome> is the SpectrumMiner installation directory). You can also find examples of howto process these XML files at:

<smhome>/server/qs8.0/etc/xslt

See also

Aggregation specification for qsmeasure on page 366

Binning specifications on page 381

Crosstab specification for qsxt on page 371

Decision-tree build specification for qsdecisiontree on page 376

Derivation specification for qsderive on page 368

Field name mapping specification for qsrenamefields on page 374

Metadata specification for qsimportmetadata on page 357

Scorecard build specification for qsscorecard on page 379

Selection specification for qsselect on page 369

Metadata specification for qsimportmetadata

metadata

DescriptionValueRequired

The root focuselementfocus

Examples

Describe the metadata for importing.

<?xml version="1.0" encoding="UTF-8"?><metadata xmlns="http://www.quadstone.com/xml"><focus><comment>This focus was created for the Retail Analysisproject.</comment>

</focus></metadata>

DescriptionValueOptional

The name of the default subfocusfocus name attributelaunch

The name for a subfocusfocus name attributename

The name of the objective fieldfield name attributeobjective

The name of the partition fieldfield name attributepartition

The focus commentelementcomment

The focus historyelementhistory

The fields to import metadata intoelementfield

Subfoci definitionselementfocus

The launch attribute can only appear in the root focus. All other subfoci are required tohave a name attribute.

Example Describe the root focus.

<?xml version="1.0" encoding="UTF-8"?><metadata xmlns="http://www.quadstone.com/xml"><focus><comment>This focus was created for the Retail Analysisproject.</comment></focus>

</metadata>

field in metadata

The name of the field being definedfield name attributename

the name of the field being referencedfield name attributeref

The data type representing this fieldsee field typetype

The length of the string representationinteger attributelength

Whether this field is an analysiscandidate

binary attributeanalysis

Whether this field is an export fieldbinary attributeexport

The field commentelementcomment

The field binningelementbinning

The field derivation FDL expressionelementfdl

The field record selectionelementrecordselection

Note: • When a field is being defined, for example as a derivation, it is required to have a nameattribute. When it is being referenced, for example for inclusion in a subfocus, it is requiredto have a ref attribute.

• length and type attributes can only be used when defining a field (when the nameattribute has been given).

Examples Describe a string field:

Describe an analysis candidate:

See also

Binning specifications on page 381

The FDL expression to describe thefield derivation

textcontent

The numeric seed for the randomnumber generator

integer attributeseed

Examples

<fdl>mode(Store)</fdl><fdl>if PaymentMethod = "CA" then Amount else 0</fdl><fdl seed="123">rndUniform()</fdl>

comment

The text description for the commenttextcontent

A comment is formatted as XHTMLbinary attributexhtml

The formatted text description for thecomment, which can contain any

string elementdiv

XHTML markup that is permitted toappear in a<div> element

When the xhtml attribute is used, it always takes the value true.Note:

Examples

Add a comment to a focus:

<?xml version="1.0" encoding="UTF-8"?><metadata xmlns="http://www.quadstone.com/xml"><focus><comment>This focus was created for the Retail Analysisproject.</comment>

</focus></metadata>

Add a formatted comment to a focus:

<div>This focus was created for theRetail Analysis project.<br/>An audit is available on the<a href="http://intranet.company.com/audits">intranet</a>.

</div></comment>

</focus></metadata>

history

Focus history is a single history element containing a non-empty sequence of operation elements.Each operation element is just plain text representing a single Spectrum Miner data-build commandapplied to the focus.

One particular action in the historystring elementsoperation

Example

Describe the history of all actions on the focus:

<history><operation>qsimportflat -input DB.fdd -outputDB.ftr</operation><operation>qsupdate -from template.ftr -toDB.ftr</operation>

</history>

recordselection

The numeric record selectionelementnumeric

The date record selectionelementdate

The categorical record selectionelementcategorical

The recordselection element takes one of the numeric, date, or categoricalsubelements.

Examples

Integer range selection between -2677 (inclusive) and 16808 (exclusive), and excluding Nulls:

</recordselection></field>

numeric

Specify how the record selectionhandles null values

see nullhandlingnullhandling

Specify the numeric range selectionelementrange

The numeric element accepts multiple range subelements.Note:

Example

Remove an existing field selection, and any Null selection:

Specify the date range selectionelementrange

The date element accepts multiple range subelements.Note:

Example

Date range selection, only available with inclusive end points, and dates specified in a fixed format:

</date></recordselection>

</field>

categorical

Specify whether the other category isselected or deselected.

see otherstatusotherstatus

The category values in the selectionelementselect

The categorical element accepts multiple select subelements.Note:

Example

A categorical selection that uses the otherstatus to deselect any field not explicitly mentioned,for example, select only categories D and E:

</categorical></recordselection>

</field>

Removing the selection from a categorical field:

</categorical></recordselection>

</field>

The minimum range of the recordselection

numeric or date attributemin

The maximum range of the recordselection

numeric or date attributemax

Include this value in the minimum rangebinary attributeminisincluded

Include this value in the maximumrange

binary attributemaxisincluded

The range element requires at least one min and one max attribute.Note:

select

The name of the categorystring attributevalue

Include in or exclude from selectionsee statusstatus

Aggregation specification for qsmeasure

aggregations

The field definitionselementsfield

Example

Describe the creation of two new aggregates:

<?xml version="1.0" encoding="UTF-8"?><aggregations xmlns="http://www.quadstone.com/xml">

<field name="mostCommonStore" context="aggregation"><fdl>mode(Store)</fdl>

</field><field name="averageSpendInStore_" context="aggregation">

<fdl>mean(Amount)</fdl><by>StoreSplitFunction(Store)</by>

</field></aggregations>

field in aggregations

The name of the field to be createdfield name attributename

Whether this field is to be used in anaggregation, statistic, or tracker file

see measure contextcontext

The FDL expression to describe theaggregation

elementfdl

The data type ofthe field to becreated

see field typetype

The length of astring field

integer attributelength

Whether this fieldis created in theoutput focus

binary attributetemporary

The split-byfunction

string elementby

The FDLexpression recordfilter

string elementwhere

The default valueof the aggregation

string elementdefault

Examples

Describe a field with the mode aggregation on field Store:

<field name="mostCommonStore" context="aggregation"><fdl>mode(Store)</fdl></field>

Describe a field with the mean aggregation on field Amount, split by function StoreSplitFunction:

<field name="averageSpendInStore_" context="aggregation"><fdl>mean(Amount)</fdl><by>StoreSplitFunction(Store)</by></field>

Derivation specification for qsderive

derivations

The field definitionselementsfield

Example

Describe a set of three derivations:

<?xml version="1.0" encoding="UTF-8"?><derivations xmlns="http://www.quadstone.com/xml">

<field name="TransMonth"><fdl>month(PurchaseDate)</fdl>

</field><field name="CashRecd">

<fdl>if PaymentMethod = "CA" then Amount else 0</fdl></field><field name="Random">

<fdl seed="123">rndUniform()</fdl></field>

</derivations>

field in derivations

The name of the field to be createdfieled name attributename

The FDL expression to describe thefield to be created

elementfdl

The data type of the field to be createdsee field typetype

The length of a string fieldinteger attributelength

Examples

Describe a field using an in-built function:

<field name="TransMonth"><fdl>month(PurchaseDate)</fdl></field>

Describe a field using a simple control statement:

<field name="CashRecd"><fdl>if PaymentMethod = "CA" then Amount else 0</fdl></field>

Describe a random field with a specific seed:

<field name="Random"><fdl seed="123">rndUniform()</fdl></field>

Selection specification for qsselect

selections

The fields to create this selectionelementsfield

Example

Describe a random selection:

<field name="random" context="selection"><fdl seed="12345678">rndUniform() \< 0.1</fdl>

</field></selections>

field in selections

The name of the field to be createdfield name attributename

FDL expression to describe the recordselection.

Must produce an integer valued zeroor one result

elementfdl

This field is for use with qsselectselectioncontext

Example

Describe a random selection:

<field name="random" context="selection"><fdl seed="12345678">rndUniform() \< 0.1</fdl></field>

Crosstab specification for qsxt

crosstabset

The crosstabs to be evaluatedelementscrosstab

Whether the crosstabset represents aprofile, rather than a crosstab

binary attributeprofile

The description of the crosstabsetstring elementdescription

Example

Describe a single crosstab crosstabset:

<?xml version="1.0" encoding="UTF-8"?><crosstabset xmlns="http://www.quadstone.com/xml"><description>No Description</description><crosstab indexscheme="compact1">

<description>No Description</description><specification>

<parameter>MaritalStatus</parameter></function>

</specification></crosstab></crosstabset>

crosstab

The indexing scheme used in thiscrosstab

compact1indexscheme

The specification of the crosstabelementspecification

The description of the crosstabstring elementsdescription

Example

Describe a single crosstab crosstabset:

<?xml version="1.0" encoding="UTF-8"?><crosstabset xmlns="http://www.quadstone.com/xml">

<description>No Description</description><crosstab indexscheme="compact1">

<description>No Description</description><specification>

</specification></crosstab>

</crosstabset>

specification in crosstab

The functions to be used as crosstabmeasurements

elementsfunction

The fields to be used as segmentationbreakdowns

elementsfield

Example

Describe a crosstab of fields Age and Gender with measures count() andmode(MaritalStatus):

</specification>

function

Identify the name of the measurefield name attributename

Identify the field parameter to thefunction

field name attributeparameter

Examples

Describe a function with no parameters:

Describe a function with a single parameter MaritalStatus:

<function name="mode"><parameter>MaritalStatus</parameter></function>

field in crosstab

The name of the field to be used in thecrosstab

field name attributename

Example

Specify field Age for use in the crosstab:

Field name mapping specification for qsrenamefields

mappingset

The mapping of field nameselementsmap

Example

Describe the renaming of three fields:

<?xml version="1.0" encoding="UTF-8"?><mappingset xmlns="http://www.quadstone.com/xml">

</map><map>

<name>Age</name><alias>CurrentAge</alias>

</map><map>

<name>Gender</name><alias>Sex</alias>

</map></mappingset>

The name of the field to renamefield name elementname

The new field namefield name elementalias

Example

Rename field StartDate to Initialized:

</map>

Decision-tree build specification for qsdecisiontree

decisiontree

The decision-tree build specificationelementspecification

Example

Describe a decision-tree build specification with objective field Age and three analysis candidates,creating new result field PredictedAge:

<?xml version="1.0" encoding="UTF-8"?><decisiontree xmlns="http://www.quadstone.com/xml">

</specification></decisiontree>

specification in decisiontree

The depth of the tree to be builtinteger attributenlevels

The objective fieldfield name elementobjectivefield

The analysis candidatesfield names elementanalysiscandidates

The name for the created results fieldfield name elementresultfield

The constraints on the decision-treebuild

elementsplitconstraints

The k-way split fieldfield name elementkwaysplitfield

The partition fieldfield name elementpartitionfield

The test/training methodology to beused

elementtesttrain

Example

Describe a decision-tree build specification with objective field Age and three analysis candidates,creating new result field PredictedAge:

Note: • If the <partitionfield> element is empty, qsdecisiontree will ignore any existingpartition prior to building the tree.

• If the <kwaysplitfield> element is empty, qsdecisiontree will ignore any existingk-way split prior to building the tree.

splitconstraints

The minimum size of a segmentpopulation

integer attributeminpopsize

The minimum size of the treatedsegment population

integer attributeminpopsizetreated

The minimum size of the controlsegment population

integer attributeminpopsizecontrol

The combining function used to selecthow the final split is chosen

see combining functioncombiningfunction

Note: • When a partitionfield element is given, the minpopsizetreated orminpopsizecontrol are valid splitconstraint attributes. Otherwise, only attributeminpopsize should be set.

• The combiningfunction attribute should only be used when the kwaysplitfieldelement is defined.

testtrain

The proportion of records, as a numberbetween zero and one, to be used totrain the decision-tree build

real attributethreshold

The random field to define the test trainsplit

field name elementnamefield

Example

</testtrain>

Scorecard build specification for qsscorecard

scorecard

The scorecard build specificationelementspecification

Example

Describe a scorecard build specification with objective field Age and three analysis candidates,creating new result field PredictedAge:

<?xml version="1.0" encoding="UTF-8"?><scorecard xmlns="http://www.quadstone.com/xml">

</specification></scorecard>

specification in scorecard

Build a continuous-objective scorecard,regardless of the setting ofmodeltype

binary attributeiscontinuous

What type of (binary) model to build(ignored if iscontinuous is true);

see model typesmodeltype

if this is unset, build acontinuous-objective scorecard

The number of fields in the createdscorecard

integer attributenfields

The objective fieldfield name elementobjectivefield

The analysis candidatesfield names elementanalysiscandidates

The name for the created result fieldfield name elementresultfield

The threshold above which analysiscandidates are excluded for being toocorrelated with the objective

real attributecorrelationthreshold

The test/training methodology to beused

elementtesttrain

Example

</specification>

Binning specifications

binning

The name of the binningstring attributename

The numeric binningelementnumeric

The date binningelementdate

The geographical binningelementgeographic

The categorical binningelementcategorical

The binning element takes only one of either numeric, date, geographical orcategorical subelements.

Examples

Add a numeric binning to field LoanAmount:

</field>

Add a categorical binning to field Gender:

<category name="male" levelname="unnamed node"><category name="1" levelname="base categories"

value="1"/></category>

</binning></field>

Add a geographic field binning to field StateCode:

</geographic></binning>

</field>

numeric

The required number of binsinteger attributetargetnbins

Automatically adjust the number of binsto create nice bin boundaries. Defaultsto true

binary attributeadjustnbins

Whether a binning on a real value fieldshould have whole numberedboundaries

binary attributeforcewholenumbered

The type of binning appliedsee bin stylestyle

The binning displaysee binning displaybinning

The lower end bin definition. By defaultequal width binning useauto endbins,while equal population use none

elementminendparam

The upper end bin definition. By defaultequal width binning useauto endbins,while equal population use none

elementmaxendparam

The bin boundaries for user-definedbinning style

elementboundaries

Example

Add an equal population numeric binning with ten ranges to a field:

</binning>

EffectOptional

The required number of binsinteger attributetargetnbins

The binning time periodsee date periodbinby

Which date to bin by date period fromsee date referencereference

The number of time periods for a bininteger attributebinwidth

The type of binning appliedsee bin stylestyle

How the binning should be displayedsee binning displaybinning

EffectOptional

The lower end bin definition. By defaultequal width binning useauto endbins,while equal population use none

elementminendparam

The upper end bin definition. By defaultequal width binning useauto endbins,while equal population use none

elementmaxendparam

The bin boundaries for user-definedbinning style

elementboundaries

Example

Create an date binning, by qaurters, starting from 1st March 1992.

</boundaries></date>

categorical

The description ofthe categoryhierarchy

elementcategories

The location ofthe file containing

file elementhierarchylocation

the categoricalhierarchydescription

The number ofcategories tocreate

elementselectn

Only one of these elements is required to define a categorical binning.Note:

Example Describe a categorical binning:

</category><category name="female">

</binning>

Data-Build Command and TML Reference Guide...18-Datatype-conversion functions todate 217 tointeger 218 toreal 219 tostring 220 19-Functionsforworkingwith dates addcenturies,addcenturiescountbackwards

Documents

Java Programming Java Structure and Datatype,Variable

Jpug study-jsonb-datatype-20141011

World Scope Datatype Definitions Guide

MySQL 5.7 NF – JSON Datatype 활용

Java Syntax and DataType

PHP - DataType,Variable,Constant,Operators,Array,Include and...

Hypokalemia in children up todate

Complex Matching of RDF Datatype Properties

CSCI.6962/4962 Software VericationŠ Fundamental Proof...

African Chicken Genetic Gains: Tanzania achievements todate

CSE 341 Programming Languages Racket Datatype Style...

Semantic Web Technologies I...– owl:sameAs requires...

MySQL's JSON Datatype

Bachelor Thesis: Performance and Interfaces of Datatype...

Datatype Mapping Reference - Informatica Documentation ...

MyNA JPUG study 20160220-postgresql-json-datatype