Automating Automating Continuous Continuous Tracking: Tracking: The Ideal System The Ideal System © Copyright Red Centre Software Pty Ltd, 2008. Version: 2 October 2008 ASC conference, Imperial College, 3 October 2008
Dec 25, 2015
AutomatingAutomating Continuous Continuous
Tracking:Tracking:
The Ideal SystemThe Ideal System© Copyright Red Centre Software Pty Ltd, 2008.
Version: 2 October 2008
ASC conference, Imperial College, 3 October 2008
Why is it so hard?Why is it so hard?
Because markets, opinions, behaviour change
Forcing questionnaires to evolve,
Forcing changes in data collection,
Which require adjustments to files,
Which impacts analysis and reporting.Dynamism
• poor work practices
• inadequate software
Managing ChangeManaging Change
In practice, a failure to manage change is often a combination of
Check List: A New BrandCheck List: A New Brand•Edit master questionnaire
•Consult existing brand lists, allocate a code
•Brief the interviewers, script writers
•Confirm field system will capture and validate the new code
•Identify verbatims, brief the coding department
•Add to house system brand lists
•Check all brand constructions
•Consider tables/charts which use brand lists for axes/filters
•Connect the new brand to the reporting regime
•Modify diagnostic reports to catch issues
Check List: New VariablesCheck List: New Variables•Provide a set of consistent variable names
•Decide on the descriptions for final reports
•Make sure that all parent brands have exactly the same code across all variables
•Make sure that all child brands have exactly the same code across all variables
•Stitch up multi-response and hierarchics to discrete variables
•Net variables as required (first or other= total)
•Create new tables and charts
•Connect to the reporting regime
•Create a set of diagnostic reports to catch issues in the above
That’s 19 Ways to Stuff it That’s 19 Ways to Stuff it UpUp
• Don’t understate and underestimate
• Sometimes, a hard thing is just hard
• Any solution has to address the real problems
• The better automated, the fewer catastrophes
• Make the machines do the work
Solution: HeteromorphicitySolution: HeteromorphicityAs the market changes, and the instrument evolves, then as far as possible the structure of the data and reports must auto-shape-shift accordingly.
Q6a. Which brands of razors did you last buy?
1. Schick Xtreme System 32. Schick Intuition3. Schick Extra II
……………………....
1826. Gillette MACH31827. Gillette Fusion
1825. Gillette Series
The PrinciplesThe PrinciplesEight principles guide the underlying rationale for work practices and for the selection of software features.
Maximum Generality
Always work at the highest levels of data abstraction: Hierarchic and multi-response variables, not atomised single response
Immutable Definitions
Once a code is defined, or a variable named and described, it must never change. A change of 1=Coke to 1=Pepsi could destroy a job.
Absolute Consistency
Always use the same code for the same brand. Never do the same thing in different ways.
Total Retention
Keep all historical case data and all meta-data well organised and accessible. All cases from job inception, and all variables, should be immediately accessible.
Don’t BloatNever add variables or files to the job without explicit point and purpose.
Super-Actions Always
Never do the same thing many times. Look for the tail which wags the dog, the domino which tips the rest
The Benito Principle
Organisational Fascism. Military precision and discipline. No exceptions. Mindlessly follow the procedures. Never break the rules. Never take a short cut. Document everything. Keep scrupulous records. Enforce accountability.
Hyper-defensivity‘what can go wrong, has already gone wrong’. You just don’t know about it yet.
Work Practice: Variable Work Practice: Variable MapMap
Name Method Details Q’nr Description Report stem
Unaided Brand Awareness
TMBA source Q1a Top of Mind Brand Awareness TopMindBrandAwa
TMBANCode net
TMBA(AnyBrandX)=c1TMBA(AnyBrandY)=c2TMBA(AnyBrandZ)=c3
Net Top of Mind Brand Awareness
TopMindBrandAwaNet
UOBA source Q1bUnaided Other Brand Awareness
UnaidOthBrandAwa
UOBANCode net
From UOBA, as per TMBAN, UOBA&!TMBAN
Net Unaided Other Brand Awareness
UnaidOthBrandAwaNet
UBA Var net TMBA | UOBA Unaided Brand Awareness UnaidBrandAwa
UBAN Var net TMBAN | UOBAN Net Unaided Brand Awareness UnaidBrandAwaNet
Work Practice: Coding Work Practice: Coding SystemSystem
Parent BrandsKraft = 1Unilever = 2Heinz = 3
Variant Brands
Kraft = 1001/1999
Unilever = 2001/2999
Heinz = 3001/3999
- Absolute Consistency- Immutable Definitions
1001=(Kraft) Cheddar Cheese1002=(Kraft) Vegemite...2001=(Unilever) I Can’t Believe it’s not Butter2002=(Unilever) Ben and Jerry’s...3001=(Heinz) Baked Beans (tomato sauce)3002=(Heinz) Bean Samosa
- codes 1/1000, 2000 and 3000 are sacrificed
- brand codes always four digits aligned for readability
- parent and variant brand codes always have the same leading digit (‘1’=Kraft, ‘2’=Unilever...)
- boundary problems are avoided (the ‘first’ is always a trailing ‘1’, never a ‘0’, eg 1001, and not 1000).
For example, allocate in blocks of 999, single digit parent codes, four digit variant codes:
Key Software RequirementsKey Software Requirements
• Fully interactive
• Reference Lists
• Rename/alias common variables
• Specification Generators
• Disassembly of Constructions (Variable Ancestry)
• Rollback
• Scripting
• Support External Data
• True Calendar
(a short list)
Priority #1. Interactivity is Priority #1. Interactivity is EssentialEssential
• Diagnostics cannot tell if 49% or 51% is right
• 50% might be 1 out of 2
• A healthy rise might be a smoothing artefact
• The unrolled data could be wildly aberrant
• A trend could be due to an outlier
• etc.
Because ERRORS will always happen
Any series could be Any series could be garbagegarbage
A Moving Average can hide a multitude of data sins
Analysts must be able to unroll/reroll at will
- Hyper-defensivity
Use Reference Master Use Reference Master Brand ListsBrand Lists
Q18a1=Brand12=Brand23=Brand3......N=BrandN
Master1=Brand12=Brand23=Brand3......N=BrandN
Q1b1=Brand12=Brand23=Brand3......N=BrandN
Q1a1=Brand12=Brand23=Brand3......N=BrandN
If a new brand N+1 is added to Master, all instances will reference it.
- Super-Action
A reference is not a copy!
Create Named Code ListsCreate Named Code ListsAnySingleBladeDisp = 1/10,51/78,102,108,234/300,345,378,401,423
AnyDoubleBladeDisp = 11/22,45,49/50,82/100,111/145
AnyGillette = 1/5,56/77,123/156,167,203/301,400
AnyDisposable = AnySingleBladeDisp OR AnyDoubleBladeDisp
Now, it is easy to specify these nets across all measures for nets, filters etc.
BrandBoughtEver(AnyGillette)
AidedAwa(AnyGillette) AND AidedAwa(AnyDisposable)
Again, the mechanism is a reference, not a copy.
-Absolute Consistency-Super-Actions
Common VariablesCommon VariablesAll variables which are common to at least some of your jobs should be uniformly named, with uniform data structures.
Let the variable for Gender always be named GEN, single response, F=1, M=2. Let the variable for Aided Brand Awareness always be named ABA, multi-response. Let the variable for Brand Image always be named BIM, multi response level B hierarchic, image within brand, etc.
- Absolute Consistency (gender is always GEN)- Maximum Generality (three specific names become one general name)- Super-Actions (set up standard parts of new jobs by copy/paste)
Specification GeneratorsSpecification Generators
Consider the variable net of TMBA with UOBA to get a total UBA.
If the rules have been followed, and TMBA and UOBA have exactly the same code frame, then you need to be able to say something like
UBA(n) = TMBA(n) OR UOBA(n) for all code n
This way, it never matters what happens at the source end – the processing system will seamlessly adapt.
- Maximum Generality- Super-Actions
Disassembly of Disassembly of ConstructionsConstructions
The structure and logic must be totally transparent
TMBA net to parents >> TMBANUOBA net to parents >> UOBAN
UBAN = TMBAN or UOBAN
RollbackRollback
• You have just spent an hour updating a job• A serious mis-code is noticed• All the constructions are wrong• Lots of fiddly one-offs which are hard to
automate are already committed
• Therefore roll back is data only – job edits remain – not a strip back.
ScriptingScripting
If you have done it twice
already, then script it
Use VB and Java.Avoid products with proprietary languages
- Super-Actions
Support External DataSupport External DataIn the real word, things happen. Quantitative data needs to be
‘re-aggregatable’. Events need to be stored and accessed.
Week to Quarter
Media weight GRPs Re-aggregate on Y2
Events track to nearest date
True CalendarTrue CalendarBrand1 is advertising on the weekends, but the survey is fielded Monday to Friday. True date variables.
How to roll across the days with no respondents?
Putting it all Together:Putting it all Together:Heteromorphism in PracticeHeteromorphism in Practice
What happens to A when a new brand code appear at the leaf nodes?
Reference Master Brand Reference Master Brand ListList
Master Brand ListAdd once here
Specification GeneratorsSpecification GeneratorsFor all brand codes n...
UAA(n) = TMAA(n) or UOAA(n)
AA(n) = ABA(n) or AAA(n)
AAA(n) = AAATV(n) or AAAOM(n)UA(n) = UAA(n) or UBA(n)
UBA(n) = TMBA(n) or UOBA(n)
Named Code ListsNamed Code Lists
UAN(1) = UA(AnyKraft)UAN(2) = UA(AnyUnilever)UAN(3) = UA(AnyHeinz)
Edit Code Lists Once
This collapses the Unaided codes to the Aided subset
Finally, by Spec Generator Finally, by Spec Generator againagain
For each parent brand n, A(n) = UAN(n) or AA(n)
DiagnosticsDiagnostics
1. Audit Report at each update
2. Check-sum relationships
3. Check for - unexpected empties - wildly out of character values - sudden differences
4. Create charts which declare problems
Good process + good software does not mean no problems
Audit ReportAudit Report
new code TMBA(239), Brand Alpha
new code UOBA(239), Brand Alpha
new code TMAA(293), Brand Alpha
new code UOAA(239), Brand Alpha
changed code def TMBA(238), old: BrandGamma
new: Brand Gamma
changed code def IMAGE1(99), old: Don’t Know
new: Planet Zeta
TMAA has Brand Alpha as 293 instead of as elsewhere 239
the meaning of IMAGE1 code 99 has changed from Don’t Know to Planet Zeta
- Absolute Consistency
Check-sum ExampleCheck-sum ExampleVerbatim Coding Check on Unaided Brand Awareness: UBA-TMBA-UOBA = 0
4 = 1 + 3, OK
5 <> 1 + 5, bad
tabulating this expression against Case, and then sorting in ascending order, identifies the relevant case IDs
Zero Sum and Wildly OffZero Sum and Wildly Off
This sort of test can be refined by adding a column to test last against average of prior N, including codes, etc
Smooth ChronologySmooth Chronology
Chronological Disaster Chronological Disaster ZoneZone
Never Trust a Weighting Never Trust a Weighting AlgorithmAlgorithm
ReportingReporting
• Free or Locked axes will now either accept or reject the new brand codes for the affected measures, according to prior decisions
• If ultimately to PowerPoint, then check destination slides
• Consider if new reports need to be created
Reporting is the black hole in MR – find a way to automate it
For the FutureFor the Future• Improvements in PC capacity for deep data
mining looking for anomalies, errors, especially internet collection
• Natural language processing for coding verbatims
• With public Office 2007 XML file formats, much tighter and far faster integration by writing directly to Office07 files
• Increasing adoption of standards (eg SSS)
In ConclusionIn Conclusion• Successfully automating CT requires the right
work practices in concert with software adequate to the task
• This PowerPoint file and the short and long versions of this paper are at www.redcentresoftware.com/public
• While I address any questions, please peruse the Ten Commandments of CT
The 10 Commandments The 10 Commandments of CTof CT1. Change the definition of a code, for a code, once defined, is immutable beneath the heavens above and
upon the earth below, for all eternity.
2. Reuse an existing variable for a different question, or remove a variable from a job, lest your analysts be driven mad with chaos and confusion.
3. Neglect to collect the date against each case, for an undated case is an abomination, useless to man or beast.
4. Fail to correctly label a variable, for an incorrect label is a despicable lie which will blight your days and sow the seeds of anguish throughout the land.
5. Change the questionnaire without due consultation with the spec writers, for such cavalier disregard for the faithful servants of DP will result in doom and despair when presentation deadlines pass unfulfilled, and the righteous anger of the Client is poured down upon thy head.
6. Change the field supplier without due consultation with the spec writers, for there are among us many suppliers of low repute, who delight in delivering shoddy data for a high price, and whose ways are full of evil, and who care not for the sacred principles of data integrity and meaningful analysis.
7. Endure a job to persist in a messy state, for a messy job is a defilement which will besmirch all who come near with filth, filling even the stoutest heart with fear.
8. Neglect to run a full update at the completion of each interviewing cycle, for such sloth will result in errors, and errors upon errors, and errors upon errors upon errors, proliferating undiscovered until the job becomes unto a plague of misinformation and deceit.
9. Use an atomising file format, such as the hateful *.SAV, lest your job spawn variables numbering the grains of sand or the stars above.
10. Disdain to run diagnostics at each update, for such hubris will sicken the job unto death when it is found that the Client has been presented meaningless garbage for all the years preceding.
Thou shalt not: