Implementing Legacy Statistical Algorithms in a Spreadsheet
Environment Stephen W. Liddle Information Systems Faculty Rollins
eBusiness Center John S. Lawson Department of Statistics Brigham
Young University Provo, UT 84602
Slide 2
Overview Introduction Introduction Fundamentals of VBA in Excel
Fundamentals of VBA in Excel Retargeting traditional algorithms to
a spreadsheet environment Retargeting traditional algorithms to a
spreadsheet environment Converting FORTRAN to VBA Converting
FORTRAN to VBA Conclusions Conclusions
Slide 3
Why Convert FORTRAN Programs to Run in a Spreadsheet
Environment? Useful code available that is not implemented in
standard statistical packages Useful code available that is not
implemented in standard statistical packages FORTRAN compilers not
usually available on normal Windows workstation FORTRAN compilers
not usually available on normal Windows workstation Many textbooks
refer to published FORTRAN algorithms Many textbooks refer to
published FORTRAN algorithms
Slide 4
Sources for Published FORTRAN Algorithms STATLIB
(http://lib.stat.cmu.edu/) STATLIB
(http://lib.stat.cmu.edu/)http://lib.stat.cmu.edu/ General Archive
General Archive Applied Statistics Archive Applied Statistics
Archive Journal of Quality Technology Archive Journal of Quality
Technology Archive JASA Software Archive JASA Software Archive JCGS
Archive JCGS Archive
Slide 5
Advantages of Running Legacy FORTRAN Code in Excel Comfortable
environment for practitioners Comfortable environment for
practitioners More user friendly input from spreadsheet More user
friendly input from spreadsheet Output to spreadsheet allows
further graphical and computational analysis of results with Excel
functions Output to spreadsheet allows further graphical and
computational analysis of results with Excel functions
Slide 6
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
Slide 13
Slide 14
Proposed Methodology Understand original FORTRAN program
Understand original FORTRAN program Choose suitable I/O methods
Choose suitable I/O methods Convert original FORTRAN code to VBA
Convert original FORTRAN code to VBA Test and use resulting Excel
code Test and use resulting Excel code
Slide 15
Visual Basic For Applications Built on ANSI BASIC Built on ANSI
BASIC Language engine of Microsoft Office Language engine of
Microsoft Office Modern structured programming language Modern
structured programming language Has vast array of types, functions,
programming helps Has vast array of types, functions, programming
helps Powerful support environment (Office platform) Powerful
support environment (Office platform) Popular in business contexts
Popular in business contexts
Slide 16
Excel Object Model Objects in Excel are addressable in VBA
Objects in Excel are addressable in VBA Each object has: Each
object has: Properties Properties Methods Methods Application
Workbooks ( Workbook ) RangeChart Worksheets ( Worksheet )
Slide 17
Clicking these buttons runs the ORPS1 and ORPS2 algorithms.
Input Region Output Region Input/Output Methods Non-interactive
Non-interactive Files, databases Files, databases Worksheet cells
Worksheet cells Interactive Interactive Message boxes Message boxes
Input boxes Input boxes Custom GUI forms Custom GUI forms
More
Operators.EQ.=.EQ.=.NE..NE..LT.=.AND.And.AND.And.OR.Or.OR.Or.NOT.Not.NOT.Not
//& //&
Slide 20
Data Types INTEGERByte, Integer, Long INTEGERByte, Integer,
Long REALSingle REALSingle DOUBLE PRECISION Double DOUBLE PRECISION
Double COMPLEXNon-primitive in VBA COMPLEXNon-primitive in VBA
LOGICALBoolean LOGICALBoolean CHARACTERString CHARACTERString
CHARACTER*lengthString*length CHARACTER*lengthString*length Other
notable VBA types: Other notable VBA types: Currency, Decimal,
Date, Variant Currency, Decimal, Date, Variant
Slide 21
Worksheet Functions ChiDist(x,deg_freedom) Returns one-tailed
probability of the 2 distribution. Correl(array1,array2) Returns
the correlation coefficient of two cell ranges. Fisher(x) Returns
the Fisher transformation at a given x. Pearson(array1,array2)
Returns the Pearson product moment correlation coefficient for two
sets. Quartile(array,quart) Returns the requested quartile of a
data set. StDev(array) Returns the standard deviation of a data
set. ZTest(array,x,sigma) Returns the two-tailed P-value of a
z-test.
Slide 22
Flow-Control Statements FORTRANVBA Logical if IF ( expr ) stmt
If expr Then stmt Block if IF ( expr 1 ) THEN stmt 1 ELSE IF ( expr
2 ) THEN stmt 2 ELSE stmt n END IF If expr 1 Then stmt 1 ElseIf
expr 2 Then stmt 2 Else stmt n EndIf
Slide 23
Subtle Differences (Gotchas) Implicit conversion of real to
integer values Implicit conversion of real to integer values
FORTRAN: truncate FORTRAN: truncate VBA: round VBA: round Solution:
use VBAs Fix(), which truncates Solution: use VBAs Fix(), which
truncates Both languages allow implicit typing Both languages allow
implicit typing This introduces ambiguity This introduces ambiguity
Solution: supply explicit types everywhere Solution: supply
explicit types everywhere
Slide 24
Eliminating Goto Statements Computer science accepts the axiom
that goto is generally considered harmful Computer science accepts
the axiom that goto is generally considered harmful We advocate
rewriting alogrithms to use structured programming techniques where
feasible We advocate rewriting alogrithms to use structured
programming techniques where feasible Sine qua non is make it work
Sine qua non is make it work Its a good idea for maintainability,
understandability to move to structured form Its a good idea for
maintainability, understandability to move to structured form
Slide 25
Eliminating Goto Statements DO 8 J=1,3... 6...
IF(OBJFN.GT.BESTFN) GO TO 7... GO TO 6 7 IF(J.EQ.3) GO TO 8
XK=BESTK-STEP 8 CONTINUE
Slide 26
Eliminating Goto Statements For j=1 To 3... 6...
IF(OBJFN.GT.BESTFN) GO TO 7... GO TO 6 7 IF(J.EQ.3) GO TO 8
XK=BESTK-STEP 8Next j
Slide 27
Eliminating Goto Statements For j=1 To 3... 6...
IF(OBJFN.GT.BESTFN) GO TO 7... GO TO 6 7 If j 3 Then xk = bestk -
step End If Next j
Slide 28
Eliminating Goto Statements For j=1 To 3... Do Until objfn >
bestfn... Loop If j 3 Then xk = bestk - step End If Next j
Slide 29
Our Reasoning Digital assets are fragile Digital assets are
fragile FORTRAN is not universally available FORTRAN is not
universally available Excel is a ubiquitous, powerful platform
Excel is a ubiquitous, powerful platform VBA is a full-featured
language capable of handling sophisticated statistical computations
VBA is a full-featured language capable of handling sophisticated
statistical computations
Slide 30
Conclusions We recommend creating a Web-based repository of
Excel/VBA implementations of classic statistical algorithms We
recommend creating a Web-based repository of Excel/VBA
implementations of classic statistical algorithms We can preserve
our legacy algorithms in this modern spreadsheet environment We can
preserve our legacy algorithms in this modern spreadsheet
environment E-mail us if you want a copy of our manuscript (liddle
or [email protected]) E-mail us if you want a copy of our manuscript
(liddle or [email protected])