Top Banner
Implementing Legacy Implementing Legacy Statistical Algorithms Statistical Algorithms in a Spreadsheet in a Spreadsheet Environment Environment Stephen W. Liddle Stephen W. Liddle Information Systems Information Systems Faculty Faculty Rollins eBusiness Rollins eBusiness Center Center John S. Lawson John S. Lawson Department of Department of Statistics Statistics Brigham Young Brigham Young University University Provo, UT 84602 Provo, UT 84602
30

Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information Systems Faculty Rollins eBusiness Center John S.

Dec 25, 2015

Download

Documents

Magnus Taylor
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Slide 1
  • Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information Systems Faculty Rollins eBusiness Center John S. Lawson Department of Statistics Brigham Young University Provo, UT 84602
  • Slide 2
  • Overview Introduction Introduction Fundamentals of VBA in Excel Fundamentals of VBA in Excel Retargeting traditional algorithms to a spreadsheet environment Retargeting traditional algorithms to a spreadsheet environment Converting FORTRAN to VBA Converting FORTRAN to VBA Conclusions Conclusions
  • Slide 3
  • Why Convert FORTRAN Programs to Run in a Spreadsheet Environment? Useful code available that is not implemented in standard statistical packages Useful code available that is not implemented in standard statistical packages FORTRAN compilers not usually available on normal Windows workstation FORTRAN compilers not usually available on normal Windows workstation Many textbooks refer to published FORTRAN algorithms Many textbooks refer to published FORTRAN algorithms
  • Slide 4
  • Sources for Published FORTRAN Algorithms STATLIB (http://lib.stat.cmu.edu/) STATLIB (http://lib.stat.cmu.edu/)http://lib.stat.cmu.edu/ General Archive General Archive Applied Statistics Archive Applied Statistics Archive Journal of Quality Technology Archive Journal of Quality Technology Archive JASA Software Archive JASA Software Archive JCGS Archive JCGS Archive
  • Slide 5
  • Advantages of Running Legacy FORTRAN Code in Excel Comfortable environment for practitioners Comfortable environment for practitioners More user friendly input from spreadsheet More user friendly input from spreadsheet Output to spreadsheet allows further graphical and computational analysis of results with Excel functions Output to spreadsheet allows further graphical and computational analysis of results with Excel functions
  • Slide 6
  • Slide 7
  • Slide 8
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Slide 14
  • Proposed Methodology Understand original FORTRAN program Understand original FORTRAN program Choose suitable I/O methods Choose suitable I/O methods Convert original FORTRAN code to VBA Convert original FORTRAN code to VBA Test and use resulting Excel code Test and use resulting Excel code
  • Slide 15
  • Visual Basic For Applications Built on ANSI BASIC Built on ANSI BASIC Language engine of Microsoft Office Language engine of Microsoft Office Modern structured programming language Modern structured programming language Has vast array of types, functions, programming helps Has vast array of types, functions, programming helps Powerful support environment (Office platform) Powerful support environment (Office platform) Popular in business contexts Popular in business contexts
  • Slide 16
  • Excel Object Model Objects in Excel are addressable in VBA Objects in Excel are addressable in VBA Each object has: Each object has: Properties Properties Methods Methods Application Workbooks ( Workbook ) RangeChart Worksheets ( Worksheet )
  • Slide 17
  • Clicking these buttons runs the ORPS1 and ORPS2 algorithms. Input Region Output Region Input/Output Methods Non-interactive Non-interactive Files, databases Files, databases Worksheet cells Worksheet cells Interactive Interactive Message boxes Message boxes Input boxes Input boxes Custom GUI forms Custom GUI forms
  • Slide 18
  • FORTRAN vs. VBA VBA: (-b+Sqr (b^ 2-4*a*c))/(2*a) VBA: (-b+Sqr (b^ 2-4*a*c))/(2*a) FORTRAN: (-b+SQRT(b**2-4*a*c))/(2*a) FORTRAN: (-b+SQRT(b**2-4*a*c))/(2*a)
  • Slide 19
  • More Operators.EQ.=.EQ.=.NE..NE..LT.=.AND.And.AND.And.OR.Or.OR.Or.NOT.Not.NOT.Not //& //&
  • Slide 20
  • Data Types INTEGERByte, Integer, Long INTEGERByte, Integer, Long REALSingle REALSingle DOUBLE PRECISION Double DOUBLE PRECISION Double COMPLEXNon-primitive in VBA COMPLEXNon-primitive in VBA LOGICALBoolean LOGICALBoolean CHARACTERString CHARACTERString CHARACTER*lengthString*length CHARACTER*lengthString*length Other notable VBA types: Other notable VBA types: Currency, Decimal, Date, Variant Currency, Decimal, Date, Variant
  • Slide 21
  • Worksheet Functions ChiDist(x,deg_freedom) Returns one-tailed probability of the 2 distribution. Correl(array1,array2) Returns the correlation coefficient of two cell ranges. Fisher(x) Returns the Fisher transformation at a given x. Pearson(array1,array2) Returns the Pearson product moment correlation coefficient for two sets. Quartile(array,quart) Returns the requested quartile of a data set. StDev(array) Returns the standard deviation of a data set. ZTest(array,x,sigma) Returns the two-tailed P-value of a z-test.
  • Slide 22
  • Flow-Control Statements FORTRANVBA Logical if IF ( expr ) stmt If expr Then stmt Block if IF ( expr 1 ) THEN stmt 1 ELSE IF ( expr 2 ) THEN stmt 2 ELSE stmt n END IF If expr 1 Then stmt 1 ElseIf expr 2 Then stmt 2 Else stmt n EndIf
  • Slide 23
  • Subtle Differences (Gotchas) Implicit conversion of real to integer values Implicit conversion of real to integer values FORTRAN: truncate FORTRAN: truncate VBA: round VBA: round Solution: use VBAs Fix(), which truncates Solution: use VBAs Fix(), which truncates Both languages allow implicit typing Both languages allow implicit typing This introduces ambiguity This introduces ambiguity Solution: supply explicit types everywhere Solution: supply explicit types everywhere
  • Slide 24
  • Eliminating Goto Statements Computer science accepts the axiom that goto is generally considered harmful Computer science accepts the axiom that goto is generally considered harmful We advocate rewriting alogrithms to use structured programming techniques where feasible We advocate rewriting alogrithms to use structured programming techniques where feasible Sine qua non is make it work Sine qua non is make it work Its a good idea for maintainability, understandability to move to structured form Its a good idea for maintainability, understandability to move to structured form
  • Slide 25
  • Eliminating Goto Statements DO 8 J=1,3... 6... IF(OBJFN.GT.BESTFN) GO TO 7... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 CONTINUE
  • Slide 26
  • Eliminating Goto Statements For j=1 To 3... 6... IF(OBJFN.GT.BESTFN) GO TO 7... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8Next j
  • Slide 27
  • Eliminating Goto Statements For j=1 To 3... 6... IF(OBJFN.GT.BESTFN) GO TO 7... GO TO 6 7 If j 3 Then xk = bestk - step End If Next j
  • Slide 28
  • Eliminating Goto Statements For j=1 To 3... Do Until objfn > bestfn... Loop If j 3 Then xk = bestk - step End If Next j
  • Slide 29
  • Our Reasoning Digital assets are fragile Digital assets are fragile FORTRAN is not universally available FORTRAN is not universally available Excel is a ubiquitous, powerful platform Excel is a ubiquitous, powerful platform VBA is a full-featured language capable of handling sophisticated statistical computations VBA is a full-featured language capable of handling sophisticated statistical computations
  • Slide 30
  • Conclusions We recommend creating a Web-based repository of Excel/VBA implementations of classic statistical algorithms We recommend creating a Web-based repository of Excel/VBA implementations of classic statistical algorithms We can preserve our legacy algorithms in this modern spreadsheet environment We can preserve our legacy algorithms in this modern spreadsheet environment E-mail us if you want a copy of our manuscript (liddle or [email protected]) E-mail us if you want a copy of our manuscript (liddle or [email protected])