Introduction After Math Practices Last Words Good Practices for Mathematical Software Sou-Cheng (Terrya) Choi NORC at UChicago, IIT www.iit.edu/ ~ schoi32 Meshfree Methods Seminar Dept. of Applied Mathematics, IIT Thanks: GAIL Team, MATH573 (Fall 2013) May 21, 2014 Sou-Cheng Terrya Choi [email protected]Good Practices for Mathematical Software May 21, 2014 1 / 33
33
Embed
Sou-Cheng (Terrya) Choi - IITMeshfree-methods-seminar/presentations/talk_20140521... · I Nightly build I Client-server functionality tests I Load testing I Documentation I Continuous
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
IntroductionAfter Math
PracticesLast Words
Good Practices for Mathematical Software
Sou-Cheng (Terrya) ChoiNORC at UChicago, IIT www.iit.edu/~schoi32
Meshfree Methods Seminar
Dept. of Applied Mathematics, IIT
Thanks: GAIL Team, MATH573 (Fall 2013)
May 21, 2014
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 1 / 33
About GAIL—words from the CEO, May 6, 2013I Before GAIL: Automatic numerical integration algorithms have inherent flaws
in their error estimation based on balls of integrands.I GAIL overcomes the flaws by considering cones of integrands. This allows us
to construct upper bounds on costs of our integration routines with rigorousguarantees of accuracy and develop algorithms that provide the value of theintegral with an error of no more than the user-defined tolerance.
I Mission (possible): To create a well-documented and well-tested library ofunivariate & multivariate integration routines that have rigorous guarantees
I GAIL version 1: By the end of the summer we hope to have our automaticroutines for function recovery, univariate integration, and Monte Carloestimation of mean on the GAIL site in good form, meaning that theseroutines need to be
I well-documentedI well-testedI optimized for speedI accompanied by examplesI in a repository where they can be modified and re-tested as needed
I Later we will improve and add to these routines.Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 4 / 33
I First-year milestonesI Sep 3, 2013: Release of GAIL version 1I Sep 4, 2013: Development of GAIL version 1.3 commencedI Fall 2013: MATH 573 Seminar/Elective “Reliable Mathematical Software”
(Instructors: Fred Hickernell & C. Students: 7 registered, 2 regular sit-in)I ♥ Feb 14, 2014: Release of GAIL version 1.3I Feb 15: Development of GAIL version 2.0 commenced
I Targets for the next few yearsI July 9–10: SIAM Annual Meeting, Minisymposium on “Reliable Mathematical
Software”I Labor Day, Sep 1: Release of GAIL version 2I Feb 2015: Release of GAIL version 2.5I Sep 2015: Release of GAIL version 3I TBD: Apply for a research grant for GAIL
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 6 / 33
RecapReproducible ResearchReliable Reproducible Research and Staunch Scientific Software
A “disintegrating” integral
0 0.2 0.4 0.6 0.8 1
−0.2
0
0.2
0.4
0.6
0.8
1
fdata
Spiky f with I =∫ 1
0f(x)dx ≈ 0.3694.
quad(f,[0,1],1e-14) = 0 giving error = I.Strategy: ↑ number of points to ↓ error
Q: How many number of points (or function evaluations), n, are re-quired in a quadrature rule to guarantee that a given error tolerance,ε is met with a confidence level 1− α?A: Clancy et al. 2013, [CDH+14b], Hickernell et al. 2014 [HJLO14],GAIL 1.0 [CDH+13], GAIL 1.3 [CDH+14a]
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 8 / 33
RecapReproducible ResearchReliable Reproducible Research and Staunch Scientific Software
What is reproducible research (RR)?
I Claerbout, “The markings [ER], [CR], and [NR] are promises by theauthor(s) about the reproducibility of each figure result.” (url:j.mp/VM7Xq4)
I ER: Easily reproducible: “programs, parameters, and makefiles . . . data”I CR: Conditionally reproducible: “processing requires 20 minutes or more, or
commercial packages”I NR: Not reproducible: drawingsI M: Movie “in a figure”; ER, CR, or NR
I Donoho paraphrasing Claerbout, “an article about computational result isadvertising, not scholarship. The actual scholarship is the full softwareenvironment, code and data, that produced the result.” (Buckheit &Donoho 1995)
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 10 / 33
RecapReproducible ResearchReliable Reproducible Research and Staunch Scientific Software
Bernd Flemisch’s survey results (Aug 2013)Online Survey: Reproducibility in ComputationalScience and Engineering (CSE)
13 questions on opinions and experiences concerning the reproducibility ofcomputational results.Results collected on August 1.
Direct EmailsCall went out on July 5 to ∼ 500 addresses.Resulted in ∼ 80 answers.
InterPore NewsletterNewsletter was sent out on July 6 to ∼ 1000 addresses.Resulted in 2 answers.
SIAM Activity Group on CSE Mailing ListCall went out on July 10 to ∼ 2000 addresses.Resulted in ∼ 300 answers.
Survey Results I (n = 385)
0% 20% 40% 60% 80%Yes
No
I understand what the reproducibilityof computational results means:
0% 20%40%60%very important 5
4
3
2
not important 1
I consider the reproducibility ofcomputational results to be ...
0% 20% 40% 60%Yes
No
I don’t know
The importance ... is sufficientlyreflected by today’s journal policies:
0% 10% 20% 30%
very high 5
4
3
2
very little 1
I don’t know
The effort it would take me/others toreproduce my computational results:
0% 20% 40% 60%
Yes
No
I already had problems with reproducingcomputational results of my own/others:
0% 10% 20% 30% 40% 50%More
Equal
Less
Compared to the average, I think that Iinvest ... effort in ensuring reproducibility:
Survey Results II (n = 385)
0% 20% 40% 60%I don’t have/need a strategy
Detailed description in my papers
Use of version control
I make the problem data available
I make the source code available
Use of tools like Madagascar
Other
My strategy to ensure the reproducibility of my comp. results:
0% 20% 40%No personal interest
No scientific interest
No requirement
No reward
No necessity
Other
Reasons for not devoting more effortto foster the reproducibility ...:
0% 10% 20% 30%81 – 100%
61 – 80%
41 – 60%
21 – 40%
0 – 20%
The ratio of working hours I devote to coding(including thinking and talking about coding):
10% 30%Professor
Postdoc
PhD student
BSc/MSc student
Other
My current education/position:
0% 10% 20% 30%> 60
51 – 60
41 – 50
31 – 40
21 – 30
≤ 20
My age in years:
Survey Results: A Slightly Deeper LookThe estimation of the effort to reproduce does not influence the estimationof the importance of reproducibility.The effort estimated for oneself influences the effort estimated for others,and the effort for the others is considered to be higher.The estimated effort to reproduce does not influence the number ofemployed strategy items.The amount of work related to coding influences the estimation of theimportance, but not the number of employed strategy items.Age does not have an influence on the quantitative results, apart from thetime devoted to coding.
5very high
4 3 2 1very little
I don’tknow
0%
10%
20%
30%
The effort it would take me to reproduce my computational results from three years ago:
problems reproducing own results (n = 200)no problems reproducing own results (n = 185)
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 12 / 33
Measuring and improving performanceMATLAB software development/analysis tools
I Matlab ProfilerI Matlab MexI Matlab 2013 Unit Testing FrameworkI Matlab Central (url: j.mp/anwdaP)I Matlab mex to interface with C/Fortran functionsI Parallelization with SWIFTI Matlab reports
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 27 / 33
Sou-Cheng T. Choi, Yuhan Ding, Fred J. Hickernell, Lan Jiang, and YizhiZhang, GAIL: Guaranteed Automatic Integration Library (version 1),MATLAB software, 2013.
N. Clancy, Y. Ding, C. Hamilton, F. J. Hickernell, and Y. Zhang, Thecomplexity of guaranteed automatic algorithms: Cones, not balls, Journal ofComplexity 30 (2014), 21–45.
Sou-Cheng T. Choi and Fred J. Hickernell, IIT MATH-573 ReliableMathematical Software, 2013, seminal course slides.
Sou-Cheng T. Choi, MINRES-QLP Pack and reliable reproducible research viastaunch scientific software, First Workshop on Sustainable Software forScience: Practice and Experiences (Denver, Colorado, USA), 2013.
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 32 / 33
F. J. Hickernell, L. Jiang, Y. Liu, and A. B. Owen, Guaranteed conservativefixed width confidence intervals via Monte Carlo sampling, Monte Carlo andQuasi-Monte Carlo Methods 2012 (J. Dick, F. Y. Kuo, G. W. Peters, andI. H. Sloan, eds.), Springer-Verlag, Berlin, 2014, to appear, arXiv:1208.4318[math.ST].
Daniel S. Katz, Sou-Cheng T. Choi, Hilmar Lapp, Ketan Maheshwari, FrankLoffler, Matthew Turk, Marcus Hanwell, Nancy Wilkins-Diehr, JamesHetherington, James Howison, Shel Swenson, Gabrielle D. Allen, Anne C.Elster, Bruce Berriman, and Colin Venters, Summary of the First Workshopon Sustainable Software for Science: Practice and Experiences (WSSSPE1),Technical report, 2014, submitted to the Journal of Open Research Software.
Sou-Cheng Terrya Choi [email protected] Good Practices for Mathematical Software May 21, 2014 33 / 33