Top Banner
Programming Tips GS540 January 10, 2011
15

Programming Tips

Dec 31, 2015

Download

Documents

clarke-norton

Programming Tips. GS540 January 10, 2011. Jarrett Egertson 4 th year Genome Sciences Graduate Student MacCoss Lab for Biological Mass Spectrometry Email: [email protected] Discussion Section: Thursdays 2-3 Foege S-040 Office Hours: Tuesdays: 2-3 Vista Café and by Appointment - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Programming Tips

Programming TipsGS540

January 10, 2011

Page 2: Programming Tips

Jarrett Egertson

❖4th year Genome Sciences Graduate Student

❖MacCoss Lab for Biological Mass Spectrometry

❖Email: [email protected]

❖Discussion Section: Thursdays 2-3 Foege S-040

❖Office Hours: Tuesdays: 2-3 Vista Café and by Appointment

❖Programming: C++/C#/Python

❖Dev environment: GCC (Linux), Visual Studio (Windows)

Page 3: Programming Tips

Outline

❖ General tips / Coding style advice

❖ Performance

❖ Debugging advice

❖ Numerical issues

❖ C Programming

• Pointers

• Sorting

❖ Homework 1

Page 4: Programming Tips

General tips

❖ Validate using toy cases with a small amount of data

❖ Figure out minimal cases by hand to verify program

❖ Print intermediate output

❖ Test important functions

Page 5: Programming Tips

Coding style advice❖ Your audience is people as well as the

computer

❖ Break large functions into small, simple functions

❖ Break large files into smaller files containing groups of related functions

❖ Use descriptive names for function arguments and variables with larger scopes (e.g. out_file, exons)

❖ Use short names for iterators, vars of limited scope, and vars that are used many times (e.g. i, j)

Page 6: Programming Tips

Performance❖ Avoid unnecessary optimization!

- Better to write simple code first and improve speed if necessary

- Big performance gains often result from changing a few small sections of code (e.g. within loops)

❖ Move unnecessary code out of loops

❖ Avoid frequent memory allocation

• Allocate memory in large blocks (e.g. array of structs, not one at a time)

• Re-use the same piece of memory

❖ Avoid slow comparison routines when sorting

❖ Use a profiler for tough cases (gprof for C; dprofpp for perl)

Page 7: Programming Tips

Debugging advice

❖ Use assertions

• E.g. check probabilities:

- should always be >= 0 and <= 1

- often should sum to 1.0

❖ Write slow but sure code to check optimized code

❖ In difficult cases use a debugger, but avoid overuse

❖ valgrind can help find segfaults, memleaks (compile with -g first)

Page 8: Programming Tips

Numerical issues❖ Consider using log space to avoid

overflow and underflow

❖ Don’t compare floats with equals (use >= or <=, NOT ==)

❖ Beware subtracting large, close numbers

❖ Beware integer casts

• 1/2 is 0, but 1.0/2 is 0.5

❖ To generate random numbers, use random() rather than rand()

Page 9: Programming Tips

Pointers in C❖ Pointers are memory addresses (they point

to other variables)

❖ The address-of operator (&) obtains the memory address of a variable

❖ The dereference operator (*) accesses the value stored at the pointed-to mem location

Page 10: Programming Tips

Pointers in C (cont’d)

❖ Arrays are pointers to blocks of memory

❖ Array indices are just pointer arithmetic and dereferencing combined:

• a[12] is the same as *(a+12)

• &a[3] is the same as a+3

From The C Programming Language by B. Kernighan & D. Ritchie

Page 11: Programming Tips

Homework 1

❖ Declare large arrays on heap not stack

• Outside main() or as static

❖ Output XML markup directly

• Avoid copy-pasting results into an XML template

Page 12: Programming Tips

Pointers in C (cont’d)❖ Large arrays should be

dynamically allocated (on the heap)

From The C Programming Language by B. Kernighan & D. Ritchie

C C++

Page 13: Programming Tips

Pointers in C (cont’d)

❖ Attributes of pointed-to structures can be derefenced with “arrow notation”:

• a->elem is equivalent to (*a).elem

Page 14: Programming Tips

Sorting (in C)

Page 15: Programming Tips

Words of wisdom❖ "Everything should be made as simple as possible, but

no simpler." -- Albert Einstein

❖ KISS principle: “Keep It Simple, Stupid”

❖ From The Zen of Python by Tim Peters:

• Beautiful is better than ugly

• Explicit is better than implicit

• Simple is better than complex

• Complex is better than complicated

• Flat is better than nested

• Sparse is better than dense

• Readability counts