Top Banner
Engineering Large Projects in a Functional Language Lessons from a Decade of Haskell at Galois Don Stewart | 2010-07-10 | DevNation PDX
49

Engineering Large Projects in a Functional Language

Apr 10, 2015

Download

Documents

Don Stewart

Galois has been building software systems in Haskell for the past decade. This talk describes some of what we’ve learned about in-the-large, commercial Haskell programming in that time. I'll look at when and where we use Haskell. At correctness, productivity, scalabilty, maintainability, and what language features we like: types, purity, types, abstractions, types, concurrency, types!

We'll also look at the Haskell toolchain: FFI, HPC, Cabal, compiler, libraries, build systems, etc, and being a commercial entity in a largely open source community.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Engineering Large Projects in a Functional Language

Engineering Large Projectsin a Functional Language

Lessons from a Decade of Haskell at Galois

Don Stewart | 2010-07-10 | DevNation PDX

Page 2: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

This talk made possible by... Aaron Tomb

Adam Wick

Andy Adams-Moran

Andy Gill

David Burke

Dylan McNamee

Eric Mertens

Iavor Diatchki

Isaac Potoczny-Jones

Jef Bell

Peter White

Trevor Elliott

Phil Weaver

Jason Dagit

Jeff Lewis

Joe Hurd

Joel Stanley

John Launchbury

John Matthews

Jonathan Daugherty

Josh Hoyt

Laura McKinney

Ledah Casburn

Lee Pike

Levent Erkok

Louis Testa

Magnus Carlsson

Matt Sottile

Paul Heinlein

Rogan Creswick

Sally Browning

Sigbjorn Finne

Thomas Nordin

Brett Letner

… and many others

Page 3: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

What does Galois do?

Information assurance for critical systems

Building systems that are trustworthy and secure

Mixture of government and industry clients

R&D with our favorite tools:

• Formal methods

• Typed functional languages

• Languages, compilers, DSLs

Kernels, file systems, networks, servers, compilers, security, desktop apps, ...

Haskell for pretty much everything

Page 4: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Haskell is ...

A purely functional language

Strongly statically typed

20 years old

Open source

Compiled and interpreted

Used in research, open source and industry

http://haskell.orghttp://haskell.org/platformhttp://hackage.haskell.org

Page 5: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Yes. Haskell can do that.

Many 20 – 200k LOC Haskell projects

Oldest commercial projects over 10 years of development now (e.g. Cryptol)

Teams of 1 – 6 developers at a time

Much pair programming, whiteboards, code reviews

20 – 30 devs over longer project lifetime

Have built many tools and libraries to support Haskell development on this scale

Haskell essential to keeping clients happy with:

• Deadlines, performance(!), maintainability

Page 6: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Themes

Page 7: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Languages matter!

Writing correct software is difficult!

Programming languages vary wildly in how well they support robust, secure, safe coding practices

Languages and tools can aid or hinder our efforts:

• Type systems

• Purity

• Modularity / compositionality

• Abstraction support

• Tools: analyses, provers, model checking

• Buggy implementations

Page 8: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Detect errors early!

Detecting problems before executing the program is critical

• Debugging is hard

• Debugging low level systems is harder

• Debugging low level critical systems is ...

Culture of error prevention

• “How could we rule out this class of errors?”

• “How could we be more precise?”

Page 9: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

The toolchain matters!

Can't build anything without a good tool chain

• Native code, optimizing compiler

• Libraries, libraries, libraries

• Debugging, tracing

• Profiling, inspection, runtime analysis

• Testing, analysis

• Need open, modifiable tools –Particularly when pushing the boundaries

(Haskell on bare metal..)

Page 10: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Community matters!

Soup of ideas in a large, open research community:

• Rapid adoption of new ideas

Support, maintainance and help

• Can't build everything we need in-house!

Give back via:

• Workshops: CUFP, ICFP, Haskell Symposium

• Hackathons

• Industrial Haskell Group

• Open source code and infrastructure

• Teaching: papers, blogs, talks

Page 11: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

How Galois Uses Haskell

Page 12: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

1. The Type System

Page 13: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Page 14: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Types make our lives easier

Cheap way to verify properties• Cheaper than theorem proving

• More assurance than testing

• Saves debugging in hostile environments

Typical conversation:• Engineer A: “Spec says this must never happen”

• Engineer B: “Can we enforce that in the type system?”

Page 15: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Kinds of things types enforce

Simple things:• Correct arguments to a function

• Function f does not touch the disk

• No null pointers

• Mixing up similar concepts:– Virtual / physical addresses

Serious things:• Information flow policies

• Correct component wiring and integration

Page 16: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Recent experienceFirst demo of a new system

Six engineers

50k lines of code, in 5 components, developed over a number of months

Integrated, tested, demo'd in only a week, two months ahead of schedule, significantly above performance spec.

1 space leak, spotted and fixed on first day of testing via the heap profiler

2 bugs found (typos from spec)

Page 17: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Purity is fundamental

Difficult to show safety without purity

Code should be pure by default

Makes large systems easier to glue:• Pure code is “safe” by default to call

Effects are “code smells”, and have to be treated carefully

The world has too many impure languages: don't add to that

Page 18: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Types aren't enough thoughTypes aren't enough thoughTypes aren't enough though

Still not expressive enough for a lot of the properties we want to enforce

We care a lot about sizes in types• “Input must only be 128, 192 or 256 bits”

• “Type T should be represented with 7 bits”

Page 19: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Other tools in the bag

Extended static analysis tools

Model checking• SAT, SMT, …

Theorem proving• Isabelle, Agda, Coq

How much assurance do you need?

Page 20: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

2. Abstractions

Page 21: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Monads

Constantly rolling new monads• Captures critical facts about the execution environment in the

type

Directly encodes semantics we care about• “Computed keys are not visible outside the M component”

• “Function f has read-only access to memory”

Page 22: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Algebraic Data Types

Every system is either an interpreter or a compiler• Abstract syntax trees are ubiquitous

• Represent processes symbolically, via ADTs, then evaluate them in a safe (monadic) context

• Precise, concise control over possible values

• But need precise representation control

Page 23: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Laziness

Captures some concepts perfectly• “A stream of 4k packets from the wire”

Critical for control abstractions in DSLs

Useful for prototyping:• error “M.F.foo: not implemented”

Page 24: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Laziness

Makes time and space reasoning harder!• Mostly harmless in practice• Stress testing tends to reveal retainers• Graphical profiling knocks it dead

Must be able to precisely enable/disable Be careful with exceptions and mutation whnf/rnf/! are your friends

Page 25: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Type classes

We use type classes• Well defined interfaces between large components (sets of

modules)

• Natural code reuse

• Capture general concepts in a natural way

• Capture interface in a clear way

• Kick butt EDSLs (see Lennart's blog)

Page 26: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Concurrency and Parallelism

forkIO rocks• Cheap, very fast, precise threads

MVars rock

STM rocks (safely composable locks!)

Result: not shy introducing concurrency when appropriate

Page 27: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

3. Foreign Function Interface

Page 28: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Foreign Function Interface

The world is a messy place

A good FFI means we can always call someone else's code if necessary

Have to talk to weird bits of hardware and weird proof systems

ForeignPtr is great abstraction tool

Must have clear API into the runtime system (hot topic at the moment)

Page 29: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

4. Meta programming

Page 30: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

There's alway boilerplate

Abstractions get rid of a lot of repetitive code, but there's always something that's not automated

We use a little Template Haskell

Other generics:• Hinze-style generics

• SYB generics

Particular useful for generating instance code for marshalling

Page 31: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

5. Performance

Page 32: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Fast enough for majority of things

Vast majority of code is fast enough• GHC -O2 -funbox-strict-fields

• Happy with 1 – 2x C for low level code

Last few drops get squeezed out:• Profiling

• Low level Haskell

• Cycle-level measurement

• EDSLs to generate better code

• Calling into C

Page 33: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Performance

Really precise performance requires expertise

Libraries are helping reify “oral traditions” about optimization

Still a lack of clarity about performance techniques in the broader Haskell community though

Page 34: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

6. Debugging

Page 35: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

There are still bugs!

Testing• QuickCheck!!!

Heap profiling• “By type” profiling of the heap

GHC -fhpc• Great for finding exceptions

• Understanding what is executing

+RTS -stderr• Explain what GC, threads, memory is up to

Page 36: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

7. Documentation

Page 37: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Generating supporting artifacts

Haddock is great for reference material• Helps capture design in the source

• Code + types becomes self documenting

Design documents can be partially extracted via:• The major data and type signatures

• graphmod

• cabalgraph

• HPC analysis

Page 38: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

8. Libraries

Page 39: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Hackage Changed Everything

2200+ libraries created in 3 years. There's a library for everything, and often more than one...

Can sit back and let mtl / monadlib / haxml / hxt fight it out :)

Static linking → need BSD licensed code if we want to ship

Haskell Platform to answer QA questions

Page 40: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

9. Shipping code

Page 41: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Cabal

I don't know how Haskell was possible before Cabal :)

Quickly adopted Cabal/cabal-install across projects

cabal-install:• Simple, clean integration of internal and external components

into packageable objects

Page 42: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

10. Conventions

Page 43: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

We try to ...

-Wall police

Consistent layout

No tabs

Import qualified Control.Exception

{-# LANGUAGE … #-}

Map exceptions into Either / Maybe

Page 44: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

We try to ...

deriving Show

Line/column for errors if you must throw

No global mutable state

Put type sigs in “when you're done” with the design

Use GHCi for rapid experimentation

Cabal by default.

Libraries by default

Page 45: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

11. Training

Page 46: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

Easy to find Haskell programmers

With a big open source community, its much easier to find Haskell programmers now

Many more applicants than jobs, often with significant experience from open source

We train on-site, and new resources like LYAH and RWH make this easier.

Page 47: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

12. Things that we still need

Page 48: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.

More support for large scale programming

Enforcing conventions across the code

Data representation precision (emerging)

A serious refactoring tool (HaRe on Hackage!)

Vetted and audited libraries by experts (Haskell Platform)

Idioms for mapping design onto types/functions/classes/monads

Better capture your 100 module design!

Page 49: Engineering Large Projects in a Functional Language

© 2010 Galois, Inc. All rights reserved.