Top Banner
Strictness/Unboxed Explained Paul Meng @MnO2XMnO2
40

Strictness-Unboxed explained

Aug 04, 2015

Download

Software

Paul Meng
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Strictness-Unboxed explained

Strictness/Unboxed Explained

Paul Meng@MnO2XMnO2

Page 2: Strictness-Unboxed explained

Lazy Evaluation in Haskell

In the late ’70s and early ’80s, .... A series of seminal publications ignited an explosion of interest in the idea of lazy (or non-strict, or call-by-need) functional languages as a vehicle for writing serious programs.

A History of Haskell: Being Lazy With Class

● Lazy evaluation was once a hot research topic in academic world, and that founded the design of Haskell.

● There are Data.ByteString and Data.ByteString.Lazy, why?

But what does this exactly mean?

Page 3: Strictness-Unboxed explained

Let’s start from a metaphor

● You are helping your starving colleagues at the office to buy their lunch. You go to a McDonald’s, head to the counter, and order a bunch of things and make it to-go.

● Then the clerk gives you this big paper bag. You didn’t bother to check and just take it and run.

● Then you are doing non-strict evaluation!

Page 4: Strictness-Unboxed explained

You are in the office now

● Evaluation begins.● Wait! what’s your definition of

evaluation? In this metaphor, one step of evaluation is to open a bag or open a box

● Let’s see what would happen!

Page 5: Strictness-Unboxed explained

The First Step

Take the paper bag from the outermost paper bag. Seems no problem.

Page 6: Strictness-Unboxed explained

The Second Step

Take the meal boxes out of the paper bag. Seems no problem either.

Page 7: Strictness-Unboxed explained

The Third Step

No problem at all.

Page 8: Strictness-Unboxed explained

The Fourth Step

WTF??!

Page 9: Strictness-Unboxed explained

Non-strict semanticYou read on the HaskellWiki that strict semantic is

And non-strict semantic is

The symbol of upside-down T is called “bottom”. It is something undefined, or non-terminating program. In this case, just giving a finger. (See it looks like a finger, right?)

Page 10: Strictness-Unboxed explained

Non-strict semanticYou read it on HaskellWiki that strict semantic is

And non-strict semantic is

Now you can tell the difference between non-strict and strict.

Evaluate the bags at the counter, and catch the error. Finger sent!

No evaluation at the counter. Happy face.

Page 11: Strictness-Unboxed explained

Back to Haskell● In the example, the evaluation means either open bags or open boxes.

What is the “evaluation” in Haskell?● What’s the difference between non-strict and lazy?

Page 12: Strictness-Unboxed explained

Evaluation

1+1+1+1+1= 1+1+1+2= 1+1+3= 1+4= 5

Each step is an evaluation step. Or another fancy name, called “reduction”

Page 13: Strictness-Unboxed explained

Evaluation

Page 14: Strictness-Unboxed explained

Weak Head Normal FormWhat does this alien word mean?

To better explain it, let’s rewrite the last example a bit.

In Haskell, ‘+’ is a function, so 1+1+1+1+1 is actually

+(1, +(1, +(1, +(1, 1))

Page 15: Strictness-Unboxed explained

Normal FormThe form that can’t be further evaluated (or “reduced”)

5

Page 16: Strictness-Unboxed explained

Head Normal FormThe form that can’t be further evaluated if we only do evaluation at the “HEAD” position

+(1, +(1, +(1, +(1, 1))

Head

Outermost Bag

Here the head normal form = normal form

Page 17: Strictness-Unboxed explained

Trouble

(\x -> 1) ((fix (+1))

If we don’t evaluate at the head position first, then we are in trouble.

If we don’t open the outermost bag, maybe there would be infinite hamburgers inside!

Page 18: Strictness-Unboxed explained

Trouble

fix (+1)

But even if we evaluate at the head position, we are still not guaranteed to be fully evaluated in Haskell.

Head normal form doesn’t apply to Haskell in general (not for arbitrary terms)

Page 19: Strictness-Unboxed explained

Weak Head Normal FormWeak = “We are not guaranteed”Weak Head Normal Form = “We only evaluate at the head position, and only evaluate one step. To evaluate further, we are not guaranteed what would happen.”

Schrödinger’s Filet of Fish: Filet of Fish boxes could contain a Big Mac! We are not guaranteed unil we open it.

Page 20: Strictness-Unboxed explained

ThunkThunk is the expression that could still be reduced. (There are still bags!)

1+1+1+1+1

We are used to think that the above would be computed to value 5, but not for Haskell. It is what it is: (1+1+1+1+1)

Page 21: Strictness-Unboxed explained

Non-Strict vs Lazy● Non-strict is semantic, by definition it is something not equal to strict.● Strategy could be many, and lazy is just one of them.

Call-by-Need: Not evaluated until it is needed. It is the so called “lazy-evaluation”

Call-by-Name: a thunk is copied to every place inside the function body.

f x = x + x

f (1+1+1) => (1+1+1) + (1+1+1)

f (1+1+1)

call-by-name call-by-need

Page 22: Strictness-Unboxed explained

Non-Strict vs LazyCall-by-Need: Not evaluated until it is needed. It is the so called “lazy-evaluation”

Call-by-Name: a thunk is copied to every place inside the function body.

f x = x + x

f (1+1+1) => (1+1+1) + (1+1+1)=> 3 + 3

call-by-name call-by-need

call-by-value

f (1+1+1)=> f (3)=> 3+3

f (1+1+1) => let x = (1+1+1)=> x = 3 => therefore 3+3

Page 23: Strictness-Unboxed explained

Back to Haskell: sum

sum [] = 0sum (x:xs) = x + sum xs

Not tail recursion! It would create a stack frame for each recursive call.

Page 24: Strictness-Unboxed explained

sum’

sum’ acc [] = accsum’ acc (x:xs) = sum’ (acc+x) xs

This would not be reduced by default

It is tail recursion now, but still has a problem

Page 25: Strictness-Unboxed explained

sum’sum’ 0 [1,2,3,4]= sum’ (0+1) [2,3,4]= sum’ ((0+1)+2) [3,4]= sum’ (((0+1)+2)+3) [4]= sum’ ((((0+1)+2)+3)+4) []= ((((0+1)+2)+3)+4) = (((1+2)+3)+4)= ((3+3)+4)= (6+4)= 10

When the list is large enough, this would still cause stack overflow.

Page 26: Strictness-Unboxed explained

seq

seq :: a -> b -> b

This allows us to control the evaluation order, it would evaluate a first, then return b

let x = 1+2 in seq x (f x)

reduce the thunk before apply f

Page 27: Strictness-Unboxed explained

sum’

sum’ acc [] = accsum’ acc (x:xs) = let z = (acc+x) in seq z (sum’ z xs)

seq :: a -> b -> bit would evaluate a first, then return b

Page 28: Strictness-Unboxed explained

sum’sum’ 0 [1,2,3,4]= sum’ (1) [2,3,4]= sum’ (3) [3,4]= sum’ (6) [4]= sum’ (10) []= 10

No more stack overflow

Page 29: Strictness-Unboxed explained

Bang Patterns

sum’ !acc [] = accsum’ !acc (x:xs) = sum’ (acc+x) xs

{-# LANGUAGE BangPatterns -#}

For convenience, you don’t have to write so many ‘seq’s

Page 30: Strictness-Unboxed explained

deepseq

import Control.DeepSeq

deepseq :: NFData a => a -> b -> bdeepseq a b = rnf a `seq` b

-- A class of types that can be fully evaluated.class NFData a where rnf :: a -> () rnf a = a `seq` ()

NFData = Normal Form Data

rnf = reduce to normal form

Page 31: Strictness-Unboxed explained

deepseqinstance NFData a => NFData [a] where rnf [] = () rnf (x:xs) = rnf x `seq` rnf xs

instance (NFData a, NFData b, NFData c) => NFData (a,b,c) where rnf (x,y,z) = rnf x `seq` rnf y `seq` rnf z

Page 32: Strictness-Unboxed explained

Boxed vs Unboxed

The finite-precision integer type Int covers at least the range [ -2^29, 2^29 - 1]. As Int is an instance of the Bounded class, maxBound and minBound can be used to determine the exact Int range defined by an implementation

From Haskell98 Standard

One might imagine numbers naively represented in Haskell "as pointer to a heap-allocated object" which is either an unevaluated closure or is a "box" containing the number's actual value, which has now overwritten the closure

From HaskellWiki

No Definition in the Standard

Page 33: Strictness-Unboxed explained

Boxed vs UnboxedIt is GHC implementation detail. It is not defined in the Standard. It could be different in other implementation

Memory Layout of an Int

I# Int#One box is one machine word

Int is two words in GHC, one pointer of word-size pointing to a word-size heap object

Page 34: Strictness-Unboxed explained

Boxed vs UnboxedIn GHC, types ending in hashes are unboxed types: Int#, Float#, Double#,

Memory Layout of an Int#

Int#Only one machine word

Page 35: Strictness-Unboxed explained

(Int, Int)

Memory Layout of an (Int, Int)

I# Int#

I# Int#IP

7 machine words in total.

Page 36: Strictness-Unboxed explained

Unboxed Typeimport GHC.Prim

data IntPair = IP Int# Int#

Memory Layout of an IntPair

IP3 machine words in total.

Int# Int#

Page 37: Strictness-Unboxed explained

UNPACKdata IntPair = IP {-# UNPACK #-} !Int

{-# UNPACK #-} !Int

Memory Layout of an IntPair

IP3 machine words in total.

Int# Int#

Page 38: Strictness-Unboxed explained

Real World Examples#ifdef __GLASGOW_HASKELL__data UArray i e = UArray !i !i !Int ByteArray##endif

-- | Boxed vectors, supporting efficient slicing.data Vector a = Vector {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !(Array a) deriving ( Typeable )

Page 39: Strictness-Unboxed explained

Epilogue

To write high performance Haskell (or specifically in GHC), you have to understand Strict and Unboxed Types thoroughly.

Page 40: Strictness-Unboxed explained

Thank youshould ask McDonald sponsorship? lol