The Formalisation of Haskell Refactorings Huiqing Li Simon Thompson Computing Lab, University of Kent www.cs.kent.ac.uk/projects/refactor-fp/
The Formalisation of Haskell Refactorings
Huiqing LiSimon Thompson
Computing Lab, University of Kent www.cs.kent.ac.uk/projects/refactor-fp/
04/18/23 TFP 2005 2
Outline
Refactoring HaRe: The Haskell Refactorer Formalisation of Haskell Refactorings Formalisation of Generalise a Definition Conclusion and Future Work
04/18/23 TFP 2005 3
Refactoring What? Changing the structure of existing code
without changing its meaning. Where and why? Development, maintenance, …
To make the code easier to understand and modify
To improve code reuse, quality and productivity
Essential part of the programming process.
04/18/23 TFP 2005 4
HaRe – The Haskell Refactorer A tool for refactoring Haskell 98 programs. Full Haskell 98 coverage. Driving concerns: usability and extensibility. Implemented in Haskell, using Programatica’s
frontends and Strafunski’s generic traversals. Integrated with the two program editors:
(X)Emacs and Vim. Preserves both comments and layout style of
the source.
04/18/23 TFP 2005 5
Refactorings Implemented in HaRe
Structural Refactorings
Module Refactorings
Data-Oriented Refactorings
04/18/23 TFP 2005 6
Refactorings Implemented in HaRe Structural Refactorings
Generalise a definition
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
module Main (main) where
f z y = y : f z (y + z)
main y = print $ f 1 10
04/18/23 TFP 2005 7
Refactorings Implemented in HaRe Structural Refactorings (cont.)
Rename an identifier Promote/demote a definition to widen/narrow its scope Delete an unused function Duplicate a definition Unfold a definition Introduce a definition to name an identified expression Add an argument to a function Remove an unused argument from a function
04/18/23 TFP 2005 8
Refactorings Implemented in HaRe Module Refactorings
Move a definition from one module to another module
module Test (f) where
f y = y : f (y + 1) module Main where import Test
main = print $ f 10
module Test ( ) where
module Main whereimport Test
f y = y : f (y + 1) main = print $ f 10
04/18/23 TFP 2005 9
Refactorings Implemented in HaRe Module Refactorings (cont.)
Clean the imports Make the used entities explicitly imported Add an item to the export list Remove an item from the export list
04/18/23 TFP 2005 10
Refactorings Implemented in HaRe Data-oriented Refactorings
From concrete to abstract data-type (ADT), which is a composite refactoring built from a sequence of primitive refactorings. Add field labels Add discriminators Add constructors Remove (nested) patterns Create ADT interface
04/18/23 TFP 2005 11
Formalisation of Refactorings Advantages:
Clarify the definition of refactorings in terms of side-conditions and transformations.
Improve our confidence in the behaviour-preservation of refactorings.
Guide the implementation of refactorings. Reduce the need for testing.
Challenges: Haskell is a non-trivial language. Haskell does not have an officially defined semantics.
04/18/23 TFP 2005 12
Formalisation of Refactorings
Our Strategy: Start from a simple language (letrec). Extend the language gradually to formalise
more complex refactorings.
04/18/23 TFP 2005 13
Formalisation of Refactorings
The specification of a refactoring contains four parts: The representation of the program before the
refactorings, say P1
The side-conditions for the refactoring. The representation of the program after the
refactorings, say P2.
A proof showing that P1 and P2 have the same
functionality under the side-conditions.
04/18/23 TFP 2005 14
Formalisation of Refactorings
The -calculus with letrec (letrec) Syntax of letrec terms.
E ::= x | x.E | E1 E2 | letrec D in E D ::= | xi=Ei | D, D
Use the call-by-name semantics developed by Zena M. Ariola and Stefan Blom in the paper Lambda Calculi plus letrec.
04/18/23 TFP 2005 15
Formalisation of Generalisation Recall the example
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
module Main (main) where
f z y = y : f z (y + z)
main = print $ f 1 10
04/18/23 TFP 2005 16
Formalisation of Generalisation Formal definition of Generalisation using letrec
Given the expression:
Assume E is a sub-expression of Ei, and Ei= C[E].
letrec x1=E1, ..., xi =Ei , ..., xn =En in E0
04/18/23 TFP 2005 17
Formalisation of Generalisation
Formal definition of Generalisation using letrec
The condition for generalising the definition xi=Ei on E is:
xiFV(E ) Æ 8x, e: (x 2FV(E ) Æ e 2 sub(Ei,C) ) x 2FV(e))
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
04/18/23 TFP 2005 18
Formalisation of Generalisation
Formal definition of Generalisation using letrec
The condition for generalising the definition xi=Ei on E is:
xiFV(E ) Æ 8x, e: (x 2FV(E ) Æ e 2 sub(Ei,C) ) x 2FV(e))
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
04/18/23 TFP 2005 19
Formalisation of Generalisation
Formal definition of Generalisation using letrec
The condition for generalising the definition xi=Ei on E is:
xiFV(E ) Æ 8x, e: (x 2FV(E ) Æ e 2 sub(Ei,C) ) x 2FV(e))
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
04/18/23 TFP 2005 20
Formalisation of Generalisation
Formal definition of Generalisation using letrec
The condition for generalising the definition xi=Ei on E is:
xiFV(E ) Æ 8x, e: (x 2FV(E ) Æ e 2 sub(Ei,C) ) x 2FV(e))
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
04/18/23 TFP 2005 21
Formalisation of Generalisation
Formal definition of Generalisation using letrec
After generalisation, the original expression becomes:
letrec x1= E1 [xi := xiE],
..., xi = z.C[z][xi:=xi z],
..., xn = En [xi:= xi E]
in E0 [xi:= xi E],
where z is a fresh variable.
module Main (main) where
f z y = y : f z (y + z)
main = print $ f 1 10
module Main (main) where
f y = y : f (y + 1)
main = print $ f 10
04/18/23 TFP 2005 22
Formalisation of Generalisation Formal definition of Generalisation using letrec
Proof. Decompose the transformation into a number of sub steps, if each sub step is behaviour-preserving, then the transformation is behaviour-preserving.
04/18/23 TFP 2005 23
Formalisation of Generalisation
Step1: add definition x = z.C[z] , where x and z are
fresh variables, and C[E]=Ei.
module Main (main) where
f y = y : f (y +1)
x z y = y : f ( y + z)
main = print $ f 10
letrec x1=E1, ..., xi =Ei, x = z.C[z], ..., xn =En in E0
Step 2: Replace Ei with x E. (Note: Ei = x E)
04/18/23 TFP 2005 24
Formalisation of Generalisation
Step 2: Replace Ei with x E. (Note: Ei = x E)
module Main (main) where
f y = x 1 y
x z y = y : f ( y + z)
main = print $ f 10
letrec x1=E1, ..., xi = x E, x = z.C[z], ..., xn =En in E0
Step 3: Unfolding xi in the right-hand side of x.
04/18/23 TFP 2005 25
Formalisation of Generalisation
Step 3: Unfolding xi in the right-hand side of x.
module Main (main) where
f y = x 1 y
x z y = y : x 1 ( y + z)
main = print $ f 10
letrec x1=E1, ..., xi = x E, x = z.C[z] [x_i:= x E], ..., xn =En in E0
Step 4: In the definition of x, replace E with z, and prove this does not change the semantics of x E.
04/18/23 TFP 2005 26
Formalisation of Generalisation
Step 4: In the definition of x, replace E with z. and prove this does not change the semantics of x E.
module Main (main) where
f y = x 1 y
x z y = y : x z ( y + z)
main = print $ f 10
letrec x1=E1, ..., xi = x E, x = z.C[z] [x_i:= x z], ..., xn =En in E0
Step 5: Unfolding the occurrences of xi.
04/18/23 TFP 2005 27
Formalisation of Generalisation
Step 5: Unfolding the occurrences of xi.
module Main (main) where
f y = x 1 y
x z y = y : x z ( y + z)
main = print $ x 1 10
letrec x1=E1 [xi:= x E] , ..., xi = x E, x = z.C[z] [xi:= x z], ..., xn =En [xi:= x E] in E0 [xi:= x E]
Step 6: Remove the definition of xi.
04/18/23 TFP 2005 28
Formalisation of Generalisation
Step 6: Remove the definition of xi.
module Main (main) where
x z y = y : x z ( y + z)
main = print $ x 1 10
letrec x1=E1 [xi:= x E] , ..., x = z.C[z] [xi:= x z], ..., xn =En [xi:= x E] in E0 [xi:= x E]
Step 7: Rename x to xi and simplify the substitution.
04/18/23 TFP 2005 29
Formalisation of Generalisation
module Main (main) where
f z y = y : f z ( y + z)
main = print $ f 1 10
letrec x1=E1 [xi:= x E] [x:=xi] , ..., x = z.C[z] [xi:= x z] [x:=xi], ..., xn =En [xi:= x E] [x:=xi] in E0 [xi:= x E] [x:=xi]
letrec x1= E1 [xi := xiE], ..., xi = z.C[z][xi:=xi z], ..., xn = En [xi:= xi E] in E0 [xi:= xi E]
04/18/23 TFP 2005 30
Formalisation of Refactorings
letrec has been extended to model the Haskell module system (M).
The move a definition from one module to another refactoring has also been formalised
using M.
04/18/23 TFP 2005 31
Conclusion and Future Work
Formalisation helps to clarify the side-conditions and transformation rules.
Improves our confidence about the behaviour-preservation of refactorings.
Future: Extend the calculus to formalise more complex
refactorings. Formalise the composition of refactorings.