ACCU 2006 © Schalk W. Cronjé Practical Generative Programming Schalk W. Cronjé
May 10, 2015
ACCU 2006© Schalk W. Cronjé
Practical Generative Programming
Schalk W. Cronjé
ACCU 2006© Schalk W. Cronjé
Even in this new millennium, many engineers will still build components that have very little reuse potential due to the inflexible way that they were constructed.
This leads to excessive time required to adapt a component for usage in another system.
ACCU 2006© Schalk W. Cronjé
Welcome to the world of
Generative Programming
ACCU 2006© Schalk W. Cronjé
Themes
● GP 101● Building a single system● C++ footwork● Working with C● Working with dynamic languages● Building multiple systems● Integration & testing
ACCU 2006© Schalk W. Cronjé
Definition
It is a software engineering paradigm where the aim is to automatically manufacture highly customised and optimised intermediate or end-products from elementary, reusable components by means of configuration knowledge.
ACCU 2006© Schalk W. Cronjé
Elements of Generative Programming
Problem space Solution space
ConfigurationKnowledge
•Illegal feature combinations
•Default settings & dependencies
•Construction rules•Optimisations
Configured Components•Domain-specific concepts
•Features
Generic Components
ACCU 2006© Schalk W. Cronjé
Steps
● Domain scoping● Feature & concept modelling● Common architecture design, implementation and
technology identification● Domain-specific notations (DSLs)● Specify configuration knowledge (metadata)● Implement generic components● Apply configuration knowledge using generators
There is no specific order to these steps !
ACCU 2006© Schalk W. Cronjé
Configuration Knowledge vs Metadata● Configuration knowledge is the term preferred by
Czarnecki & Eisenecker● Configuration knowledge can be considered the
holistic encapsulation of all knowledge related to building all variants
● Metadata is probably a more codified form of configuration knowledge.
● Some people find the term metadata easier to grasp and less confusing than configuration knowledge
● The rest of this presentation uses the term metadata● DSL is a notation for capturing metadata
ACCU 2006© Schalk W. Cronjé
Key Code-level Strategy
ACCU 2006© Schalk W. Cronjé
For effective implementation there is one basic principle that encompasses most GP strategies:
Dijkstra's Separation of Concerns
This principle accepts the fact that we cannot deal with many issues at once and that important issues should be addressed with purpose.
ACCU 2006© Schalk W. Cronjé
Key Code-level Strategies
● Develop elementary components as generic components– Fully testable outside of the intended product
configuration● Configure these components using generated
artefacts appropriate to the programming language● Aim for zero cyclomatic-complexity in the generated
artefacts● Eliminate defects as early as possible
ACCU 2006© Schalk W. Cronjé
McCall's Quality Factors Addressed
● Correctness● Reliability● Usability● Maintainability● Portability● Efficiency
● Testability● Flexibility● Integrity● Reusability● Interoperability
ACCU 2006© Schalk W. Cronjé
Techniques for C++
ACCU 2006© Schalk W. Cronjé
Strategies for C++
● Templates are the C++ way to generic programming ● Develop elementary components as generic
components– Fully testable outside of the intended product
configuration● Configure these components using generated traits /
policy classes● Aim for zero cyclomatic-complexity in the generated
classes
ACCU 2006© Schalk W. Cronjé
Template Metaprogramming
● MPL is a key technology to build generic components– Best example is Boost C++ MPL
● MPL has been suggested as a domain-specific language– Metadata difficult to review to someone not familiar with
MPL● MPL should rather be used as implementation
strategy
ACCU 2006© Schalk W. Cronjé
Example #1: Configuration system
Name: NetworkPortDescription: Unrestricted port on which a service can be started
Type: uint16Minimum Value: 1024Maximum Value: 65535
ACCU 2006© Schalk W. Cronjé
Example #1: Configuration system
template <typename CfgAttr>typename CfgAttr::value_typeget_config();
std::cout << "The network port we'll use is " << get_config<NetworkPort>();
ACCU 2006© Schalk W. Cronjé
Example #1: A traits class
struct NetworkPort{ typedef uint16_t value_type; static const value_type const_min = 1024; static const value_type const_max = 65535;
// ... rest to follow};
ACCU 2006© Schalk W. Cronjé
Example #1: Alternative traits
Because non-integral types cannot be initialised inline, it might be more practical to use the following alternative.
struct NetworkPort{ typedef uint16_t value_type; static value_type min_value() {return 1024;} static value_type max_value() {return 65535;}
// ... rest to follow};
ACCU 2006© Schalk W. Cronjé
Example #1: Basic generic function
std::string get_cfg_string( const char* name ); template <typename CfgAttr>typename CfgAttr::value_typeget_config(){ // Calls a basic configuration interface function std::string tmp=get_cfg_string( CfgAttr::name() );
// Converts to appropriate type, throws exception // on conversion failure return boost::lexical_cast<typename CfgAttr::value_type>(tmp);}
ACCU 2006© Schalk W. Cronjé
Introducing run-time safety
● In order to protect the system against external invalid data we need to add boundary checks.– Use min_value(), max_value() from traits– Add a default_value() to handle missing data
● Additional features could include:– Throwing an exception, instead of defaulting, when data is
missing.
ACCU 2006© Schalk W. Cronjé
Example #1: Extending the functiontemplate <typename CfgAttr>typename CfgAttr::value_typeget_config(){ try { std::string tmp=get_cfg_string( CfgAttr::name() ); typedef typename CfgAttr::value_type vtype; vtype ret= boost::lexical_cast<vtype>(tmp); return CfgAttr::bounded(ret); } catch(boost::bad_lexical_cast const&) { return CfgAttr::default_value(); }}
ACCU 2006© Schalk W. Cronjé
Example #1: Updated traits
struct NetworkPort{ typedef uint16_t value_type; static value_type min_value() {return 1024;} static value_type max_value() {return 65535;} static value_type default_value {return 4321;}
static value_type& bounded(value_type& v_) { return v_=std::max(min_value(),std::min (v_,max_value())); }};
ACCU 2006© Schalk W. Cronjé
Capturing Metadata
● Various methods have been used for codifying metadata into a DSL– Text files– Graphical Tools– CASE Tools
● XML is a very convenient form for new projects – Semi-human readable– Text – Unrestricted source-control– Easy to transform to other formats
● Includes non-code artefacts– Custom editor can be created in Python or Java
● XML can restrict flexibility of DSL.
ACCU 2006© Schalk W. Cronjé
Example #1: Configuration system
<ConfigSystem> <Attr name="NetworkPort" adt="uint16"> <Description>Unrestricted port on which a service can be started</Description>
<Min>1024</Min> <Max>65535</Max> <Default>4321</Default> </Attr></ConfigSystem>
ACCU 2006© Schalk W. Cronjé
Prefer ADTs
● Use abstract data types (ADTs)● Use a XML lookup table to go from ADT to C++ type● Underlying C++ representation can be changed
without changing any of the metadata
ACCU 2006© Schalk W. Cronjé
Example #1: Simple Generator
<xsl:template match="Attr">struct <xsl:value-of select="@name"/>{ typedef <xsl:apply-templates select="." mode="adt"/>
value_type; static const char* name() {return "<xsl:value-of
select="@name"/>";} static value_type min_value() {return <xsl:value-of
select="Min/text()"/>;} static value_type max_value() {return <xsl:value-of
select="Max/text()"/>;} static value_type default_value() {return <xsl:value-of
select="Default/text()"/>;}};</xsl:template>
ACCU 2006© Schalk W. Cronjé
Example #1: Simple Generator<xsl:template match="Attr"> <xsl:text>struct </xsl:text> <xsl:value-of select="@name"/> <xsl:text> { 	 typedef </xsl:text> <xsl:apply-templates select="." mode="adt"/> <xsl:text> value_type; 	</xsl:text> <xsl:text>static const char* name() {return "</xsl:text> <xsl:value-of select="@name"/> <xsl:text>";} 	<xsl:text> <xsl:text>static value_type min_value() {return </xsl:text> <xsl:value-of select="Min/text()"/> <xsl:text>;} 	</xsl:text> <xsl:text>static value_type max_value() {return </xsl:text> <xsl:value-of select="Max/text()"/> <xsl:text>;} 	</xsl:text> <xsl:text>static value_type default_value() {return </xsl:text> <xsl:value-of select="Default/text()"/> <xsl:text>;} }; </xsl:text></xsl:template>
(with xsl:text)
ACCU 2006© Schalk W. Cronjé
ADT Lookup Table<types> <type adt="uint16"> <cpp type="uint16_t" quoted="no"/> </type> <type adt="string"> <cpp type="std::string" quoted="yes"/> </type><!-- adt: ADT name cpp/@type: What type to use on a C++ system cpp/@quoted: Whether to quote the type in a traits
class--></types>
ACCU 2006© Schalk W. Cronjé
Generating code documentation
/** Unrestricted port on which a service can * be started.** @ingroup Configuration*/struct NetworkPort{ // ... Generated traits};
ACCU 2006© Schalk W. Cronjé
Example #2
● Logging and reporting are aspects of most systems that cut across the architecture.
● There might be many requirements in your system, on how logging and reporting will be used.– Loggable entities– Levels of logging– User display issues– Localisation
● From a C++ point-of-view one important feature is how logging is generated at logging points
● Using a GP approach it is possible to introduce compile-time validation
ACCU 2006© Schalk W. Cronjé
Example #2: Legacy Logging
#define MINOR_FAILURE 10#define MAJOR_PROBLEM 20#define GENERAL_PANIC 30
void log_it( int id, const char* text );
// and then some cowboy programmer comes alonglog_it( MINOR_PROBLEM|GENERAL_PANIC, ”Voila!! An unsupported error”);
ACCU 2006© Schalk W. Cronjé
Example #2: Logging Metadata
<Logging> <Report id=”10” name=”MINOR_FAILURE”> <Text>The projector's bulb needs replacing</Text> </Report> <Report id=”20” name=”MAJOR_PROBLEM”> <Text>We're out of Belgium beer</Text> </Report> <Report id=”30” name=”GENERAL_PANIC”> <Text>David Beckham spotted outside Randolph</Text> </Report></Logging>
ACCU 2006© Schalk W. Cronjé
Example #2: Logging Function
template <typename Report>void log_it( Report const&, const char* text );
log_it( 3,”My code” ); // compile error
log_it( MAJOR_PROBLEM, “Out of German beer too” );
log_it( MINOR_FAILURE|MAJOR_PROBLEM, “No way” ); // Compile error
ACCU 2006© Schalk W. Cronjé
Example #2: Logging ID Class
// Define typeclass Reportable{ public: explicit Reportable( unsigned id_ ); unsigned id() const;};
// then do either, initialising MINOR_FAILURE in .cppextern const Reportable MINOR_FAILURE;
// ornamespace { const Reportable MINOR_FAILURE =
Reportable(10); }
ACCU 2006© Schalk W. Cronjé
Preventing C++ Code-bloat
● Only instantiate what is needed– For constant objects this is very easy using the MPL-value
idiom● Due to ways some linkers work, concrete code might be
included in a final link even if the code is not used, therefore only generate what is needed– Control the config elements available to a specific system
from metadata– Only generate the appropriate traits classes
● Cleanly separate common concrete code into a mixin class
ACCU 2006© Schalk W. Cronjé
The MPL-value idiom
template <unsigned V>class A{ public: static const A<V> value;
private: A();};
template <unsigned V>static const A<V> A<V>::value;
ACCU 2006© Schalk W. Cronjé
Logging Reworked
template <unsigned id_>class Reportable{ public: unsigned id() const {return id_;} static const Reportable<id_> value;};const Reportable<id_> Reportable<id_>::value;
typedef Reportable<10> MINOR_FAILURE;
log_it( MINOR_FAILURE::value,”Only instantiated when used”);
ACCU 2006© Schalk W. Cronjé
Adding logging actions
● A user might want to specify that some reports can have certain associated actions.
● For the logging example we might have – GO_BUY– MAKE_ANNOUNCEMENT– CALL_SECURITY.
● As this is configuration knowledge we can add this to the metadata and then generate appropriate metacode.
ACCU 2006© Schalk W. Cronjé
Example #2: Logging Metadata
<Logging> <Report id=”10” name=”MINOR_FAILURE”> <Action>GO_BUY</Action> </Report> <Report id=”20” name=”MAJOR_PROBLEM”> <Action>GO_BUY</Action> <Action>MAKE_ANNOUNCEMENT</Action> </Report> <Report id=”30” name=”GENERAL_PANIC”> <Action>CALL_SECURITY</Action> <Action>MAKE_ANNOUNCEMENT</Action> </Report></Logging>
ACCU 2006© Schalk W. Cronjé
Using MPL as glue
template <unsigned id_,typename actions_list>struct Reportable{ BOOST_STATIC_CONSTANT(unsigned,id=id_);
typedef actions_list valid_actions;};// Generated codetypedef Reportable<20, boost::mpl::vector<GO_BUY,MAKE_ANNOUNCEMENT> > MAJOR_PROBLEM;
ACCU 2006© Schalk W. Cronjé
Using SFINAE as validator
template <typename Report,typename Action>void log_it( const char* text, typename boost::enable_if< boost::mpl::contains< typename Report::valid_actions, Action > >::type*_= 0);
// Fails to compile – no suitable functionlog_it<GENERAL_PANIC,GO_BUY>("Bought Posh a drink");
// OK,log_it<MAJOR_PROBLEM,GO_BUY>("Imported some Hoegaarden");
ACCU 2006© Schalk W. Cronjé
Using static assertion as validator
template <typename Report,typename Action>void log_it( const char* text ) { BOOST_MPL_ASSERT(( boost::mpl::contains< typename Report::valid_actions, Action > ));
// implementation follows after MPL assert ...}
ACCU 2006© Schalk W. Cronjé
Techniques for C
ACCU 2006© Schalk W. Cronjé
Strategies for C
● C has nowhere near the power and flexibility of C++, but basic principles remain the same– Configure generic components using generated
macros– Aim for zero cyclomatic-complexity in the macros
● Use indirect naming in order to attempt a bit of compile-time safety– Hide void* and varargs from programmer
● Use types as a strategy for compile-time validation– C does not place parameter types in symbols– Type manipulation will not create excessive
symbol tables
ACCU 2006© Schalk W. Cronjé
Example #3: C Configuration system
/* Public interface via macros */#define CONFIG_TYPE( CFGATTR ) .....#define CONFIG_TOKEN( CFGATTR,EXTRA ) ....#define CONFIG_GET( CFGATTR,VAR ) ....
CONFIG_TYPE(NetworkPort) v;printf( "The network port we'll use is " CONFIG_TOKEN(NetworkPort,"") "\n", * CONFIG_GET(NetworkPort,v)
);
ACCU 2006© Schalk W. Cronjé
Example #3: Implementation
void* config_get__( char const* pName_, size_t varsize_, void * var_, ...);
Pass in order:•Function for setting default•Pointer to default value•Function for setting bounds•Pointer to minimum value•Pointer to maximum value
ACCU 2006© Schalk W. Cronjé
Example #3: Generic Macros#define CONFIG_TYPE( CFGATTR ) \ AUTOGENCFG_TYPE_##CFGATTR
#define CONFIG_TOKEN( CFGATTR,EXTRA ) \ "%" EXTRA AUTOGENCFG_TOKEN_##CFGATTR
#define CONFIG_GET( CFGATTR,VAR ) \ ( CONFIG_TYPE(CFGATTR) *) config_get__( \ AUTOGENCFG_NAME_##CFGATTR, \
sizeof( CONFIG_TYPE(CFGATTR) ), \ & VAR, \ AUTOGENCFG_DEFAULTF_##CFGATTR, \ AUTOGENCFG_DEFAULT_##CFGATTR, \ AUTOGENCFG_BOUNDF_##CFGATTR, \ AUTOGENCFG_MIN_##CFGATTR, \ AUTOGENCFG_MAX_##CFGATTR )
Hide implementation inside macro
ACCU 2006© Schalk W. Cronjé
Example #3: Generated Macros
#define AUTOGENCFG_TYPE_NetworkPort uint16_t#define AUTOGENCFG_TOKEN_NetworkPort "hu"#define AUTOGENCFG_NAME_NetworkPort \ "sys.network.port"
#define AUTOGENCFG_DEFAULTF_NetworkPort \ &config_set_default__
#define AUTOGENCFG_DEFAULT_NetworkPort 1234#define AUTOGENCFG_BOUNDF_NetworkPort \ &config_bound_ushort__
#define AUTOGENCFG_MIN_NetworkPort 1024#define AUTOGENCFG_MAX_NetworkPort 65535
ACCU 2006© Schalk W. Cronjé
Updated ADT Lookup Table<types> <type adt="uint16"> <cpp type="uint16_t" quoted="no"/> <c type="uint16_t" quoted="no"/> </type> <type adt="string"> <cpp type="std::string" quoted="yes"/> <c type="const char*" quoted="yes"/> </type><!-- c/@type: What type to use on a C system c/@quoted: Whether to quote the type in a macro
definition--></types>
ACCU 2006© Schalk W. Cronjé
Example #4: C Logging System
/* Public interface via macro */
#define LOG_IT( LOGNAME,ACTION,TEXT ) ....
// Valid ruleLOG_IT( MAJOR_PROBLEM,GO_BUY, "Bought some Hoegaarden" );
// Invalid combination – fails to compileLOG_IT( MAJOR_PROBLEM,CALL_SECURITY, "Bought some Hoegaarden" );
ACCU 2006© Schalk W. Cronjé
Example #4: Generated C
/* Events and as macros */#define AUTOGENLOG_ID_MINOR_FAILURE 10#define AUTOGENLOG_ID_MAJOR_PROBLEM 20#define AUTOGENLOG_ID_GENERAL_PANIC 30
#define AUTOGENLOG_ACTION_GO_BUY 1#define AUTOGENLOG_ACTION_CALL_SECURITY 2#define AUTOGENLOG_ACTION_MAKE_ANNOUNCEMENT 3
/* Valid combinations as union */union AUTOGENLOG_ACTIONS_MAJOR_PROBLEM { unsigned long GO_BUY; unsigned long MAKE_ANNOUNCEMENT; };
ACCU 2006© Schalk W. Cronjé
Example #4: C Implementation
/* Implementation function Internals left as exercise for the reader*/void log_it__( unsigned long id, unsigned long action, const char* text );
#define LOG_IT( LOGNAME,ACTION,TEXT ) do { \ auto union AUTOGENLOG_ACTIONS_##LOGNAME x; \ x. ACTION = AUTOGENLOG_ACTION_##ACTION; \ log_it__( AUTOGENLOG_ID_##LOGNAME,x. ACTION,TEXT); \} while(0)
Use union member toperform compile-time validation
ACCU 2006© Schalk W. Cronjé
Preventing C Code-bloat
● Keep generated code in macros and type definitions where possible.
● Keep executable code inside macros simplistic and to a minimum
● Rely on compiler optimisation for where duplication cannot be avoided
● Cleanly separate common code into testable functions
ACCU 2006© Schalk W. Cronjé
Techniques for Scripting / Dynamic Languages
ACCU 2006© Schalk W. Cronjé
Dispelling Myths about Reflection
● Myth #1: I don't need GP because language X has reflection
● Myth #2: I don't need reflection because I am using GP
● Fact: GP maps the domain knowledge, captured in a non-programming language, into a programming language. – If reflection is the most effective way of doing this
in language X, then it should be used.
ACCU 2006© Schalk W. Cronjé
Problems with Build-time Validation
● Early validation at build-time is not always trivial
● Perl can sometimes use -c switch● JavaScript is difficult● Sometimes validation has to be pushed out to
unit tests– Should never require system testing to provide
the validation
ACCU 2006© Schalk W. Cronjé
Example #5: JavaScript Configuration
// For writing and reading configuration files
function get_config( CFGNAME );function get_config_text( CFGNAME );function set_config( CFGNAME, new_value );
document.write( get_config_text(NetworkPort), ": ", get_config(NetworkPort));
Will display text descriptionfrom metadata (suitably localised)
ACCU 2006© Schalk W. Cronjé
Example #5: Generated JS Traits
// Autogenerated 'traits'var NetworkPort = new Object;NetworkPort.min = 1024;NetworkPort.max = 65535;NetworkPort.defaultvalue= 1234;NetworkPort.path= "subsystem.network.port";
// Validate functions are first-class variablesNetworkPort.validate= validate_integral;
// Localised text settingNetworkPort.descr.en = "Unrestricted port on which a service can be started";
ACCU 2006© Schalk W. Cronjé
Example #5: JS Config Read
// Reads a variable from a path in a config file// Implementation is system-dependantfunction get_from_config_file( Path ) { /* ... */ }
function get_config( CFGNAME ){
var tmp = get_from_config_file( CFGNAME.path );if( tmp == null )
return CFGNAME.defaultvalue;else
return tmp;}
ACCU 2006© Schalk W. Cronjé
Example #5: JS Config Write
// Writes a new value to the path in a config file// Implementation is system-dependant
function set_in_config_file( Path,Value ) {/* .... */}
function set_config( CFGNAME, new_value ){ if(typeof (new_value) != typeof (CFGNAME.defaultvalue) )
throw "Invalid Type Applied"; CFGNAME.validate(new_value,CFGNAME.min,CFGNAME.max); set_in_config_file( CFGNAME.path, new_value);}
ACCU 2006© Schalk W. Cronjé
Updated ADT Lookup Table<types> <type adt="uint16"> ... <js type="Number" quoted="no"/> </type> <type adt=”string”> ... <js type="String" quoted="yes"/> </type><!-- js/@type: What type to use in JavaScript js/@quoted: Whether to quote the type initialisation--></types>
ACCU 2006© Schalk W. Cronjé
Updated ADT Lookup Table<types> <type adt="uint16"> ... <js quoted="no"/> </type> <type adt="string"> ... <js quoted="yes"/> </type><!-- js/@quoted: Whether to quote the type initialisation--></types>
(no types)
ACCU 2006© Schalk W. Cronjé
The Bigger Picture
ACCU 2006© Schalk W. Cronjé
Multiple Systems
● Examples until now have shown the GP steps for a configuration system and a logging system.
● The next step is to apply these to three systems:– System 1 uses XML files for configuration and sends logs
to syslog.– System2 uses INI files, and sends logs to NT Evlog– System 3 keeps configuration in a binary format (read-only)
and sends logs via SNMP.
ACCU 2006© Schalk W. Cronjé
Example Product Metadata
<Products> <System id=”1”> <Config type=”xml”/> <Logging type=”syslog”/> <Functionality> ... <Functionality> </System> <System id=”2”> <Config type=”ini”/> <Logging type=”ntevlog”/> <Functionality> ... <Functionality> </System></Products>
ACCU 2006© Schalk W. Cronjé
Building Multiple Systems
● Four generators can be applied to this product metadata.– Two of them we have already seen– These will generate configurations and logging aspects
● Another generator looks at logging and configurations and adds the appropriate subsystems.
● A fourth generator looks at the functionality and loads up all of the functional classes for the system– A creative exercise for the reader …
ACCU 2006© Schalk W. Cronjé
Integration
● Many modern systems are multi-language / multi-platform
● These techniques extend easily into other programming languages / development environments
● The same configuration knowledge remains the driver
● Localisation data can be generated in various formats● Parts of technical documents can also be generated.
ACCU 2006© Schalk W. Cronjé
Bad smells
● The unit tests are generated– How can you verify that the generated tests is correct?– Generating test inputs & outputs are acceptable
● There is business logic in the generators– The DSL is probably incorrect
● The generated code is edited before usage– Artefacts are not templates
● Every build takes very long– Dependency checking must be improved
ACCU 2006© Schalk W. Cronjé
Further Reading
● www.program-transformation.org● www.generative-programming.org● www.codegeneration.net● research.microsoft.com● www.martinfowler.com