Also in this issue: Joint ERCIM Actions: ERCIM 25 Years Celebration Keynote: The Future of ICT: Blended Life by Willem Jonker, CEO EIT ICT Labs Research and Innovation: Learning from Neuroscience to Improve Internet Security ERCIM NEWS www.ercim.eu Number 99 October 2014 Special theme Software Quality 25 years ERCIM: Challenges for ICST
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Also in this issue:
Joint ERCIM Actions:
ERCIM 25 Years Celebration
Keynote:
The Future of ICT: Blended Life
by Willem Jonker, CEO EIT ICT Labs
Research and Innovation:
Learning from Neuroscience to
Improve Internet Security
ERCIM NEWSwww.ercim.eu
Number 99 October 2014
Special theme
SoftwareQuality
25 years ERCIM:
Challenges for ICST
ERCIM NEWS 99 October 2014
ERCIM News is the magazine of ERCIM. Published quarterly, it
reports on joint actions of the ERCIM partners, and aims to reflect
the contribution made by ERCIM to the European Community in
Information Technology and Applied Mathematics. Through short
articles and news items, it provides a forum for the exchange of infor-
mation between the institutes and also with the wider scientific com-
munity. This issue has a circulation of about 6,000 printed copies and
January 2015, Special theme: Scientific Data Sharing
Cover image: an Intel processor wafer.
Photo: Intel Corporation.
Editorial Information
JOINT ERCIM ACTIONS
4 ERCIM “Alain Bensoussan” Fellowship Programme
4 ERCIM 25 Years Celebration
25 YEARS ERCIM: FuTuRE CHALLENgES FOR ICST
6 Intermediation Platforms, an Economic Revolution
by Stéphane Grumbach
8 Enabling Future Smart Energy Systems
by Stefan Dulman and Eric Pauwels
9 Will the IT Revolution Cost Our Children Their Jobs?
by Harry Rudin
11 The Next Boom of Big Data in Biology:
Multicellular Datasets
by Roeland M.H. Merks
13 Looking Towards a Future where Software is
Controlled by the Public (and not the other way round)
by Magiel Bruntink and Jurgen Vinju
14 Scaling Future Software: The Manycore Challenge
by Frank S. de Boer, Einar Broch Johnsen, DaveClarke, Sophia Drossopoulou, Nobuko Yoshida andTobias Wrigstad
KEYNOTE
5 The Future of ICT: Blended Life
by Willem Jonker, CEO EIT ICT Labs
SPECIAL THEME
The special theme section “Software Quality” has been
coordinated by Jurgen Vinju, CWI and Anthony Cleve,
University of Namur.
Introduction to the Special Theme
16 Software Quality
by Jurgen Vinju and Anthony Cleve, guest editors forthe special theme section
17 Monitoring Software Quality at Large Scale
by Eric Bouwers, Per John and Joost Visser
18 OSSMETER: A Health Monitoring System for OSS
Projects
by Nicholas Matragkas, James Williams and DimitrisKolovos
19 Monitoring Services Quality in the Cloud
by Miguel Zuñiga-Prieto, Priscila Cedillo, JavierGonzalez-Huerta, Emilio Insfran and Silvia Abrahão
Contents
ERCIM NEWS 99 October 2014 3
20 Dictō: Keeping Software Architecture Under Control
by Andrea Caracciolo, Mircea Filip Lungu and OscarNierstrasz
22 Dedicated Software Analysis Tools
by Nicolas Anquetil, Stéphane Ducasse and Usman Bhatti
23 Mining Open Software Repositories
by Jesús Alonso Abad, Carlos López Nozal and JesúsM. Maudes Raedo
25 A Refactoring Suggestion Tool for Removing Clones
in Java Code
by Francesca Arcelli Fontana, Marco Zanoni andFrancesco Zanoni
26 Debugging with the Crowd: A Debug
Recommendation System Based on StackOverflow
by Martin Monperrus, Anthony Maia, Romain Rouvoyand Lionel Seinturier
27 RiVal: A New Benchmarking Toolkit for
Recommender Systems
by Alan Said and Alejandro Bellogín
29 Evaluating the Quality of Software Models using
Light-weight Formal Methods
by Jordi Cabot and Robert Clarisó
31 KandISTI: A Family of Model Checkers for the
Analysis of Software Designs
by Maurice ter Beek, Stefania Gnesi and FrancoMazzanti
33 QVTo Model Transformations: Assessing and
Improving their Quality
by Christine M. Gerpheide, Ramon R.H. Schiffelersand Alexander Serebrenik
34 Redundancy in the Software Design Process is
Essential for Designing Correct Software
by Mark G.J. van den Brand and Jan Friso Groote
35 Estimating the Costs of Poor Quality Software: the
ICEBERG Project
by Luis Fernández, Pasqualina Potena and Daniele Rosso
37 Software Quality in an Increasingly Agile World
by Benoît Vanderose, Hajer Ayed and Naji Habra
38 Improving Small-to-Medium sized Enterprise
Maturity in Software Development through the Use
of ISO 29110
by Jean-Christophe Deprez, Christophe Ponsard andDimitri Durieux
39 Software Product Quality Evaluation Using
ISO/IEC 25000
by Moisés Rodríguez and Mario Piattini
40 High-Level Protocol Engineering without
Performance Penalty for Multi-Core
by Farhad Arbab, Sung-Shik Jongmans and Frank de Boer
EvENTS, IN BRIEF
Announcements
54 CLOUDFLOW First Open Call for Application
Experiments
54 7th International Conference of the ERCIM WG
on Computational and Methodological Statistics
54 Research Data Alliance and Global Data and
Computing e-Infrastructure challenges
In Brief
55 Altruism in Game Theory
55 World’s First Patient Treated by Full 3-D Image
Guided Proton Therapy
55 Essay Contest Prize for Tommaso Bolognesi
RESEARCH ANd INNOvATION
This section features news about research activities and
innovative developments from European research institutes
42 InterpreterGlove - An Assistive Tool that can Speak
for the Deaf and Deaf-Mute
by Péter Mátételki and László Kovács
44 The OFSE-Grid: A Highly Available and Fault
Tolerant Communication Infrastructure based on
Openflow
by Thomas Pfeiffenberger, Jia Lei Du and PedroBittencourt Arruda
46 Learning from Neuroscience to Improve Internet
Security
by Claude Castelluccia, Markus Duermuth and FatmaImamoglu
47 Mathematics Saves Lives: The Proactive Planning of
Ambulance Services
by Rob van der Mei and Thije van Barneveld
48 An IoT-based Information System Framework
towards Organization Agnostic Logistics: The
Library Case
by John Gialelis and Dimitrios Karadimas
49 Lost Container Detection System
by Massimo Cossentino, Marco Bordin, IgnazioInfantino, Carmelo Lodato, Salvatore Lopes, PatriziaRibino and Riccardo Rizzo
51 Smart Buildings: An Energy Saving and Control
System in the CNR Research Area, Pisa
by Paolo Barsocchi, Antonino Crivello, Erina Ferro,Luigi Fortunati, Fabio Mavilia and Giancarlo Riolo
52 T-TRANS: Benchmarking Open Innovation
Platforms and Networks
by Isidoros A. Passas, Nicos Komninos and Maria Schina
Joint ERCIM Actions
ERCIM NEWS 99 October 20144
ERCIM “Alain
Bensoussan”
Fellowship
Programme
ERCIM offers fellowships for PhD
holders from all over the world.
Topics cover most disciplines in ComputerScience, Information Technology, andApplied Mathematics.
Fellowships are of 12-month duration, spentin one ERCIM member institute.Fellowships are proposed according to theneeds of the member institutes and the avail-able funding.
Conditions
Applicants must:• have obtained a PhD degree during the
last 8 years (prior to the applicationdeadline) or be in the last year of the the-sis work with an outstanding academicrecord
• be fluent in English• be discharged or get deferment from mil-
itary service• have completed the PhD before starting
the grant.
In order to encourage mobility:• a member institute will not be eligible to
host a candidate of the same nationality.• a candidate cannot be hosted by a mem-
ber institute, if by the start of the fellow-ship, he or she has already been workingfor this institute (including phd or post-doc studies) for a total of 6 months ormore, during the last 3 years.
Application deadlines
30 April and 30 September
More information and application form:
http://fellowship.ercim.eu/
ERCIM 25 Years Celebration
The 25th ERCIM anniversary and the ERCIM fall meetings will be
held at the CNR Campus in Pisa on 23-24 October 2014.
On the occasion of ERCIM’s 25th anniversary, a special session andpanel discussion will be held on Thursday 23 October in the afternoon inthe auditorium of the CNR Campus. Speakers and representatives fromresearch, industry, the European Commission, and the ERCIM commu-nity will present their views on research and future developments ininformation and communication science and technology:
Programme
14:00 - 16:45 • Welcome address by Domenico Laforenza, President of ERCIM
AISBL (CNR)• Alberto Sangiovanni Vincentelli, Professor, Department of
Electrical Engineering and Computer Sciences, University ofCalifornia at Berkeley: “Let’s get physical: marrying computing withthe physical world”
• Carlo Ratti, Director, Senseable Lab, MIT, USA: “The Senseable City”
• Alain Bensoussan, International Center for Decision and RiskAnalysis, School of Management, The University of Texas at Dallas,ERCIM co-founder: “Big data and big expectations: Is a successfulmatching possible?”
• Rigo Wenning, W3C’s legal counsel, technical coordinator of the EUSTREWS project on Web security: presentation of the ERCIM WhitePaper “Security and Privacy Research Trends and Challenges”
• Fosca Giannotti, senior researcher at ISTI-CNR, head of theKDDLab: presentation of the ERCIM White Paper “Big DataAnalytics: Towards a European Research Agenda”
• Radu Mateescu, senior researcher at Inria Grenoble - Rhône-Alpes,head of the CONVECS team: “Two Decades of Formal Methods forIndustrial Critical Systems”
• Emanuele Salerno, senior researcher at ISTI-CNR: “MUSCLE: fromsensing to understanding”
16:45 - 17:15 Coffee break
17:15 - 18:45Panel: “ICT Research in Europe: How to reinforce the cooperationbetween the main actors and stakeholders”. Panelists: • Carlo Ghezzi (Moderator), President of Informatics Europe
(Politecnico di Milano, Italy)• Domenico Laforenza, President of ERCIM AISBL (CNR)• Fabio Pianesi, Research Director, ICT Labs, European Institute for
Innovations and Technology• Fabrizio Gagliardi, President of ACM Europe • Jean-Pierre Bourguignon, President of the European Research
Council (ERC) • Paola Inverardi, ICT Italian Delegate, Rector of the University of
L’Aquila, Italy• Thomas Skordas, Head of the FET Flagships Unit, European
Today we live a blended life. This blended life is a direct con-sequence of the deep penetration of Information andCommunication Technology (ICT) into almost every area ofour society. ICT brings ubiquitous connectivity and informa-tion access that enables disruptive innovative solutions toaddress societal megatrends such as demographic changes,urbanization, increased mobility and scarcity of naturalresources. This leads to a blended life in the sense that thephysical and virtual worlds are merging into one where phys-ical encounters with friends and family are seamlessly inte-grated with virtual encounters on social networks. A blendedlife in the sense that work and private life can be combined ina way that offers the flexibility to work at any time from anylocation. A blended life combining work and life-long educa-tion facilitated by distance learning platforms that offer us apersonalized path to achieving our life and career goals.Industries experience a blended life owing to the deepembedding of ICT into their production methods, productsand services. Customers experience a blended life where ICTallows industries to include consumers in production,blending them into ‘prosumers’.
Blended life is becoming a reality and as such it brings bothopportunities and challenges. On the one hand, it allows us tomaintain better contact with people we care about, yet at thesame time it is accompanied by a level of transparency thatraises privacy concerns. The blending of private life andwork has the clear advantage of combining private and pro-fessional obligations, but at the same time introduces thechallenge of maintaining a work-life balance. The blendingof products and services leads to personalization of offeringsbut, at the same time, the huge range of choice can be con-fusing for consumers. Blended production leads to shortersupply chains and cost-effective production, yet disruptsexisting business models, resulting in considerable socialimpact.
Key drivers in the development of ICT itself, include futurenetwork technology (such as 5G, Internet of Things, SensorNetworks) at the communication layer and CloudComputing at the information-processing layer (such asSoftware as a Service and Big Data Processing andAnalytics). The main challenge here is to deal with the hugeamounts of heterogeneous data both from a communicationas well as an information processing perspective.
When it comes to the application of ICT in various domains,we see huge disruptions occurring both now and in the futurein domains such as social networks, healthcare, energy, pro-duction, urban life, and mobility. Here the main challenge isto find a blending that simultaneously drives economicgrowth and quality of life. There are many domain-specifictechnical challenges, such as sensor technology for contin-uous health monitoring, cyber-physical systems for theindustrial Internet, 3D-printing, smart-grids for energy
supply, tracking and tracing solutions for mobility. Social,economic, and legal challenges are key to successful innova-tion in this area.
The issue of privacy is a prime example. The domains men-tioned above are highly sensitive in regard to privacy. ICTthat allows instant proliferation of information and contin-uous monitoring of behaviour can be perceived as a personalinfringement; as a result several innovations have beenslowed down, blocked or even reversed.
Innovations addressing societal challenges should involvesocial, economic, technical and legal specialists right fromthe inception, to map out the potential issues in a multidisci-plinary way in order to ensure a proper embedding intosociety thus preventing these issues to become innovationblockers.
This approach is at the heart of EIT ICT Labs (www.eitict-labs.eu), a leading European player in ICT Innovation andEducation supported by the European Institute of Innovationand Technology. Its mission is to foster entrepreneurial talentand innovative technology for economic growth and qualityof life. EIT ICT Labs is active in many of the core ICT devel-opments as well as the embedding of ICT in the above-men-tioned domains. Education is an integral part of the EIT ICTLabs approach, since human capital is considered essential inbringing ICT Innovations to life.
Today we live a blended life. At the same time this blendedlife is only just beginning. Rapid developments in ICT willfurther drive the penetration of ICT into almost all areas ofsociety leading to many disruptions. The key challengeahead is to make sure that this blended life combines eco-nomic growth with high quality of life, which can only beachieved via a multidisciplinary innovation approach.
Willem Jonker
Willem Jonker , CEO EIT ICT Labs
The first intermediation platforms that were deployed at avery large scale were search engines. Introduced in the late1990s, the primary purpose of search engines is to connectpeople with the information they are looking for. Meanwhiletheir business model relies on their secondary service whichis effectively targeting ads to users. Search engines rely onvery complex algorithms to rank the data and provide rele-vant answers to queries. In the last decade, intermediationplatforms have successfully penetrated an increasing numberof sectors, mostly in the social arena.
All intermediation platforms essentially rely on the samearchitecture. To begin with, they collect huge amounts ofdata which can come from the outside world (e.g., web pagesfor search engines) or be hosted by the platform (e.g., socialnetworks). However, they are never produced by the plat-form itself but rather, by the people, services or thingsaround it. These primary data are then indexed and trans-formed to extract information that fuels the primary servicesoffered.
The activities of users on platforms generate secondary data.This secondary data essentially consists of traces which theplatform generally has exclusive rights to, and allow the plat-form to create secondary services. A key example of this isthe precise profiling of users which permits personalised andcustomised services: personal assistants trace users as theygo about their day-to-day activities, not only online but alsoin the physical world through the use of geo-localization orquantified-self means.
Beyond personal services, platforms also generate data thatcan be of tremendous importance for virtually anything, forexample, the information provided by search trends onsearch engines. Interestingly, it is hard to predict what usesthis data might have and surprising applications of big dataare emerging in many sectors. For instance, in the US, roadtraffic data from Inrix could reveal economic fluctuationsbefore government services, much like Google Flu wasahead of the Centre for Disease Control. The externalities ofbig data, both positive and negative, need to be thoroughlyconsidered and this is the goal of the European project BYTE[2], launched this year.
Platforms create ecosystems in which both users and eco-nomic players take a role. Platforms follow two rules: (i)they perform a gatekeeper role, acting as intermediaries forother services their users require and removing the need forother middleman; and (ii) they facilitate the easy develop-ment of services on their API for economic players. Thesetwo rules are fundamental to ensuring their capacity to col-lect data: this data then fuels all their services.
Platforms now have important economic power that rivalsenergy corporations. They offer services which have becomeessential utilities and are indispensable, not only for the gen-eral public but corporations as well. This latter group havecome to rely on the services of platforms to facilitate cus-tomer relations and other fundamental aspects of their busi-nesses. Like other essential utilities, such as water, energyand telecommunications, platforms provide service conti-nuity, are non-discriminatory and can adapt to changes. Theirbusiness model is two-sided, with users on one side who
25 Years ERCIM: Challenges for ICST
Intermediation Platforms,
an Economic Revolution
by Stéphane Grumbach
Intermediation platforms connect people, services and
even things in ways that have been unthinkable until
now. Search engines provide relevant references for
people searching for information. Social networks
connect users in their environment. Car pooling systems
link drivers and passengers using the same routes.
Intermediation platforms use big data to fuel the
services they offer and these services are evolving
extremely quickly but almost unnoticed. They are
already competing with the oil industry as the world’s
top market capitalisations and are on the verge of
revolutionising the world in which we live.
Intermediation platforms can connect people, services orthings that share common or complementary interests andwould benefit from knowing one another. It can be eitherusers that seek out such connections or the platform itselfwhich takes the initiative to suggest connections. The rele-vance of the intermediation, which relies on sophisticatedalgorithms and agile business models, ensures the success ofthe services provided by the platforms [1].
ERCIM NEWS 99 October 20146
Information and Communication Science and Technology (ICST)and Applied Mathematics are playing an increasing role helping tofind innovative solutions raised by today’s economic, societal andscientific challenges. ERCIM member institutes are at the forefrontof European research in ICST and Applied Mathematics. Over theyears ERCIM News has witnessed the significant advances made byscientists in this field and in the related application areas. On theoccasion of ERCIM’s 25th anniversary, we take a glance into thefuture with the following selection of articles, providing insight intojust some of the multitude of challenges that our field is facing:
Economic/societal
• Intermediation Platforms, an Economic Revolutionby Stéphane Grumbach
• Enabling Future Smart Energy Systems by Stefan Dulman and Eric Pauwels
• Will the IT Revolution Cost Our Children Their Jobs?by Harry Rudin
Science
• The Next Boom of Big Data in Biology: Multicellular Datasets by Roeland M.H. Merks
Software
• Looking Towards a Future where Software is Controlled by the Public (and not the other way round)by Magiel Bruntink and Jurgen Vinju
• Scaling Future Software: The Manycore Challengeby Frank S. de Boer et al.
Future Challenges for ICST
ERCIM NEWS 99 October 2014
receive free access to services and on the other side, clients(who at this stage mostly consist of advertisers) that canaccess special services offered to them.
In addition to offering brand new services, platforms are alsodisrupting [3] existing services by changing the rules of thegame. By giving users easy access to a wide choice of pos-sible services, platforms empower consumers. At the sametime, they weaken traditional service providers. For instance,in the press, platforms allow users to access content frommultiple sources. Such integrators might rely completely onalgorithms and thus, bypass in-house journalists. In addition,by taking into account their reading habits or declared inter-ests, platforms can offer a customised experience for theirreaders. This revolution is progressively taking place in allaspects of the content industry, including publishing, massmedia, etc. This level of influence is not unique: there ishardly a sector that involves people, services or things thatisn’t or wont be affected by platforms. Platforms are abol-ishing the distinction between service providers and con-sumers. Currently, the transportation and the lodging sectorsare experiencing seriously impacts. This will soon be thecase of media channels. States often react by trying to protectthe suffering economic sectors.
Platforms empower people, but weaken establishments. Bytaking control of an increasing number of services, they alsoweaken States by disrupting the traditional economy andgenerating revenue for both users and the platform itself thatescape taxation. To respond to this, tax systems must be rein-
7
France
USA
China
S. Korea
Brazil
Egypt
Rest ofthe world
050100
250500 5000
1000500 250
0
Online population(million people)
Traffic Top 10 sites(million monthly visits)
Balance inflow and outflow(million monthly visits)
vented. However, platforms also hold fantastic potential formeeting one of society’s most important challenges, themore frugal use of resources.
Platforms are on the verge of redesign the world as we knowit. They not only introduce incredible promise, but also con-siderable challenges. To date, no platforms have been devel-oped in Europe and consequently, its dependency on US plat-forms increases every day. This means that much of theEuropean data harvested online is flowing to the US and withit an increasing loss of business opportunities and controlover local activities.
Link:
BYTE project: http://byte-project.eu
References:
[1] S. P.Choudary: “Platform Thinking, The New Rules ofBusiness in a Networked World”, http://platformed.info/[2] S. P. Choudary, G. Parker and M. Van Alstyne: “Out-look 2014: Platforms are Eating the World”,http://www.wired.com/2013/12/outlook-2014-platforms-eating-world/
Figure 1: An estimate of data flows between representative countries (as indicated by the arrows) that have been harvested
online from the top ten websites in each country of origin. These websites represent about a third of the total activity of the
entire top 500 sites. It is interesting to note that the US is clearly harvesting most of the data generated by most countries,
including those in Europe.
25 Years ERCIM: Challenges for ICST
Enabling Future Smart
Energy Systems
by Stefan Dulman and Eric Pauwels
The on-going transition to more sustainable energy
production methods means that we are moving away
from a monolithic, centrally controlled model to one in
which both production and consumption are
progressively decentralised and localised. This in turn
gives rise to complex interacting networks. ICT and
mathematics will be instrumental in making these
networks more efficient and resilient. This article
highlights two research areas that we expect will play
an important role in these developments.
The confluence of various scientific and technological devel-opments in computing, telecommunications and micro-elec-tronics is rapidly ushering in an era in which humans, serv-ices, sensors, vehicles, robots and various other devices areall linked and interacting through loosely connected net-works. Such networks can be thought of as cyber-physicalsystems as they build on an interwoven combination of thephysical and digital environments, interacting throughexchanges of data and control. Although the precise detailsof such cyber-physical systems will largely depend on theconcrete application domain (e.g., logistics, smart energygrids, high precision agriculture), they tend to share a set ofcommon important architectural characteristics. Theseinclude a large number of fairly autonomous componentsinteracting at different levels, the routine collection andanalysis of massive amounts of data, organic growth and flu-idity in participation and an emphasis on decentralised deci-sion making. This latter characteristic results in variouslevels of self-organisation or in other words a system of sys-tems. The scale at which some of these systems are intendedto operate, as well as their impact on society, calls for a prin-cipled approach to their design, analysis and validation.Therefore, these developments prompt challenging newresearch questions, creating novel directions of investigationand shedding new light on established ones.
Smart energy systems (SES) are an important case in point.The on-going transition to more sustainable energy produc-tion methods means that we are moving away from a mono-lithic, centrally controlled model to one in which both pro-duction and consumption are progressively decentralised andlocalised. Furthermore, the growing reliance on renewableenergy sources such as wind and solar energy introducesconsiderable fluctuations to energy production that require aprompt and adequate response. As a result, energy networksare increasingly being twinned with parallel ICT networksthat shuttle data, often in real time, between the variousstakeholders (e.g., producers, distributors and consumers),with the ultimate goal of making the networks more efficientand resilient. Examples of this type of development include:the massive deployment of sensors to closely monitor net-work performance (e.g., the roll out of smart meters in manycountries); the planned introduction of automated, onlineauctions and markets to use variable pricing as a stabilisingmechanism for counter-acting fluctuations in consumption;
and the ambition to decentralise energy production by par-celling the network into islands that are largely self-suffi-cient.
Obviously, in this short contribution it is impossible to high-light all the important research questions that are currentlybeing explored to tackle the many challenges just mentioned.Rather, we will briefly outline two examples of such devel-opments that piqued our personal interest.
A mathematical research topic that is currently being re-invigorated and has direct bearings on the design of SES isthe use of gossip algorithms in distributed computing. As thename suggests, gossip algorithms attempt to compute aggre-gate (i.e., global) values for important network parametersby relying exclusively on local exchanges of information.Put differently, network nodes only talk to their neighbours,but nevertheless manage to efficiently compute reliablevalues for global network parameters, such as network-wideaverages or maximum values. Such an approach obviates theneed to establish a central control authority: since theresulting estimates diffuse across the network, it suffices toquery an arbitrary node. Furthermore, these algorithms canbe extended to be self-stabilising in the sense that changes inthe network topology resulting from re-arrangements ofneighbours are rapidly and automatically reflected in thefinal result.
From the short description above, it transpires that gossip-based aggregate computation addresses several of the keyissues in SES. A distributed approach solves many scalingissues and proves to be robust against changes in networktopology (e.g., nodes becoming unavailable due to failure orexit, new nodes joining the system). Furthermore, as the pro-
ERCIM NEWS 99 October 20148
In a future of increasingly adaptive and opportunistically connected
energy networks, control and self-organization based on the
decentralized computation of global network parameters, will gain
importance.
Sou
rce:
Sh
utt
erst
ock
ERCIM NEWS 99 October 2014
tocols rely mainly on anonymous dataexchange, privacy issues are alleviated.Last but not least, significant progressis being made in the development ofmathematical methodologies to pre-cisely quantify the performance char-acteristics of these algorithms, in par-ticular, error margins and speed of con-vergence (e.g., [1]).
A second challenge that is expected todrive future research finds its origin inthe heterogeneity of the many ICTresources that are being combined in atypical SES setting. Indeed, althoughhuge amounts of data about variousaspects (e.g., production, consumption,storage, pricing, etc.) are routinelybeing collected, high quality informa-tion is still notoriously difficult tocome by. Data are often siloed orserved up in an unstructured andundocumented format. Data (and
resources in general) need to come augmented with widelyunderstood metadata in order to provide sufficient contextthat other agents can independent process it. This is of partic-ular interest in the nascent field of cross-modal data miningwhich investigates methodologies that can be used to auto-matically combine information drawn from heterogeneoussources, for example, how can the content of messages onsocial media be linked to spikes in consumption data ema-nating from smart meters? To make this type of analysis pos-sible it is imperative to develop more powerful semanticmediation tools that can automatically identify related con-cepts in different data sets and establish precise mapsbetween them [2]. We expect that this will spur on furtherdevelopments in semantic web technologies, in particularontology alignment and RDF-reasoners.
Again, it is important to re-iterate that such a short contribu-tion cannot do justice to all the different scientific disciplinesthat are expected to hugely impact SES, such as multi-scaleand multi-physics simulations of networks, market mecha-nisms using multi-agent systems, privacy protection andcyber-security, just to name a few. However, we think thatboth the topics discussed here will feature prominently infuture research and have an important impact, not just onSES research but more generally on the wider class of cyber-physical systems.
Link:
http://www.cwi.nl/research-themes/energy
References:
[1] D. Shah: Gossip Algorithms. Foundations and Trends inNetworking, Vol. 3 (1), 2008,pp1-125. [2] Choi, N., Song, I. Y., Han, H.: A survey on ontologymapping. ACM Sigmod Record, 35(3), 34-41 (2006)
Over the past two and a half centuries, there have been
several technological revolutions: the industrial
revolution in around 1770, the introduction of steam
engines and railways in the 1830s, the introduction of
steel and electricity in the 1870s, the use of mass
production from 1910 onwards and finally, the current
Information Technology (IT) revolution that began in the
1970s. This modern day revolution is unique: not only
has it brought us incredible achievements but it also
poses some real threats to our traditional concept of
employment.
Our Modern Day Revolution
In the past, technological revolutions have been followed bya stagnation in employment figures as traditional jobs aredisplaced by the new technologies. Then, as people adapt tothose technologies and embrace their benefits, new jobs arecreated, productivity increases and overall living standardsimprove. This delay between technological breakthroughsand employment recovery is typically several decades [1]. Inthe case of the IT revolution, however, recovery seems to betaking much longer.
This prompts us to ask the question “Why?”. Previous tech-nological introductions have also acted as a human labourmultiplier. For example, the mechanical loom increasedproductivity by a factor of approximately 100 but is tiny incomparison with the progress seen in electronics and com-puters, as described by Moore’s Law. To illustrate thispoint, consider the invention of the transistor in around1960: since the introduction of this device, computationcosts have been reduced by a factor of 1010. If a similarreduction was seen in aviation costs, it would equate to abrand new Airbus 320 costing one cent as opposed to 100million dollars.
The IT revolution has been incredibly profound and there-fore, it is not surprising that it has had profound conse-quences. IT goods and services have become relatively inex-pensive and virtually ubiquitous and they have been appliedto almost every aspect of our day-to-day lives to automateprocesses from mass production to routine bookkeeping.Now, they are poised to enhance, and in some instances evenreplace, intellectual processes that have long been thought tobe the domain of human intellect alone. Today, IT is beingused to guide robots, prepare legal dossiers and even playand win games that call for wit and context such as the pop-ular US game show Jeopardy!.
IT, a Threat to Jobs?
While automation displaced blue-collar workers from repeti-tive, menial tasks, IT and robotics have accelerated andextended this trend into white-collar sectors. Further, theinfinite capabilities of these technologies means they are rap-idly making inroads into automating intellectual processes.In the future, it is highly likely that IT will drive our cars, fly
25 Years ERCIM: Challenges for ICST
ERCIM NEWS 99 October 201410
Our Challenge
The tidal wave of IT progress is a formidable force. In linewith historical patterns, some economists are of the view thatin time, society will learn how to embrace the new technolo-gies, leverage IT to find new employment solutions and regainmore of an equilibrium. However, the world has never dealtwith a revolution that has such a profound scope. Happily, cer-tain jobs will obviously be created. For example, there will bea growing demand for jobs that require a flair for creativity,personal interaction and social skills. In addition, huge num-bers of software experts will be needed to adapt computers tohandle tasks that are yet to be automated. Programming andanalytical skills will also be in high demand. It is certainly truethat IT will generate many new jobs, but will these replace thecurrent jobs lost? Education will continue to play a vital role inmeeting these future challenges: the financial advantages ofhaving a good education are already evident.
Nonetheless we must do more. We have created some incred-ible technology and now the question is, can we now channelour inventiveness and ingenuity into creating new classes ofwork? New jobs that will result in a novel economy that isnot simply based on consumable goods? The clock is runningand I wish all of us, especially our children, good luck.
Link: http://www.zurich.ibm.com/~hr/IT_refs
References:
[1] C. Perez, “Technological Revolutions and FinancialCapital: the Dynamics of Bubbles and Golden Ages,”Edward Elgar Publishing, 2002.[2] C. B. Frey and M. A. Osborne, “The Future of Employ-ment: How Susceptible Are Jobs to Computerisation,”Oxford University, 17 September, 2013. [3] C. Murray, “Coming Apart: 1960-2010,” RandomHouse, 2012.
our planes, diagnose our diseases and manage our medicaltreatment. The term “white-collar” was used to describedworkers whose tasks required no manual dexterity. Now itrefers to professionals who engage in unstructured and intel-lectual work, precisely the tasks that are increasingly beingautomated. In the last 35 years, corporate profits have grown,thanks largely to the increasing use of IT (Figure 1). Highercorporate profits mean firms have more available capital toinvest expanding the IT-guided automation of theirprocesses. As a result, many jobs have already become obso-lete and this trend seems likely to continue. Simply put, theIT revolution has made machines cheaper than manpowerand consequently, jobs are being lost and personal incomesare decreasing.
For the United States, Frey and Osborne [2] have predictedhow jobs will develop over the next two decades. To do thisthey analysed more than 700 job types and arrived at somestartling results. For instance, they predict (with a 90% prob-ability) that telemarketers, bank tellers, insurance agents, fileclerks and cashiers will all be replaced by IT. Bus drivers,teachers and flight attendants will also be heavily impacted,with half of all jobs predicted to be supplanted by IT.Overall, they predict an estimated 47% of all current jobs inthe US will be eliminated. Psychologists, scientists, engi-neers and managers, however, appear to be some of the saferprofessions. Similar changes are taking place in here inEurope although luckily, the European education systembetter prepares students to deal with IT. More negatively,however, Europe is already facing overall unemploymentrates of nearly 12%.
These employment changes are already well underway andthis can be seen in the sharply decreasing rates of not onlyblue-collar incomes, but also some white-collar workers whoperform well-structured tasks. In contrast, professionals withintellectual functions have enjoyed steady or increasingsalaries. It is undeniable that this discrepancy is exacerbatingthe already unhealthy redistribution of wealth throughout theworld (see C. Murray [3])
1980
Wages
as % GDP
Corporate
Profit
as % GDP
+7%
- 15%
2014
Figure 1: Corporate profits and wages as a percentage of gross domestic product in the US.
ERCIM NEWS 99 October 2014 11
These developments will generate enormous new datasetsthat will be much bigger than those that currently exist inmolecular biology. So how can we store such data in a struc-tured way so that researchers can use it meaningfully to fur-ther the field of embryogenesis? First, the data must bestored in standard formats that facilitate sharing and make itaccessible for on-going use after publication. For moleculardata standards exist to ensure the gene sequence and func-tional features are captured. For example, Gene Ontology isa structured vocabulary that allows researchers to assign areadable list of well-defined biological functions to a gene.Two computer-oriented, domain-specific languages, SBMLand CellML, can be used to describe the dynamic interactionnetworks of genes, proteins and metabolites. These lan-guages create files that are similar to PDF files and they canbe interpreted by many different software applications.
Ongoing initiatives in the field of information sciences arelaying the foundations for similar data standards anddomain-specific languages in the multicellular biology com-munity. New versions of SBML will allow users to describethe distribution of molecules in fixed geometries and coupledcells. However, in a recent paper that proposed a CellBehaviour Ontology (CBO) [2], it was argued that SBML isnot the most efficient or insightful way to annotate embry-ological data. The multicellular organism is a collection ofthousands to trillions of individual cells. Individuallydescribing the gene expression levels and biophysical prop-erties of each cell will create huge datasets but not neces-sarily yield useful insights. Even the most detailed three-dimensional movies or sets of cell trajectories are merelypretty pictures unless we can identify and label their compo-nents meaningfully. A useful comparison is thinking aboutthe difference between providing a list of pixels in an imageversus the list of things in that image. CBO focuses ondescribing the behaviour of cells and the dependency ofthose behaviours on the cell’s internal machinery. Thisincludes its gene expression pattern and local environment.This declarative approach allows the CBO to categorise eachcell in a developing embryo using a manageable set of celltypes which range from the tens to hundreds in number. Eachcell type is characterised by the same class of behaviours,thus, cells belonging to the same cell type share the samebehaviours. Each cell follows a set of logical input andoutput rules that guide these behaviours and its transitionfrom one cell type to another (i.e., differentiation). Many celltypes in multicellular organisms are ‘sub-types’ whosebehaviour varies in subtle ways around a general ‘base’ celltype. For example, the endothelial cells in a developingblood vessel are made up of two sub-types: ‘tip’ cells at theend of a sprouting blood vessel which are usually morespikey and motile and ‘stalk’ cells which occur to the back ofthe sprout. This approach allows the CBO to develop a hier-archical classification of cell types and cell behaviours.
Besides compressing the data, the classification of cellbehaviours will also enable quantitative biologists to under-stand biological development to a point that, with the aid ofapplied mathematicians, they can then reconstruct it usingagent-based computer simulations. This will then enablethem to unravel how subtle changes in cell behaviour, drivenby factors such as inherited disease or cancer, can affect theoutcome of development and why. Thus, the resulting
The Next Boom
of Big data in Biology:
Multicellular datasets
by Roeland M.H. Merks
Big data research in Life Sciences typically focuses on
big molecular datasets of protein structures, DNA
sequences, gene expression, proteomics and
metabolomics. Now, however, new developments in
three-dimensional imaging and microscopy have started
to deliver big datasets of cell behaviours during
embryonic development including cell trajectories and
shapes and patterns of gene activity from every position
in the embryo. This surge of multicellular and multi-scale
biological data poses exciting new challenges for the
application of ICT and applied mathematics in this field.
In 1995, when I was in the early stages of my biology mas-ters and starting to think about PhD opportunities, Naturepublished a short feature article entitled ‘The Boom inBioinformatics’. The world needed bioinformaticians, thearticle said, “to take full advantage of the vast wealth ofgenetic data emanating from […] the Human GenomeProject and other […] efforts.” With a combined interest incomputer science and biology, you might have thought there-fore, this promised a bright future for me. The problem wasbioinformatics did not excite me. I didn’t believe it wouldsolve the problem I had held closest to my heart since I hadfirst seen tadpoles develop from eggs, embryogenesis. Afterall, embryos are not just bags of genes.
Technological developments in microscopy and imageanalysis are now producing a flood of new data that excitesme much more. With this data, it is now possible to track themovements and behaviours of any cell, in an early embryo,organ, or tumour. With this capability we will now be able toidentify what makes cells take a wrong turn in children withbirth defects or how tumour cells can change their metabo-lism and movement to out compete their well-behavedneighbours and disrupt the structure and function of anorgan. Such mechanistic insights will eventually make it pos-sible to interfere with developmental mechanisms with agreater specificity than currently possible.
Conventional light microscopy can already follow the migra-tion of a subset of individual cells (labelled with fluorescentmarkers) in organs but techniques are getting better. Two-photon microscopy techniques, used in conjunction withadvanced image analysis, allow researchers to routinely gen-erate all-cell datasets of developing embryos or organs.Applying this approach the BioEmergences platform atCNRS (Gif-sur-Yvette, France) recently produced a geneexpression atlas featuring cellular resolution of developingzebrafish [1]. Soon we will be able to follow every cell indeveloping organisms and tissues and concurrently identifywhat genes they are expressing and what metabolites theyare producing.
25 Years ERCIM: Challenges for ICST
ERCIM NEWS 99 October 201412
datasets become more meaningful descriptions of theobservations as well as sets of rules to construct agent-based computer simulations of those observations. In thisway, CBO takes a ‘cell-based approach’ [3], which viewsembryogenesis as the collective behaviour of a ‘colony’ ofindividual cells.
The extraction of cell behaviours from data, followed by there-synthesis of the embryo as a computer simulation isalready under way. At Inria (Roquencourt, France), a teamled by Dirk Drasdo is using structural images to build simu-lations of liver regeneration following poisoning. At Inria(Montpellier, France), the VirtualPlants team headed byChristophe Godin has used detailed plant tissue images tobuild cell-level simulations of leaf initiation and vasculardevelopment in plants. In our own work here at CWI(Amsterdam), we are simulating the formation of bloodvessel sprouts, e.g., during cancer neoangiogenesis, from thechemical and mechanical interactions between endothelialcells. As multicellular imaging datasets are merging withexplanatory computer modelling, big data in biology isfinally starting to really excite me.
[1] C. Castro-González et al.: “A Digital Framework toBuild, Visualize and Analyze a Gene Expression Atlas withCellular Resolution in Zebrafish Early Embryogenesis”,PLoS Comp. Biol., 10 (6), 2014, e1003670[2] J.P. Sluka, J. P. et al .: “The Cell Behavior Ontology:Describing the Intrinsic Biological Behaviors of Real andModel Cells Seen as Active Agents”, Bioinformatics,30(16), 2014, 2367-2374[3] R.M.H. Merks and J.A. Glazier: “A Cell-CenteredApproach to Developmental Biology”, Physica A, 352(1),2005, 113–130.
Figure: Zebrafish (Danio rerio) imaged live throughout early
development (gastrulation). Snapshots of the tailbud stage.
A: Raw data (fluorescent nuclei and membranes), display with avizo
software, data cut at a depth of 100 microns.
B: Display of detected nuclei and cell trajectories, calculated using
the BioEmergences workflow (http://www.bioemergences.eu).
A, B scale bar 100 microns. Close up in C to show selected clones
(colored cubes) and their trajectories for the past 6 hours, in white
an orthoslice of the membrane channel.
Pictures by Nadine Peyriéras, BioEmergences.eu, CNRS Gif-sur-
Yvette, France.
ERCIM NEWS 99 October 2014 13
are numerous examples of incidents in which the security ofkey systems in many (public) organizations have beenbreached. Recently, a serious vulnerability dubbed the‘Heartbleed bug’ was exposed in a software (OpenSSL) thatis supposed to secure vast numbers of Internet servers. Itisn’t at all clear what happened to the data stored on thosesystems that were compromised by these vulnerabilities.
Heartbleed, in particular, provides an interesting illustrationof the level of software complexity we are dealing with. Thebug itself consists of only two lines of code, whereas theentire OpenSSL software package contains 450,000 lines ofcode [1]. Industry research into the existence of bugs ordefects suggests a wide range in the bug ratio, from 0.1 to100 bugs per 1,000 lines of code [2,3]. This ratio is stronglyrelated to how well the software was developed and tested.Clearly, our current understanding of software does not allowus to develop software without bugs and is just one of theconsequences of software complexity. Other considerationsinclude the high cost and lack of performance.
Considering all this, we feel that the future of softwareshould involve a radical change, best summarised as follows:
Software complexity should become a public problem,instead of simply remaining just a problem for the public. Inour view, the current situation in which software is too com-plex to be handled properly should transition to a situation
Looking Towards
a Future where Software
is Controlled by the Public
(and not the other way
round)
by Magiel Bruntink and Jurgen Vinju
Nowadays, software has a ubiquitous presence in
everyday life and this phenomenon gives rise to a range
of challenges that affect both individuals and society as
a whole. In this article we argue that in the future, the
domain of software should no longer belong to
technical experts and system integrators alone. Instead
it should transition to a firmly engaged public domain,
similar to city planning, social welfare and security. The
challenge that lies at the heart of this problem is the
ability to understand, on a technical level, what all the
different software actually is and what it does with our
information.
Software is intrinsically linked to many of the challengescurrently facing society. The most obvious of these chal-lenges is data security and user privacy. Much of the soft-ware currently in use collects data. This data comes from awide range of sources including recreational activities, per-sonal health, messaging, street surveillance, financial trans-actions and international communications. Software is notonly deployed on personal (mobile) computing devices butalso through far-reaching government programs. In all cases,it is the software that tells each computing device how to par-ticipate in the act of data collection and process the activitiesthat bring benefits. Whether these benefits are for the greatergood of society, however, is not always clear cut. Thisprompts questions such as “Who is aware of the exact datacollected by their smartphone, or where (on the internet) itwill be stored?”, “What servers hold the contents of yoursoftware-supported tax return and in which countries arethey located?” or “Is there a database somewhere thatsomehow stores a picture of you linked to a crime scene?”.
Besides the obvious political and social aspects of thesequestions, there are more fundamental problems that stillneed to be addressed by software researchers and practi-tioners. The core problem that exacerbates the issues of soft-ware security and privacy is that software is not sufficientlywell understood at a technical level, especially at the scalesat which it is now being developed and deployed. All toooften, software is so complex it can’t even be handled by themost experienced software engineers or explained by themost advanced theories in computer science, and too big tobe summarised accurately by the automated tools created forthat purpose. How then are policy makers or the generalpublic supposed to be able to make software-related deci-sions that are based on facts and insight?
Given that software complexity is still an untamed problem,what consequences exist for data security and privacy? There
The core problem that exacerbates the issues of software security and
privacy is that software is not well enough understood on a technical level,
especially at the scale at which it is now being developed and deployed.
Sour
ce: S
hutte
rsto
ck
25 Years ERCIM: Challenges for ICST
ERCIM NEWS 99 October 201414
where software-related decisions can feasibly be made by
non-experts, in particular policy makers and citizens.
In our view, a positive development has been the installation
of the (temporary) committee on ICT by the Dutch House of
Representatives, which is tasked with investigating several
problematic e-government projects. We envision a similar
public status for software as given to law making, city plan-
ning, social security, etc. While all these social priorities still
require a certain level of technical expertise, public debate
determines their direction. There is a long road ahead to
reach a point where software can join this list. We feel the
following directions are essential to making this journey:
• investment in research that creates more accessible soft-
ware technologies, for instance, domain-specific (pro-
gramming) languages that achieve a better fit to societal
problems and reduce software complexity;
• investment in empirical research that considers the current
state-of-the-art practices in dealing with software com-
plexity with a view to scientifically establishing good and
bad practices, methods and technologies;
• the introduction of software and computing curriculum at
the primary, secondary and higher levels of education to
increase general software literacy and ultimately, foster a
better public understanding of software complexity; and
• contributions to the public debate on the nature of soft-
ware and its impacts on society, for instance, by arguing
that society-critical software should transition to open-
source models, enabling public control and contribution.
In conclusion, to arrive at a future where software is some-
thing we can all understand and control, as opposed to us
being controlled by software and its complexities, a strong
focus on software will be required in both research and edu-
cation. Therefore, it is high time to generate public engage-
ment on the complexities of software and the challenges that
task requires a fundamental breakthrough in how parallelismand concurrency are integrated into programming languages,substantiated by a complete inversion of the current canon-ical language design. By inverting design decisions, whichhave largely evolved in a sequential setting, new program-ming models can be developed that are suitable for main-stream concurrent programming and deployment onto par-allel hardware. This could be achieved without imposing aheavy syntactic overhead on the programmer.
The authors of this article are the principal investigators ofthe three year EU Upscale project (From InherentConcurrency to Massive Parallelism through Type-BasedOptimizations) which started in March 2014. In this projectwe take as starting point of the inverted language designexisting actor-based languages [1] and libraries (i.e., AkkaActor API; see Links section). In contrast to an object, anactor executes its own thread of control in which the pro-vided operations are processed as requested, by the actorswhich run in parallel. These requests are processed accordingto a particular scheduling policy, e.g., in order of theirarrival. Sending a request to execute a provided operationinvolves the asynchronous passing of a corresponding mes-sage. That is, the actor that sends this message continues theexecution of its own thread. Both concurrency and the fea-tures which typically make concurrency easier to exploit,such as immutability, locality and asynchronous messagepassing, will be default behaviour of the actors. This inver-sion produces a programming language that can be easilyanalysed as properties which may potentially inhibit paral-lelism (e.g., synchronous communication and sharedmutable state) must be explicitly declared.
The key feature of the Encore language that is currentlyunder development is that everything will be designed toleverage deployment issues. Deployment is the mapping ofcomputations to processors and the scheduling of such com-putations. The main rationale of the inverted language designis to support the automated analysis of the code in order thatdeployment-related information can be obtained. This infor-mation can then be used to facilitate optimisations by boththe compiler and at run-time. These automated optimisationswill alleviate the design of parallel applications for manycorearchitectures and thus will make the potential computingpower of this hardware available to mainsteam developers.
Links:
Upscale project: http://www.upscale-project.euAkka Actor API. http://akka.io/docs/
Reference:
[1] Gul A. Agha: “ACTORS - a model of concurrent com-putation in distributed systems”, MIT Press series in artifi-cial intelligence, MIT Press 1990, ISBN 978-0-262-01092-4, pp. I-IX, 1-144
Please contact:
Frank S. de BoerCWI, The NetherlandsTel: +31 20 5924139E-mail: [email protected]
However, the current concurrency model commonly used forobject-oriented programs in industry is multithreading. Athread is a sequential flow of control which processes data byinvoking the operations of the objects storing the data.Multithreading is provided through small syntactic additionsto the programming language which allow several suchthreads to run in parallel. Nevertheless, the development ofefficient and precise concurrent programs for multicoreprocessors is very demanding. Further, an inexperienced usermay cause errors because different parallel threads can inter-fere with each other, simultaneously reading and writing thedata of a single object and thus, undoing each other’s work.To control such interference, programmers have to use low-level synchronization mechanisms, such as locks or fences,that feature subtle and intricate semantics but whose use iserror-prone. These mechanisms can be introduced to avoidinterference but generate additional overhead that is causedby threads that need to wait for one another, and thus, cannotbe run in parallel. This overhead can also occur because thedata are distributed across different parts of the architecture(i.e., cache and memory). If the data access pattern used bythe various threads does not match their distribution pattern,the program generates a large amount of overhead transfer-ring data across processors, caches and memory.
To address these issues, increasingly advanced languageextensions, concurrency libraries and program analysis tech-niques are currently being developed to explicitly controlthread concurrency and synchronization. However, despitethese advances in programming support, concurrency is stilla difficult task. Only the most capable programmers canexplicitly control concurrency and efficiently make use ofthe relatively small number of cores readily available today.
Thus, manycore processors require radically new softwareabstractions to coordinate interactions among the concurrentprocesses and between the processing and storage units. This
A chip processor wafer. Chip
manufacturers are moving
from single-processor chips to
new architectures that utilise
the same silicon real estate for
a conglomerate of multiple
independent processors known
as multicores.
ERCIM NEWS 99 October 201416
Special Theme: Sofware Quality
Introduction to the Special Theme
Software Quality
by Jurgen Vinju and Anthony Cleve, guest editors for the special theme section
The introduction of fast and cheap computer and net-working hardware enables the spread of software.Software, in a nutshell, represents an unprecedentedability to channel creativity and innovation. The joyful actof simply writing computer programs for existing ICTinfrastructure can change the world. We are currently wit-nessing how our lives can change rapidly as a result, atevery level of organization and society and in practicallyevery aspect of the human condition: work, play, love andwar.
The act of writing software does not imply an under-standing of the resulting creation. We are surprised byfailing software (due to bugs), the inability of rigid com-puter systems to “just do what we want”, the loss of pri-vacy and information security, and last but not least, themillion euro software project failures that occur in thepublic sector. These surprises are generally not due tonegligence or unethical behaviour but rather reflect ourincomplete understanding of what we are creating. Ourcreations, at present, are all much too complex and thislack of understanding leads to a lack of control.
Just as it is easy to write a new recipe for a dish the worldhas never seen before, it is also easy to create a uniquecomputer program which does something the world hasnever seen before. When reading a recipe, it isn’t easy topredict how nice the dish will taste and, similarly, wecannot easily predict how a program will behave fromreading its source code. The emergent properties of soft-ware occur on all levels of abstraction. Three examplesillustrate this. A “while loop” can be written in a minutebut it can take a person a week or even a lifetime to under-stand whether it will eventually terminate or not on anyinput. Now imagine planning the budget for a softwareproject in which all loops should terminate quickly. Ortake a scenario where you simply need to scale a com-puter system from a single database with a single front-end application to a shared database with two front-endapplications running in parallel. Such an “improvement”can introduce the wildest, unpredictable behaviours suchas random people not getting their goods delivered, orworse, the wrong limb amputated. In the third example,we do not know how the network will react to the loadgenerated by the break of the next international soccermatch between France and Germany, e.g., “When will itall crash?”.
Higher quality software is simpler software, with morepredictable properties. Without limiting the endless possi-bilities of software, we need to be able to know what weare creating. Teaching state-of-the-art software engi-
neering theory and skills is one way of improving under-standing but alone, this is not enough. We are working ondeveloping better theories and better tools to improve ourunderstanding of complex software and to better controlits complex emergent behaviours. We will be able toadapt existing software to satisfy new requirements andto understand how costly these adaptations will be andthe quality of the results. We will be able to design soft-ware in a way that means that consciously made designdecisions will lead to predictable, high quality softwareartifacts. We will be able to plan and budget softwareprojects within reasonable margins of error.
In this special theme of ERCIM News, some of the recentsteps developed to understand and manipulate softwarequality are presented. We aren’t yet at the stage where wefully understand, or can control software but we are cer-tainly working towards this point. Some researchers arestudying the current reality of software, discovering theo-ries and tools that can improve our abilities to analyse,explain and manipulate. Other researchers are re-thinkingand re-shaping the future of software by discovering new,simpler languages and tools to construct the next genera-tion of software. These two perspectives should leapfrogus into a future where we understand it all.
As quality and simplicity are highly subjective concepts,our best bet is to strive to increasingly contextualisingsoftware engineering theory and technology. Generaltheory, languages and tools have resulted in overly com-plex systems so now, more specialised tools and tech-niques for distinct groups of people and industries arebeing discovered. For example, instead of modellingcomputation in general, we are now modelling big dataprocessing; instead of inventing new general purposeprogramming languages, we are now focusing on domainspecific formalisms; and instead of reverse engineeringall knowledge from source code, we are now extractingdomain specific viewpoints.
We hope you will find this selection of articles aninspiring overview of state-of-the-art software qualityengineering research and beyond.
Please contact:
Jurgen VinjuCWI and TU Eindhoven, The NetherlandsE-mail: [email protected]
Over the last 10 years, a range of tech-nological, organizational and infra-structure innovations have allowed SIGto grow to the point that it provides anassessment service currently pro-cessing 27 million lines of code eachweek. In this article, we present a briefdiscussion of a few of those innova-tions.
Analysis tools
The software analysts at SIG are alsosoftware developers who create and con-tinuously improve their own suite ofsoftware analysis tools. Not only arethese tools adept at picking apart sourcecode, they can be easily extended to sup-port a range of additional computer lan-guages. To date, this strength has beenexploited to develop support for around100 different languages, 50 of which areused on a continuous basis. These toolsare also good at operating autonomouslyand scale appropriately. After an initialconfiguration, new batches of sourcecode can be automatically analyzedquickly, allowing the analysts to focustheir attention on the quality anomaliesthat are found [1]. On average, across allsystems types, serious anomalies occurin approximately 15% of the analysedcode.
Evaluation models
Whilst all software systems differ (i.e.,their age, technologies used, function-ality and architecture), common pat-terns do exist between them. Thesebecome apparent through on going,extensive analysis. SIG’s analystsnoticed these patterns in the softwarequality measurements and consolidatedtheir experience to produce standard-ized evaluation models that opera-tionalize various aspects of softwarequality (as defined by the ISO-25010international standard of softwareproduct quality). The first modelfocused on the “maintainability” of asystem. First published in 2007, thismodel has since been refined, validatedand calibrated against SIG’s growing
data warehouse [2]. Since 2009, thismodel has been used by theTechnischer Überwachungs-Verein(TÜV) to certify software products.
Recently, two applications used by theDutch Ministry of Infrastructure(Rijkswaterstaat) to assist with main-taining safe waterways, were awarded4-star quality certificates by this organ-isation (from a possible 5 stars).Similar models for software security,reliability, performance, testing andenergy-efficiency, have recentlybecome available and these are contin-uously being refined.
Lab organization
Scalable tools and models that caneffectively be applied are extremelyvaluable, but to achieve this, properorganization is paramount. SIG organ-izes its software analysis activities in anISO-17025 certified lab. This meansthat analysts undergo proper trainingand follow standardized work proce-dures, consequently producing reliablemeasurement results that can berepeated. When quality anomalies aredetected, they undergo triage in theMonitor Control Centre (Figure 1).Here, the false positive results are sepa-
Monitoring Software Quality at Large Scale
by Eric Bouwers, Per John and Joost Visser
In 2004, the Software Improvement Group (SIG) introduced a new software monitoring service. Then
a recent spin-off of CWI with a couple of clients and a vision, SIG has transformed itself ten years
later into a respected (and sometimes feared) specialist in software quality assessment that helps a
wide variety of organizations to manage their application portfolios, their development projects, and
their software suppliers.
rated out. Then, depending on theseverity and/or type of finding, the ana-lyst works with the client to determinean appropriate resolution. If the
anomaly cannot be resolved, then seniormanagement becomes involved.Currently, SIG monitors over 500 soft-ware systems and takes over 200 codesnapshots each week. From these, theiranalysts are responsible for assessingover 27 million lines of code, in over 50different languages from COBOL andScala to PL/SQL and Python.
Value adding: beyond software
quality
On the foundation of tools and models,SIG has built an advisory practice.Working together with the analysts, therole of the advisors is to translate tech-nical software quality findings intorisks and recommendations. Thus, SIGis able to provide cost comparisons(e.g., the cost of repairing qualitydefects versus not repairing them [3])or provide project recommendations(e.g., suboptimal quality may be areason to postpone deployment, cancela project or provide better conditions tothe developers). By providing this con-text to the software findings, SIG offers
Figure 1: The workflow that is executed whenever a snapshot
of a system is received.
meaningful value for its client’s tech-nical staff and decision-makers.
Ongoing R&D
The growth of SIG thus far, and itsfuture path depends on ongoing invest-ment in R&D. Into the future, SIG islooking to keep working with universi-ties and research institutes from aroundthe Netherlands (including Delft,Utrecht, Amsterdam, Tilburg, Nijmegenand Leiden) and beyond (e.g., theFraunhofer Institute) to explore newtechniques to control software quality.A number of questions still remainunanswered. For example, “how can thebacklogs of agile projects be analysed
to give executive sponsors confidencein what those agile teams are doing?”,“how can security vulnerabilities due todependencies on third-party libraries bedetected?”, “how can developmentteams be given insight into the energyfootprint of their products and ways toreduce them?” or, “what are the qualityaspects of the software-defined infra-structure that support continuous inte-gration and deployment?”. By contin-uing to explore the answers to thesequestions and others, SIG will continueto grow in the future.
References:
[1] D. Bijlsma, J. P. Correia, J. Visser:
“Automatic event detection forsoftware product quality monitoring”,QUATIC 2012 [2] R. Baggen et al.: “StandardizedCode Quality Benchmarking forImproving Software Maintainability”,Software Quality Journal, 2011 [3] A. Nugroho, T. Kuipers, J. Visser:“An empirical model of technical debtand interest”, MTD 2011.
by Nicholas Matragkas, James Williams and Dimitris Kolovos
OSSMETER is a FP7 European project that aims to extend the state-of-the-art in the field of automated
analysis and the measurement of open-source software (OSS). It also aims to develop a platform that will
support decision makers in the process of discovering, comparing, assessing and monitoring the health,
quality, impact and activity of OSS.
Deciding if an open-source software(OSS) project meets the standardsrequired for adoption, in terms ofquality, maturity, the activity of thedevelopment and user support, is not astraightforward process. It involvesexploring a variety of informationsources including OSS source coderepositories, communication channels(e.g., newsgroups, forums and mailinglists) and bug-tracking systems. Sourcecode repositories help the potentialadoptee to identify how actively thecode has been developed, which pro-gramming languages were used, howwell the code has been commented andhow thoroughly the code has beentested. Communication channels canidentify whether user questions arebeing answered in a timely and satisfac-tory manner and help estimate howmany experts and users the software has.Finally, bug-tracking systems can showwhether the software has many openbugs and the rate at which those bugs arefixed. Other relevant metadata such asthe number of downloads, the license(s)under which the software is made avail-able and its release history may also beavailable from the forge that hosts theproject. If available, this myriad of
information can help OSS adopteesmake informed decisions, however, thedisaggregated nature of the informationmakes this analysis tedious and timeconsuming.
This task becomes even more chal-lenging if the user wishes to identify andcompare several different OSS projectsthat offer software with similar function-ality (e.g., there are more than 20 open
source XML parsers for the Java pro-gramming language) and make an evi-dence-based selection decision.Following the product selection, thesoftware still requires on-going moni-toring to ensure it remains healthy,
actively developed and adequately sup-ported throughout its lifecycle. This iscrucial for identifying and mitigatingany risks that emerge as a result of adecline in the project’s quality indicatorsin a timely manner.
OSSMETER is a Specific TargetedResearch Project (STREP) under theSeventh Framework Programme forresearch and technological development
(FP7). The project began in October2012 and will end in March 2015. Anumber of different European organiza-tions are involved in the project: TheOpen Group, University of York andUniversity of Manchester (United
Figure 1: The tiered architecture of the OSSMETER system.
ERCIM NEWS 99 October 2014 19
Kingdom), CWI (Netherlands),University of L’Aquila (Italy), Technalia(Spain), Softeam (France), and Uninovaand Unparallel Innovation (Portugal).OSSMETER aims to: extend the state-of-the-art in the field of automatedanalysis and the measurement of OSS;and develop a platform that will supportdecision makers in the process of dis-covering, comparing, assessing andmonitoring the health, quality, impactand activity of OSS. To achieve thesegoals OSSMETER will develop trust-worthy quality indicators by providing aset of analysis and measurement compo-nents. More specifically, the OSS-METER platform will provide the fol-lowing software measurement compo-nents [2]:• programming language-agnostic and
language-specific components toassess a project’s source code quality;
• text mining components to analysethe natural language information
extracted from communication chan-nels and bug-tracking systems andthus, provide information about com-munication quality within the projectand community activity (around theproject); and
After this data is extracted by the plat-form, it will be made available via apublic web application, along with itsdecision support tools and visualizations.
With almost six months to go, the OSS-METER team is planning to release abeta-version of the platform to the pro-ject’s industrial partners to support theanalysis of OSS projects hosted onSourceForge, GitHub and Eclipseforges. The OSSMETER platform isscheduled to be publicly available at thebeginning of 2015. The OSSMETERteam will run a public version of the
platform which will analyse a number ofOSS projects and the results will be pub-lished on the OSSMETER website.Furthermore, the platform will also beavailable for download for personalanalyses.
Link:
OSSMETER project:http://www.ossmeter.org/
Reference:
[1] N. Matragkas et al.: “OSSMETERD5.1 - Platform ArchitectureSpecification”, Technical Report, pp. 1-32, Department of Computer Science,University of York, York, UK, 2013(http://www.ossmeter.eu/publications)
Please contact:
Nicholas MatragkasUniversity of YorkTel: +44 (0)1904 325164E-mail: [email protected]
Monitoring Services Quality in the Cloud
by Miguel Zuñiga-Prieto, Priscila Cedillo, Javier Gonzalez-Huerta, Emilio Insfran and Silvia Abrahão
Due to the dynamic nature of cloud computing environments, continuous monitoring of the quality of cloud
services is needed in order to satisfy business goals and enforce service-level agreements (SLAs). Current
approaches for SLA specifications in IT services are not sufficient since SLAs are usually based on
templates that are expressed in a natural language, making automated compliance verification and
assurance tasks difficult. In such a context, the use of models at runtime becomes particularly relevant:
such models can help retrieve data from the running system to verify SLA compliance and if the desired
quality levels are not achieved, drive the dynamic reconfiguration of the cloud services architecture.
Cloud computing represents much morethan an infrastructure with which organ-izations can quickly and efficiently pro-vision and manage their computingcapabilities. It also represents a funda-mental shift in how cloud applicationsneed to be built, run and monitored.While some vendors are offering dif-ferent technologies, a mature set ofdevelopment tools that can facilitatecross-cloud development, deploymentand evaluation is yet to be developed.This definitely represents a growth areain the future. The different nature ofcloud application development willdrive changes in software developmentprocess frameworks, which will becomemore self-maintained and practice-ori-ented.
Cloud services need to comply with a setof contract clauses and quality require-ments, specified by an SLA. To support
the fulfillment of this agreement a moni-toring process can be defined whichallows service providers to determinethe actual quality of cloud services.Traditional monitoring technologies arerestricted to static and homogenousenvironments and, as such, cannot beappropriately applied to cloud environ-ments [3]. Further, during the develop-ment of these technologies, manyassumptions are realized at design time.However, due to the dynamic nature ofcloud computing, meeting thoseassumptions in this context is not pos-sible. It is necessary, therefore, to mon-itor the continuous satisfaction of thefunctional and quality requirements atruntime.
During this monitoring process, the vio-lation of an SLA clause may trigger thedynamic reconfiguration of the existingcloud services architecture. Dynamic
reconfiguration creates and destroysarchitectural elements instances at run-time: this is particularly important in thecontext of cloud computing as theirservices must continue working whilethe reconfiguration takes place.However, little attention has been paidto supporting this reconfiguration at run-time and only recently has the field ofsoftware engineering research startedfocusing on these issues [1].
Through the Value@Cloud project,funded by the Ministry of Economy andCompetitiveness in Spain, we are devel-oping a framework to support model-driven incremental cloud service devel-opment. Specifically, the frameworksupports cloud development teams to: i) capture business goals and Quality-of-Service (QoS) attributes (which willform part of the SLA); ii) create andincrementally deploy architecturual-
centric cloud services that are capable ofdynamically evolving; and iii) monitorthe quality of cloud services delivered tothe customers.
The monitoring strategy developedthrough this project is based on two keyelements. The first is models at runtime[2] which verify the degree of compli-ance against the quality requirementsspecified in the SLA. The second istechniques for dynamically reconfig-uring the cloud services architecture ifthe desired quality levels are not satis-fied. The main activities and artifactsinvolved in this monitoring strategy areshown in Figure 1.
Models at runtime offer flexibility to themonitoring infrastructure through theirreflection mechanisms: the modificationof quality requirements may dynami-
cally change the monitoring computa-tion, thus avoiding the need to adjust themonitoring infrastructure. In ourapproach, models at runtime are part ofa monitoring & analysis middlewarethat interacts with cloud services. Thismiddleware retrieves data in the modelat runtime, analyzes the informationand provides a report outlining the SLAviolations. This report is used in thereconfiguration planning to dynami-cally reconfigure the cloud servicesarchitecture in order to satisfy the SLA.The architecture reconfiguration is car-ried out by generating cloud specificreconfiguration plans, which includeadaptation patterns to be applied tocloud service instances at runtime.
We believe that our approach will facili-tate the monitoring of the higher-levelquality attributes specified in SLAs. It
can also provide the architect with flexi-bility if new quality requirements needto be added or modified since thechanges will be performed at runtimeand the monitoring infrastructure willremain unchanged. Finally, not onlydoes this approach report the SLA vio-lations identified but also provides areconfiguration plan for dynamicallychanging the cloud service architecturein order to satisfy the SLA qualityrequirements.
Link:
ISSI Research Group at UniversitatPolitècnica de València:http://issi.dsic.upv.es/projects
References:
[1] L. Baresi, C. Ghezzi: “TheDisappearing Boundary BetweenDevelopment-time and Run-time”,FSE/SDP Workshop on Future ofsoftware Engineering Research, pp.17-21, 2010[2] N. Bencomo et al.: “Requirementsreflection: requirements as runtimeentities”, 32nd InternationalConference on Software Engineering,pp. 199-202, 2010[3] V.C. Emeakaroha et al.: “Low levelMetrics to High level SLAs -LoM2HiS framework: Bridging thegap between monitored metrics andSLA parameters in cloudenvironments”, HPCS 2010, pp. 48-54.
Please contact:
Silvia AbrahãoUniversitat Politècnica de València,SpainEmail: [email protected]
ERCIM NEWS 99 October 201420
Special Theme: Sofware Quality
Figure 1: Cloud services quality monitoring and reconfiguration infrastructure.
dictō: Keeping Software Architecture
under Control
by Andrea Caracciolo, Mircea Filip Lungu and Oscar Nierstrasz
Dictō is a declarative language for specifying architectural rules that uses a single
uniform notation. Once defined, the rules can automatically be validated using adapted
off-the-shelf tools.
Quality requirements (e.g., perform-ance or modifiability) and other derivedconstraints (e.g., naming conventionsor module dependencies) are oftendescribed in software architecture doc-uments in the form of rules. Forexample:
• “Repository interfaces can onlydeclare methods named ‘find*()’”;or
• “If an exception is wrapped intoanother one, the wrapped exceptionmust be referenced as the cause”;and
• “Entity bean attributes of type ‘Code’must be annotated with @Type(type =“com.[..].hibernate.CodeMapping”)”.
Ideally, rules such as these should bechecked periodically and automatically.However, after interviewing and sur-
ERCIM NEWS 99 October 2014 21
veying dozens of practitioners [1], wediscovered that in approximately 60%of cases, these architectural rules areeither checked using non-automatedtechniques (e.g., code review or manualtesting) or not checked at all. This situa-tion arises because the automated toolscurrently available are highly special-ized and not always convenient to use.Typically, these tools only handle onekind of rule based on various (oftenundocumented) theoretical and opera-tional assumptions that hinder their
adoption. For a practitioner to be able tovalidate all their architectural rules,they would need to learn about multipleautomated tools, conduct experimentalevaluations and set up a proper testingenvironment. This process requires asignificant time and resource invest-ment with no evident payoff.
Our approach in a nutshell
Our goal is to enable practitioners toautomatically check that their softwaresystems are evolving without strayingfrom previously established architec-tural rules. To support this idea, wepropose that Dictō [2], a unified DSL(domain specific language), could beused to specify the architectural rules,thus allowing them to be automaticallytested using adapted off-the-shelftools.
This proposed approach supports afully automated verification process,allowing even non-technical stake-holders to be involved in the specifica-t ion of rules. Once the rules areexpressed, their verification can beintegrated into the continuous integra-tion system. This ensures the correct
implementation of the planned archi-tecture over time and helps preventarchitectural decay.
How it works
Using Dictō, an architectural rule such as “The web service must answer userrequests within 10ms”can be expressed as“WebService must HaveResponseTimeLessThan (“10 ms”)”.
Rules are composed of subject entitiesand logical predicates. In this example,
the subject (“WebService”) is an entitydefined by the user, which maps to aconcrete element of the system. Thepredicate(“HaveResponseTimeLessThan”) isformulated to prescribe the expectedproperties on the specified subjects. Toincrease expressivity without sacri-ficing readability, we support four typesof rules: must, cannot, only-can andcan-only.
Dictō rules are parsed and fed to themost appropriate tool through purpose-built adapters (Figure 1). These areplug-ins designed to accept rules thatmatch a specific syntactic structure. Theaccepted rules are then analyzed andused to generate a valid input specifica-tion for the adapted tool. The resultsobtained from each of the supportedtools can eventually be aggregated andused to build an overall report for theuser. The adapters are written by toolexperts and, by contributing thenecessary code to the Dictō project, canbe shared with a larger user-base.
Implementation
The current version of Dictō is capableof testing rules defined over variousaspects of a system from observablebehaviours (e.g., latency, load oruptime) to implementation (e.g.,dependencies or code clones). It fea-tures adapters for six different tools(JMeter, JavaPathFinder, PMD, grep,Moose and ping). By collaborating withinterested stakeholders, we plan toextend this catalogue of testable rules.This future step will offer invaluableinsights into the needs of actual users.
This work has been developed byAndrea Caracciolo (SoftwareComposition Group, University ofBern). It is part of the “Agile SoftwareAssessment” project funded by theSwiss National Science Foundation.The tool is freely available for down-load on our website (http://scg.unibe.ch/dicto-dsl/) and the source codeis open under the MIT license. We areinterested in both academic andindustry collaborations to furtherdevelop and assess our approach.
Link:
http://scg.unibe.ch/dicto-dsl/
References:
[1] A. Caracciolo, M. F. Lungu, and O.Nierstrasz: “How Do SoftwareArchitects Specify and ValidateQuality Requirements?”, in SoftwareArchitecture, LNCS 8627, p. 374-389,Springer, 2014[2] A. Caracciolo, M. F. Lungu, and O.Nierstrasz: “Dicto: A Unified DSL forTesting Architectural Rules”, inECSAW ‘14, ACM, 2014.
Please contact:
Andrea CaraccioloUniversity of Bern, SwitzerlandE-mail: [email protected]
Figure 1: Our approach, from the specification of the rules to their testing.
The lifetime of large systems (such asthose that support the activities ofbanks, hospitals, insurance companiesand the army) can be measured indecades. Such software systems havebecome a crucial component for run-ning the day-to-day affairs of oursociety. Since these systems modelimportant aspects of human activity,they must undergo continuous evolutionthat follows the evolution of our society.For example, new laws, economicalconstraints or requirements force largesoftware systems to evolve. Previousstudies have shown that undertakingthis evolution can represent up to 90%of total software effort [1]. Controllingsuch systems and ensuring they canevolve is a key challenge: it calls for adetailed understanding of the system, aswell as its strengths and weaknesses.Deloittes recently identified this issueas an emerging challenge [2].
From an analysis of the current situa-tion, four key facts emerge. 1. Despite the importance of software
evolution for our economy, it is notconsidered to be a relevant problem(and in fact, is considered a topic ofthe past): for example, currently thereare no EU research axes that focus onthis crucial point while buzzwordssuch as “big data” and “the Cloud”attract all the attention.
2. People seem to believe that the issuesassociated with software analysis andevolution have been solved, but thereality is little has been accomplished.
3. New development techniques such asAgile Development, Test DrivenDevelopment, Service-Oriented Archi-tecture and Software Product Linescannot solve the problems that haveaccumulated over years of maintenanceon legacy systems and it is impossibleto dream of redeveloping even a smallfraction of the enormous quantity ofsoftware that currently exists today.
4. Software evolution is universal: ithappens to any successful software,even in projects written with the latestand coolest technologies. The produc-tivity increases that have been
achieved with more recent technolo-gies will further complicate the issueas engineers produce more complexcode that will also have to be main-tained. There are tools that proposesome basic analyses in terms of the“technical debt” (i.e., that put a mon-etary value on bad code quality),however, knowing that you have adebt does not help you take action toimprove code quality.
Typical software quality solutions thatassess the value of some generic metricsat a point in time are not adapted to theneeds of the developers. Over the yearswe have developed Moose, a data andsoftware analysis platform. We havepreviously presented Moose [3], but inthis article, we want to discuss some ofthe aspects we learnt while selling toolsbuilt on top of Moose. In conjunctionwith the clients of our associated com-pany Synectique, we identified that anadequate analysis infrastructurerequires the following elements.
The first is dedicated processes and
tools which are needed to approach thespecific problems a company or systemmight face. Frequently software sys-tems use proprietary organizationschemes to complete tasks, for example,to implement a specific bus communi-cation between components. In suchcases, generic solutions are mostly use-less as they only give information in
terms of the “normal”, low-level con-cepts available in the programming lan-guage used. Large software systemsneed to be analyzed at a higher abstrac-tion level (e.g., component, feature orsub-system). This supports reverseengineering efforts. In Moose, we offera meta model-based solution where theimported data is stored independently ofthe programming language. Thisapproach can be extended to supportproprietary concepts or idioms, and newdata can be supported by merelyadapting the model and defining theproper importer. Once the informationis imported, analysts can take advantageof the different tools for crafting soft-ware analyses that are tailored to meettheir needs.
The second element is tagging. Endusers and/or reengineers often require away to annotate and query entities withexpert knowledge or the results of ananalysis. To respond to this need, endusers and reengineers are provided witha tagging mechanism which allowsthem to identify interesting entities ortheir groups. An interesting case whichhighlights the use of this mechanism isthe extraction of a functional architec-ture from structural model information.Once experts or analyses have taggedentities, new tools and analyses (such asa rule-based validation) use it (byquerying) to advanced knowledge andcreate more results.
ERCIM NEWS 99 October 201422
Special Theme: Sofware Quality
dedicated Software Analysis Tools
by Nicolas Anquetil, Stéphane Ducasse and Usman Bhatti
The data and software analysis platform Moose allows for the quick development of dedicated tools
that can be customized at different levels. These tools are crucial for large software systems that
are subject to continuous evolution.
Figure 1: A dependency analyzer for legacy code.
The third element is the dependency
nightmare analysis and remediationtool. Large and/or old software systemshave often suffered from architecturaldrift for so long that there is little or noarchitecture left. All parts of a softwaresystem are intrinsically linked and fre-quently, loading three modules canmean loading the complete system. Thechallenge is how to deal with this finegrained information at a large grainlevel. We propose an advanced cyclicdependency analysis and removal toolsas well as a drill-down on architecturalviews. Figure 1 shows the tool revealingrecursive dependencies for impactanalysis.
The fourth element is a trend analysis
solution. Instead of a punctual picture ofa system’s software state, it is desirableto understand the evolution of thequality of entities. As the source code(and thus the software entities) typicallyevolve in integrated development envi-ronments, independently from the dedi-cated, off-the-shelf, software qualitytools, computing quality analysis trendsrequires changes (e.g., add, remove,move or rename) of individual softwareentities identified. We propose a toolthat computes such changes and themetric evolutions. Figure 2 shows thechanges computed on two versions(green: entity added, red: entityremoved) and the evolution of qualitymetrics for a change. Queries may be
expressed in terms of the changes (i.e.,“all added methods”) or in terms of themetric variations (i.e., “increase ofCyclomaticComplexity > 5”).
Conclusion
Software evolution and maintenance is,and will continue to be, a challenge forthe future. This is not because a lack ofresearch advances but rather becausemore and more software is being cre-ated and that software is destined to lastlonger. In addition, any successful soft-ware systems must evolve to adapt to
ERCIM NEWS 99 October 2014 23
global changes. Our experience showsthat while problems may look similaron the surface, key problems oftenrequire dedicated attention (e.g., pro-cessing, analyses and tools). There is aneed for dedicated tools that can be cus-tomized at different levels, such asmodels, offered analyses and the levelof granularity.
[1] C. Jones: “The Economics ofSoftware Maintenance in the TwentyFirst Century”, 2006,http://www.compaid.com/caiinternet/ezine/capersjones-maintenance.pdf[2] B. Briggs et al.: “Tech Trends2014, Inspiring Disruption”, WhitePaper, Deloitte University Press, 2014,http://www.deloitte.com/assets/Dcom-Luxembourg/Local%20Assets/Documents/Whitepapers/2014/dtt_en_wp_techtrends_10022014.pdf [3] O. Nierstrasz, S. Ducasse, T. Gîrba:“The story of Moose: an agilereengineering environment”,ESEC/SIGSOFT FSE 2005: 1-10.
In recent years, scientists and engi-neers have started turning their headstowards the field of software reposi-tory mining. The ability to not onlyexamine static snapshots of softwarebut also the way they have evolvedover t ime is opening up new andexciting lines of research towards thegoal of enhancing the quality assess-ment process. Descriptive statistics(e.g., mean, median, mode, quartiles of
the data-set, variance and standarddeviation) are not enough to gener-alize specific behaviours such as howprone a file is to change [1]. Datamining analysis (e.g. , clustering,regression, etc.) which are based onthe newly accessible information fromsoftware repositories (e.g., contribu-tors, commits, code frequency, activeissues and active pull requests) mustbe developed with the aim of proac-
tively improving software quality, notonly reactively responding to issues.Open source software repositories likeSourceforge and GitHub provide a richand varied source of data to mine. Theiropen nature welcomes contributors withvery different skill sets and experiencelevels and the absence or low levels ofstandardized workflow enforcementmake them reflect ‘close-to-extreme’cases (as opposed to the more structured
Mining Open Software Repositories
by Jesús Alonso Abad, Carlos López Nozal and Jesús M. Maudes Raedo
With the boom in data mining which has occurred in recent years and higher processing powers,
software repository mining now represents a promising tool for developing better software. Open
software repositories, with their availability and wide spectrum of data attributes are an exciting
testing ground for software repository mining and quality assessment research. In this project, the
aim was to achieve improvements in software development processes in relation to change control,
release planning, test recording, code review and project planning processes.
workflow patterns experienced whenusing, for instance, a branch-per-taskbranching policy). In addition, they pro-vide easily accessible data sources forscientists to experiment with. The col-lection of these massive amounts ofdata have been supported by QualitasCorpus [2] and GHTorrent [3] who haveboth made multiple efforts to gather andoffer datasets to the scientific commu-nity.
The project workflow, undertaken byour research team at the University ofBurgos, Spain, included the followingsteps (Figure 1):
1. Obtain data collected by GHTorrentfrom the GitHub repository and put itinto MongoDB databases.
2. Filter the data according to needs andexpand the data where possible (e.g.,downloading source code files orcalculating measurements such asthe number of commits, number ofissues opened, etc.). Some pre-pro-cessing of the data using JavaScriptwas completed during the databasequerying step and a number ofNode.js scripts were used for severaloperations afterwards (e.g., filedownloading or calculating staticcode metrics such as the number oflines of code, McCabe’s complexity,etc.)
3. Define an experiment with the aim ofimproving the software development
process and pack the expanded datainto a data table that will be suppliedto a data mining tool to be used for arange of different techniques includ-ing regression or clustering.
4. Evaluate the data mining results andprepare experiments to validate newhypotheses based on those results.
Despite the benefits of using suchrepositories, it is important to rememberthat, sometimes, a lack of standarizationin the integration process can createunformatted or missing commit mes-sages or frequent unstable commits.This, and other constraints (not dis-cussed here) can make data miningthese repositories more difficult and/orlead to sub-optimal results.
Until now, software quality assess-ment has focused on single snapshotstaken throughout the life of the soft-ware. Thus, the assessments have notbeen able to take the time variableinto account. The use of softwarerepositories allows researchers toaddress this shortcoming.Consequently, future software reposi-tory mining will play a key role inenhancing the software developmentprocess, allowing developers to detectweak points, predict future issues andprovide optimized processes anddevelopment cycles. Open softwarerepositories offer a number of futureresearch opportunities.
[1] I. S. Wiese et al.: “Comparingcommunication and developmentnetworks for predicting file changeproneness: An exploratory studyconsidering process and socialmetrics,” Electron. Commun. EASST -proc. of SQM 2014, vol. 65, 2014.[2] E. Tempero et al.: “The QualitasCorpus: A Curated Collection of JavaCode for Empirical Studies,” 2010Asia Pacific Softw. Eng. Conf., 2010.[3] G. Gousios, D. Spinellis:“GHTorrent: Github’s data from afirehose,” 9th IEEE MSR, 2012.
Please contact:
Jesús Alonso AbadUniversity of Burgos, SpainTel: +34 600813116E-mail: [email protected]
Carlos López NozalUniversity of Burgos, Spain Tel: +34 947258989E-mail: [email protected]
Jesús M Maudes Raedo. University of Burgos, SpainTel: +34 947259358E-mail: [email protected]
ERCIM NEWS 99 October 201424
Special Theme: Sofware Quality
Figure 1: The process of mining data from an open software repository.
ERCIM NEWS 99 October 2014 25
Code clones [1] have been widelystudied and there is a large amount of lit-erature on this issue. This work has ledto a number of different types of clonesbeing identified. Clones involve all non-trivial software systems; the percentageof involved duplicated lines is usuallyestimated to be between 5% and 20%but can sometimes even reach 50% [2].Many of the studies have investigatedthe factors that cause clone insertion andtheir results have enabled several criteriaand detection techniques to be devel-oped.
When addressing the issue of duplicatedcode management, we have to considerthe following aspects:• what instances are worth refactoring
and which are not; and• once an instance has been evaluated
as worth of refactoring, which tech-nique should be applied to remove aduplicated instance.
Refactoring duplicated code is a task inwhich code fragments are merged or
moved to other locations, for exampleother functions, methods or classes.Moving code means that the computa-tional logic belonging to a specificentity of the system is moved: it shouldbe approached with caution as reloca-tion can break the original design coher-ence, reducing cohesion and/or movingresponsibilities to unsuitable entities.There are a number of refactoring tech-niques available, each having its ownpros and cons in both design and lower-level aspects.
In this study, we proposed an approachthat aims at automatically evaluatingand selecting suitable refactoring tech-niques based on the classification of theclones, thus reducing the humaninvolvement in the process. We focusedour attention on the following aspects:• an analysis of the location of each
clone pair resulting in a specific set ofapplicable refactoring techniques,
• the ranking of the applicable refactor-ing techniques based on a set ofweighting criteria, and
• the aggregation of the critical cloneinformation and best refactoringtechniques, according to thosenumerical criteria.
In line with this vision, we developed atool which suggests the ‘best’ refac-toring techniques for code clones inJava and named it the Duplicated CodeRefactoring Advisor (DCRA; Figure 1).The tool consists of four components,each designed with a specific goal.Every component enriches the informa-tion obtained on the duplicate code andthe whole elaboration process identifiesa suitable list of techniques that couldbe applied to the most problematicduplications. The four components are:• the Clone Detector, which is an
external tool for detecting clone pairs(we are currently using a well knowntool called NiCad [3]);
• the Clone Detailer, which analyzesthe Clone Detector output and char-acterises every clone, detailing infor-mation such as clone location, sizeand type;
• the Refactoring Advisor, which visitsa decision tree to choose the possiblerefactoring techniques related to eachclone pair; the use of this componentallows for refactoring technique sug-gestions to be made, based on theclone location and the variables con-tained in the clone; suggestions areranked on the basis of the clone’s dif-ferent features, e.g., a Lines of Code(LOC) variation and an evaluation ofthe quality resulting from its applica-tion, in terms of object-oriented pro-gramming constructs exploitation;and
• the Refactoring Advice Aggregator,which aggregates the available infor-mation on clones and refactoringtechniques, groups them by class orpackage and then sorts them by refac-
A Refactoring Suggestion Tool
for Removing Clones in Java Code
by Francesca Arcelli Fontana, Marco Zanoni and Francesco Zanoni
Code duplication is considered a widespread code smell, a symptom of bad code development
practices or potential design issues. Code smells are also considered to be indicators of poor software
maintainability. The refactoring cost associated with removing code clones can be very high, partly
because of the number of different decisions that must be made regarding the kind of refactoring
steps to apply. Here, we describe a tool that has been developed to suggest the best refactoring steps
that could be taken to remove clones in Java code. Our approach is based on the classification of
clones, in terms of their location in a class hierarchy, so that decisions can be made from a restricted
set of refactorings that have been evaluated using multiple criteria.
Figure 1: Duplicate code data flow through DCRA components.
User CloneDetailer
Binary Code
Source Code
CloneDetector
Poor clonedetails
Clone Detector report
RefactoringAdvisor
RefactoringAdvisce
Aggregator
Rich clonedetails
Refactoringadvice
Aggregated refactoring advice
toring significance or clone pairimpact, thus providing a summaryreport which captures the most inter-esting information about clones, e.g.,what are the largest clones and whichclones should be easiest (or most con-venient) to remove.
In developing this approach, our dualaim was to filter out which clone pairsare worthy of refactoring and suggestthe best refactoring techniques for thoseworthy clone pairs. We have success-fully provided an automated techniquefor selecting the best refactoring tech-niques in a given situation that is basedon a classification of code clones. Weexperimented the Clone Detailermodule on 50 systems of the Qualitas
Corpus from Tempero et al. We vali-dated all the modules of our DCRA toolon four systems of the Qualitas Corpus.The tool suggested a successful refac-toring in most cases.
Through its use, the aim is that DCRAwill offer a concrete reduction in thehuman involvement currently requiredin duplicated code refactoring proce-dures and, thus, reducing the overalleffort required from software devel-opers.
References:
[1] M. Fowler: “Refactoring.Improving the Design of ExistingCode”, Addison- Wesley, 1999[2] M. F. Zibran, C. K. Roy: “The road
to software clone management: Asurvey”, The Univ. of Saskatchewan,Dept. Computer Science, Tech. Rep.2012-03, Feb. 2012,http://www.cs.usask.ca/documents/techreports/2012/TR- 2012- 03.pdf[3] C. Roy, J. Cordy: “NICAD:Accurate detection of near-missintentional clones using flexible pretty-printing and code normalization”, inproc. of ICPC 2008, Amsterdam, pp.172–181.
Debugging continues to be a costlyactivity in the process of developingsoftware. However, there are tech-niques to decrease overall the costs ofbugs (i.e., bugs get cheaper to fix) andincrease reliability (i.e., on the whole,more bugs are fixed).
Some bugs are deeply rooted in thedomain logic but others are independentof the specificity of the applicationbeing debugged. This latter category arecalled “crowd bugs”, unexpected andincorrect behaviours that result from acommon and intuitive usage of an appli-cation programming interface (API). Inthis project, our research group here atInria Lille (France), set out to minimizethe difficulties associated with fixingcrowd bugs. We propose a novel debug-ging approach for crowd bugs [1]. Thisdebugging technique is based onmatching the piece of code beingdebugged against related pieces of codereported by the crowd on a question andanswer (Q&A) website.
To better define what a “crowd bug” is,let us first begin with an example. In
JavaScript, there is a function calledparseInt, which parses a string givenas input and returns the correspondinginteger value. Despite this apparentlysimple description and self-describedsignature, this function poses prob-lems to many developers, as wit-nessed by the dozens of Q&As on thistopic (http://goo.gl/m9bSJS). Many ofthese specially relate to the samequestion, “Why does parseInt(“08”)produce a ‘0’ and not an ‘8’?” Theanswer is that i f the argument ofparseInt begins with 0, it is parsed asan octal. So why is the question askedagain and again? We hypothesize thatthe semantics of parseInt are counter-intuitive for many people and conse-quently, the same issue occurs overand over again in development situa-tions, independently of the domain.The famous Q&A website for pro-grammers, StackOverflow(http://stackoverflow.com), containsthousands of crowd bug Q&As. In thisproject, our idea was to harness theinformation contained in the codingscenarios posed and answered on thiswebsite at debugging time.
The new debugger we are proposingworks in the following manner. Whenfaced with an understandable bug, thedeveloper sets a breakpoint in the codepresumed to be causing the bug andthen approaches the crowd by clickingthe ‘ask the crowd’ button The debuggerthen extracts a ‘snippet’, which isdefined as n lines of code around thebreakpoint, and cleans them accordingto various criteria. This snippet acts as aquery which is then submitted to aserver which in turn, retrieves a list ofQ&As that match that query. The idea isthat within those answers lies a range ofpotential solutions to the bug and allowsfor those solutions to be reused. Theuser interface of our prototype is basedon a crowd-based extension of Firebug,a JavaScript debugger for Firefox(Figure 1).
To determine the viability of thisapproach, our initial task was to con-firm whether Q&A websites such asStackOverflow were able to handlesnippet inputs well. Our approach onlyuses code to query the Q&As, asopposed to text elaborated by devel-
debugging with the Crowd:
A debug Recommendation System
Based on StackOverflow
by Martin Monperrus, Anthony Maia, Romain Rouvoy and Lionel Seinturier
Leveraging the wisdom of the crowd to improve software quality through the use of a
recommendation system based on StackOverflow, a famous Q&A website for programmers.
opers. To investigate this question, wetook a dataset that comprised of 70,060StackOverflow Q&As that were deter-mined to possibly relate to JavaScriptcrowd-bugs (dataset available onrequest). From this dataset, 1000Q&As and their respective snippetswere randomly extracted. We then per-formed 1,000 queries to theStackOverflow search engine, usingthe snippets only as an input. Thisanalysis yielded the following results:377 snippets were considered to benon-valid queries, 374 snippets yieldedno results (i.e., the expected Q&A wasnot found) and finally, 146 snippetsyield a perfect match (i.e., the expectedQ&A was ranked #1).
These results indicate thatStackOverflow does not handle codesnippets inputted as query particularlywell.
In response to this issue, we introducedpre-processing functions aimed atimproving matching quality betweenthe snippets being debugged and theones on the Q&A repository. We exper-imented with different pre-processingfunctions and determined that the bestone is based on a careful filtering of theabstract syntax tree of the snippet.Thus, using this pre-processing func-tion, we repeated the same evaluationfollowed above and on this occasion,511 snippets yielded a #1 ranked Q&A(full results can be found in our onlinetechnical report [1]). Consequently, weintegrated this pre-processing functioninto our prototype crowd-bugdebugger to maximize the chances offinding a viable solution.
Beyond this case, we are conductingfurther research which aims toleverage crowd wisdom to improve the
automation of software repair and con-tribute to the overarching objective ofachieving more resilient and self-healing software systems.
[1] M. Monperrus, A. Maia:“Debugging with the Crowd: a DebugRecommendation System based onStackoverflow”, Technical report #hal-00987395, INRIA, 2014.
Please contact:
Martin MonperrusUniversity Lille 1, FranceInria/Lille1 Spirals research teamTel: +33 3 59 35 87 61E-mail: [email protected]
Figure 1: A screenshot of our prototype crowd-enhanced
JavaScript debugger. The button ‘AskCrowd’ selects the
snippet surrounding the breakpoint (Line 11: red circle). A list
of selected answers that closely match the problem are then
automatically retrieved from StackOverflow. In this case, the
first result is the solution: the prefix has a meaning for
parseInt which must be taken into account.
Rival: A New Benchmarking Toolkit
for Recommender Systems
by Alan Said and Alejandro Bellogín
RiVal is a newly released toolkit, developed during two ERCIM fellowships at Centrum Wiskunde &
Informatica (CWI), for transparent and objective benchmarking of recommender systems
software such as Apache Mahout, LensKit and MyMediaLite. This will ensure that robust and
comparable assessments of their recommendation quality can be made.
Research on recommender systemsoften focuses on making comparisons oftheir predictive accuracy, i.e., the betterthe evaluation scores, the better the rec-ommender. However, it is difficult tocompare results between different rec-ommender systems and frameworks, oreven assess the quality of one system,due to the myriad of design and imple-mentation options in the evaluationstrategies. Additionally, algorithmimplementations can diverge from thestandard formulation due to manualtuning and modifications that work
better in some situations. We havedeveloped an open source evaluationtoolkit for recommender systems(RiVal), which provides a set of stan-dardised evaluation methodologies. Thiswas achieved by retaining complete con-trol of the evaluation dimensions beingbenchmarked (i.e., data splitting, met-rics, evaluation strategies, etc.), inde-pendent of the specific recommendationstrategies.
Recommender systems are a popularmeans of assisting users of a range of
online services in areas, such as music(e.g., Spotify, Last.fm), movies andvideos (e.g., Netflix, YouTube) or otheritems (e.g., Amazon, eBay) [1]. In recentyears, research in this field has grownexponentially and today most top-tierresearch venues feature tracks on recom-mendation. There has been a paralleldevelopment in industry and now, manydata science positions place a significantemphasis on candidates possessingexpertise in recommendation tech-niques. This gain in popularity has led toan overwhelming growth in the amount
ERCIM NEWS 99 October 201428
Special Theme: Sofware Quality
of available literature, as well as a largeset of algorithms to be implemented.With this in mind, it is becomingincreasingly important to be able tobenchmark recommendation modelsagainst one another to objectively esti-mate their performance.
Usually, each implementation of analgorithm is associated with a recom-mendation framework or softwarelibrary, which in turn, must provideadditional layers to access the data,report performance results, etc. Anemerging problem associated withhaving numerous recommendationframeworks is the difficulty in com-paring results across software frame-works, i.e., the reported accuracy of analgorithm in one framework will oftendiffer from the same algorithm in a dif-ferent framework. Minor differences inalgorithmic implementation, data man-agement and evaluation are among thenumber of causes of this problem. Toproperly analyse this problem, we havedeveloped RiVal, a software toolkit thatis capable of efficiently evaluating rec-ommender systems, RiVal can test thevarious functionalities of recommendersystems while simultaneously beingagnostic to the actual algorithm in use. Itdoes not incorporate recommendationalgorithms but rather, provides bindingsor wrappers to the three recommenda-tion frameworks most common at themoment, Apache Mahout, LensKit andMyMediaLite.
RiVal provides a transparent evaluationsetting which gives the practitioner com-plete control of the various evaluationsteps. More specifically, it is composedof three main modules: data splitting,candidate item generation and perform-ance measurement. In addition, an item
recommendation module is also pro-vided that integrates the three commonrecommendation frameworks (listedabove) into the RiVal pipeline (Figure1). RiVal is now available on GitHuband further development informationcan be found on its Wiki and manualpages (see Links section below). Thetoolkit’s features are also outlined indetail in a paper [2] and demo [3]. Thetoolkit can be used programmatically asMaven dependencies, or by running it asa standalone program for each of thesteps.
By using RiVal, we have been able tobenchmark the three most common rec-ommendation algorithms implementedin the three aforementioned frameworksusing three different datasets. We alsogenerated a large amount of results byusing our controlled evaluation protocol,since it consisted of four data splittingtechniques, three strategies for candidateitem generation, and five main perform-ance metrics. Our results point to a largediscrepancy between the same algo-rithms implemented in different frame-works. Further analyses of these results[2] indicate that many of these inconsis-tencies were a result of differences in theimplementation of the algorithms.However, the implementation of theevaluation metrics and methods also dif-fered across the frameworks, whichmakes the objective comparison of therecommendation quality across theframeworks impossible when using aframework-internal evaluation.
The RiVal toolkit enables practitionersto perform completely transparent andobjective evaluations of recommenda-tion results, which will improve theselection of which recommendationframework (and algorithm) should be
used in each situation. Providing anevaluation system, which is highly con-figurable, unbiased by framework-dependent implementations, and usableacross frameworks and datasets, allowsboth researchers and practitioners toassess the quality of a recommendersystem in a wider context than currentstandards in the area allow.
This research was developed while bothauthors were ERCIM fellows at CWI.
[1] F. Ricci, L. Rokach, B. Shapira,P.B. Kantor: “Recommender SystemsHandbook”, Springer, 2011[2] A. Said, A. Bellogín: “ComparativeRecommender System Evaluation:Benchmarking RecommendationFrameworks”, in ACM RecSys, 2014[3] A. Said, A. Bellogín: “RiVal – AToolkit to Foster Reproducibility inRecommender System Evaluation”, inACM RecSys, 2014.
Alejandro BellogínUniversidad Autónoma de Madrid,SpainE-mail: [email protected]
Figure 1: The RiVal evaluation
pipeline. The toolkit’s modular design
means each module can be executed
individually (i.e., only the evaluation
module or only the data splitting
module) or alternatively, the complete
pipeline can be executed within RiVal.
ERCIM NEWS 99 October 2014 29
Model-Driven Engineering (MDE) is asoftware engineering paradigm aimed atimproving developer productivity andsoftware quality. To this end, the devel-opment process in MDE does not focuson the code but rather, on the models ofthe system being built. Models charac-terize the relevant features of a systemfor a specific purpose, for example, doc-umentation, analysis, code generation orsimulation. By abstracting irrelevantfeatures, system complexity is reducedand, due to support from (semi)auto-matic tools for some development tasks,developers can minimise human-intro-duced errors and enhance productivity.
For this to hold true, model qualityshould be a primary concern: a defect ina model can propagate into the finalimplementation of the software system.As for the software, the quality of themodels can be regarded from many dif-ferent perspectives. It is necessary tomake sure that the models are realizable(i.e., the structural models should beable to be satisfied, the states in a behav-ioral model should be reachable, etc.). Inaddition, models that offer complemen-tary views of the same system should beconsistent.
Formal methods provide valuable tech-niques to ensure the correctness of thesesoftware models. Fortunately, abstrac-tion makes models more amenable toanalysis than source code. In any case,most formal verification problems areundecidable or have such high computa-tional complexity that it hampers scala-bility. Thus, software verification is, andwill remain, a grand challenge for soft-ware engineering research in the fore-seeable future [1].
Beyond issues of scale, other modelquality challenges include incompletemodels and model evolution. These fac-tors make the application of fully-fledged (complete and possibly undecid-able) formal methods unsuitable in thiscontext. Instead, light-weightapproaches that are able to provide
quick feedback and support large andcomplex models may be preferred, evenif their outcomes can sometimes beinconclusive. We believe this pragmaticapproach offers the best trade-off for
non-critical software systems. In partic-ular, we have been applying a family oflight-weight formal methods, based onbounded verification by means of con-straint programming, to evaluate thecorrectness of Unified ModelingLanguage(UML)/Object ConstraintLanguage (OCL) software models.
The UML is a de facto standard fordescribing software models, providing acollection of visual and textual nota-tions to describe different facets of asoftware system. The most popularUML notation is the class diagram,which depicts classes within an object-oriented hierarchy. However, UMLclass diagrams are unable to capturecomplex integrity constraints beyondbasic multiplicity constraints. For moreadvanced constraints, class diagramscan be augmented using a companiontextual notation, the OCL.
Using UML/OCL allows complex soft-ware systems to be designed withoutcommitting to a specific technology orplatform. These designs can be verifiedto detect defects before investing effort
in implementation. Typical design flawsinclude redundant constraints or incon-sistencies arising from unexpectedinteractions among integrity con-straints. These errors are far from trivialand may be very hard to detect anddiagnose at the code level.
In this context, we have developed twoopen source tools capable of analyzingUML/OCL class diagrams: UMLtoCSPand its evolution, EMFtoCSP, which isable to deal with more general EMF-based models and is integrated withinthe Eclipse IDE. These tools frame cor-rectness properties as ConstraintSatisfaction Problems (CSPs), whosesolution is an instance that constitutesan example that proves (or a counterex-ample that disproves) the propertybeing checked. This mapping is trans-parent to the user, which means thatthey do not need a formal methods
Evaluating the Quality of Software Models
using Light-weight Formal Methods
by Jordi Cabot and Robert Clarisó
For non-critical software systems, the use of light-weight formal evaluation methods can guarantee
sufficient levels of quality.
Figure 1: The architecture of
the EMFtoCSP tool.
background to execute the tools orunderstand their output.
The constraint-logic programmingsolver ECLiPSe is used as the reasoningengine to find instances. This choiceoffers advantages and disadvantageswith respect to SAT solvers, e.g., bettersupport for complex numerical con-straints. As with all bounded verifica-tion approaches, when a solution is notfound, no conclusion can be drawnabout the property since a solutioncould exist beyond the search bounds.
We believe that using this approachcould enable the adoption of model ver-ification practices in many softwarecompanies, where currently none areemployed due to the lack of a usablesolution. Ultimately, this absence risksthe quality of the software they produce.Nevertheless, there is still a lot of workto be done. A number of extensions arecurrently planned for EMFtoCSP, butwe would like to highlight two. The firstis incremental verification, where oncea model has been checked, further eval-uation should only consider the subset
of the model that has changed since thelast run. The second is an automaticsuggestion of search bounds whereby aquick pre-analysis of the model couldsuggest promising bounds within whichwe should look for a solution to maxi-mize the performance.
Link:
EMFtoCSP:https://github.com/atlanmod/EMFtoCSP
Reference:
[1] C. Jones, P. O’Hearn, J. Woodcock:“Verified Software: A GrandChallenge”, Computer, vol. 39, no. 4,pp. 93-95, April 2006.
Please contact:
Jordi Cabot SagreraInria and École des Mines de Nantes,FranceE-mail: [email protected]
Robert Clarisó ViladrosaInternet Interdisciplinary Institute –Universitat Oberta de CatalunyaE-mail: [email protected]
ERCIM NEWS 99 October 201430
Special Theme: Sofware Quality
Figure 2: The EMFtoCSP interface, including the selection of the constraint (a), bounds (b) and properties (c) used to make the verification. The
visualization of the results are presented in (d) and (e).
ERCIM NEWS 99 October 2014 31
When analyzing the correctness ofdesigns in complex software systemsduring their early stages of development,it is essential to apply formal methods andtools. The broader system is describedusing a formal specification language andits relative correctness (with respect to rel-evant behavioural properties) is checkedby formally evaluating temporal logic for-mulas over the underlying computationalmodel. Over the last two decades, wehave developed the KandISTI family ofmodel checkers, each one based on a dif-ferent specification language, but allsharing a common (on-the-fly) temporallogic and verification engine.
The main objective of the KandISTIframework is to provide formal supportto the software design process, espe-cially in the early stages of the incre-mental design phase (i.e., when designsare still likely to be incomplete andlikely to contain mistakes). The mainfeatures of KandISTI focus on the possi-bilities of (i) manually exploring theevolution of a system and generating asummary of its behaviours; (ii) investi-gating abstract system properties using atemporal logic supported by an on-the-fly model checker; and (iii) obtaining aclear explanation of the model-checkingresults, in terms of possible evolutionsof the specific computational model.
The first tool in the family was the FMCmodel checker which described a system
by a hierarchical composition ofsequential automata. This tool proved tobe a very useful aid when teaching thefundamentals of automated verificationtechniques in the context of softwareengineering courses. As an attempt toreduce the gap between theoreticiansand software engineers, the originalmodel-checking approach was experi-mented over a computational modelbased on UML statecharts. In the con-text of the FP5 and FP6 EU projectsAGILE and SENSORIA, this has led tothe development of UMC, in which asystem is specified as a set of communi-cating UML-like state machines.
In cooperation with Telecom Italia,UMC was used to model and verify anasynchronous version of the SOAPcommunication protocol and model andanalyse an automotive scenario pro-vided by an industrial partner of theSENSORIA project. Currently UMC isbeing used successfully in the experi-mentation of a model-checking-baseddesign methodology in the context ofthe regional project TRACE-IT (TrainControl Enhancement via InformationTechnology). This project aims todevelop an automatic train supervisionsystem that guarantees a deadlock-freestatus for train dispatches, even whenthere are arbitrary delays with respect tothe original timetable. The largestmodel we analysed in this context had astatespace of 35 million states.
Again in the context of SENSORIA, wedeveloped the CMC model checker forthe service-oriented process algebraCOWS. Service-oriented systemsrequire a logic that expresses the corre-lation between dynamically generatedvalues appearing inside actions at dif-ferent times. These values represent thecorrelation values which allow, e.g., torelate the responses of a service to theirspecific requests or to handle the con-cept of a session involving a longsequence of interactions among inter-acting partners. CMC was used tomodel and analyse service-oriented sce-narios from the automotive and financedomains, as provided by industrial part-ners in the project.
The most recent member of theKandISTI family is VMC, which wasdeveloped specifically for the specifi-cation and verification of softwareproduct families. VMC performs twokinds of behavioural variabilityanalyses on a given family of products.The first is a logic property expressedin a variability-aware version of aknown logic. This can directly be veri-fied against the high-level specifica-tion of the product family behaviour,relying on the fact that under certainsyntactic conditions the validity of theproperty over the family model guar-antees the validity of the same propertyfor all product models of the family.The second is that the actual set of
KandISTI: A Family of Model Checkers
for the Analysis of Software designs
by Maurice ter Beek, Stefania Gnesi and Franco Mazzanti
Driven by a series of European projects, researchers from the Formal Methods and Tools lab of ISTI-
CNR have developed a family of model-checking tools for the computer-aided verification of the
correctness of software designs. To date, these tools have been applied to a range of case studies in
the railway, automotive and telecommunication fields.
Figure 1: The railway yard layout and missions for trains on the green, red, yellow and blue lines.
valid product behaviours can be gener-ated explicitly and the resulting speci-fications can be verified against thesame logic property (this is surely lessefficient than direct verification, but itmakes it possible to identify preciselywhy the original property failed overthe whole family). Experimentationwith VMC is on-going in the context ofthe EU FP7 project QUANTICOL. Todate, only a small version of the bike-sharing case study from QUANTICOLhas been considered but more effort isneeded to evaluate VMC on more real-istically sized problems.
Link: http://fmt.isti.cnr.it/kandisti
References:
[1] M.H. ter Beek, A. Fantechi, S.Gnesi, and F. Mazzanti. “A state/event-based model-checking approach for theanalysis of abstract system properties.”Science of Computer Programming76(2):119-135, 2011http://dx.doi.org/10.1016/j.scico.2010.07.002[2] A. Fantechi, S. Gnesi, A. Lapadula,F. Mazzanti, R. Pugliese, and F. Tiezzi.“A logical verification methodologyfor service-oriented computing.” ACM
Transaction on Software Engineeringand Methodology 21(3):16, 2012.http://doi.acm.org/10.1145/2211616.2211619[3] F. Mazzanti, G. O. Spagnolo, andA. Ferrari. “Designing a deadlock-freetrain scheduler: A model checkingapproach.” NASA Formal Methods,LNCS 8430, Springer, 2014, 264-269.http://dx.doi.org/10.1007/978-3-319-06200-6_22
Model-driven engineering (MDE) canbe used to develop highly reliable soft-ware which offers a range of benefitsfrom systems analysis and verificationto code generation. In MDE, systemmodels are created by domain expertsand then transformed into other modelsor code using model transformations.One of the languages used for writingthese model transformations is QVTOperational Mappings (QVTo) whichwas specified in the 2007 ObjectManagement Group (OMG) standardfor model-to-model transformation lan-guages. QVTo is regularly used by bothacademics and industry practitioners,including ASML, the leading providerof complex lithography systems for thesemiconductor industry. CurrentlyASML has more than 20,000 lines ofQVTo code, supporting more than ahundred model transformations.
Despite its widespread use, however,QVTo is a relatively new language. Forgeneral-purpose languages, developershave had time to share best practices andestablish a number of standard referencepoints against which the quality of apiece of code can be judged. These areyet to be developed for QVTo.
Moreover, QVTo has a large amount oflanguage-specific constructs which arenot available in general-purpose lan-guages or even in other model-to-modeltransformation languages. In fact,QVTo specifications have beendescribed by some as “rather volumi-nous” and even “fantastically com-plex”. Therefore, it is unclear whetherestablished intuitions about codequality apply to QVTo and a lack ofstandardized and codified best practicesis recognized as a serious challenge inassessing their transformation quality.
In a response to this challenge, ASMLand Eindhoven University ofTechnology have joined in an ongoingcollaboration to investigate the qualityof QVTo transformations. In addition toassessing the quality of a transforma-tion, this project is also seeking to pro-mote the creation of higher-qualitytransformations from their inception,improving quality proactively [1].
To achieve this goal, a bottom-upapproach was used which combinedthree qualitative methodologies. Tobegin, a broad exploratory study whichincluded the analysis of interviews with
QVTo experts, a review of the existingliterature and other materials, and intro-spection were completed. Then, aQVTo quality model was developed toformalize QVTo transformation quality:this model consisted of high-levelquality goals, quality properties, andevaluation procedures. The qualitymodel was validated using the out-comes from a survey of a broad groupof QVTo developers in which they wereasked to rate each model property on itsimportance to QVTo code quality.
Many of the model properties recog-nized as important for QVTo transfor-mation quality are similar to those intraditional languages (e.g. , smallfunction size). However, this analysisalso highlighted a number ofproperties which are specific to QVToor other model transformation lan-guages. For instance, we found thatthe following QVTo-specific proper-ties were considered important forquality: the use of only a few blackboxes, few queries with side effects,little imperative programming (e.g.,for-loops) and small init sections.Deletion using trash-bin patterns wasalso found to be beneficial for per-
QvTo Model Transformations:
Assessing and Improving their Quality
by Christine M. Gerpheide, Ramon R.H. Schiffelers and Alexander Serebrenik
Applying a unique bottom-up approach, we (Eindhoven University of Technology (The Netherlands))
worked collaboratively with ASML, the leading provider of complex lithography systems for the
semiconductor industry, to assess the quality of QVTo model transformations. This approach
combines sound qualitative methods with solid engineering to proactively improve QVTo quality in a
practitioner setting.
ERCIM NEWS 99 October 2014 33
formance as it has high test coveragein relation to functionality [2].
Test coverage also emerged as a keyissue in the expert interviews, withevery interviewee highlighting the lackof a test coverage tool. Consequently,we prioritized this demand and createda test coverage tool for QVTo (Figure1). Implemented as an Eclipse plugin,the tool helps QVTo developers tocreate higher-quality transformationsfrom the start: at the developers’request, the tool reports coverage per-centages for different units of QVTotransformations (e.g., mappings,helpers, constructors), as well as visu-alizing expressions covered and notcovered by the tests. During the sevenweek study period, the tool was used 98times, resulting in execution of 16,714unit tests. In addition to assisting withdebugging issues, the tool was alsoused by one developer to prepare a userstory, a description of how to completethe specific task of an agile sprint.When preparing the user story, thedeveloper ran the entire test suite andlooked at the coverage for the moduleshe knew would be affected by the taskto be completed. He then noted exactlywhich places in the modules were notcovered by the test suite by inspectingthe coverage overlay. Then, as part ofthe list of steps for the feature, he addedan additional step stating that before
implementation of the feature canbegin, additional tests must be writtento cover the untested parts. The devel-opers also stated the tool easily identi-fies dead code. The coverage tool hasalso been added to the developmentteam’s “Way of working” document,making it an official part of their devel-opment process.
As a side product of our research, threepatches were submitted, accepted andintegrated into the Eclipse QVTo coreengine which was released with EclipseLuna on June 25, 2014. Together, thesepatches make it possible for QVTointerpreters to easily access the test cov-erage tool, as well as future tools, suchas an integrated profiler [3].
References:
[1] C.M. Gerpheide: “Assessing andImproving Quality in QVTo ModelTransformations”, M.Sc. thesis,Eindhoven University of Technology,2014,http://phei.de/docs/academic/Gerpheide_QualityQVToTransformations.pdf[2] C.M. Gerpheide, R.R.H.Schiffelers, A. Serebrenik: “A Bottom-Up Quality Model for QVTo”,QUATIC 2014, Portugal, 2014)[3] C.M. Gerpheide et al.: “Add hooksfor contributing 3rd-party visitordecorators”,https://bugs.eclipse.org/bugs/show_bug.cgi?id=432969
Please contact:
Alexander SerebrenikEindhoven University of Technology,The NetherlandsTel: +31402473595E-mail: [email protected]
Figure 1: The coverage tool highlights the covered and not covered parts of the QVTo transformation as well as reports
If an engineer was asked how reliable acritical artefact must be, a likely answermight be that a system can only fail oncein 1010 times. For example, the proba-bility of a ship’s hull collapsing duringits lifetime or an active high water bar-rier failing to function when it isrequested to do so, are typically in theorder of 10-10. These numbers are so lowthat no engineer will ever experience thefailure of his own artefact.
Considering such a question led us toreflect on how it may be possible toobtain similar numbers when designingsoftware. In our previous work, weaddressed this question by constructinga simple failure model and, we found asimple answer [1]. The only way toobtain such figures was by employingredundancy. Such an approach is verycommon in the more classical engi-neering disciplines. For example, whenconsidering support columns with afailure probability of 10-5, an engineercan simply use two columns (where onlyone is truly necessary), thus allowing theoverall failure probability to beincreased to 10-10.
To examine how this redundancyapproach applies in the software devel-opment field, we must realise that soft-ware is very different from physicalartefacts. In physical artefacts, compo-nents fail due to wear and tear whilesoftware fails due to built-in errors, forexample, a small typing error, an incor-rect algorithm or the wrong use of aninterface. When such an error is acti-vated, the software fails. As software hasmany varied states, it can take a longtime for some errors to become active,although as shown in [2] many program-ming faults lead to a multitude of erro-neous states. Therefore, these latterfaults are far easier to catch.
The probability of a hardware failure isalmost negligible and thus, can beignored. Software errors are alwaysdirectly caused by either the program-mers or program designers that left those
errors in the code. As humans they havea large probability of doing somethingwrong [3]. At best, the failure proba-bility of humans is only 10-3 but eventhis figure can only be applied in situa-tions where the tasks are very simpleand the programmer highly trained. Formore complex tasks, failure probabili-ties of 10-2 or 10-1 are more realistic. Insituations where a human must com-plete a non-trivial task under stress, theyare almost certain to fail.
It should be obvious that the differencebetween the failure probability of a pro-grammer and the desired failure proba-bility of a critical piece of software isaround eight orders of magnitude.Obvious measures, such as comprehen-sive training for programmers or the useof the most modern programming lan-guages are excellent solutions butalone, these measures are unable tobridge this gap. Training can neveraccomplish an improvement of morethan a factor 10-2 and for a complex tasksuch as programming, even this isunlikely. Using modern programminglanguages, even domain specific lan-guages, in combination with librariescan lead to substantial reductions in theamount of required code and thus,
reduce the overall numbers of errors.However, here too, the possible reduc-tions that can be achieved (at most afactor of 100) are insufficient.
Thus, the only way to achieve thedesired failure probability of 10-10 is toconsciously employ redundancy in thesoftware design process. Typically,when constructing software, it must bedescribed in several ways. These dif-fering approaches should then be metic-ulously compared and challenged, withthe goal of removing as many of theflaws that will be inherent in eachdescription.
Several forms of redundancy arealready present in actual programming,such as type checking and testing.However, these forms of redundancycame about as good practices, not con-scious ways to introduce redundancywith a view to attaining a certain levelof software quality.
Active redundancy can be brought intothe software design process through theintroduction of high level models of thesoftware, for instance, in the form ofdomain specific languages, propertylanguages such as modal logics to inde-pendently state properties, independ-ently (and perhaps multiple) con-structed implementations, and a prioridescribed test cases. The comparison ofthese different views can be done bymodel checking (software or modelsagainst properties), model based testing(model against implementation), andsystematic testing (tests against modelor software). Code inspection andacceptance tests are also fruitful, butlack the rigour of comparison that themore mathematical methods have.
By acknowledging that redundancy indesign is the only way to obtain reliablesoftware, one can then question certaintrends. For instance, there is an on-going trend to eliminate the annoyanceassociated with static type checking. Alanguage like Python is a typical
Redundancy in the Software design Process
is Essential for designing Correct Software
by Mark G.J. van den Brand and Jan Friso Groote
Researchers at Eindhoven University of Technology in the Netherlands plead the case for more redundancy
in software development as a way of improving the quality of outcomes and reducing overall costs.
Figure 1: A graphical illustration of the decision model indicating when it is best to stop
testing and move into the software delivery phase [2].
Most software engineering processes
and tools claim to assess and improve
the quality of software in some way.
However, depending on its focus, each
one characterizes quality and interprets
evaluation metrics differently. These dif-
ferences have led to software
researchers questioning how quality is
perceived across various domains and
concluded that it was an elusive and
multifaceted concept. However, two key
perspectives stand out: a software’s
objective quality and subjective quality.
The objective “rationalized” perspec-
tive is taught in influential quality
models and promoted through standards
such as ISO/IEC 25010. It envisions
quality as conformance to a pre-defined
set of characteristics: quality is an
intrinsic data of the product that can be
measured and must be compared
against a standard in order to determine
its relative quality level. Therefore,
from this perspective, quality assurance
is closely associated with quality con-
trol.
The subjective perspective, on the other
hand, defines software quality as a con-
stantly moving target based on cus-
tomer’s actual experiences with the
product. Therefore, quality has to be
Software Quality in an Increasingly Agile World
by Benoît Vanderose, Hajer Ayed and Naji Habra
For decades, software researchers have been chasing the goal of software quality through the
development of rigorous and objective measurement frameworks and quality standards. However,
in an increasingly agile world, this quest for a perfectly accurate and objective quantitative
evaluation of software quality appears overrated and counter-productive in practice. In this
context, we are investigating two complementary approaches to evaluating software quality, both
of which are designed to take agility into account.
ERCIM NEWS 99 October 2014 37
defined dynamically in collaborationwith customers as opposed to pre-defined standards. This definition wel-comes change that enhance the qualityof the customer’s experience, empha-sizes the possibility to tolerate deliber-ately bad quality (in order to do it betternext time) and allows quality goals to beredefined. As such, quality is con-structed and checked iteratively and canevolve over time. This then leads toconstructive quality or emergent
quality. The subjective perspective alsopromotes the idea of on-going cus-tomer’s satisfaction and garnerseveryone’s commitment to achievingquality: thus, quality assurancebecomes an organization-wide effort orwhat is called holistic quality.
The suitability of either perspectivedepends on the software developmentprocess. In a production context, qualityis defined as the conformance to setrequirements while quality in a servicecontext should take into account the factthat each stakeholder will have a dif-ferent definition of what constitutes aquality experience, and furthermore,these perceptions will evolve over time.In the field of software engineering,there has been a move from the compli-ance view towards a constructiveholistic quality assurance view. This isparticularly notable in the case of itera-tive and incremental software engi-neering methods and agile methods.
Improving the support for this way ofenvisioning software quality is one ofthe research topics addressed by thePReCISE research center at theUniversity of Namur, and currentefforts focus on two complementaryresearch areas: model-driven qualityassessment and iterative context-drivenprocess evolution.
MoCQA and AM-QuICk
frameworks
Our attempts to capture the essence of a“traditional” quality assessment (i.e.,metrics, quality models and standards)in a unified meta-model resulted in afully-fledged model-driven qualityassessment framework named MoCQA[1]. This framework aims to provide themethodology, tools and guidelines tointegrate evaluation methods from dif-ferent sources and associate them with aquality goal, a set of stakeholders andan artefact (e.g., a piece of code, UMLdiagram, etc.), allowing these elements
to coexist in a coherent way. Beingmodel-driven, the framework providesa unified view of the quality concerns ofspecific stakeholders and iterativelyguides the actual assessment. However,in order to leverage the benefits of theframework, it is essential to perform thequality assessment iteratively and incre-mentally (the feedback from the assess-ment helps improve the products) andensure that this feedback is taken intoaccount to pilot the next steps of thedevelopment process.
Guaranteeing the positive impact of anassessment on the development processcalls for iterative process evolution. Ourresearch in the field of agile methodscustomization [2] revealed that this cus-tomization does not include the fact thatthe context itself may evolve over time.Another framework, AM-QuICk [3] isdesigned to allow a truly context-drivenevolution of the development process, areview at each iteration ensuring that itcan be adapted to the current context. Itrelies on the elaboration of a repositoryof reusable agile artefacts, practices andmetrics and a context-sensitive compo-sition system.
In order to exploit the benefits of aniterative process evolution, decision-making elements are needed to guidethe evolution and decide which prac-tices to include at the right time. Thiscan be achieved through model-drivenquality assessment, making the twoapproaches complementary (Figure 1).
Future Work
Looking to the future, our efforts willfocus on tightening the integrationbetween the two frameworks.Advancements in these complementaryresearch areas offer great opportunitiesto provide development teams withcomprehensive sets of tools with whichto manage an evolving software devel-opment process that focuses on theglobal satisfaction of each stakeholderat each stage. Such tools ensuring theseprocesses can operate more effectivelyin an increasingly agile world.
References:
[1] B. Vanderose, “Supporting amodel-driven and iterative qualityassessment methodology: The MoCQAframework,” Ph.D. dissertation, NamurUniv., Belgium, 2012[2] H. Ayed et al., “A metamodel-based approach for customizing andassessing agile methods,” in proc.2012 QUATIC.[3] H. Ayed et al., “AM-QuICk : ameasurement-based framework foragile methods customization,” in proc.2013 IWSM/MENSURA.
Figure 1: An agile and quality-oriented development process based on the complementarity
between the model-driven quality assessment (MoCQA) and the iterative context-driven process
evolution (AM-QuICk).
SMEs and the small IT sections of largercompanies often produce valuable soft-ware products. Due to limited resources,coping with quality issues in these verysmall entities (VSEs) is not an easy task.In these contexts, using process refer-ence models such as CMMI or ISO12207 are clearly excessive. Therefore,the need for a lightweight standard isuseful and will particularly assist VSEsthat need recognition as suppliers ofhigh quality systems. Such a standardwould also provide a practical processimprovement tool.
Through the CE-IQS project, CETIChas been promoting the use and develop-ment of such lightweight methods foryears. The group has shared its experi-ences with the working group 24(WG24) of sub-committee 7 (SC7) ofthe Joint Technical Committee 1 (JTC1)of the International Organization forStandardization and the InternationalElectro-technical Commission(ISO/IEC). This work has broughttogether contributions from numerousparties including Canada (ETS), Ireland(LERO), Japan, Thailand and Brazil,and has resulted in a new ISO/IEC29110 series of standards. These newstandards address the “Systems andSoftware Life Cycle Profiles and
Guidelines for Very Small Entities” [1]and have been made freely available bythe ISO [2]. Of particular relevance isPart 5 which contains dedicated guide-lines for VSEs that are complementedby deployment packages to assist withtheir adoption.
The ISO/IEC 29110 series are struc-tured to follow a progressive approachwhich is based on a set of profiles thatrange from an entry profile to the mostadvanced profile (see Figure 1). Theentry profile consists of the simplest setof development practices, coveringsoftware implementation and projectmanagement activities, and is particu-larly suited to start-ups. This is followedby the basic, intermediate and advanceprofiles that progressively cover agrowing set of activities that can handlemore complex situations involving alarger range of risks. The intermediateprofile, although progressing well, isyet to be published and the advancedprofile is still being discussed by theWG24.
In addition to contributing to this work,CETIC is also actively applying thestandard as a practical tool forincreasing the maturity of VSEs devel-oping software. This work is not being
done in a certification context as there islittle incentive for this measure inBelgium as opposed to Thailand andBrazil where VSEs are highly involvedin off-shoring (and hence, want toobtain such a certification to advertisethe quality of their work). Instead, ouractions take the following forms:
• Self-assessment questionnaire: througha dedicated web-site, companies cananswer a set of questions to see howtheir current development approachcompares against the entry profileactivites. The questionnaire is cur-rently available in French, Englishand Czech and feedback and/or trans-lations into other languages are wel-comed.
• Light assessment: a two-hour assess-ment which is based on a two-pagecheck-list based on the entry profile.This assessment can be conducted aspart of a code quality assessment in aVSE context (has become mandatoryto integrate some start-up incubatorsin Wallonia).
• Full assessment: comprises of one totwo days of interviews with key per-sonnel including the project manager,architect and possibly some develop-ers. The assessment should be con-ducted in two phases so that weak
ERCIM NEWS 99 October 201438
Special Theme: Sofware Quality
Improving Small-to-Medium sized
Enterprise Maturity in Software development
through the use of ISO 29110
by Jean-Christophe Deprez, Christophe Ponsard and Dimitri Durieux
Technological small to medium sized enterprises (SMEs) that are involved in developing software
often lack maturity in the development process. This can result in an increased risk of software
quality issues and technological debt. Recently, standards have started to address the need for
lightweight process improvements through the development of the ISO 29110 series. In this article,
we look at how this series can successfully be applied in the areas of (self)-assessment, diagnosis
and specific tooling support.
Figure 1: ISO29110 generic
profile group and related
process support
ERCIM NEWS 99 October 2014 39
points can be both identified and fur-ther investigated, allowing a full set ofrecommendations to be produced.Additional coaching can also be pro-vided.
• Process support service: an onlineservice called “Easoflow” is currentlybeing developed which will aim tohelp companies standardize their proj-ect management and software devel-opment processes by providing a sim-ple and guided streamlining of theISO/IEC 29110 entry and basic pro-file activities.
The results of all these activities arereported back to the ISO. Our on-going
R&D work will focus on extending thestandard, with specific profiles for areassuch as system engineering, the devel-opment of Easoflow and the associatedlightweight risk assessment tools.
[1] C. Laporte et al.: “A SoftwareEngineering Lifecycle Standard forVery Small Enterprises”, in R.V.O’Connor et al. (Eds.): EuroSPI 2008[2] ISO/IEC 29110:2011, SoftwareEngineering - Lifecycle Profiles forVery Small Entities (VSEs),http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html
Although numerous certifications forsoftware quality exist (e.g.,ISO/IEC15504, CMMI, etc.), there islittle evidence to suggest that compli-ance with any of these standards guaran-tees good software products. Criticshave gone so far as to suggest that theonly thing these standards guarantee isuniformity of output and thus, may actu-ally lead to the production of bad prod-ucts. Consequently, the idea that soft-ware evaluations should be based ondirect evidence of product attributes isbecoming more widespread. Therefore,a growing number of organizations arebecoming concerned about the quality ofthe products that they develop and/oracquire, as well as the processes.
The ISO/IEC 25000 family of standards,known as SQuaRE (Software ProductQuality Requirements and Evaluation),appears to meet this emerging need toassess product-related quality. Theobjective of ISO/IEC 25000 is to createa common framework within which toevaluate software product quality andthis standard is now beginning to replacethe previous ISO/IEC 9126 and ISO/IEC14598 standards to become the corner-stone of this area of software engi-
neering. ISO/IEC25000 is divided intoseveral parts: we highlightISO/IEC25040 [1] which defines theprocess of evaluating software productquality and ISO/IEC25010 [2] whichdetermines the software product charac-teristics and sub-characteristics that canbe evaluated.
As with other standards, ISO/IEC25000 describes what to evaluate butdoes not specify how. In other words, itdoes not detail the thresholds for theevaluation metrics to be used, nor doesit describe how to group these metrics inorder to assign a quality value to a soft-ware product.
Improving this evaluation process hasbeen the goal for AQC team (began bythe Alarcos Research Group, Universityof Castilla-La Mancha) over the lastfew years. This has led to the creation ofAQC Lab [3], the first laboratoryaccredited for ISO/ IEC17025 byENAC (National Accreditation Entity)to assess the quality of software prod-ucts using the ISO/IEC25000. The labhas also been recognised by the ILAC(International Laboratory AccreditationCooperation). This accreditation con-
firms the technical competence of thelaboratory and ensures the reliability ofthe evaluation results. The AQC Labuses three main elements to conduct thequality evaluations. These are:• The Assessment Process which
directly adopts the activities ofISO/IEC 25040 and completes themwith specific laboratory roles andinstructions which have been devel-oped internally. This process pro-duces an evaluation report that showsthe level of quality achieved by theproduct and any software aspects thatmay require improvement.
• The Quality Model which defines thecharacteristics and metrics needed toevaluate the software product. Thismodel was developed through theMEDUSAS (Improvement and Evalu-ation of Usability, Security and Main-tainability of Software (2009-2012))research project, funded by MICINN/FEDER. Although the AQC Lab eval-uates multiple features of the modelpresented in ISO/IEC 25010, theaccreditation initially focuses on thecharacteristic of maintainability, prin-cipally because maintenance is one ofthe most expensive phases of thedevelopment lifecycle and maintain-
Software Product Quality Evaluation
using ISO/IEC 25000
by Moisés Rodríguez and Mario Piattini
In recent years, software quality has begun to gain great importance, principally because of the key
role software plays in our day-to-day lives. To control software quality, it is necessary to conduct
evaluations of the software products themselves. The AQC Lab was established for this purpose and its
core responsibility is evaluating the quality of software products against the ISO/IEC 25000 standard.
ability is one of the features most fre-quently requested by software clients,Many clients seek products that main-tain themselves or can be maintainedby a third party. This model requiredconsiderable effort to develop, eventu-ally resulting in a set of quality prop-erties that are measurable from thesource code and related to the sub-characteristics of quality proposed inISO/IEC25010 (Table 1).
• The Evaluation Environment whichlargely automates the evaluationtasks. This environment uses measure-ment tools that are applied in the soft-ware product to combine the valuesobtained, assign quality levels to themodel sub-characteristics and charac-teristics and, eventually, present themin an easily accessible manner.
In addition, the Spanish Association forStandardization and Certification(AENOR) has created a softwareproduct quality certification based onISO/ IEC25000. To perform a certifica-tion, AENOR reviews the assessmentreport issued by the accredited labora-tory and makes a brief audit of the com-pany and the product. If everything iscorrect, the company is then given acertificate of quality for its softwareproduct.
During the past year, several pilot proj-ects have been run using this certifica-tion scheme. Participating companies(from Spain and Italy) have been thefirst to be assessed and future projectsare planned for companies in Colombia,Mexico and Ecuador.
[1] ISO, ISO/IEC 25040 Systems andsoftware engineering - Systems andsoftware Quality Requirements andEvaluation (SQuaRE) - Evaluationprocess. 2011: Ginebra, Suiza.[2] ISO, ISO/IEC 25010, Systems andsoftware engineering - Systems andsoftware Quality Requirements andEvaluation (SQuaRE) - System andsoftware quality models. 2011:Ginebra, Suiza.[3] Verdugo, J., M. Rodríguez, and M.Piattinni, Using Agile Methods toImplement a Laboratory for SoftwareProduct Quality Evaluation, in 15thInternational Conference on AgileSoftware Development (XP 2014).2014: Roma (Italia).
Please contact:
Moisés RodríguezAlarcos Quality Center, SpainE-mail:[email protected]
Mario PiattiniUniversity of Castilla–La Mancha,SpainE-mail: [email protected]
by Farhad Arbab, Sung-Shik Jongmans and Frank de Boer
CWI’s Reo-to-C compiler can generate code that outperforms hand-crafted code written by a
competent programmer. As such, compared to conventional approaches to parallel programming,
our approach has the potential to not only improve software quality but also performance.
In 2012, we described in ERCIM Newsa novel approach to programming inter-action protocols among threads onmulti-core platforms. This approach isbased on the idea of using the graphicalcoordination language Reo, which hasbeen subject to ongoing development bythe Formal Methods group at CWI sincethe early 2000s, as a domain-specificlanguage for the compositional specifi-
cation of protocols. Since then, we havedeveloped several code generation tech-niques and tools for Reo, including theReo-to-Java and Reo-to-C compilers.
In terms of software engineering andsoftware quality there are many advan-tages of using Reo. Reo [1] providesdeclarative high-level constructs forexpressing interactions and thus, pro-
grammers can specify their protocols ata more suitable level of abstraction thatcan be achieved by using conventionallanguages (e.g., Java or C). These con-ventional languages provide only error-prone low-level synchronization primi-tives (e.g., locks, semaphores).Moreover, because Reo has a formalsemantics, protocols expressed in Reocan be formally analyzed to improve
Table 1: Relationship between the sub-characteristics and quality properties in the
quality model (completed in Step 2 of the evaluation process).
software quality and ensure correctness(e.g., by using model checking tools).Lastly, by using Reo, protocols becometangible, explicit software artifactswhich promotes their reuse, compositionand maintenance.
Two years ago, we thought that this list ofsoftware engineering advantages wouldbe the main reason for programmers toadopt our approach, so long as the per-formance of the code generated by ourcompilers proved “acceptable”. At thetime, it seemed ambitious to strive for thelevel of performance (within an order ofmagnitude) that competent programmerscan achieve hand-crafting code using aconventional language. After all, com-piling high-level specifications into effi-cient lower-level implementations consti-tutes a significant challenge.
However, while developing our com-pilers, we came to realize that Reo’sdeclarative constructs actually give usan edge, as compared to conventionallanguages. Reo allows for novel com-piler optimization techniques that funda-mentally conventional languages cannotapply. The reason for this is that Reo’sdeclarative constructs preserve more ofa programmer’s intentions when theyspecify their protocols. When a suffi-ciently smart compiler knows exactlywhat a protocol is supposed to achieve,this compiler can subsequently choosethe best lower-level implementation.Using conventional languages, in whichprogrammers write such lower-levelimplementations by hand, informationabout their intentions is either irretriev-ably lost or very hard to extract.Therefore, to perform optimizations atthe protocol logic level a compiler needsto reconstruct those intentions, but typi-cally, it simply cannot. Thus, by usingconventional languages for imple-
menting protocols by hand, the burdenof writing efficient protocol implemen-tations rests exclusively on the shouldersof programmers, adding even morecomplexity to the already difficult taskof writing parallel programs. As thenumber of cores per chip increases, theshortcomings of conventional program-ming languages in writing efficient pro-tocol implementations will amplify thisissue, effectively making such lan-guages unsuitable for programminglarge-scale multi-core machines.
The following example, a simple pro-ducers-consumer program, offers the firstevidence that our approach can result inbetter performance. In this program, everyone of n producers produces and sends aninfinite number of data elements to theconsumer, while the consumer receivesand consumes an infinite number of dataelements from the producers. The pro-tocol between the producers and the con-sumer states that the producers send theirdata elements asynchronously, reliablyand in rounds. In every round, each pro-ducer sends one data element in an arbi-
trary order. A Reo specification realizesthis protocol (for three producers; seeFigure 1). We also had a competent pro-grammer hand-craft a semantically equiv-alent program in C and Pthreads.
We compared the scalability of the codegenerated by our current Reo-to-C com-piler with the hand-crafted implementa-tion (Figure 2). The generated C coderuns on top of a novel runtime systemfor parallel programs. In this example,for this protocol, the Reo-based imple-mentation outperformed the carefullyhand-crafted code.
The technology behind our compiler isbased on Reo’s automaton semantics.Our most recent publication containsreferences to the relevant material [3].All the optimization techniques used bythe compiler have a strong mathematicalfoundation and we have formally provedtheir correctness (which guarantees cor-rectness by construction).
We do not claim that code generated byour compiler will outperform everyhand-crafted implementation in everyprotocol and in fact, know that it doesnot. However, we do believe that thesefirst results are very encouraging and seeseveral ways to further optimize ourcompiler in the future.
[1] F. Arbab: “Puff, The MagicProtocol”, in “Formal Modeling:Actors, Open Systems, BiologicalSystems (Talcott Festschrift)”, LNCS7000, pp. 169--206, 2011 [2] S.-S. Jongmans, S. Halle, F. Arbab:“Automata-based Optimization ofInteraction Protocols for ScalableMulticore Platforms”, in “CoordinationModels and Languages (Proceedings ofCOORDINATION 2014)”, LNCS8459, pp. 65--82, 2014 [3] S.-S. Jongmans, F. Arbab: “TowardSequentializing OverparallelizedProtocol Code”, in “proc. of ICE2014”, EPTCS, 2014
Please contact:
Farhad Arbab, Frank de Boer, Sung-Shik Jongmans CWI, The Netherlands Tel: +31 20 592 [4056, 4139, 4241] E-mail: [farhad, frb, jongmans]@cwi.nl
Figure 1:
Producers-consumer
protocol specification
in Reo.
Figure 2: The results of a scalability
comparison between the code generated by the
Reo-to-C compiler (dashed line) and the hand-
crafted implementation (continuous line).
ERCIM NEWS 99 October 2014
European
Research and
Innovation
Interpreterglove -
An Assistive Tool
that can Speak for
the deaf and deaf-Mute
by Péter Mátételki and László Kovács
InterpreterGlove is an assistive tool for hearing- and
speech-impaired people that enables them to easily
communicate with the non-disabled community through
the use of sign language and hand gestures. Put on the
glove, put the mobile phone in your pocket, use sign
language and let speak for you!
Many hearing- and speech-impaired people use sign, insteadof spoken, languages to communicate. Commonly, the onlysectors of the community fluent in these languages are theaffected individuals, their immediate friend and familygroups and selected professionals.
We have created an assistive tool, InterpreterGlove, that canreduce communication barriers for the hearing- and speech-
impaired community. The tool consists of a hardware-soft-ware ecosystem that features a wearable motion-capturingglove and a software solution for hand gesture recognitionand text- and language-processing. Using these elements itcan operate as a simultaneous interpreter, reading the signedtext of the glove wearer aloud (Figure 1).
Prior to using the glove it needs to be configured and adaptedto the user’s hand. A series of hand poses are recordedincluding bended, extended, crossed, closed and spread fin-gers, wrist positions and absolute hand orientation. Thisallows the glove to generate the correct gesture descriptor forany hand state. This personalization process not only enablesthe glove to translate every hand position into a digital hand-print but also ensures that similar hand gestures will result insimilar gesture descriptors across all users. InterpreterGloveis then ready for use. A built-in gesture alphabet, based on theinternational Dactyl sign language (also referred to as finger-spelling), is provided. This alphabet includes 26 one-handedsigns, each representing a letter of the English alphabet.Users can further customize this feature by fine-tuning thepre-defined gestures. Thus, the glove is able to recognize andread aloud any fingerspelled word.
Feedback suggested that fingerspelling long sentencesduring a conversation could become cumbersome. However,
Figure 1: The key components of InterpreterGlove are motion-
capturing gloves and a mobile-supported software solution that
recognises hand gesture and completes text- and language-
processing tasks.
ERCIM NEWS 99 October 2014 43
the customization capabilities allow the user to eliminate thisinconvenience. Users can define their own gesture alphabetby adding new, customised gestures and they also have theoption of assigning words, expressions and even full sen-tences to a single hand gesture. For example, “Goodmorning”, “Thank you” or “I am deaf, please speak slowly”can be delivered with a single gesture.
The main building blocks of the InterpreterGlove ecosystemare the glove, the mobile application and a backend server thatsupports the value-added services (Figure 2). The glove’s pro-totype is made of a breathable elastic material and the elec-tronic parts and wiring are ergonomically designed to ensureseamless daily use. We used data from twelve ‘9 DoF (Degreeof Freedom)’ integrated motion-tracking sensors to calculatethe absolute position of the hand and to determine the joints’deflections in 3D. The glove creates a digital copy of the handwhich is denoted by our custom gesture descriptor. The gloveconnects to the user’s cell phone and transmits these gesturedescriptors via a Bluetooth serial interface to the mobile appli-cation to be processed by the high-level signal- and natural-language processing algorithms.
Based on the biomechanical characteristics and kinematicsof the human hand [1], we defined the semantics of Hagdil,the gesture descriptor. We use this descriptor to communicate
the users’ gestures to the mobile device (Figure 3). Everysecond, 30 Hagdil descriptors are generated by the glove andtransmitted to the mobile application.
Two types of algorithm are applied on the generated Hagdildescriptor stream to transform it into understandable text(Figure 4). To begin with, raw text is generated as a result ofthe segmentation, by finding the best gesture descriptor can-didates. To achieve this a signal processing algorithm isrequired. Based on our evaluation, the sliding window anddynamic kinematics based solutions produced the bestresults and consequently these have been used in our proto-type. This raw text may contain spelling errors caused by theuser’s inaccuracy and natural signing variations. To addressthis issue, a second algorithm which performs an auto-cor-rection function processes this raw text and transforms it intounderstandable words and sentences. This algorithm is basedon a customised natural language processing solution thatincorporates 1- and 2-gram database searches and a confu-sion matrix based weighting. The corrected output can thenbe read aloud by the speech synthesizer. Both algorithms areintegrated into the mobile-based software application.
Our backend server operates as the central node for value-added services. It offers a community portal that facilitates ges-ture sharing between users (Figure 5). It also supports higher-
Figure 2: The building blocks of the InterpreterGlove ecosystem.
Figure 3: Hand gestures are transmitted to the mobile application
using Hagdil gesture descriptors.
Figure 4: Two algorithms
are applied to complete the
segmentation and natural
language processing
components necessary for
transforming hand gestures
into understandable text.
These are both intergrated
into the mobile application.
level language processing capabilities than available from theoffline text processing built into the mobile application.
Throughou t the whole project, we worked closely with thedeaf and blind community to ensure we accurately capturedtheir needs and used their expertise to ensure the prototypesof InterpreterGlove were properly evaluated. Their feedbackdeeply influenced our work and achievements. We hope thatthis device will improve and expand communication oppor-tunities for hearing- and speech-impaired people and thus,enhance their social integration into the wider community. Itmay also boost their employment possibilities.
Although originally targeted to meet the needs of hearing-impaired people, we have also realised that this tool has con-siderable potential for many others, for example, those with aspeech-impairment, physical disability or being rehabilitatedfollowing a stroke. We plan to address additional, targetedneeds in future projects. Presently, two major areas havebeen identified for improvement that may have huge impactson the application-level possibilities of this complex system.Integrating the capability to detect dynamics, i.e., perceivethe direction and speed of finger and hand movements, opensup new interaction possibilities. Expanding the coverage ofmotions capture, by including additional body parts, opensthe door for more complex application scenarios.
The InterpreterGlove (“Jelnyelvi tolmácskesztyű fejlesztése”KMR_12-1-2012-0024) project was a collaborative effortbetween MTA SZTAKI and Euronet MagyarországInformatikai Zrt. that ran between 2012 and 2014. The projectwas supported by the Hungarian Government, managed bythe National Development Agency and financed by theResearch and Technology Innovation Fund.
[1] C. L. Taylor, R. J. Schwarz: “The Anatomy andMechanics of the Human Hand”, Artificial limbs 06/1955;2(2):22-35.
Please contact:
Péter Mátételki or László Kovács, SZTAKI, HungaryE-mail: {peter.matetelki, laszlo.kovacs}@sztaki.mta.hu
Research and Innovation
ERCIM NEWS 99 October 201444
Figure 5: The InterpreterGlove portal allows users to share their
customised gestures.
The OFSE-grid:
A Highly Available
and Fault Tolerant
Communication
Infrastructure
based on Openflow
by Thomas Pfeiffenberger, Jia Lei Du and Pedro
Bittencourt Arruda
The project OpenFlow Secure Grid (OFSE-Grid) evaluates
the use of a software-defined networking (SDN) infrastruc-
ture in the domain of energy communication networks.
Worldwide, electrical grids are developing into smart grids.To ensure reliability, robustness and optimized resourceusage, these grids will need to rely heavily on modern infor-mation and communication technologies. To support theachievement of these goals in communication networks, weevaluated the possibility of using a software-defined net-working (SDN) infrastructure based on OpenFlow, to pro-vide a dependable communication layer for critical infra-structures.
SDN proposes a physical separation of the control and dataplanes in a computer network (Figure 1). In this scenario,only the controller is able to configure forwarding rules in thedata plane of the switches. This has the advantage of givingthe system a comprehensive and complete overview of itself.With this multifaceted knowledge about the status of the net-work, it is easier to implement new applications in the net-work by writing an application that configures it properly.
The implementation of a robust and fault-tolerant, multi-castforwarding scheme based on an OpenFlow network architec-ture is one of the main goals of the OFSE-Grid project. Tosolve this issue we use a two-layered approach. To beginwith, one must know how to forward packets correctly in thetopology and then, decide what to do when a fault occurs.Different approaches have been published regarding howbest to calculate the multicast tree for a network and makefurther improvements [1]. Fault-tolerance can be achievedeither reactively or proactively. In a reactive fault-tolerantscheme, the SDN controller is responsible for recalculatingthe configuration rules when a failure happens. In a proactivefault-tolerant scheme, the controller pre-emptively installsall rules necessary for managing a fault.
In terms of robustness and the rational use of switchresources, a hybrid approach to fault-tolerance is best.Therefore, we propose making the network proactively tol-erant to one fault (as in our current solution) so that there isvery little packet loss on disconnection. However, we alsopropose that further research should be undertaken so that anetwork that is capable of reconfiguring itself to the newtopology after the failure can be developed. Using this tech-nique, the network is not only tolerant to a fault, but it is also
ERCIM NEWS 99 October 2014 45
As part of the OFSE-Grid project we also confirmed that ingeneral, it will be possible to use commercial off-the-shelfSDN/OpenFlow hardware to provide a robust communica-tion network for critical infrastructures in the future [3].Looking forward, one of our next steps will be to considerlatency and bandwidth requirements in the routing decisionsas this may be a major precondition for critical infrastructure.
This work was part of the Open Flow Secure Grid (OFSE-Grid) project funded by the Austrian Federal Ministry forTransport, Innovation and Technology (BMVIT) within the“IKT der Zukunft” program.
[1] H. Takahashi, A. Matsuyama: “An approximate solutionfor the Steiner problem in graphs”, Math. Japonica, vol. 24,no. 6, 1980[2] D. Kotani, K. Suzuki, H. Shimonishi: “A Design andImplementation of OpenFlow Controller Handling IP Mul-ticast with Fast Tree Switching,” IEEE/IPSJ 12th Interna-tional Symposium on Applications and the Internet(SAINT), 2012[3] T. Pfeiffenberger, J. L. Du: “Evaluation of Software-Defined Networking for Power Systems”, IEEE Interna-tional Conference on Intelligent Energy and Power Systems(IEPS), 2014
Please contact:
Thomas PfeiffenbergerSalzburg Research Forschungsgesellschaft mbHE-mail: [email protected]
able to maintain fault-tolerance after a fault. This is similar tothe approach in [2] but here, we take advantage of local faultrecovery which reduces the failover times and thus, packetloss during failovers. Of course, the algorithm controlling thenetwork must run fast enough to avoid that a second failurehappening before the network is reconfigured. If a situationin which two failures can occur almost simultaneously isexpected, it would be advisable to make the network two-fault-tolerant. This can be achieved with minor modifica-tions of our software but comes at a greater cost in terms ofhardware resources, both in the controller and the involvednetwork devices.
To verify our approach, we chose a topology that couldapproximate a critical network infrastructure such as a substa-tion (Figure 2). The topology consists of multiple rings con-nected to a backbone ring. It is a fault-tolerant, multi-cast sce-nario and the configured forwarding rules are shown. Thereconfigured multi-cast scenario after a link failure is shown inFigure 3. This new multi-cast tree is not simply a workaroundto get to t1, but actually a whole new multi-cast tree.
Figure 1: A software defined networking architecture.
Figure 2: A 2-approximation calculation of the optimal Steiner tree
for the multi-cast group of the topology.
Figure 3: Network behaviour when the link t2 - t1 fails. When this
happens, the switch forwards the packet to a different tree (bold
edges), which can be used to forward packets to the destinations
without using the faulty link.
Research and Innovation
ERCIM NEWS 99 October 201446
Learning from
Neuroscience to Improve
Internet Security
by Claude Castelluccia, Markus Duermuth and Fatma
Imamoglu
This project, which is a collaboration between Inria,
Ruhr-University Bochum, and UC Berkeley, operates at
the boundaries of Neuroscience and Internet Security
with the goal of improving the security and usability of
user authentication on the Internet.
Most existing security systems are not user friendly andimpose a strong cognitive burden on users. Such systems usu-ally require users to adapt to machines, whereas we think thatmachines should be adjusted to users. There is often a trade-off between security and usability: in current applicationssecurity tends to decrease usability. A prime example for thistrade-off can be observed in user authentication, which is anessential requirement for many web sites that need to secureaccess to stored data. Most Internet services use password-authentication based schemes for user authentication.
Password-authentication based schemes are knowledge-based, since they require users to memorize secrets, such aspasswords. In password-based authentication schemes,higher security means using long, random combination ofcharacters as passwords, which are usually very difficult toremember. In addition, users are asked to provide differentpasswords for different web-sites, which have their own spe-cific policy. These trade-offs are not well understood, andpassword-based authentication is often unpopular amongusers [1]. Despite substantial research focusing on improvingthe state-of-the-art, very few alternatives are in use.
This project explores a new type of knowledge-basedauthentication scheme that eases the high cognitive load ofpasswords. Password-based schemes, as well as otherexisting knowledge-based authentication schemes, useexplicit memory. We propose a new scheme, MooneyAuth,which is based on implicit memory. In our scheme, users canreproduce an authentication secret by answering a series ofquestions or performing a task that affects their subconsciousmemory. This has the potential to offer usable, deployable,and secure user authentication. Implicit memory is effort-lessly utilized for every-day activities like riding a bicycle ordriving a car. These tasks do not require explicit recall of pre-viously memorized information.
The authentication scheme we propose is a graphical authen-tication scheme, which requires users to recognize Mooneyimages, degraded two-tone images that contain a hiddenobject [2]. In contrast to existing schemes, this scheme isbased on visual implicit memory. The hidden object is usu-ally hard to recognize at first sight but is easy to recognize ifthe original image is presented beforehand (see Figure 1).This technique is named after Craig Mooney, who used sim-ilar images of face drawings as early as 1957 to study theperception of incomplete pictures in children [3].
Our authentication scheme is composed of two phases: In thepriming phase, the user is ‘primed’ with a set of images, theirMooney versions and corresponding labels. During theauthentication phase, a larger set of Mooney images,including the primed images from the priming phase, is dis-played to the user. The user is then asked to label the Mooneyimages that she was able to recognize. Finally, the systemcomputes an authentication score from the correct and incor-rect labels and decides to grant or deny access accordingly. Aprototype of our proposed authentication scheme can befound online (see link below). We tested the viability of thescheme in a user study with 230 participants. Based on the
participants from the authentication phase we measured theperformance of our scheme. Results show that our scheme isclose to being practical for applications where timing is notoverly critical (e.g., fallback authentication).
We believe that this line of research, at the frontier of cogni-tive neuroscience and Internet security, is very promising andrequires further research. In order to improve the usability ofauthentication schemes, security researchers must achieve abetter understanding of human cognition.
Link:
http://www.mooneyauth.org
References:
[1] A. Adams and M. A. Sasse: “Users are not the enemy”,Commun. ACM, 42(12):40-46, Dec. 1999[2] F. Imamoglu, T. Kahnt, C. Koch and J.-D. Haynes:“Changes in functional connectivity support consciousobject recognition”, NeuroImage, 63(4):1909-1917, Dec.2012[3] C. M. Mooney: “Age in the development of closureability in children”. Canadian Journal of Psychology,11(4):219-226, Dec. 1957.
is very relevant for today’s society. At CWI, we have
pursued investigations in these fields to enhance the
vital services provided by ambulances through the
development of Dynamic Ambulance Management.
In life-threatening emergency situations, the ability of ambu-lance service providers (ASPs) to arrive at an emergencyscene within a short period of time can make the differencebetween survival or death for the patient(s). In line with this,a service-level target is commonly used that states that forhigh-emergency calls, the response time, i.e., the timebetween an emergency call being placed and an ambulancearriving at the scene, should be less than 15 minutes in 95%of cases. To realise such short response times, but still ensurerunning costs remain affordable, it is critical to efficientlyplan ambulance services. This encompasses a variety ofplanning problems at the tactical, strategic and operationallevels. Typical questions that must be answered include“How can we reliably predict emergency call volumes overtime and space?”, “How can we properly anticipate andrespond to peaks in call volumes?”, “How many ambulancesare needed and where should they be stationed?”, and “Howshould ambulance vehicles and personnel be effectivelyscheduled?”.
A factor that further complicates this type of planning prob-lems is uncertainty, an ever-present consideration that in thiscontext, will affect the entire ambulance service-provi-sioning process (e.g., emergency call-arrival patterns, traveltimes, etc.). The issue is that the planning methods currentlyavailable typically assume that “demand” (in the context ofambulance services this would refer to call volumes and theirgeographical spread) and “supply” (the availability of vehi-cles and ambulance personnel at the right time in the rightplace) parameters are known a priori. This make thesemethods highly vulnerable to randomness or uncertainty, andthe impacts this inevitably has on the broader planning
process, namely inefficiencies, and higher costs. For ambu-lance services, the challenge is to develop new planningmethods that are both scalable and robust against theinherent randomness associated with the service process,both real and non-real time.
A highly promising development that is gaining momentumin the ambulance sector is the emergence of DynamicAmbulance Management (DAM). The basic idea of DAM isthat ambulance vehicles are proactively relocated to achievea good spatial coverage of services in real time. By usingdynamic and proactive relocation strategies, shorter arrivaltimes can be achieved [1].
To illustrate the use of DAM, consider the followingexample area which features six cities (A, D, E, L, U and Z;Figure 1), serviced by seven ambulances. When there are noemergencies (a ‘standard situation’), optimal vehicle cov-erage is obtained by positioning one ambulance in each ofthe six cities, with one additional vehicle in city A as it hasthe largest population. Now consider a scenario where anincident occurs at city L while simultaneously, two addi-tional incidents are occurring in city A. These incidents canall be serviced by the ambulances currently located in thosetwo cities. Under an ‘optimal’ dynamic relocation policy, thisscenario should then trigger a proactive move by the ambu-lance in city D to city L, in order to maintain service cov-erage. As soon as that ambulance is within a 15-minutedriving range of city L, the ambulance at city U should moveto city D (note that city U is smaller than D, meaning that itcan be covered by city E’s ambulance). The ambulance atcity Z can then proactively move to city A. This exampleillustrates the complexity in using DAM: for example, whatadditional steps would be appropriate if an additional acci-dent was to occur in city U whilst the ambulance was transi-tioning between cities U and D?
The key challenge in an approach such as DAM is devel-oping efficient algorithms that support real-time decisionmaking. The fundamental question is “under what circum-stances should proactive relocations be performed, and howeffective are these relocation actions?”. Using methods fromthe stochastic optimization techniques Markov DecisionProcesses and Approximate Dynamic Programming, wehave developed new heuristics for DAM [2,3].Implementation of these solutions in the visualization andsimulation package suggests that strong improvements inservice quality can be realised, when compared with the out-
Figure 1: Illustration of proactive relocations of ambulance vehicles.
Research and Innovation
SELIDA is a joint research project between the IndustrialSystem Institute, the University of Patras (Library andInformation Center), the Athens University of Economics andBusiness, Ergologic S.A and Orasys ID S.A. This projectintroduces an architectural framework that aims to support asmany of the EPC global standards as possible (Figure 1). Theproject’s main goal is the ability to map single physicalobjects to URIs in order to provide, to all involved organiza-tions in the value chain, various information related to theseobjects (tracking, status, etc). This is mainly achieved bySELIDA’s architectural framework which is able to supportas many of the EPC global standards as possible (Figure 1)along with the realization of ONS-based web services avail-able in the cloud. This architectural framework is a value-chain agnostic which relates to:• the common logistics value-chain; • the physical documents inter-change value-chain; and • in demanding cases, the objects inter-change value-chain.
The discovery and tracking service of physical documentsthat has been implemented exploits both ONS 1.0.1 andEPCIS 1.0.1, in order to allow EPC tagged documents to bemapped to the addresses of arbitrary object managementservices (OMS), albeit ones with a standardised interface.
The main constituents of the architectural framework are: • The RFID middleware which is responsible for receiving,
analysing processing and propagating the data collectedby the RFID readers to the information system which sup-ports the business processes.
• The Integration Layer which seamlessly integrates theEPC related functions to the existing services workflow.
ERCIM NEWS 99 October 201448
Figure 1: The architectural framework of SELIDA.
comes gained from the relocation rules currently beingdeployed by most ambulance providers. Currently we are inthe field trial phase and are setting up a number of newlydeveloped DAM algorithms, in collaboration with a numberof ASPs in the Netherlands. These activities are part of theproject ‘From Reactive to Proactive Planning of AmbulanceServices’, partly funded by the Dutch agency StichtingTechnologie & Wetenschap.
[1] M. S. Maxwell et al.: “Approximate dynamic program-ming for ambulance redeployment”, INFORMS Journal onComputing 22 (2) (2010) 266-281[2] C.J. Jagtenberg, S. Bhulai, R.D. van der Mei: “A poly-nomial-time method for real-time ambulance redeploy-ment”, submitted.[3] T.C. van Barneveld, S. Bhulai, R.D. van der Mei: “Aheuristic method for minimizing average response times indynamic ambulance management”, submitted.
SELIDA, a printed materials management system that
uses radio frequency identification (RFID), complies
with the Web-of-Things concept. It does this by
employing object naming based services that are able
to provide targeted information regarding RFID-enabled
physical objects that are handled in an organization
agnostic collaborative environment.
Radio Frequency Identification (RFID) technology hasalready revolutionised areas such as logistics (i.e., supplychains), e-health management and the identification andtraceability of materials. The challenging concept of RFID-enabled logistics management and information systems isthat they use components of the Electronic Product Code(EPC) global network, such as Object Naming Services(ONS) and the EPC Information Services (EPCIS) in orderto support the Internet of Things concept.
ERCIM NEWS 99 October 2014
While the existing legacy systems could be altered, such alayer is preferable because of the reliability offered byshop floor legacy systems in general.
• The ONS Resolver which provides secure access to theONS infrastructure so that its clients can not only querythe OMSs related to EPCs (which is the de facto use casefor the ONS) but also introduce new OMSs or delete anyexisting OMSs for the objects.
• The OMS which provides management, tracking and othervalue added services for the EPC tagged objects. The ONSResolver maps the OMS to the objects, according to theirowner and type, and they should be implemented accord-ing to the EPCIS specification (see link below).
The SELIDA architecture has been integrated into KOHA,the existing Integrated Library System used in theUniversity of Patras Library and Information Center. Aswith all integrated library systems, KOHA supports avariety of workflows and services that accommodate theneeds of the Center. The SELIDA scheme focuses on ahandful of those services and augments them with addi-tional features. This is generally done by adding, in a trans-parent way, the additional user interface elements andbackground processes that are needed for the scheme towork. In order to provide the added EPC functionality tothe existing KOHA operations, the integration layer wasdesigned and implemented to seamlessly handle all theextra work, along with the existing service workflow. TheSELIDA scheme provides additional functionality to serv-ices such as Check Out, Check In, New Record and DeleteRecord. There are also a number of tracking services thatour scheme aims to enhance; these are History, Locationand Search/Identify.
The implemented architecture focuses on addressing theissue of empowering the whole framework with a standardspecification for object tracking services by utilising anONS. Thus, the organisations involved are able to act agnos-tically of their entities, providing them with the ability toresolve EPC tagged objects to arbitrary services in a stan-dardised manner.
Links:
KOHA: www.koha.orgISO RFID Standards: http://rfid.net/basics/186-iso-rfid-standards-a-complete-listSurvey: http://www.rfidjournal.com/articles/view?9168EPCglobal Object Name Service (ONS) 1.0.1:http://www.gs1.org/gsmp/kc/epcglobal/ons/ons_1_0_1-stan-dard-20080529.pdfEPCglobal framework standards:http://www.gs1.org/gsmp/kc/epcglobal
Reference:
[1] J. Gialelis, et al.: An ONS-based Architecture forResolving RFID-enabled Objects in Collaborative Environ-ments”, IEEE World Congress on Multimedia and Comput-er Science, WCMCS 2013
Please contact:
John Gialelis or Dimitrios KaradimasIndustrial Systems Institute, Patras, GreeceE-mail address: {gialelis,karadimas}@isi.gr
49
Lost Container
detection System
by Massimo Cossentino, Marco Bordin and Patrizia
Ribino
Each year thousands of shipping containers fail to arrive
at their destinations and the estimated damage arising
from this issue is considerable. In the past, a database
of lost containers was established but the difficult
problem of identifying them in huge parking areas was
entrusted to so-called container hunters. We propose a
system (and related methods) that aims to
automatically retrieve lost containers inside a logistic
area using a set of sensors that are placed on cranes
working within that area.
The Lost Container Detection System (LostCoDeS) [1] is anICT solution created for avoiding the costly loss of containersinside large storage areas, such as logistic districts or shippingareas. In these kinds of storage areas (Figure 1), several thou-sand of containers are moved and stacked in dedicated zones(named cells) daily. Nowadays, the position of each stackedcontainer is stored in a specific database that facilitates thelater retrieval of this location information. As the movementand management of containers involves many differentworkers (e.g., crane operators, dockers, administrative per-sonnel, etc.), communication difficulties or simply humandistraction can cause the erroneous positioning of containersand/or the incorrect updating of location databases. In largeareas that store thousands of containers, such errors oftenresult in containers becoming lost and thus, result in theensuing difficulties associated with finding them.
At present, to the best of our knowledge, there are no auto-matic solutions available that are capable of solving this par-ticular problem without the pervasive use of trackingdevices. Most of the proposed solutions in the literatureaddress container traceability during transport (either to theirdestinations or inside logistic districts) by using on-boardtracking devices [2] or continuously monitoring the con-tainers with ubiquitous sensors [3], only to name a few.
Figure 1: A common shipping area (by courtesy of Business Korea).
Research and Innovation
ERCIM NEWS 99 October 201450
The LostCoDeS is a system that is able to detect a misplacedcontainer inside a large storage area without using any kindof positioning or tracking devices on the container. The nov-elty of the LostCoDeS lies in the method we use to find thelost containers, rather than on the wide use of hardwaredevices on containers. In this system, a few sets of sensorsare placed on the cranes working inside the logistic area.Using the data from these sensor sets, an algorithm can thenverify if there are any misplaced containers that may indicatea lost item. The architectural design of LostCoDeS is quitesimple. It is composed of a set of sensors for capturing geo-data related to the large storage area, a workstation for elabo-rating these data and a network for communicating data to astorage device. An informal architectural representation ofLostCoDeS is presented in Figure 2.
From the functional perspective, LostCoDeS is based onthree main algorithms: the former allows to execute a multi-modal fusion of data coming from the set of sensors; thesecond one is able to reproduce a three-dimensional repre-sentation of geo-data and finally the last one implements acomparison between real data perceived by the sensors andexpected ones.
Hence, our system is able to generate a representation of thecontainer stacks to detect anomalies.More in detail, theLostCoDeS is able to (i) detect the incorrect placement ofcontainers; (ii) identify the likely locations of lost containers;(iii) indicate the presence of non-registered containers; (iv)indicate the absence of registered containers; and (v) monitorthe positioning operations.
The main advantage of the LostCoDeS is that the detectionof lost containers can be completed during normal handlingoperations. Moreover, it overcomes the limitations associ-
ated with traditional tracking systems which are based onradio signals, namely that the reliability and stability oftransmissions is not guaranteed when the signal has to passthrough obstacles (e.g., metal). Further, the system is discrete(i.e., there is no need to install cameras and/or other equip-ment over the monitored area) and low cost (i.e., there is noneed to install tracking devices on the containers), but stillmaintains an ability to monitor large areas. It is also worthnoting that the continuous monitoring of operations is notstrictly required (although useful) as the system is capable ofidentifying misplaced containers even when it starts from anunknown location.
Acknowledgements
A special thank to Ignazio Infantino, Carmelo Lodato,Salvatore Lopes and Riccardo Rizzo who along with theauthors are inventors of LostCoDeS.
References:
[1] M. Cossentino et al: “Sistema per verificare il numerodi contenitori presenti in una pila di contenitori e metodo diverifica relativo”, Italian Pending Patent n.RM2013A000628, 14 Nov 2013 [2] Cynthia Marie Braun: “Shipping container monitoringand tracking system.” U.S. Patent No. 7,339,469. 4 Mar.2008[3] Leroy E. Scheppmann: “Systems and methods for mon-itoring and tracking movement and location of shippingcontainers and vehicles using a vision based system.” U.S.Patent No. 7,508,956. 24 Mar. 2009.
by Paolo Barsocchi, Antonino Crivello, Erina Ferro, Luigi
Fortunati, Fabio Mavilia and Giancarlo Riolo
“Renewable Energy and ICT for Sustainability Energy”
(or “Energy Sustainability”) is a project led by the
Department of Engineering, ICT, and Technologies for
Energy and Transportations (DIITET) of the Italian
National Research Council (CNR). This project aims to
study and test a coordinated set of innovative solutions
to make cities sustainable, with respect to their energy
consumption.
To achieve its goal, the project is based on i) the widespreaduse of renewable energy sources (and related storage tech-nologies and energy management), ii) the extensive use ofICT technologies for the advanced management of energyflows, iii) the adaption of energy-efficient city services todemands (thus encouraging rationale usage of energyresources and, thus, savings), and iv) the availability ofenergy from renewable sources.
This project is aligned with the activities of the EuropeanCommission under their energy efficiency theme. By June2014, the European Member States will have to implementthe new Directive 2012/27/EU (4 December 2012). ThisDirective establishes that one of the measures to be adoptedis “major energy savings for consumers”, where easy andfree-of-charge access to data on real-time and past energyconsumption, through more accurate individual metering,will empower consumers to better manage their energy con-sumption [1].
The Energy Sustainability project focuses on six activities.We are involved in the “in-building” energy sustainabilitycomponent. The main goal of this sub-project is to computethe real energy consumption of a building, and to facilitateenergy savings when and where possible, through the experi-mental use of the CNR research area in Pisa. This researcharea is more complex than a typical, simple building, as ithosts 13 institutes. However, energy to the area is suppliedthrough a single energy source, which means that the insti-tutes have no way of assessing their individual energy con-sumption levels.
The main requirements of the In-Building sub-project arethat it must:• monitor the power consumption of each office (lights and
electrical sockets), • regulate the gathering and visualization of data via per-
mits, • support real-time monitoring and the ability to visualize
time series, • define energy saving policies,• be cheap and efficient.
We developed an Energy long-term Monitoring System(hereafter referred to as the EMS@CNR), which is com-posed of a distributed ZigBee Wireless Sensor Network(WSN), a middleware communication platform [2] and a setof decision policies distributed on the cloud (Figure 1). Atthe time of writing, there are sensor nodes of this WSNinstalled in some offices of our Institute (CNR-ISTI). Eachsensor node can aggregate multiple transducers such ashumidity, temperature, current, voltage, Passive Infrared(PIR), pressure sensors and noise detector. Each node is con-nected to a ZigBee Sink, which provides Internet-of-Thingsconnectivity through the IPv6 addressing methodology. Thechoice to use a ZigBee network was driven by several tech-nology characteristics, such as ultra low-power consump-tion, the use of unlicensed radio bands, cheap and easy instal-lation, flexible and extendable networks, integrated intelli-gence for network set-up and message routing.
In order to measure the energy consumption of a room, weneed to assess the values of current and voltage waveforms atthe same instant. This is driven by the need to operate withinexisting buildings, without the possibility of changingexisting electrical appliances. We used a current and voltagetransformer. We also installed a PIR and a noise detector in asingle node, and a set of pressure detectors, installed inanother node under the floating floor, in order to determinethe walking direction of a person (i.e., to detect if he isentering or leaving the office). This helps to determinewhether or not someone is in the office, which is an informa-tion that drives decisions regarding potential energy savingsfor that specific room. As an example, in an office wherenobody is present, lights and other electric appliances (apartfrom computers) can be automatically switched off.Currently, the decision policies do not take into account datacoming from welfare transducers, such as temperature andrelative humidity. This data will be included in the in-progress decision policies.
Sensor data collected by the deployed WSN are stored in aNoSQL document-oriented local database, such as
Figure 1: The EMS@CNR system and the WebOffice platform.
Research and Innovation
ERCIM NEWS 99 October 201452
T-TRANS: Benchmarking
Open Innovation Platforms
and Networks
by Isidoros A. Passas, Nicos Komninos and Maria Schina
What might sixty web-based platforms and networks
have to tell us about open innovation? The FP7 project
T-TRANS aims to define innovation mechanisms for
Intelligent Transport Systems (ITS) that facilitate the
transfer of related innovative products and services to
the market.
The T-TRANS project addresses the difficulties associatedwith transferring new technologies, and seeks to capitaliseon the significant opportunities to improve efficiency andreduce costs once those technologies are commercialised.One of the expected outcomes of the project will be theestablishment of a pilot innovation network that is focusedon ITS. Initially this network will feature three glocal (globalto local) communities that are suitable for ITS commerciali-sation, referred to as CIMs, and will be implemented inCentral Macedonia (Greece), Galicia (Spain) and Latvia.This will set the scene for a more expansive Europe-wideITS e-innovation network.
With the view to informing the design of this new e-innova-tion network, T-TRANS partners undertook an analysis tobenchmark open innovation platforms and networks [1].From this work, the partners were able to gain a better under-
standing of what it takes for a network to serve the objectivesof its members and sustain itself effectively. Benchmarking iswidely defined as the act of comparatively assessing anorganisation’s technologies, production processes and prod-ucts against leading organisations in similar markets. The T-TRANS benchmarking exercise was based on three main pil-lars of comparison and assessment: (1) the platform, (2) thecollaboration network and (3) the added value of the networkand the platform. Each of these pillars has been described by aset of characteristics or attributes. A benchmarking templatewas developed to capture the data that was included and theindicators that were used to benchmark each characteristic.
Platforms are considered to be those web-based systems thatcan be programmed and therefore, customised by developersand users. They may also accommodate the goals for on-going collaborative and/or joint initiatives that the originaldevelopers could not have possibly contemplated or had timeto accommodate. A forthcoming publication by Angelidou etal. [2] presents an analysis of the current trends in innovationstrategies set by the companies. The kind of trends thatappear are 1) the majority of firms introducing new-to-market innovation do perform in-house R&D; 2) companiesturn to open innovation and collaborative networks, espe-cially the formation of global networks, and external knowl-edge partnerships for the acquisition of knowledge, freshideas and market access; 3) users and consumers also play agrowing role, increasing the interaction between demand andsupply; 4) multinational firms have a leading role in the glob-alisation of innovation; 5) local knowledge and capabilitiesas well as proximity to research and education institutionscontinue to matter for innovation.
A key finding of this work was that the results “indicate thatthere is no return to the old linear model of innovation, andR&D translates directly and spontaneously to innovation. Onthe contrary the systemic and network perspective is consoli-dated and shapes all drivers of innovation creation, such asuniversities and tertiary education, patenting and technologytransfer, knowledge infrastructure and flows, internationalcooperation, governance and stakeholders’ involvement inshaping policies for innovation. Traditional innovation net-works comprising only a few nodes evolve to extremelylarge networks with hundreds of participants from all overthe world. They include local and global partners, but withthe spread of ICTs and virtual networks they are becomingglocal, combining local competences with global know-howand access to markets”. This perspective provided the neces-sary definition framework for collaborative innovation net-works and has been used in the T-TRANS benchmarkingexercise.
Of the sixty cases considered in the benchmarking analysis,66% were characterised as platforms and 51% as networks.The objectives of both platforms and networks are veryclearly stated and identified. Clear objectives are crucialsince they state exactly what the platform and/or network isintended to either build or support. Having clearly definedobjectives supports the enrolment of new users and membersand on-going operations. Some of the key objectives relatedto the collaborative design and development of products,problem solving, brainstorming and the creation of commu-nities for crowdsourcing ideas. Interestingly, one of the plat-
MongoDb. In order to provide both a real-time view of thesensor data and a configuration interface, we implemented aweb interface, named WebOffice, that runs on JBoss 7.1. Itimplements the JavaEE specification and it is free and open-source. The Web interface provides a login system to protectdata display and ensure privacy. After making a successfullogin, according with the permission, the sensor data areshown in the main page. There are two main types ofgraphics: dynamic charts (for real-time visualization and his-torical data) and gauges (for an immediate display of the lastvalue recorded).
[1] L. Pérez-Lombard, J. Ortiz, C. Pout: “A review onbuildings energy consumption information,” Energy andbuildings, vol. 40, no. 3, pp. 394–398, 2008 [2] F. Palumbo et al.: “AAL Middleware Infrastructure forGreen Bed Activity Monitoring”, Journal of Sensors, vol.2013, pp. 1-15.
form/network combinations we examined provided a gami-fied community where members could complete missionsand earn points and badges. Another identified objective wasto support ideas that can hel p improve living conditions,including activities ranging from early stage investment toin-depth research thus, strengthening the social aspect ofinnovation and the development of innovative new products.
Following an analysis of these objectives, in accordance withthe clear statements they made, we found that in the majorityof instances the benefits of the platforms and networks weexamined were clearly stated as well. Some of the benefitsidentified included effective cross-cultural collaboration, adeep understanding of complex issues, patent applicationand invention licencing coaching and collaboration opportu-nities in R&D and innovation. A commonly identified ben-efit was the free access to shared knowledge. In general,most of the networks stated that a creative process was muchmore powerful if it was fuelled by large numbers of partici-pants who were all thinking about the same problem at thesame time. The promotion of businesses between inventorsand interested parties was another common benefit andcrowdsourcing capabilities appeared to be an emerging ben-efit trend.
Figure 1: Open innovation platforms and networks benchmarking
framework.
Most of the platforms were focused although not to a signifi-cant degree. Among the thematic domains identified wereinnovations services and patent invention support, grants andinnovation management, disruptive and open innovation,innovation management and technology transfer and net-working services.
The main supporting actions performed by the networks aretowards knowledge transfer, collaboration and joint develop-ment. The platforms were categorised into 7 new productdevelopment stages as defined by the Coopers’ Stage Gatemethodology (Figure 2). The platforms mainly supported theprocesses which occur in the first four stages: idea genera-tion, screening, concept development and testing and busi-ness analysis.
In conclusion, web-based platforms are an essential compo-nent of innovation networks, enlarging and extending collab-orative opportunities across geographical and time zones andenabling the participation of large numbers of users, inven-tors and innovators, thus supporting an “innovation-for-all”culture [3].
Links:
Project T-TRANS: http://www.ttransnetwork.eu/List of examined open innovation platforms and entities:http://wp.me/a2OwBG-PWStage-Gate innovation process: http://www.stage-gate.com/aboutus_founders.php
References:
[1] H. Chesbrough, W. Vanhaverbeke and J West J. (eds.):“Open Innovation: Researching a New Paradigm”, OxfordUniversity Press, 2006[2] M. Angelidou, et al.: “Intelligent transport systems:Glocal communities of interest for technology commercial-isation and innovation”, 2014[3] N. Komninos: “The Age of Intelligent Cities: SmartEnvironments and Innovation-for-all Strategies”, Londonand New York, Routledge, 2014.
Please contact:
Isidoros Passas, INTELSPACE SA, Thessaloniki, GreeceE-mail: [email protected]
Figure 2: The platforms in the analysis were categorised according to which of the seven new product development stages they were involved.
Events
Research data
Alliance and global
data and Computing
e-Infrastructure
challenges
Rome, Italy, 11-12 December 2014,
The Research Data Alliance and globalData and Computing e-Infrastructurechallenges event is being organisedwithin the framework of the ItalianPresidency of the European Union andwill take place in Rome, Italy on 11-12December 2014.
This event will focus on how synergiesbetween e-Infrastructures and the ambi-tious European Research Infrastructuresroadmap (ESFRI) and other major initia-tives with high potential impact onresearch and innovation (e.g. HBP,COPERNICUS, and other initiativesacross Horizon 2020) can be strength-ened. This implies a strong Europeancoordination between these initiatives. Italso puts particular emphasis on theimportance of long term sustainable sup-port to basic services for the researchand education communities as well as onthe consolidation of global cooperationfor Research Data and Computing infra-structures in the above contexts.
The event is organised with the supportof the Italian Ministry of Education,Universities and Research (MIUR), theItalian Supercomputing Center(CINECA), the Italian NationalResearch Council (CNR), the ItalianNational Institute for Geophysics andVolcanology (INGV) and RDA Europe.High-level policy-makers, national,European, and international scientists,academics, as well as government repre-sentatives will be invited to attend.
Participation to this event is by invita-tion only.
CloudFlow - Computational CloudServices and Workflows for AgileEngineering - is a European IntegratingProject (IP) in the framework ofFactories of the Future (FoF) that aimsat making Cloud infrastructures a prac-tical solution for manufacturing indus-tries, preferably small and medium-sized enterprises (SMEs). The objectiveof CloudFlow is to ease the access tocomputationally demanding virtualproduct development and simulationtools, such as CAD, CAM, CAE, andmake their use more affordable by pro-viding them as Cloud services.
The project is now open for new (teamsof) participants and solicits small con-sortia consisting of one to four partners(end users, software vendors,HPC/Cloud infrastructure providers andresearch organizations) to respond to theopen call for proposals. With the call,the project seeks to increase the numberof partners and application experimentscurrently being carried out within theCloudFlow project.
Application experiments will be rootedin computational technology for manu-facturing and engineering industries,preferably SMEs, in stages covering butnot limited to:• design (CAD), • simulation (product, process, factory,
etc.), • optimization,• visualization,• manufacturing planning,• quality control and• data management, addressing workflows along the valuechain in and across companies.
The deadline of the first Call is 30September 2015. The expected durationof participation is January to December2015. A second Call is expected to be launched in June 2015.
The 7th International Conference onComputational and MethodologicalStatistics is organised by the ERCIMWorking Group on Computational andMethodological Statistics and theUniversity of Pisa.
The conference will take place jointlywith the 8th International Conference onComputational and FinancialEconometrics (CFE 2014). The confer-ence has a high reputation of quality pre-sentations. The last editions of the jointconference CFE-ERCIM gathered over1200 participants.
Topics
Topics include all subjects within theaims and scope of the ERCIM WorkingGroup CMStatistics: robust methods,statistical algorithms and software, high-dimensional data analysis, statistics forimprecise data, extreme value modeling,quantile regression and semiparametricmethods, model validation, functionaldata analysis, Bayesian methods, opti-mization heuristics in estimation andmodelling, computational econometrics,quantitative finance, statistical signalextraction and filtering, small area esti-mation, latent variable and structuralequation models, mixture models,matrix computations in statistics, timeseries modeling and computation,optimal design algorithms and computa-tional statistics for clinical research.
The journal Computational Statistics &Data Analysis will publish selectedpapers in special peer-reviewed, or reg-ular issues.
More information: http://www.cmstatistics.org/ERCIM2014/
Proton therapy is considered the most advanced and targetedcancer treatment due to its superior dose distribution andreduced side effects. Protons deposit the majority of theireffective energy within a precisely controlled range within atumor, sparing healthy surrounding tissue. Higher doses canbe delivered to the tumor without increasing the risk of sideeffects and long-term complications, improving patient out-comes and quality of life. The Belgian company IBA is theworld leader in the field.
IBA and the iMagX team at Université catholique deLouvain have jointly developed a software platform and a 3-D cone beam CT in order to guide the proton beam in realtime in the treatment room in the frame of a public-privateR&D partnership between IBA, UCL and the WalloonRegion of Belgium. The system allows for dose deliveryestimation and efficient 3-D reconstruction, co-registrationof the in-vivo image with the treatment planning based onoffline 3-D high resolution Computer Tomography of thepatient, both in real time. IBA’s AdaPTInsight is the firstoperational software based on ImagX software. The globalsystem including the hardware and software were grantedFDA approval and has been used for the first time to treat apatient in Philadelphia, United States, at Penn Medicine’sDepartment of Radiation Oncology on 9 September 2014.More information can be found at http://www.imagx.org.
Altruism in game Theory
Research shows that consideration for others does notalways lead to the best outcome - that is, when it’s applied ingame theory. Bart de Keijzer (CWI) studied algorithms forgame theory, with a focus on cooperative aspects. Hedefended his PhD thesis ‘Externalities and Cooperation inAlgorithmic Game Theory’ on 16 June at VU University.His research results can have applications in data and trafficnetworks, peer-to-peer networks and GSP auctions, such asused by Google Adwords.
In conventional models it is a common assumption thatplayers are only interested in themselves. However, in real
In Brief
Essay Contest
Prize for Tommaso
Bolognesi
In August 2014 Tommaso Bolognesi, senior researcher atISTI-CNR, Pisa, was awarded for the second time (the firstwas in 2011) a prize in the essay contest “How ShouldHumanity Steer the Future?”, launched by the U.S. institu-tion FQXi (Foundational Questions Institute). His essay,entitled ‘Humanity is much more than the sum of humans’,ambitiously attempts to use ideas on the computational uni-verse conjecture by Wolfram and others, on life as evolvingsoftware and on mathematical biology by G. Chaitin, and onintegrated information and consciousness by G. Tononi, forproviding some formal foundations to the cosmologicalvisions of the French Jesuit and paleontologist PierreTeilhard de Chardin. The essay can be found and commentedat: http://fqxi.org/community/forum/topic/2014.
life players are also influenced by others. De Keijzer investi-gated the impact of cooperation, friendship and animosity ondifferent games. One of his conclusions is that when playersbehave altruistic, the flow in a road or data network canbecome worse. “It’s a remarkable result that for the mathe-matical concept of social welfare, one can sometimes better
choose at the expense of others, than to change the strategy toplease them,” the researcher says.
With these more realistic models researchers and policymakers can make better qualitative predictions. Otherresearch results from De Keijzer, who is now working at theSapienza University of Rome, have applications in procure-ment auctions, treasury auctions, spectrum auctions and theallocation of housing. See also http://bart.pakvla.nl/
View of the proton therapy treatment room.
Pict
ure
cour
tesy
of I
BA
, s.a
.
ERCIM is the European Host of the World Wide Web Consortium.
Institut National de Recherche en Informatique
et en Automatique
B.P. 105, F-78153 Le Chesnay, France
http://www.inria.fr/
Technical Research Centre of Finland
PO Box 1000
FIN-02044 VTT, Finland
http://www.vtt.fi/
SBA Research gGmbH
Favoritenstraße 16, 1040 Wien
http://www.sba-research.org
Norwegian University of Science and Technology
Faculty of Information Technology, Mathematics and Electri-
cal Engineering, N 7491 Trondheim, Norway
http://www.ntnu.no/
Universty of Warsaw
Faculty of Mathematics, Informatics and Mechanics
Banacha 2, 02-097 Warsaw, Poland
http://www.mimuw.edu.pl/
Consiglio Nazionale delle Ricerche, ISTI-CNR
Area della Ricerca CNR di Pisa,
Via G. Moruzzi 1, 56124 Pisa, Italy
http://www.isti.cnr.it/
Centrum Wiskunde & Informatica
Science Park 123,
NL-1098 XG Amsterdam, The Netherlands
http://www.cwi.nl/
Foundation for Research and Technology - Hellas
Institute of Computer Science
P.O. Box 1385, GR-71110 Heraklion, Crete, Greece
http://www.ics.forth.gr/FORTH
Fonds National de la Recherche
6, rue Antoine de Saint-Exupéry, B.P. 1777
L-1017 Luxembourg-Kirchberg
http://www.fnr.lu/
FWO
Egmontstraat 5
B-1000 Brussels, Belgium
http://www.fwo.be/
F.R.S.-FNRS
rue d’Egmont 5
B-1000 Brussels, Belgium
http://www.fnrs.be/
Fraunhofer ICT Group
Anna-Louisa-Karsch-Str. 2
10178 Berlin, Germany
http://www.iuk.fraunhofer.de/
SICS Swedish ICT
Box 1263,
SE-164 29 Kista, Sweden
http://www.sics.se/
University of Geneva
Centre Universitaire d’Informatique
Battelle Bat. A, 7 rte de Drize, CH-1227 Carouge
http://cui.unige.ch
Magyar Tudományos Akadémia
Számítástechnikai és Automatizálási Kutató Intézet
P.O. Box 63, H-1518 Budapest, Hungary
http://www.sztaki.hu/
University of Cyprus
P.O. Box 20537
1678 Nicosia, Cyprus
http://www.cs.ucy.ac.cy/
Spanish Research Consortium for Informatics and MathematicsD3301, Facultad de Informática, Universidad Politécnica de Madrid28660 Boadilla del Monte, Madrid, Spain,http://www.sparcim.es/
Science and Technology Facilities CouncilRutherford Appleton LaboratoryChilton, Didcot, Oxfordshire OX11 0QX, United Kingdomhttp://www.scitech.ac.uk/
Czech Research Consortium
for Informatics and Mathematics
FI MU, Botanicka 68a, CZ-602 00 Brno, Czech Republic
http://www.utia.cas.cz/CRCIM/home.html
Subscribe to ERCIM News and order back copies at http://ercim-news.ercim.eu/
ERCIM - the European Research Consortium for Informatics and Mathematics is an organisa-
tion dedicated to the advancement of European research and development, in information
technology and applied mathematics. Its member institutions aim to foster collaborative work
within the European research community and to increase co-operation with European industry.