End-User Programmers and their Communities End-User Programmers and their Communities: An Artifact-based Analysis Kathryn T. Stolee, Sebastian Elbaum, and Anita Sarma University of Nebraska–Lincoln {kstolee, elbaum, asarma}@cse.unl.edu September 22, 2011 This work is supported by the NSF GRFP under CFDA#47.076, NSF Award #0915526, and AFOSR Award #9550-10-1-0406. 1 / 31
89
Embed
End-User Programmers and their Communities: An Artifact ... · Environment Yahoo! Pipes Scratch MATLAB # Artifacts 100,000 700,000 13,717 # Participants 90,000 500,000 5,356:::yet
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
End-User Programmers and their Communities
End-User Programmers and theirCommunities: An Artifact-based Analysis
Kathryn T. Stolee, Sebastian Elbaum, and Anita SarmaUniversity of Nebraska–Lincoln
{kstolee, elbaum, asarma}@cse.unl.edu
September 22, 2011
This work is supported by the NSF GRFP under CFDA#47.076, NSF Award #0915526, and AFOSR Award #9550-10-1-0406.
1 / 31
End-User Programmers and their Communities
Introduction
End-User Programming
Introduction
End User Programmers
People who engage in programming activities to support theirhobbies and work.
Professionals End UsersNumber in U.S. 3 million 13 millionTypical Education C.S. Degree Other DegreeRole of Programming It’s their job It supports their job
2 / 31
End-User Programmers and their Communities
Introduction
End-User Programming
Introduction
End User Programmers
People who engage in programming activities to support theirhobbies and work.
Professionals End UsersNumber in U.S. 3 million 13 millionTypical Education C.S. Degree Other DegreeRole of Programming It’s their job It supports their job
. . . yet we know little about how the repositories are utilized
3 / 31
End-User Programmers and their Communities
Empirical Study
Motivation
Empirical Study Details
Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults
4 / 31
End-User Programmers and their Communities
Empirical Study
Motivation
Research Goal
To better understand end-user programmer communities
Learn how communities and artifact repositories evolveUncover needs for support in: development, maintenance,search, program understanding, . . .
5 / 31
End-User Programmers and their Communities
Empirical Study
Motivation
Empirical Study Details
Goal: To better understand end-user programmer communities
Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults
6 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
Why Mashup Communities?
Web Mashups
Applications that compose and manipulate existing data sources orservices to create new data or service.
Why study mashups?Many environments (e.g., Apatar, DERI Pipes, IBM MashupCenter, Kivati, Yahoo! Pipes, . . . )Potential impact (many users, growth)
7 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
Why Mashup Communities?
Web Mashups
Applications that compose and manipulate existing data sources orservices to create new data or service.
Why study mashups?Many environments (e.g., Apatar, DERI Pipes, IBM MashupCenter, Kivati, Yahoo! Pipes, . . . )Potential impact (many users, growth)
7 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
This example mashupfetches and filters newsfrom news.google.com
Information page showsthe pipe output anddescriptive information
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
Clicking Publish adds thepipe to the publicrepository
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
Clicking Edit Source loadsthe Pipes Editor
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
Visual mashupcreation environmentWithin a browserDrag and dropinterface
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
Visual mashupcreation environmentWithin a browserDrag and dropinterface
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
Visual mashupcreation environmentWithin a browserDrag and dropinterface
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
About Yahoo! Pipes
Visual mashupcreation environmentWithin a browserDrag and dropinterface
8 / 31
End-User Programmers and their Communities
Empirical Study
Web Mashups
Empirical Study Details
Goal: To better understand end-user programmer communities
Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults
9 / 31
End-User Programmers and their Communities
Empirical Study
Study Setup
Research Questions
RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository
RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions
RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community
10 / 31
End-User Programmers and their Communities
Empirical Study
Study Setup
Research Questions
RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository
RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions
RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community
10 / 31
End-User Programmers and their Communities
Empirical Study
Study Setup
Research Questions
RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository
RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions
RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community
10 / 31
End-User Programmers and their Communities
Empirical Study
Study Setup
Research Questions
RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository
RQ2: How do pipe attributes change as authors gain experience?2a: experience measured by time2b: experience measured by total contributions
RQ3: What are the characteristics of most prolific authors?3a: author activities3b: author skills3c: awareness of the community
10 / 31
End-User Programmers and their Communities
Empirical Study
Study Setup
Empirical Study Details
Goal: To better understand end-user programmer communities
Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults
11 / 31
End-User Programmers and their Communities
Empirical Study
Metrics
Study Details
Concept to Capture Variableartifact sharing/impact popularityabstraction configurabilitycomplexity sizeoverlap of artifacts in repository diversity
1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match
Significance: Diversity is related to contribution novelty
1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match
Significance: Diversity is related to contribution novelty
1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match
Significance: Diversity is related to contribution novelty
1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match
Significance: Diversity is related to contribution novelty
1 Same structure, fields, content2 Same structure, field counts3 Same structure4 Same bag of modules5 Same set of modules6 Same type bag7 Same size8 No match
Significance: Diversity is related to contribution novelty
12 / 31
End-User Programmers and their Communities
Empirical Study
Metrics
Empirical Study Details
Goal: To better understand end-user programmer communities
Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults
13 / 31
End-User Programmers and their Communities
Empirical Study
Study Methods
Data Collection
Artifacts: 32,887Authors: 20,313
Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)
14 / 31
End-User Programmers and their Communities
Empirical Study
Study Methods
Data Collection
Artifacts: 32,887
Authors: 20,313
Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)
14 / 31
End-User Programmers and their Communities
Empirical Study
Study Methods
Data Collection
Artifacts: 32,887Authors: 20,313
Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)
14 / 31
End-User Programmers and their Communities
Empirical Study
Study Methods
Data Collection
Artifacts: 32,887Authors: 20,313
Threats: public repository offers limited visibility (internal); samplingbias (external); generalizability to other domains (external)
14 / 31
End-User Programmers and their Communities
Empirical Study
Study Methods
Empirical Study Details
Goal: To better understand end-user programmer communities
Research GoalStudy ContextResearch QuestionsVariables and MetricsMethodsResults
15 / 31
End-User Programmers and their Communities
Empirical Study
Results
Research Questions
RQ1: What are the characteristics of Yahoo! Pipes community?1a,b: author attrition and author contributions1c: artifact sharing, abstraction, complexity, and degree ofoverlap among pipes in the repository
16 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ1: Characteristics of Yahoo! Pipes Community
Summary
Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level
17 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ1: Characteristics of Yahoo! Pipes Community
Summary
Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level
34% of pipes areconfigurable
17 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ1: Characteristics of Yahoo! Pipes Community
Summary
Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level
54% of pipes havebeen cloned
17 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ1: Characteristics of Yahoo! Pipes Community
Summary
Metric AverageSize 8.20 modules per pipeConfigurability 0.65 modules per pipePopularity 5.67 clones per pipeDiversity 3.62 cluster level
5% of pipes areexact duplicates,yet 46% have amatch if fieldvalues are relaxed
17 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ1: Characteristics of Yahoo! Pipes Community
Take Aways:
There is a lot of reuse of shared pipesParticipants often submit pipes that are highly similar to otherpipes in the repository
18 / 31
End-User Programmers and their Communities
Empirical Study
Results
Research Questions
RQ2: How do pipe attributes change as authors gain experience?2a: measures experience in terms of time2b: measures experience in terms of total contributions
19 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ2: Analysis of artifacts as authors gain experienceComparisons based on experience (time)
43% of pipes submitted by prolific authors represent tweaks
For Example: Change a URL, filter criterion, sort order, . . .
27 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ3: Characteristics of most prolific authorsAuthor Activities
Level 8: Nostructuralsimilarities
43% of pipes submitted by prolific authors represent tweaks52% of pipes submitted by prolific authors represent new initiatives
27 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ3: Characteristics of most prolific authorsAuthor Activities
02
46
8
Rolling Diversity Analysis Over Time
Time in days: 713 total
Div
ersi
ty
0
513 16 148 2 0 0 0 0 0 1 0 0 31 0 0 1 0 1
02
46
8
56% of prolific authors consistently submit new initiatives
27 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ3: Characteristics of most prolific authorsAuthor Activities
02
46
8
Rolling Diversity Analysis Over Time
Time in days: 19 total
Div
ersi
ty
0 0 0 0 0 0 0 0 11 0 1 2 0 0 0 0 0 0 0 5
02
46
8
27% of prolific authors consistently submit tweaks
27 / 31
End-User Programmers and their Communities
Empirical Study
Results
RQ3: Characteristics of most prolific authors
Take Away #1: 1/2 of participants submit pipes that are novel to theirprevious contributions
Take Away #2: 1/4 of participants submit pipes that are tweaks oftheir other pipes
28 / 31
End-User Programmers and their Communities
Discussion
Implications
The real take away
End-user programmer communities may need . . .
moderators.→ Repository is cluttered with highly similar artifacts (RQ1)
more sophisticated repository search.→ Many pipes are very structurally similar to other pipes in the
repository (RQ1)→ Early authors create less diverse pipes than later authors (RQ2)
artifact development support.→ Tweaks represent missed opportunities for parameterization (RQ3)→ Many shared pipes are tweaks on previously-committed pipes by
the same author (RQ3)
29 / 31
End-User Programmers and their Communities
Discussion
Implications
The real take away
End-user programmer communities may need . . .
moderators.→ Repository is cluttered with highly similar artifacts (RQ1)
more sophisticated repository search.→ Many pipes are very structurally similar to other pipes in the
repository (RQ1)→ Early authors create less diverse pipes than later authors (RQ2)
artifact development support.→ Tweaks represent missed opportunities for parameterization (RQ3)→ Many shared pipes are tweaks on previously-committed pipes by
the same author (RQ3)
29 / 31
End-User Programmers and their Communities
Discussion
Implications
The real take away
End-user programmer communities may need . . .
moderators.→ Repository is cluttered with highly similar artifacts (RQ1)
more sophisticated repository search.→ Many pipes are very structurally similar to other pipes in the
repository (RQ1)→ Early authors create less diverse pipes than later authors (RQ2)
artifact development support.→ Tweaks represent missed opportunities for parameterization (RQ3)→ Many shared pipes are tweaks on previously-committed pipes by
the same author (RQ3)
29 / 31
End-User Programmers and their Communities
Discussion
Threats
Threats to Validity
Internal→ History (the pipes were sampled at different times)→ Selection (the repository only provides public pipes)
Construct→ Interaction of different factors→ Mono-method bias on diversity (only consider structural diversity,
not semantic)
External→ Generalizability (only studied one community)→ Sampling bias (could not control search results when sampling)
30 / 31
End-User Programmers and their Communities
Discussion
Conclusion
Conclusion
Authors utilize the repository in different waysAs authors gain experience in the environment, they tend tomake more valuable contributions to the repositoryThere is a need for better support to help end-user programmercommunities continue to progress and growTo generalize the results, we are interested in extending themetrics to other languages and repositories
To facilitate replication, the data used in this analysis is available:http://cse.unl.edu/˜kstolee/esem2011/artifacts.html