The Pennsylvania State University The Graduate School AUTOMATING FEEDBACK-GENERATION FOR PROGRAMMING ASSIGNMENTS VIA CROWDSOURCING A Thesis in Computer Science and Engineering by Rohan Shah Ó 2020 Rohan Shah Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science December 2020
43
Embed
AUTOMATING FEEDBACK-GENERATION FOR PROGRAMMING ASSIGNMENTS ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
The Pennsylvania State University
The Graduate School
AUTOMATING FEEDBACK-GENERATION FOR PROGRAMMING
ASSIGNMENTS VIA CROWDSOURCING
A Thesis in
Computer Science and Engineering
by
Rohan Shah
Ó 2020 Rohan Shah
Submitted in Partial Fulfillment of the Requirements
for the Degree of
Master of Science
December 2020
ii
The thesis of Rohan Shah was reviewed and approved by the following:
John Hannan Associate Department Head and Associate Professor Thesis Advisor
Danfeng Zhang Assistant Professor
Chitaranjan Das Distinguished Professor of Computer Science and Engineering Department Head of Computer Science and Engineering
iii
ABSTRACT
The importance of this project is to provide formative feedback to novice programmers
for programming assignments when they’re facing a bug. A Boolean feedback of test pass or fail
doesn’t help them resolve the issue and hence, formative feedback is important. Also, the
availability of feedback is another important factor, since with a high student to instructor ratio in
entry level computer science courses, it’s really hard for an instructor or TA to help every student
within a limited timeframe. Additionally, currently available solutions on providing feedback or
auto feedback generation are difficult to implement. So, this study investigates the use of
continuous integration tool to automate testing and to tackle the problem of providing formative
feedback for programming assignments, while keeping novice programmers in mind and with the
goal of implementing an unsophisticated solution, so an instructor can adapt it.
iv
TABLE OF CONTENTS
LIST OF FIGURES ............................................................................................................ v
LIST OF TABLES .............................................................................................................. vi
ACKNOWLEDGEMENTS................................................................................................. vii
Figure 3-8: Feedback.java file to compute feedback for students based on their exception .. 22
Figure 3-9: UpdateFeedback.java function that updates feedback based on crowdsourced feedback ...................................................................................................................... 23
Figure 4-1: Student's java programming assignment file to be graded. ................................ 25
Figure 4-2: Shell script as a part of build step on Jenkins-git project. ................................. 26
Figure 4-3: Email response generated by Jenkins-git project. ............................................... 28
Figure 4-4: Email response generated from Jenkins pipeline for git-diff. .............................. 29
Figure 4-5: Git diff resulted page created from Jenkins pipeline. .......................................... 29
Figure 5-1: JSON representation of feedback.txt file. ........................................................... 31
The script is designed such that it first triggers a build for Jenkins-git project that was
setup in section 3.2.1.1 and after that, based on the git repository mentioned, it will get git
difference of current version and previous version and write it to a new html file. Once the html
file is ready, it will initiate an email to the recipient mentioned in last line with a link to the html
file that was just created, so that the recipient (student) in our case can see git diff of current and
previous git versions.
3.4: Java Setup
It must be noted that any programming language that can run from shell script can adapt
this model of automating feedback generation via crowdsourcing, since Jenkins and the shell
scripting is the requirement. But in order to show the working model, the following two java files
are needed, which essentially helps to compute feedback for students and second one is to
improve hints based on student’s input. Also, feedback.txt file must be created which contains
error name and feedback in the following format:
<errorName>; <feedback>
java.lang.ArrayIndexOutOfBoundsException; Did you try to access an array's ith element whose value is not defined yet?
body: """<p>See attached diff of build <b>${env.JOB_NAME} #${env.BUILD_NUMBER}</b>.</p> <p>Check build changes on Jenkins <b><a href="${env.BUILD_URL}/last-changes">here</a></b>.</p>""", to: "[email protected]") } //end script } } } }
Figure 3-7: Groovy script for automated feedback generation pipeline.
22
3-4-1: Compute Feedback for Programming Assignment in Java
Feedback.java file as shown below is called by Jenkins-git project if the student’s
submission has run time error exception. Currently the function is designed to read each line of
feedback.txt and if the student’s run time exception matches the error in feedback.txt file, it will
store the corresponding hints in a temporary array-list and return them. It must be noted that for
same error, different solutions are possible and hence multiple hints.
import java.io.*; import java.util.ArrayList; public class FeedBackMap { private static ArrayList<String> getFeedback(String error) { error = error.trim().toLowerCase(); ArrayList<String> hints = new ArrayList<String>(); String line; try { BufferedReader bufferReader = new BufferedReader(new FileReader("feedback.txt")); while ((line = bufferReader.readLine()) != null) { String[] words = line.split(";"); if(words[0].toLowerCase().equals(error)) { hints.add(words[1]); } } } catch (FileNotFoundException ex) { ex.printStackTrace(); } catch (IOException ex) { ex.printStackTrace(); } return hints; } public static void main(String[] args) { ArrayList<String> hints = getFeedback(args[0]); for (int i = 1; i <= hints.size(); i++) { System.out.println(i + ". " + hints.get(i - 1)); } } }
Figure 3-8: Feedback.java file to compute feedback for students based on their exception
23
3-4-2: Update Feedback based on crowdsourcing
UpdateFeedback.java is called by Poll email Jenkins project from the shell script in the
build step with two arguments, namely error and feedback, that student’s think would’ve worked
for them. Once this method is called, it writes to the feedback.txt files by adding the error and the
hint while maintaining the format of the .txt file with <errorName>; <updated_fdb>. Finally it
returns the success message that the feedback.txt file was updated or in the case of error, it throws
error handled by try-catch mechanism.
This component helps us to improve our model overtime and hence the feedback map is learning
based on crowdsourced feedback.
import java.io.FileWriter; import java.io.IOException; public class UpdateFeedback { public static void main(String[] args) { System.out.println(updateFeedback(args[0], args[1]+ "\n")); } public static String updateFeedback(String error, String hint) { try{ FileWriter fileWriter = new FileWriter("feedback.txt", true); String fileContent = error + "; " + hint; fileWriter.write(fileContent); fileWriter.close(); } catch (IOException e) { // exception handling } return "the error: " + error + " and the feedback: " + hint + " has been successfully added."; } }
Figure 3-9: UpdateFeedback.java function that updates feedback based on crowdsourced feedback
24
Chapter 4
Results
This chapter shows a running example and the results achieved, so an instructor can
estimate the amount of work involved and the simplicity of the proposed solution. Also, the java
setup from section 3.3 is a requirement along with following files and scripts mentioned in this
section.
The following is the Student.java file that has four different functions namely isEven,
reverseString, sum2, and failFunc. It should be noted that failFunc() is coded incorrectly and
throws a runtime exception of ArrayIndexOutOfBounds, to understand the working of the
proposed model.
Public class Student { // checks if an int is even public static boolean isEven(int num) { return (num % 2 == 0); } // Reverses a String public static char[] reverseString(String S) { char[] s = S.toCharArray(); int left = 0, right = s.length - 1; while (left < right) { char tmp = s[left]; s[left++] = s[right]; s[right--] = tmp; } return s; } // Sum two int private static int sum2(int a, int b) { return a + b; } // Function fails with error ArrayIndexOutOfBounds private static void failFunc(int[] a) { int i = a[1]; } public static void main(String[] args) throws Exception { try { int[] arr = {1}; System.out.println(isEven(Integer.valueOf(args[0]))); System.out.println(reverseString(args[1]));
25
Once the changes are pushed to GitHub and assuming above is the latest version of
Student.java, GitHub will send a POST request to Jenkins CI and trigger a build for pipeline
discussed in section 3.2.1.3. Based on the logic in the pipeline, it will first build Jenkins-git
project with the following shell script included in build step:
Figure 4-1: Student's java programming assignment file to be graded.
javac Student.java javac FeedBackMap.java testNum=15 rem=$(($testNum%2)) testString='Apple' SumNum1=2147483647 SumNum2=1 output=$(java Student $testNum $testString $SumNum1 $SumNum2) arr=() res=() while read -r line; do IFS='.' read -a array <<< "$line" if ([ ${array[0]} == "java" ]) then res4="failFunc is failing. Hints: " res4+=$(java FeedBackMap $line) else arr+=("$line") fi done <<< "$output" if ([ -n "${arr[0]}" ] && [ ${arr[0]} == "true" ] && [ $rem == 0 ]) || ([ -n "${arr[0]}" ] && [ ${arr[0]} == "false" ] && [ $rem == 1 ]) then res1="isEven test has passed for int input=$testNum." else res1="isEven test has failed for int input=$testNum." fi if ([ -n "${arr[1]}" ] && [ ${arr[1]} == "elppA" ]) then res2="reverseString test has passed for string input=$testString." else res2="reverseString test has failed for string input=$testString." fi
26
As it can be seen in the above shell script, it first compiles the Student.java file and
FeedBackMap.java file (from chapter 3). Next, it runs the Student.java file with input parameters
set in the shell script and passed along with the command:
After the Student.java file runs, it can be seen that the main method when initiated, will
run all four functions that were supposed to be coded by students and it outputs the result one
after the other. So, in this model, we’re leveraging the console output and reading it in the shell
and storing it in a variable. Next, this output is compared line by line and following two things are
done:
1. Checks for runtime exception, since in the Student.java, we’re handling all exceptions using try
catch and, in the catch, the exception name is printed. In Jenkins shell, the runtime exception is
detected by the following pseudo code:
Psuedo code:
read each line of the output:
split the line by “.” and store it in temporary array called words:
if words[0] is equal to “java”, error is detected:
call FeedbackMap with error-name
if ([ -n "${arr[2]}" ] && [ ${arr[2]} -gt -1 ]) then res3="sum2 test has passed for inputs: $SumNum1 and $SumNum2." else res3="sum2 test has failed for inputs: $SumNum1 and $SumNum2. Hint: Did you check for maximum integer value Java int can handle?" fi echo "start-here\n <br>${res1}</br> <br>${res2}</br> <br>${res3}</br> <br>${res4}</br>\n end-here\n"
Figure 4-2: Shell script as a part of build step on Jenkins-git project.
27
After the FeedbackMap.java is called with error-name, as per the logic discussed in
chapter 2 for FeedBackMap.java, it will print hints to the console for the corresponding error.
Also, the logic for detecting run-time exception can be improved overtime, but for simplicity, we
assumed no function is returning an output with the following format: java.<some_string> i.e. the
first word right before “.” is “java”.
2. If no runtime exception is detected, the output is stored in an array, which will be used for
testing purpose.
Once the output is stored in an array, from the shell script in figure 4-2, it can be used to
design test cases in the shell itself. For example, in the shell script, there’re three test cases coded
by if-else statements, which are designed to test functionalities for isEven(), reverseString(), and
sum2() of Student.java file. After the output from calling FeedbackMap.java is stored in a
variable $res4 and the if-else statements in shell script has been executed and stored in variables
$res{i} (i ranging from [1:3]), the shell script echoes them to the console and is surrounded by
HTML tags with “start-here” and “end-here”. The reason it is printed to the console in HTML
format is because in chapter 3, where Jenkins-git was setup, step#5 had post-build action, which
configured email settings and as discussed there, the email will include all contents after <start-
here> tag and until <end-here> tag. Also, the content-type was set to HTML and hence the reason
why the echo statement in shell script is surrounded by HTML table tags i.e. <br> and </br>. So,
at the end of execution of this script, an email will be triggered and delivered to the recipient list,
configured in the editable email notification section. Also following is the email response
received by the recipient list in our working model:
28
Based on the email response, the student can infer that isEven() and reverseString()
functions were coded correctly, although sum2 function is easy to design, but one of the
requirements to return sum of two integers as an integer clearly failed, when one of the integers
passed as an argument was the maximum value of an integer that java can hold. Hence, the
feedback in test results for #3 is along the lines. The same test can be designed in java (say in a
new file testHandling.java) and the error can be added to the feedback.txt file with appropriate
error naming convention. The only change required to support such logical errors (compare to
runtime errors) in testHandling.java, is appropriate error detection in shell script. As discussed
earlier in this section, the shell script detected runtime errors by detecting if the first word before
“.” is “java”, but passing the logical error name in testHandling.java file and running the error
across the feedback.txt file for matching error name, hints can be computed for logical errors as
well. Hence, this approach is capable of handling run-time errors as well as logical errors in the
language desired (as far as it callable by shell script).
Now, if we recall, building and running Jenkins-git project was step 1 that was initiated
by automated feedback generation pipeline. So, based on the pipeline script from figure 3-6, it’ll
now create a new html file that contains git diff of previous and current version and email it to the
recipient list, as discussed in chapter 3.2.1.3 while setting up pipeline. Following is an example of
email received for git diff:
Figure 4-3: Email response generated by Jenkins-git project.
29
And this is an example of git diff page resulted from clicking the link in the above email and was
generated by the second step in pipeline:
Also, it should be noted that the average time taken from triggering the build for the
pipeline (by GitHub, after a code change is pushed) to receiving both, feedback on test results and
the git diff via email, is around 16s. This time will vary depending on the size of the files and
number of tests that’ll be ran for different cases, but it is important to note that compare to
receiving feedback from TA/instructor, receiving feedback from the proposed model is
significantly lower.
Figure 4-4: Email response generated from Jenkins pipeline for git-diff.
Figure 4-5: Git diff resulted page created from Jenkins pipeline.
30
Last phase of the proposed model and the most important one, is the ability of the model
to improve overtime via crowd-sourcing (part of the solution was inspired by Socure inc.’s [9]
engineering). When a student receives feedback in the form of email, as shown in figure 4-3, if
they think the feedback wasn’t enough and could come up with a better feedback that helped
them solve the problem, they can simply reply to the email with a better feedback (that can
potentially help other students in future) for the given error. Since the instructor/TA’s email is
continuously polled by the Jenkin’s poll email project that was discussed and setup in chapter 3,
the UpdateFeedback.java file will then be called and the updated feedback in will be registered in
the feedback.txt file. This makes the proposed model self-sustaining, and it has the ability to
improve over time with minimal TA/instructor’s intervention. Also, as mentioned before in
chapter 3, the format for the feedback received from student matters and hence they must reply
back to the email in the following format:
error=<error_name>
feedback=<updated_feedback>
i.e. with key and value pairs, where error and feedback are keys and they’re used as variables in
the shell script of poll email project. It should be noted that we faced bugs in poll email plugin
and hence, might require reconfiguration. After the updated feedback is registered, if another
student stumbles upon the same problem, they now have better library of hints available.
31
Chapter 5
Discussion
There are a few areas that this model can be improved upon and some future directions
that this study can lead to, so this section will focus on that. Currently, the model stores hints and
error names without any order or format; so instead, they could be stored in a JSON file with each
error and the hints as a JSON object, as shown below:
This change can improve the lookup time for getting hints, since currently, the brute force
solution is traversing through each line of error and hint, and checking if there’s a match. Clearly,
if the list of errors and the corresponding hints are large, the solution is not scalable. Additionally,
implementing JSON in java or Scala projects can be done in a simpler way by using maven
dependencies. Also, the getFeedback code, the updateFeedback code, and the unit/functional tests
designed for programming assignments can be hidden behind a client and students can simply use
the api or client to call and get the feedback for their programming assignments. This approach
{ "task1": { "java.lang.NullPointerException": { "hints": [ "Did you try to use a variable that was not initialized before?" ] }, "java.lang.ArrayIndexOutOfBoundsException": { "hints": [ "The array is indexed at 0 in java", "Did you try to access an array's ith element whose value is not defined yet?" ] } } }
Figure 5-1: JSON representation of feedback.txt file.
32
will allow easy implementation/setup of project for an instructor/TA as well as the ability to hide
unit or functional test from students.
Next, the feedback.txt file stores error and hints, but adding the frequency, i.e. the
number of times a particular hint was successful in helping a student to resolve their issue, will
make this model more accurate. That is, the hints received by students’ overtime will be more
accurate. The frequency count implementation can be achieved by adding an extra variable in the
feedback email provided to students asking if the hint was helpful, let’s say variable name
hintHelpful. Then, the email response from student will then look like the following:
error= java.lang.NullPointerException
hintHelpful=Yes
feedback=NA
The reason feedback variable is left empty or set to NA is because feedback in here
represents the better feedback that the student thinks would’ve helped him resolve the situation,
but since the feedback received via the email, generated from the pipeline was enough, the
hintHelpful variable is set to yes. On the backend, on receiving hintHelpful=Yes, the
updateFeedback function will simply update the frequency count of the corresponding hint, by
adding 1 to it. After the frequency count is updated to +1, next time the getFeedback function is
called on the same error, the hints will be returned in descending order of frequency. So, the
chances are, the first few feedbacks will be enough to solve their bug. Hence, adding a frequency
count in the model will allow the model to learn over time.
Next, as discussed in chapter 3-2-1 while configuring email settings for extended email
plugin in Jenkins, there is a potential security threat for Gmail users, since in order to use the