PersOnalized Smart Environments to increase Inclusion of people with DOwn’s syNdrome 1 Deliverable D6.1 Evaluation Protocols Call: FP7-ICT-2013-10 Objective: ICT-2013.5.3 ICT for smart and personalised inclusion Contractual delivery date: 30.04.2014 (M6) Actual delivery date: 15.04.2015 Version: v2 Editor: Andreas Braun (FhG) Anna Zirk (BIS) Contributors: Silvia Rus (FhG) Katrine Prince Moe (Karde) Reviewers: Juan Carlos Augusto (MU) Terje Grimstad (Karde) Dissemination level: Public Number of pages: 18
18
Embed
Evaluation Protocols · 2017-08-30 · The deliverable D6.1 - Evaluation Protocols - is the first deliverable of Work Package 6 - Validation. The deliverable is linked to task T6.1
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
PersOnalized Smart Environments to increase Inclusion of people with DOwn’s syNdrome
4.1 Best practice of system evaluation ......................................................................................... 5 4.2 POSEIDON requirements ......................................................................................................... 6 4.3 Evaluating requirements ......................................................................................................... 6
4.5 Result Template ..................................................................................................................... 11 5 User experience evaluation of POSEIDON .................................................................................... 12
5.1 Guidelines for Pilots .............................................................................................................. 12 5.1.1 Ethical approval ............................................................................................................. 12 5.1.2 Recruitment procedures................................................................................................ 12 5.1.3 Eligibility criteria for people with DS ............................................................................. 12 5.1.4 Eligibility criteria for caregiver ....................................................................................... 12 5.1.5 Exit strategy ................................................................................................................... 12 5.1.6 Pre-pilots ....................................................................................................................... 12
A Not critical Not fulfilling this specific requirement will not interfere with the overall functionality of the system - mostly suited for optional requirements
B Potentially critical
If not fulfilled this requirement has some impact on the system or user experience but it is considered low
C Very critical If the requirement is not fulfilled the system is expected to behave unexpectedly or not adhere to the minimum functionality required
F Fatal This breaks the system experience and permits main functions from working
The last component is the estimated time required to fix the requirement in a way so it will be adhering
to the specified level. This is important to get an estimate about the resources that will be required to
fix the problem and how they can be mapped into the remaining project timeline and the development
queue until the next iteration is due. The time required to fix will be quantized similar to the previous
factors, in the following table. It should be noted that the time should be estimated in a way that
includes testing of the fixed requirement:
Table 6 Score associated to estimated time to fix a certain requirement not met
Grade Estimated time to fix
A < 1 hour
B < 1 day
C < 1 week
F > 1 week
In order to calculate the risk score we have to associate the grades that were given to a numeric value.
We are using the simple association of the following table:
Table 7 Association between grades and numeric values
Grade Numeric value
A 0
B 1
C 3
F 5
Now, all components are complete that are required to calculate the risk score. For example, if there
is a requirement with low adherence level that is not critical and will take less than one day to fix and
test the resulting risk score is:
Risk Score = 3 * (0 + 1) = 3
A second example for a requirement with low adherence level that is potentially critical and takes a
long time to fix is:
Risk Score = 3 * (3 + 5) = 24
The risk score is considerably higher than before, as the impact of not meeting this requirement on the
project development roadmap can potentially be very high, primarily due to the long time that is
required for a fix. The risk score for a fully met requirement thus is always 0 and there is obviously no
5.2 Methods The pilots will consist of several methods and approaches. Data collection will be carried out
quantitatively and qualitatively with the help of different methods and instruments. Gathering
quantitative and qualitative, objective and subjective data as well as observing and asking primary and
secondary users ensures a comprehensive view on the developmental process. This approach also
helps to validate subjective data gained from the primary users, who might have problems in telling
their experiences in interviews for instance.
5.2.1 Controlled tasks versus free usage
On the one hand, participants will be given the chance to explore POSEIDON and all its functions on
their own during the one month pilot. That means, all primary users will get a smartphone and an
interactive table, which enables them to use POSEIDON whenever they want to or when help is
needed. On the other hand, several controlled tasks will be conducted. This approach enables the
researcher to gain standardized information about how people with DS are using POSEIDON. These
controlled tasks will be used to make sure, that the main functions of POSEIDON are used and
evaluated by people with DS during the pilots. Therefore they will be asked to fulfil different tasks
consisting of several subtasks with all of the features tested during the trials. These different tasks will
take place at different stages during the testing period of one month. People with DS will be instructed
by a researcher. The researcher will explain the different tasks and ask the person with DS to fulfil them
with the help of POSEIDON.
5.2.2 Observation
Observation will take place when conducting the controlled tasks. For each subtask the researcher notes down:
Wrong turns: i.e. where the participant taps that will not complete the subtask.
For each subtask and “wrong turn” the following should be recorded:
o Level of hesitation or confusion shown (scale of 1-5 with 1 being very little hesitation) and description of hesitation or confusion. Did the person look distressed or distracted etc.?
o Comments and question voiced by the participant, record of what was asked and when.
o Hints or help given. What hints or tips were given and when. These could be coded with help of prompting guidance.
o Subjective measure of ease of task completion on a scale of 1-5.
Task XY Subtask Wrong
turns Hesitation or confusion
Comments or questions asked
Hints or tips given
Ease of task (1-5)
Subtask 1
Subtask 2
Subtask 3
Subtask 4
Subtask 5
Subtask 5
Subtask 6
5.2.3 Video recording
When conducting the controlled tasks, participants will be recorded. This ensures, that all important
aspects of the interaction will be gathered. Afterwards there is enough time to analyse the video
Additionally the “again-again table” will be used [6]. It lists activities on the left hand side and has three
columns headed. By using this instrument we try to answer if there is a desire to do things again. The
“again-again table” will be displayed automatically after finishing an interaction and participants will
be asked: “Can you imagine to use XX in the future?”
5.2.9 Scores
Scores can be used to assess immediate learning and long-term learning effects. Participants will receive scores for instance when doing a training with the stationary navigation system. If the score increases over time, we can draw the conclusion that also the learning success increases.
5.2.10 Limited Evaluation of POSEIDON
Whilst 18 users engaging over a month will provide rich data of all elements of POSEIDON, more
evaluation is needed. For that reason, we will evaluate a limited version of POSEIDON with at least 30
users in all countries. The DSA´s will help with the recruitment. They can send out information about
the projects, the app and ask if anyone is interested in testing it out. Conferences and workshop where
the POSEIDON project will take part will be used to introduce the POSEIDON system to large user
groups. People interested in the POSEIDON technology will be encouraged to try it out. Researcher will
observe and identify problems in usage. Afterwards they will be interviewed and asked about their
[1] B. Beizer, Software system testing and quality assurance. Van Nostrand Reinhold Co., 1984.
[2] B. Nuseibeh and S. Easterbrook, “Requirements engineering: a roadmap,” in Proceedings of the Conference on the Future of Software Engineering, 2000, pp. 35–46.
[3] I. Sommerville and P. Sawyer, Requirements engineering: a good practice guide. John Wiley & Sons, Inc., 1997.
[4] K. Dowd, Beyond value at risk: the new science of risk management, vol. 3. Wiley Chichester, 1998.
[5] Barendregt, W., & Bekker, M. M. (2006). Developing a coding scheme for detecting usability and fun problems in computer games for young children. Behavior Research Methods, 38(3), 382–389. http://doi.org/10.3758/BF03192791
[6] Read, J. C., MacFarlane, S. J., & Casey, C. (2002). Endurability, engagement and expectations: Measuring children’s fun. In Interaction Design and Children (S. 189–198).