Top Banner
A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities Praneeth Vepakomma Debraj De Sajal K. Das Shekhar Bhansali Department of Computer Science, Missouri University of Science & Technology Department of Electrical and Computer Engineering, Florida International University Email: praneeth.vepakomma@fiu.edu, [email protected], [email protected], sbhansa@fiu.edu Abstract—In this work we present A-Wristocracy, a novel framework for recognizing very fine-grained and complex in- home activities of human users (particularly elderly people) with wrist-worn device sensing. Our designed A-Wristocracy system improves upon the state-of-the-art works on in-home activity recognition using wearables. These works are mostly able to detect coarse-grained ADLs (Activities of Daily Living) but not large number of fine-grained and complex IADLs (Instrumental Activities of Daily Living). These are also not able to distinguish similar activities but with different context (such as sit on floor vs. sit on bed vs. sit on sofa). Our solution helps accurate detection of in-home ADLs/ IADLs and contextual activities, which are all critically important for remote elderly care in tracking their physical and cognitive capabilities. A-Wristocracy makes it feasible to classify large number of fine-grained and complex activities, through Deep Learning based data analytics and exploiting multi-modal sensing on wrist-worn device. It exploits minimal functionality from very light additional infrastructure (through only few Bluetooth beacons), for coarse level location context. A-Wristocracy preserves direct user privacy by excluding camera/ video imaging on wearable or infrastructure. The clas- sification procedure consists of practical feature set extraction from multi-modal wearable sensor suites, followed by Deep Learning based supervised fine-level classification algorithm. We have collected exhaustive home-based ADLs and IADLs data from multiple users. Our designed classifier is validated to be able to recognize very fine-grained complex 22 daily activities (much larger number than 6-12 activities detected by state-of- the-art works using wearable and no camera/ video) with high average test accuracies of 90% or more for two users in two different home environments. I. I NTRODUCTION Current state of elderly healthcare and care-giving systems in the United States is causing the Federal Government sig- nificantly high spending. Now, the cognitive health conditions and cognitive diseases largely constitute the concerns in el- derly healthcare. Examples of prevalent cognitive conditions or diseases in elderly population are dementia, stroke, and Parkinson’s disease. This requires early detection of gradual cognitive changes in elderly people, while they live their daily lives at home. Remote assessment of their daily fine-grained activity profiles may be useful in deciding when it’s time for changing medication, assistance by caregiver, or moving them to formal care facilities. Automated recognition and classification of fine-grained and complex in-home activity contexts of elderly people is a key component required towards designing applications for elderly healthcare and well-being. Unfortunately, the existing in-home activity context recogni- tion works have serious limitations in terms of: granularity and complexity of detected activities, infrastructure cost, and pri- vacy awareness. But the more important drawback is that most of the works are able to recognize only basic coarse-grained ADLs (Activities of Daily Living), while very few complex IADLs (instrumental activities of daily living) [14]. The ADLs are typically basic self-care skills that people usually learn during early childhood, such as sitting, standing, walking, watching TV, etc. But IADLs are more complex tasks needed for independent living such as cooking, housekeeping, doing laundry, and so on. ADLs typically require more physical or postural ability, while IADLs require more cognitive and decision making skills. Also, the related works are typically not able to distinguish similar activity states with different context, such as sit on floor vs. sit on bed vs. sit on sofa. This can be important for understanding health conditions and state of well-being of elderly people at home. In this work we have designed A-Wristocracy system, that can help recognize in- home ADLs/ IADLs and contextual activities of users with high accuracy, mainly by utilizing multi-modal sensing on wrist-worn wearable and intelligent sensor data analytics. To the best of our knowledge, this is the first time that a total of 22 in-home activities are recognized mainly by wearable device, which we believe is of significance, when compared to between 6 and 12 activities recognized in related existing works. There are mainly three categories of works in the literature of in-home activity context recognition: (i) only with wearable devices (e.g. [13], [16], [17]), (ii) only with static infrastructure based systems in physical environment (e.g. [15]), and (iii) combined wearable and additional static infrastructure based systems (e.g. [9]). First, in wearable devices only, the activity recognition can be achieved by learning from the sensed data from one or multiple carried or body-worn devices. Second, in static infrastructure based system, the activity recognition can be achieved by learning from the sensed data from deployed systems in the physical environment such as sensor networks, RFID tags and readers, etc. Third, there is a line of work which performs activity recognition with combination of sensed data from both wearable devices and additional static infrastructure.
6

A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

Apr 22, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

A-Wristocracy: Deep Learning on Wrist-wornSensing for Recognition of User Complex Activities

Praneeth Vepakomma‡ Debraj De† Sajal K. Das† Shekhar Bhansali‡

†Department of Computer Science, Missouri University of Science & Technology‡Department of Electrical and Computer Engineering, Florida International UniversityEmail: [email protected], [email protected], [email protected], [email protected]

Abstract—In this work we present A-Wristocracy, a novelframework for recognizing very fine-grained and complex in-home activities of human users (particularly elderly people) withwrist-worn device sensing. Our designed A-Wristocracy systemimproves upon the state-of-the-art works on in-home activityrecognition using wearables. These works are mostly able todetect coarse-grained ADLs (Activities of Daily Living) but notlarge number of fine-grained and complex IADLs (InstrumentalActivities of Daily Living). These are also not able to distinguishsimilar activities but with different context (such as sit on floor vs.sit on bed vs. sit on sofa). Our solution helps accurate detectionof in-home ADLs/ IADLs and contextual activities, which areall critically important for remote elderly care in trackingtheir physical and cognitive capabilities. A-Wristocracy makes itfeasible to classify large number of fine-grained and complexactivities, through Deep Learning based data analytics andexploiting multi-modal sensing on wrist-worn device. It exploitsminimal functionality from very light additional infrastructure(through only few Bluetooth beacons), for coarse level locationcontext. A-Wristocracy preserves direct user privacy by excludingcamera/ video imaging on wearable or infrastructure. The clas-sification procedure consists of practical feature set extractionfrom multi-modal wearable sensor suites, followed by DeepLearning based supervised fine-level classification algorithm. Wehave collected exhaustive home-based ADLs and IADLs datafrom multiple users. Our designed classifier is validated to beable to recognize very fine-grained complex 22 daily activities(much larger number than 6-12 activities detected by state-of-the-art works using wearable and no camera/ video) with highaverage test accuracies of 90% or more for two users in twodifferent home environments.

I. INTRODUCTION

Current state of elderly healthcare and care-giving systemsin the United States is causing the Federal Government sig-nificantly high spending. Now, the cognitive health conditionsand cognitive diseases largely constitute the concerns in el-derly healthcare. Examples of prevalent cognitive conditionsor diseases in elderly population are dementia, stroke, andParkinson’s disease. This requires early detection of gradualcognitive changes in elderly people, while they live their dailylives at home. Remote assessment of their daily fine-grainedactivity profiles may be useful in deciding when it’s timefor changing medication, assistance by caregiver, or movingthem to formal care facilities. Automated recognition andclassification of fine-grained and complex in-home activitycontexts of elderly people is a key component required towardsdesigning applications for elderly healthcare and well-being.

Unfortunately, the existing in-home activity context recogni-tion works have serious limitations in terms of: granularity andcomplexity of detected activities, infrastructure cost, and pri-vacy awareness. But the more important drawback is that mostof the works are able to recognize only basic coarse-grainedADLs (Activities of Daily Living), while very few complexIADLs (instrumental activities of daily living) [14]. The ADLsare typically basic self-care skills that people usually learnduring early childhood, such as sitting, standing, walking,watching TV, etc. But IADLs are more complex tasks neededfor independent living such as cooking, housekeeping, doinglaundry, and so on. ADLs typically require more physicalor postural ability, while IADLs require more cognitive anddecision making skills. Also, the related works are typicallynot able to distinguish similar activity states with differentcontext, such as sit on floor vs. sit on bed vs. sit on sofa. Thiscan be important for understanding health conditions and stateof well-being of elderly people at home. In this work we havedesigned A-Wristocracy system, that can help recognize in-home ADLs/ IADLs and contextual activities of users withhigh accuracy, mainly by utilizing multi-modal sensing onwrist-worn wearable and intelligent sensor data analytics. Tothe best of our knowledge, this is the first time that a totalof 22 in-home activities are recognized mainly by wearabledevice, which we believe is of significance, when comparedto between 6 and 12 activities recognized in related existingworks.

There are mainly three categories of works in the literatureof in-home activity context recognition: (i) only with wearabledevices (e.g. [13], [16], [17]), (ii) only with static infrastructurebased systems in physical environment (e.g. [15]), and (iii)combined wearable and additional static infrastructure basedsystems (e.g. [9]). First, in wearable devices only, the activityrecognition can be achieved by learning from the sensed datafrom one or multiple carried or body-worn devices. Second, instatic infrastructure based system, the activity recognition canbe achieved by learning from the sensed data from deployedsystems in the physical environment such as sensor networks,RFID tags and readers, etc. Third, there is a line of work whichperforms activity recognition with combination of sensed datafrom both wearable devices and additional static infrastructure.

Page 2: A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

2

TABLE I: Comparison of different categories of works on user in-home activity recognition.

Wearable sensors in use Additional infras-tructure in use

Activities recognized

D. Wilson et.al. [15]

None Motion detectors,break-beamsensors, pressuremats, and contactswitches

Room-level tracking and basic activities such as sleeping in bed, user movementstatus

P. Gupta et. al.[7]

Belt-clip accelerometer None 6 activities: walking, jumping, running, sit to-stand/stand-to-sit, stand-to-kneel-to-stand, and being stationary

K. Zhan et. al.[16]

Smartphone accelerometer andvideo

None 12 activities: walking, going upstairs, going Downstairs, drinking, stand up, sitdown, sitting, reading, watching TV/monitor, writing, switch water-tap, hand-washing

N. Roy et. al.[9]

Smartphone accelerometer and gy-roscope

Ceiling mountedinfrared motionsensors

6 “low-level” postural or motion activities (sitting, standing, walking, running,lying and climbing stairs), and 6 “high-level” semantic activities (cleaning,cooking, medication, sweeping, washing hands, watering plants)

Our proposedA-Wristocracyframework

Wrist-worn wearable multi-modalsensors: activity (accelerometer,gyroscope); ambient environment(temperature, atmospheric pres-sure, humidity); location context(Bluetooth message reception)

Few simpleBluetooth beaconlocation tagsin the physicalenvironment

Total 22 complex fine-grained activity contexts with various activity classes: (i)Locomotive (walk indoor, run indoor); (ii) Semantic (use refrigerator, clean uten-sil, cooking, sit and eat, use bathroom sink, standing and talking); (iii) Transitional(indoor to outdoor, outdoor to indoor, walk upstairs, walk downstairs); and (iv)Postural/ relatively Stationary (just stand, stand and lean on wall, lying on bed,sit on bed, sit on desk chair, lying on floor, sit on floor, lying on sofa, sit on sofa,sit on Commode).

As an example, the work in [9] has combined deployed net-worked motion sensor based room level location context withsmartphone based activity sensing, for classifying postural/locomotive activity states of multiple inhabitants in SmartEnvironment. Other important works on activity recognitionare [5], [2], [8]. Relevant works on multi-sensor fusion andcontext-awareness in Smart Environments include [10], [6].

The state-of-the-art works utilizing smartphone/ smart wear-able onboard sensors for activity recognition, in majority uti-lizes only the accelerometer and gyroscope for activity sensingand GPS for outdoor location sensing. But current generation/upcoming smartwatches or other wrist-worn devices are/ willbe equipped with more versatile sensing capabilities withmulti-modal sensor arrays. These can provide streams of richcontextual information about subtle activities and ambienceof users. Our designed system selectively utilizes these for:(a) body locomotion sensing (accelerometer, gyroscope), (b)ambient environment condition sensing (barometric pressure,temperature, humidity), and (c) contextual location signaturesensing through message reception from few simple Bluetoothbeacons deployed in physical environment. To best of ourknowledge, few or no previous work have exploited thewearable ambience sensing features and the Bluetooth beaconlocation tags for user complex in-home activity recognition.All these multi-modal rich sensor data on wearable devicecan provide subtle but valuable contextual signatures, whichcan be efficiently learned and exploited for more fine-grainedcomplex activity context recognition of human users. Thecomparative study and significant contribution of our designedsolution are illustrated in details in Table I.

The main contribution of this proposed work are as follows:

• Wrist-worn wearable multi-modal sensor data based clas-sification of complex very fine-grained ADLs and IADLs.Deep Learning Neural Network based classifier is utilizedand its parameters are tuned for enhanced classification

performance.• Leveraging inexpensive and light additional infrastructure

with few Bluetooth beacons as location context tags,besides user’s wrist-worn wearable device.

• Performance validation with exhaustive experimentalstudy consisting of large data collection of 22 complexin-home activities by two users in two different homeenvironments.

The rest of the paper is organized as follows. Section IIpresents our proposed A-Wristocracy complex in-home ADL/IADL recognition framework in details. Next Section IIIpresents detailed experimental evaluation and validation ofA-Wristocracy. Finally Section IV concludes this work withdiscussion on future works.

II. A-WRISTOCRACY: COMPLEX IN-HOME ADL/ IADLRECOGNITION

The wrist-worn wearable multi-modal sensing based ap-proach in A-Wristocracy is illustrated in Figure 1. Also asshown in Figure 2, the proposed and designed A-Wristocracyin-home activity classifier consists of pipeline of followingphases: (i) sliding window based data pre-processing followedby feature extraction on each of the sensor datastreams;(ii) Deep Learning Neural Network based classification withparameter tuning.

A. Feature Extraction from Raw Sensor Data

Accelerometer features: Number of different works in theliterature typically extracts too many features from fast (dueto higher sampling rate required) accelerometer data streams,which can be often time and resource consuming, especiallyfor low power wearables. Realizing this and after studying de-tailed literature on accelerometer data processing in healthcarebased applications, A-Wristocracy is designed to use just thefollowing 6 features from accelerometer data for each sam-pled sliding window (sliding windows size chosen typically

Page 3: A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

3

Fig. 1: A-Wristocracy framework: fine-grained complex in-home ADL/ IADL recognition with multi-modal sensing onwearable wrist-worn device.

as 2 seconds): mean and variance of resultant acceleration(√

a2x + a2y + a2z where ax, ay and az are acceleration alongthe X, Y and Z axis), mean and variance of first derivative ofresultant acceleration, mean and variance of second derivativeof resultant acceleration. The resultant acceleration is a goodmeasure of the degree of body movement due to activity, andit includes the effect of accelerometer signal variations in allthe three axes of acceleration. The 3 axis accelerometer issampled at 100 Hz. This sampling frequency is enough tocapture acceleration based human user body movements.

It is important to note that A-Wristocracy doesn’t useany axis specific parameter of accelerometer for practicalapplicability in real-world scenario. All the 6 features arecombined property of all 3 axis, and thus is not affected byrotation or tilting of the wearable device. If the wearabledevices are rotated or tilted, the direction of all the axiswill change, leading to changing distribution of accelerationcomponents on the 3 axis. For example the wearable on wrist,in vertical and in tilted positions, will have totally differentamount of gravitational acceleration on Z axis because ofchange of device orientation through time. A-Wristocracy thuscompromises some axis specific features in order to supportreal-world applicability and practical usage.

Gyroscope features: Same as accelerometer, A-Wristocracyis designed to use just the following 6 features (again all theseare not sensitive to specific axis) from gyroscope data foreach sampled sliding window (sliding windows size chosentypically as 2 seconds): mean and variance of resultant (of the3 axis) angular speed, mean and variance of first derivativeof resultant angular speed, mean and variance of secondderivative of resultant angular speed. The 3 axis gyroscope issampled at 100 Hz. This sampling frequency is enough to cap-ture angular momentum based human user body movements.

Temperature, humidity and pressure sensor features:The temperature, humidity and barometric pressure sensorsare typically sampled at 1 Hz, 1 Hz and 5 Hz respectively.

These sampling frequencies are enough to capture typicallymuch slower variation of ambient environment factors. Thebarometric air pressure sensor sampling rate was a bit higherto capture fine changes in atmospheric pressure in differentlocation contexts. A-Wristocracy is designed to use just themean and variance of windowed data for each of these sensors(sliding windows size chosen typically as 2 seconds).

Location context features: A-Wristocracy is designed touse GPS sensor (if available) and Bluetooth message receptionfrom Bluetooth beacons deployed in the infrastructure. Bothare sampled at 1 Hz frequency. To note that our designed A-Wristocracy system is more targeted for ADL/ IADL recogni-tion in the indoor home environment. Although GPS signal isusually not available in indoors, A-Wristocracy collects GPSdata to recognize any activity of going outdoors or outdoor-indoor transition. But the more effective location featuresused by A-Wristocracy is the Bluetooth beacon messageRSSI (received signal strength indicator). This is one of thekey novelties of this work. The simple, small, inexpensiveBluetooth beacon devices are getting popularity in commercialsectors and business for ease of very low deployment andmanagement overhead. These are small transmitters that cannotify nearby devices of their presence, representing proximityof those devices to the beacons. We have exploited thispractical feasibility in A-Wristocracy to provide it with somecoarse-grained location context. From the beacon message(broadcasted by the beacons and received by the wrist-wornwearable) comparative (the maximum) RSSI signal strengthand the beacon unique ID (these ID’s are mapped to locationtags), it is feasible for the wearable devices to infer estimatedlocation contexts such as bedroom/ kitchen/ bathroom/ livingroom etc.

B. Deep Learning based Classifier Algorithm

The Deep Learning ([3], [4]) with Neural Network that weapplied, consists of a multi-layer feed-forward artificial neuralnetwork that is trained with stochastic gradient descent usingback-propagation with two hidden layers. Our implementa-tion of Deep Learning Neural Network is based on wrapperroutines we wrote in R, around the OxData’s build of H2Opackage [1] for R. H2O is an open-source in-memory platformfor machine learning and predictive modeling. Further detailson Deep Learning based classifier setup is discussed in sectionIII-B.

III. EXPERIMENTAL EVALUATION AND ANALYSIS

A. Experiment Setup

We have tested in-home ADL/ IADL classification perfor-mance of A-Wristocracy with collected large datasets fromtwo users in two separate home environments. The User-1performed 22 activities, while the User-2 performed 19 activ-ities. Each user was wearing sensor data collection device onwrist. We have used Samsung Galaxy S4 smartphone onboardsensors alongwith Gimbal Bluetooth beacons in experimentsetup. (It is important to note that smartphone is used justas multi-sensor data collection platform purpose. As next

Page 4: A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

4

Fig. 2: Complex ADL activity classifier system overview inproposed A-Wristocracy.

Fig. 3: Ground truth activities performed by user-1 (top fig-ure) and corresponding classified activities by A-Wristocracy(bottom figure).

phase of this project, we are starting to design prototype ofwrist-worn wearable form factor, as well as custom beaconsfor location tags, both using RFDuino hardware platform.)The sensing device generates data from all three categoriesof sensors: movement activity (accelerometer, gyroscope),ambience (temperature, atmospheric pressure, humidity) andlocation (GPS, location tag from Bluetooth beacon). We havedeveloped Android applications for: (i) collecting data fromonboard selected sensors as well as receiving Bluetooth signalsfrom the beacons deployed in different rooms of the homeenvironment, and (ii) collecting the ground truth informationwith proper timestamp. The sensor data collection applica-tion is installed on the wrist-worn smartphone, while theground truth application is installed on an external observer’ssmartphone for recording the ground truth (observer taps thebuttons belonging to every single activity to store the starttime and end time for each of those activities). Both theapplications have used time synchronization from the NTPserver for fine accuracy with timestamps. Every user’s seriesof selected activities consisted of average 45 minutes of sensordata collection.

B. ADL/IADL Recognition Accuracy with A-Wristocracy

We have performed evaluation of Deep Learning NeuralNetwork on data from sensors on the wrist location of twousers with 4411 and 5413 records respectively, containingcalculated features (with a 2 seconds sliding window and 50%overlap). This data was appended with the corresponding label,denoting the ground truth activity of the person at that time-instance. We then made a 75% - 25% uniform random splitof this data to form the train and test datasets respectively foreach user.

The Deep Learning we used consists of a multi-layer feed-forward artificial neural network that is trained with stochasticgradient descent using back-propagation with two hidden lay-ers. In order to approximately figure out the required numberof neurons in each of the two hidden layers, apart from therequired type of activation function that gives us the best out-of-sample error, we have performed a 5-fold cross-validationover a discrete grid of parameters and possible choices. Themodel with choice of these parameters that produced thehighest cross-validation accuracy was then used on the testdataset. We then report the training, cross-validation and testerrors for that corresponding model in Table II for User-1 &User-2 respectively. Example snapshot of ground truth andclassified activities of User-1 is illustrated in Figure 3.

To note that the no-information rate is defined as a randomclassifiers performance to know the baseline from a random-ized result and the Kappa statistic is a measure of concordancefor categorical data that measures agreement relative to whatwould be expected by chance, with higher being better witha maximum possible value of 1. This is especially a goodmeasure, in cases where classes are unbalanced like in ourdataset. In order to enforce regularization, we performed adropout procedure ([12]) where for the input layer, 0.2 is thefraction of features dropped for each training row while forhidden layers, 0.5 is the fraction of incoming weights thatwas dropped from training at that layer. Note that dropoutwas randomized for each training row. More specifically, arandom user-given fraction of the incoming weights to eachhidden layer is zeroed out during training, for each trainingrow. This effectively trains exponentially many models at once,and can improve generalization error of the model.

We do this with 50 epochs where an epoch is the numberof passes over the training dataset to be carried out. Similarly,for the choice of activation functions, we tried the Hyperbolictangent function, the Rectifier function which chooses themaximum of (0, x) where x is the input value and the Maxoutfunction, which chooses the maximum coordinate of the inputvector. Each of these activation functions were checked viacross-validation along with drop-out for each choice of thenumber of neurons from the discrete grid for each of the hid-den layers. We obtained optimal cross-validated performancewith the Rectifier activation function with dropout, and hencewent ahead with this choice.

Table III shows accuracies of the model for choice ofnumber of neurons in each of the two hidden layers on the

Page 5: A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

5

TABLE II: Overall performance across two users.

Training Accuracy Validation Accuracy Test Accuracy No-Information Rate Test KappaUser 1 97.23% 90.94% 93.06% 12.11% 0.926User 2 95.42 89.38 90.73 14.73% 0.896

TABLE III: Grid Search on Total neurons in Hidden Layer vs CV, Train and Test Accuracies.

Hidden Layer 1 Hidden Layer 2 Training Accuracy Validation Accuracy Test Accuracy100 1200 94.18 88.96 87.45100 1100 91.97 86.72 86.55100 1000 92.38 88.66 89.82100 900 95.31 89.08 89.27100 800 95.42 89.38 90.73100 700 95.65 89.02 90.36100 500 94.36 89.21 88.5550 50 86.71 81.92 84.00

Fig. 4: Confusion matrix of test predictions for User-1.

dataset of User-2. Note, that our grid was even more granularwhile we chose to report this condensed version in the paperfor the ease of the reader. Note that the choice of 100 and800 neurons respectively produced an optimal cross-validatedaccuracy that we went ahead with in this case. Finally, TableIV and Table V show the test-accuracies across each of theactivities for User-1 and User-2 respectively along with otherperformance measures for classification as defined in [11].Furthermore, the confusion matrix for test-accuracies acrosseach of the activities for User-1 and User-2 are also illustratedin Figure 4 and 5 respectively.

IV. CONCLUSION

In this work we have proposed and designed A-Wristocracy,an innovative multi-modal sensing and Deep Learning Neu-ral Network based data analytics on wrist-worn wearabledevice, for fine-grained and complex in-home ADL/ IADLrecognition. Our designed framework is validated to providehigh classification accuracy for varieties of human complexactivities of different categories: locomotive, semantic, transi-

Fig. 5: Confusion matrix of test predictions for User-2.

tional, postural/ relatively stationary. A-Wristocracy has beenshown to classify 22 fine-grained activities with very highaverage testing accuracies of 90% and 93% over two usersin two different home environments. As for next phase of thiswork, we are collaborating with local hospital (Phelps CountyRegional Medical Center at Rolla, Missouri) doctors for testingdesigned system and classification algorithm in elderly care fortracking signs of mild dementia during independent living athome and in assisted living facilities.

REFERENCES

[1] H2o package in r. http://docs.0xdata.com/Ruser/Rinstall.html.[2] L. Bao and S. S. Intille. Activity recognition from user-annotated

acceleration data. pages 1–17. Springer, 2004.[3] Y. Bengio. Learning deep architectures for ai. Found. Trends Mach.

Learn., 2(1):1–127, Jan. 2009.[4] Y. Bengio, I. J. Goodfellow, and A. Courville. Deep learning. Book in

preparation for MIT Press, 2014.[5] H.-T. Cheng, F.-T. Sun, M. Griss, P. Davis, J. Li, and D. You. Nu-

activ: Recognizing unseen new activities using semantic attribute-basedlearning. In Mobisys’13. ACM, 2013.

Page 6: A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities

6

TABLE IV: Performance on User-1 across various activities.

Sensitivity Specificity PPV NPV Prevalence Detection Rate Detection Prevalence Balanced Accuracy(1) Indoor to outdoor 0.000 0.994 0.000 0.997 0.003 0.000 0.006 0.497(2) Lying on Bed 0.973 0.998 0.973 0.998 0.055 0.053 0.055 0.986(3) Lying on Floor 0.969 0.989 0.816 0.998 0.047 0.046 0.056 0.979(4) Lying on Sofa 1.000 1.000 1.000 1.000 0.028 0.028 0.028 1.000(5) Outdoor to Indoor 0.000 0.999 0.000 0.999 0.001 0.000 0.001 0.499(6) Running Indoor 0.929 0.998 0.929 0.998 0.021 0.019 0.021 0.964(7) Sitting on Bed 1.000 0.998 0.967 1.000 0.043 0.043 0.044 0.999(8) Sitting on Chair 0.938 1.000 1.000 0.997 0.047 0.044 0.044 0.969(9) Sitting on commode 0.977 0.998 0.977 0.998 0.064 0.062 0.064 0.988(10) Sitting on Floor 0.865 0.998 0.978 0.989 0.077 0.066 0.068 0.932(11) Sitting on Sofa 0.967 1.000 1.000 0.998 0.044 0.043 0.043 0.983(12) Sitting while Eating 1.000 1.000 1.000 1.000 0.071 0.071 0.071 1.000(13) Stand Fridge 0.750 0.997 0.857 0.994 0.024 0.018 0.021 0.873(14) Standing & Just Stand 0.963 0.992 0.941 0.995 0.121 0.117 0.124 0.978(15) Standing & Leaned on Wall 0.967 0.988 0.784 0.998 0.044 0.043 0.055 0.977(16) Standing while Cleaning Utensils 0.906 0.991 0.829 0.995 0.047 0.043 0.052 0.948(17) Standing while Cooking 0.857 1.000 1.000 0.997 0.021 0.018 0.018 0.929(18) Standing while Talking 0.872 0.997 0.953 0.991 0.069 0.061 0.064 0.935(19) Standing while Using Sink 0.971 0.998 0.971 0.998 0.052 0.050 0.052 0.985(20) Walking Downstairs 0.824 0.998 0.933 0.995 0.025 0.021 0.022 0.911(21) Walking Just 0.889 0.997 0.960 0.990 0.080 0.071 0.074 0.943(22) Walking Upstairs 0.909 0.996 0.769 0.998 0.016 0.015 0.019 0.952

TABLE V: Performance on User-2 across various activities.

Sensitivity Specificity PPV NPV Prevalence Detection Rate Detection Prevalence Balanced Accuracy(2) Lying on Bed 0.969 0.992 0.940 0.996 0.118 0.115 0.121 0.981(3) Lying on Floor 0.842 1.000 1.000 0.994 0.035 0.029 0.029 0.921(4) Lying on Sofa 0.928 0.994 0.812 0.998 0.025 0.023 0.029 0.961(6) Running Indoor 1.000 0.998 0.900 1.000 0.016 0.016 0.018 0.999(7) Sitting on Bed 1.000 0.990 0.900 1.000 0.081 0.081 0.090 0.995(9) Sitting on commode 1.000 0.998 0.969 1.000 0.056 0.056 0.058 0.999(10) Sitting on Floor 0.821 0.996 0.920 0.990 0.050 0.041 0.045 0.908(11) Sitting on Sofa 0.833 0.998 0.909 0.996 0.021 0.018 0.020 0.915(12) Sitting while Eating 1.000 0.996 0.937 1.000 0.054 0.054 0.058 0.998(13) Stand Fridge 0.933 1.000 1.000 0.998 0.027 0.025 0.025 0.966(14) Standing & Just Stand 0.641 0.993 0.945 0.941 0.147 0.094 0.100 0.817(15) Standing & Leaned on Wall 0.947 0.951 0.409 0.998 0.034 0.032 0.080 0.949(16) Standing while Cleaning Utensils 0.944 0.992 0.810 0.998 0.032 0.030 0.038 0.968(17) Standing while Cooking 0.650 0.992 0.764 0.986 0.036 0.023 0.031 0.821(18) Standing while Talking 0.945 0.998 0.972 0.996 0.067 0.063 0.065 0.972(19) Standing while Using Sink 1.000 0.998 0.958 1.000 0.041 0.041 0.043 0.999(20) Walking Downstairs 0.923 0.990 0.705 0.998 0.023 0.021 0.030 0.956(21) Walking Just 0.929 0.997 0.981 0.992 0.104 0.096 0.098 0.963(22) Walking Upstairs 0.643 1.000 1.000 0.991 0.025 0.016 0.016 0.821

[6] D. J. Cook and S. K. Das. How smart are our environments? an updatedlook at the state of the art. Pervasive Mob. Comput., 3(2):53–73, Mar.2007.

[7] P. Gupta and T. Dallas. Feature selection and activity recognition systemusing a single triaxial accelerometer. Biomedical Engineering, IEEETransactions on, 61(6):1780–1786, June 2014.

[8] H. Pirsiavash and D. Ramanan. Detecting activities of daily living infirst-person camera views. In Computer Vision and Pattern Recognition(CVPR), 2012 IEEE Conference on. IEEE, 2012.

[9] N. Roy, A. Misra, and D. Cook. Infrastructure-assisted smartphone-based adl recognition in multi-inhabitant smart environments. InPervasive Computing and Communications (PerCom), 2013 IEEE In-ternational Conference on, pages 38–46, March 2013.

[10] N. Roy, A. Misra, C. Julien, S. K. Das, and J. Biswas. An energy-efficient quality adaptive framework for multi-modal sensor contextrecognition. In Pervasive Computing and Communications (PerCom),2011 IEEE International Conference on, pages 63–73, March.

[11] M. Sokolova and G. Lapalme. A systematic analysis of performancemeasures for classification tasks. Inf. Process. Manage., 45(4):427–437,July 2009.

[12] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdi-

nov. Dropout: A simple way to prevent neural networks from overfitting.J. Mach. Learn. Res., 15(1):1929–1958, Jan. 2014.

[13] L. Wang, T. Gu, X. Tao, H. Chen, and J. Lu. Recognizing multi-useractivities using wearable sensors in a smart home. Pervasive Mob.Comput., 7(3):287–298, June 2011.

[14] J. B. M. William D. Spector, Sidney Katz and J. P. Fulton. The hier-archical relationship between activities of daily living and instrumentalactivities of daily living. Journal of Chronic Diseases, 40(6):481–489,1987.

[15] D. H. Wilson and C. Atkeson. Simultaneous tracking and activity recog-nition (star) using many anonymous, binary sensors. In Proceedings ofthe Third International Conference on Pervasive Computing, PERVA-SIVE’05, pages 62–79, Berlin, Heidelberg, 2005. Springer-Verlag.

[16] K. Zhan, S. Faux, and F. Ramos. Multi-scale conditional randomfields for first-person activity recognition. In Pervasive Computing andCommunications (PerCom), 2014 IEEE International Conference on,pages 51–59, March 2014.

[17] Z. Zhao, Y. Chen, J. Liu, Z. Shen, and M. Liu. Cross-people mobile-phone based activity recognition. In Proceedings of the Twenty-SecondInternational Joint Conference on Artificial Intelligence - Volume Three,IJCAI’11, pages 2545–2550. AAAI Press, 2011.