Top Banner
1 SenSec Mobile Application Security through Passive Sensing Jiang Zhu, Pang Wu, Xiao Wang, Joy Ying Zhang Carnegie Mellon University January 31 st , 2013
27

ICNC 2013 SenSec Presentation

Dec 12, 2014

Download

Documents

Jiang Zhu

We introduce a new mobile system framework, SenSec, which uses passive sensory data to ensure the security of applications and data on mobile devices.
SenSec constantly collects sensory data from accelerometers, gyroscopes and magnetometers and constructs the gesture model of how a user uses the device.
SenSec calculates the sureness that the mobile device is being used by its owner.
Based on the sureness score, mobile devices can dynamically request the user to provide active authentication (such as a strong password), or disable certain features of the mobile devices to protect user's privacy and information security.
In this paper, we model such gesture patterns through a continuous n-gram language model using a set of features constructed from these sensors. We built mobile application prototype based on this model and use it to perform both user classification and user authentication experiments. User studies show that SenSec can achieve 75 accuracy in identifying the users and 71.3 accuracy in detecting the non-owners with only 13.1 false alarms.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 1. Jiang Zhu, Pang Wu, Xiao Wang, Joy Ying ZhangCarnegie Mellon UniversityJanuary 31st, 20131

2. Monitor and track user behavior on smartphones using various on-device sensors Convert sensory traces and other context information to Personal Behavior Features Build continuous n-gram model with these features and use it for calculation of Sureness Scores Trigger various Authentication Schemes when certain application is launched. 2 3. 3 4. 60%The 329 organizationspolled had collectively lost50% more than 86,000 devices with average cost of lost40% data at $49,246 per device,worth $2.1 billion or $6.430%million per organization. 20% 10% "The Billion Dollar Lost-Laptop Study," 0% conducted by Intel Corporation and thePonemon Institute, analyzed the scopeand circumstances of missing laptopMobile Device Loss or theft PCs.Strategy One Survey conducted among a U.S. sample of 3017 adults age 18 years older in September 21- 28, 2010, with an oversample in the top 20 cities (based on population). 4 5. Application PasswordDifferentapplications may have differentA major source of sensitivitiessecurity vulnerabilities.Easy toguess, reuse, forgotten, shared UsabilityAuthentication too-often or sometimes too loose 5 6. 6 7. QuantizationClusteringRisk Analysis Sensor FusionActivityTree and SegmentationRecognition Certainty of RiskApplication Sensitivity > P(President Obama has signed the Bill of | Sports ) LM reflects the n-gram distribution of the training data: domain, genre, topics. With labeled behavior text data, we can train a LM for each activity type: walking-LM, running-LM and classify the activity as9 10. User behavior at time t depends only on the last n-1 behaviors Sequence of behaviors can be predicted by n consecutive location in the past Maximum Likelihood Estimation from training data by counting: MLE assign zero probability to unseen n-grams Incorporate smoothing function (Katz)Discount probability for observed gramsReserve probability for unseen grams 12 11. Convert feature vector series to label streams dimension reduction Using n-gram to model sequence of label stream for each sensory dimension current state and transition captured Step window with assigned length A1 A2A1 A4G2 G5G2 G2 W2W1 W2P1P3P6 P1 A2 G2G5 W1 P1P3 A1A4 G2 W1W2 P113 12. Build n-gram models for M users/classes m=1,2,3M Given a behavior text L, we estimate L is generated by the model from user m: The user classification problem formulated as 14 13. A binary classification problem: Classifying a user as the owner =1 or not =-1. Given a behavioral n-gram model And an observation r, evaluate the probability of a given user is the owner =1 and check if exceeding a threshold : Given a sequence of behavior text L, and a sensitivity threshold , validate if L is generated by user m15 14. 0. 80. 7Aver age Log Pr obabi l i t y0. 60. 50. 4 C D A0. 30. 2 Log Probility B Low Threshold High Threshold0. 10Sl i di ng W ndow Posi t i oni16 15. SensingPreprocessing Modeling N-gram ModelFeatureBehavior TextConstructionGeneration UserClassifier Classification UserClassifier Binary Authentication Threshold Inference17 16. 18 17. Accelerometer Used to summarize acceleration stream Calculated separately for each dimension [x,y,z,m] Meta features:Total Time, Window Size GPS: location string from Google Map API and mobility path WiFi: SSIDs, RSSIs and path Applications: Bitmap of well-known applications Application Traffic Pattern: TCP UDP traffic pattern vectors: [ remote host, port, rate ]19 18. Offline data collection (for training and testing)Pick up the device from a deskUnlock the device using the right slide patternInvoke Email app from the "Home Screen"Lock the device by pressing the "Power" button Put the device back on the desk 20 19. 21 21 20. 71.3% True-Positive Rate with 13.1% False Positive 22 21. 23 22. 24 23. 25 24. QuantizationClusteringRisk Analysis Sensor FusionActivityTree and SegmentationRecognition Certainty of RiskApplication Sensitivity < Application Access Control Experiments to discover anomaly usage with ~80%accuracy with only days of training data26 25. Alpha test in Jun 2012, 1st Google Play Store release in Oct 2012 False Positive: 13% FPR still annoying users sometimes Use adaptive model Adding the trace data shortly before a false positive to the training data and update the model Change passcode validation to sliding pattern A false positive will grant a free ride for a configurable duration Assumption: just authenticated user should control the device for a given period of time Free Ride period will end immediately if abrupt context change isdetected. Newer version is scheduled to be release in Jan 2013.27 26. Extended data set for feature construction TCP, UDP traffic; sound; ambient lighting; battery status, etc. Data and Modeling Gain more insights into the data, features and factorized relationships among various sensors Try other classification methods and compare results: LR, SVM, Random Forest, etc Enhanced security of SenSec components Integration with Android security framework and other applications Privacy challenges Data collection, model training, privacy policy, etc. Energy efficiency 28 27. Thank you.