-
S2A: Secure Smart Household Appliances
Yuxin ChenDepartment of Computer Science
Swiss Federal Institute of Technology (ETHZ)Zurich,
Switzerland
[email protected]
Bo LuoDepartment of EECS
The University of KansasLawrence, KS, USA
[email protected]
ABSTRACTSecurity protection is an integral component for smart
homes;however, smart appliances security has received little
atten-tion in the research community. Household appliances be-come
very vulnerable if we introduce smart functions with-out proper
security protection. In particular, smart accessfunctions enable
users to operate devices remotely. Mean-while, smart devices are
are also designed to support resi-dential demand response, i.e.
postpone non-urgent tasks tonon-peak hours. However, remote
adversaries could utilizesuch functions to manipulate smart
appliances’ operationswithout physically touching them. Such
interferences, if notproperly handled, could damage the smart
devices, disturbowners’ life or even harm the households’ physical
security.
In this paper, we present S2A, a security protection so-lution
to be embedded in smart appliances. First, a SUPmodel is developed
to quantify penalties from device secu-rity, usability and
electricity price. We employ multi-criteriareinforcement learning
to integrate the three factors to de-termine an optimal operation
strategy. Next, to leverage therisk of forged control commands or
pricing data, we presenta realtime assessment mechanism based on
Bayesian infer-ence. Risk indices are further integrated into the
SUP modelto serve as weighting factors of corresponding decision
crite-ria. Evaluation shows that S2A ensures appliances
securitywhile providing good usability and economical
efficiency.
Categories and Subject DescriptorsK.6.5 [Management of Computing
and Informationsystems]: Security and Protection—Physical
Security
General TermsSecurity, Design
1. INTRODUCTIONAs the next-generation standard for power
generation and
transmission, advanced computation and telecommunication
Permission to make digital or hard copies of all or part of this
work forpersonal or classroom use is granted without fee provided
that copies arenot made or distributed for profit or commercial
advantage and that copiesbear this notice and the full citation on
the first page. To copy otherwise, torepublish, to post on servers
or to redistribute to lists, requires prior specificpermission
and/or a fee.CODASPY’12, February 7–9, 2012, San Antonio, Texas,
USA.Copyright 2012 ACM 978-1-4503-1091-8/12/02 ...$10.00.
capabilities are introduced into smart devices to constitute
alarge-scale smart grid network, and to support “smart” func-tions,
such as large sale load balancing, dynamic pricing,smart
consumption. Unfortunately, most of the advancedfunctions,
especially those on the power consumption side(i.e., smart home
side), are not yet implemented in the pilotprojects. Security
concern is one of the major obstacles thatprevent broad industrial
adoption of such smart functions.
Smart appliances are envisioned to receive control com-mands and
electricity prices from the network. Embed-ded control systems have
been installed in household appli-ances. Manufactures are starting
to build appliances withremote access functions (a.k.a. smart
access). For instance,LG products with THINQ technology were
demonstratedat CES 2011. General Electronics (GE) has been
workingwith Tendril to connect GE household products over Zig-bee
wireless networks. Such smart access capabilities enableowners to
remotely monitor and operate their devices us-ing phones, tablets,
or through designated websites. On theother hand, smart meters are
designed to receive realtimeelectricity pricing (RTP) and pass on
to household devices[14], which optimizes energy consumptions based
on RTP[34]. However, smart appliances are not yet equipped
withsmart security protection mechanisms to defend against cy-ber
attacks. For instance, they follow remote control com-mands without
verifying the authenticity of such commands.In this context, if we
introduce “smart” functions to elec-trical appliances without
proper security protection, theybecome more vulnerable than
conventional devices. Adver-saries could manipulate or intervene
smart appliances’ op-erations remotely, without physically touching
them. Moreseverely, when compromised devices are set to work in
ab-normal conditions for an extended period of time, they couldbe
physically damaged, and even compromise environmentalsafety. For
instance, overheating electric motors are shownto be a root cause
of insulation failures, which is very dan-gerous to the users. In
this paper, instead of focusing on thetraditional security notions
of confidentiality, integrity andavailability, we focus on the
operational or physical safety ofsmart devices. Therefore, the
security goal of the S2A ap-proach is to ensure the physical safety
of the smart devices,preventing them from working in abnormal
conditions, whenthe smart control environment becomes
unreliable.
In this paper, we present S2A (secure smart appliances),
asecurity protection mechanism for smart appliances. S2A isan
embedded software solution, which employs machine learn-ing
technologies to provide smart and flexible protections forsmart
household appliances. First, for an individual smart
217
-
appliance, the S2A models heterogeneous notions of
devicesecurity (S), usability (U), and electricity pricing (P) into
ho-mogeneous benefit (or penalty) functions. We then
employmulti-criteria reinforcement learning (MCRL) to integrateall
three factors to determine the optimal operation strat-egy, which
aims to maximize usability and minimize bothsecurity penalties and
electricity costs. Next, to leveragethe risk of fake control
commands or forged pricing data, wepropose a real-time risk
assessment and re-weighting mecha-nism. We invoke Bayesian
inference approaches to evaluatethe trustworthiness of input from
each channel, and adjustthe parameters for MCRL criteria
accordingly. Through se-curity analysis and simulation results, we
show that our solu-tion ensures appliance security, while
maintaining usabilityand economical efficiency of power
consumption.
Our contributions are: (1) we introduce a comprehen-sive
security protection for smart household devices. Oursolution
integrates usability, electricity pricing, and devicesecurity to
maximize the overall benefit (or minimize over-all penalty). (2) By
employing machine learning methods,S2A provides an effective and
reliable security protection.Moreover, compared with the
conventional security notion,which is black-or-white, S2A
seamlessly integrates risk as-sessment into decision algorithms,
without making a verdictof “safe” or “under attack”. (3) We propose
a flexible ap-proach, in which degree of protection and quality of
serviceis based on resources (e.g. historical data) and
capabilities.
2. RELATED WORKSmart grids are envisioned as the next generation
power
system [50, 16, 40, 48]. Some vision/introductory papers canbe
found at [25, 11, 35, 6]. Existing research projects mostlyfocus on
the “power grid” side (i.e. the macro grid), for ex-ample, large
scale dynamic load balancing, reliability andrecovery, power market
[45, 24, 38, 32]. On the other endof the spectrum, smart meters are
being implemented [22],and smart meter communication systems are
being deployed[41, 42, 1, 39]. Meanwhile, smart appliances are
proposed toimprove user experience and cost efficiency: realtime
retailpricing (RTP) introduces dynamically changing
electricityprices that reflect the realtime supply-vs-demand trend
[4,5]. RTP is delivered to smart meters and then
householdappliances. With the built-in intelligence, smart
appliancescould move non-urgent tasks to off-peak hours to
enhanceeconomic efficiency of power usage [10]. Recently, [34]
intro-duces a reinforcement learning approach to identify a
rela-tively optimal time to start tasks. Tasks from the queue
arepicked to execute based on RTP and the length of wait. Onthe
other hand, systems have been designed to enable re-mote monitor
and control for household appliances throughsmart meters [47, 20,
36, 46].
Security and privacy protection is an important and chal-lenging
component in smart grids [26, 19]. A comprehensivesurvey is
provided at [2]. In particular, [30, 31] studiedthe security
requirements in the overall smart grid frame-work and presented
security technologies to fulfill such re-quirements. [44] presents
a conceptual framework to pro-tect power grid automation systems.
[7] points out securityrequirements and threats related to smart
meters. [3] ana-lyzes external intrusions, and introduces
specification-baseddetection approach as a potential solution. [27,
28] show
Figure 1: Smart appliances receive control com-mands and
realtime electricity pricing from remote.
that adversaries could attack the advanced metering
infras-tructure to manipulate power usage for energy theft.
Most of the above-mentioned security protection approachesfocus
on the power grid, from generators to distributers tosmart meters.
Meanwhile, security issues related to house-hold appliances have
been lightly studied in the contextof ubiquitous computing and
home-area networks [23, 33].They mostly concern about wireless
communication security,authentication and privacy issues. For
instance, [21] showsthat in-home activities could be inferred from
realtime en-ergy consumption data. However, to our best
knowledge,there has not been any work on protecting appliances’
phys-ical security, especially at the presence of untrustworthy
ex-ternal inputs (control commands and prices).
3. PROBLEM AND SOLUTION OVERVIEW
3.1 Smart Household DevicesSmart devices receive users’ control
commands and real-
time pricing (RTP) from the network. As shown in Figure
1,utility distribution companies broadcast realtime
electricityprices to households. Various proposals have been
suggestedin the literature. The more popular approach is to
employwired communication from utility companies to neighbor-hood
collector devices, and wireless communications (e.g.wireless mesh)
to further deliver to smart meters. Smart me-ters then send RTP
information to compatible smart appli-ances via home-area WiFi.
Meanwhile, manufactures suchas LG and GE are starting to introduce
remote control func-tions to smart appliances. In their design,
users send controlcommands via a designated website or a mobile
app. Thecommands go though the Internet to be delivered to the
ap-pliances, which connect to the Internet through householdWiFi.
There are also proposals that such commands couldbe delivered via
smart meters.
3.2 The Threat ModelIn a large scale open platform with many
stakeholders,
from the viewpoint of a smart device, it cannot assume ab-solute
security of all the external peer(s). When adversariespenetrate
into the control systems or temper with the com-munication channel,
they could inject forged inputs (controlcommands and/or RTP data)
into household smart devices,who may not be able to verify the
authenticity and validityof such inputs. The interferences, if not
properly handled,
218
-
could affect the owner’s regular lifestyle, or even cause
seri-ous physical damages. Let us look at some examples:
Example 1: Electric vehicles (EV) are designed to opti-mize the
economical efficiency of power consumption, i.e.,charge the battery
when the electricity price is low, and (op-tionally) provide power
to the household or the grid whenthe price is high (a.k.a.
vehicle-to-grid [17, 18]). An in-truder may send forged pricing
data to trick EV to operateimproperly to cause financial losses to
the owner, to affectthe load balancing of the power grid, or even
mess up withthe grid to achieve financial advantages. �
Example 2: Battery life of electric vehicles heavily rely
onproper use and maintenance. An intruder may send forgedfluctuant
pricing to trick EV battery to constantly switch be-tween charge
and discharge for a relatively long period (e.g.start and stop
charging 10 times per hour for 10 hours). Thisattack will seriously
damage the battery, and even cause haz-ardous conditions when the
battery gets overheated. �
Example 3: An adversary could penetrate into the remotecontrol
systems or home-area networks to obtain controlof household
appliances. Such interference could affect theowner’s regular
lifestyle, or even cause serious physical dam-ages. When an
adversary sets all the exothermic devices in ahousehold to maximum
heat level simultaneously, the roomtemperature rises significantly.
More dangerously, the cir-cuit gets overloaded, and the risk of
fire increases. �
In this paper, we consider the situation where smart de-vices
(excluding smart meters) receive potentially harmfulinputs from
ostensibly legitimate sources. We do not considersmart meters,
since they are usually located outside of thehousehold, and they
are physically insecure. On the otherhand, a mal-functioning smart
meter will not directly threathousehold safety. In our settings,
each smart appliance inthe residence functions as an agent that has
an embeddedcontrol unit to manage its own operations. We assume
thatsmart devices are physically secure since they are usually
lo-cated inside the household. We also assume that the embed-ded
control systems are not compromised: the control logicis relatively
less complicate; they only receive limited infor-mation (control
and price) from designated sources; softwareupdates usually require
physically touching the device (e.g.using a USB drive). Therefore,
it is not easy to hack intothe kernels of the smart devices
remotely.
The goal of the paper is to protect the operational secu-rity of
household smart devices in the presence of potentiallyharmful
inputs from information (and command) distribu-tion channels. We
also aim to maintain usability (QoS) andeconomical efficiency. In
particular, we study two channelsthat may take suspicious inputs:
user control commands(UCC) and realtime pricing (RTP). Meanwhile,
based onthe duration and frequency of suspicious inputs, we
con-sider two types of threats: Threat 1. sporadic incidents
andThreat 2. continuous attacks. Continuous active attacksare
potentially more damaging, and may not be handled byexisting
rule-based security protection mechanisms.
Please note that smart meters have essentially
differentcapabilities and functionalities than household
appliances.Our threat models and countermeasures are not
applicableon smart meters. Some related works on smart meter
secu-rity are summarized in Section 2.
(a)
(b)
Figure 2: Battery charging system under activeattack: (a) forged
control commands that rapidlyswitch between charging and
discharging; (b) charg-ers operations with rule-based
protection.
3.3 Rule-based Security ProtectionAt present, most of the
household appliances, including
smart devices, are equipped with embedded security protec-tion
mechanisms that are usually rule-based. For instance,when an air
conditioner is switched off, its internal securityprotection
mechanism will keep it off for n minutes beforeit could be
restarted. Similarly, when a smart car stopscharging, it will
mandatorily wait for m minutes to avoidimmediate recharging to
protect the battery. Some devicesuse sensors to obtain status
information, and security rulesare based on sensor inputs. For
example: an electronic mo-tor should stop for n minutes when the
motor temperatureis higher than x degrees. However, the rules are
mostly de-signed to protect the device against users’ misuse. They
pro-vide minimum protection, and do not consider future
conse-quences. In particular, they can hardly protect the
devicesagainst active attacks, especially continuous attacks.
Example 4: Figure 2 gives an example of a battery charg-ing
system under active attack. Aimed to damage the bat-tery, the
attacker sends forged control commands that rapidlyswitch between
charging and discharging. The embeddedrule-based protection
mechanism enforces an interval of tminutes between two charges, to
protect the battery againsttransient power line faults. As shown in
Figure 2 (b), forcontinuous attacks, such protection mechanism will
only in-crease charging interval to t. However, without more
com-plicate security protection mechanism, the battery is
stilldamaged after an extended period of time. �
3.4 Solution OverviewIn this paper, we propose the S2A
framework, as an em-
bedded software solution to protect operational security ofsmart
household appliances against misuses and forged in-puts. The goals
are: (1) ensure appliances’ security, (2)maintain usability, and
(3) reduce energy costs. Typically,smart appliances need to make
appropriate tradeoffs be-tween ensuring usability (e.g. user wants
to start the dish-washer) and minimizing energy cost (e.g. smart
dishwasherwants to wait for low electricity price). Such
requirementsusually lead to a complicate optimization problem,
which isdifficult, if not impossible, to solve with rule-based
methods.
We propose a two-phase solution, which enables smartdevices to
learn to protect themselves, without requestingany support from
“supernodes” or smart meters. In the first
219
-
phase, we aim to achieve an optimized and fine-grained
op-erational strategy. The SUP model considers security penal-ties,
usability penalties, as well as economical benefits. Inthe second
phase, we assess the trustworthiness of each in-put channel by
comparing the instant input with historicaldata. Since user
commands and RTP demonstrate very dif-ferent patterns in the
regular working conditions, differentintrusion detection mechanisms
are invoked accordingly. Wedo not provide a verdict of “safe” or
“under attack (forgedinput)”. Instead, the security assessments are
seamlesslyfeedback to the SUP model as weight factors of the
corre-sponding penalty (or benefit) functions.
4. THE S2A FRAMEWORK
4.1 Overview of the S2A Framework.Figure 3 demonstrates the S2A
framework. As shown, our
solution constitutes two major components (tiers): the SUPmodule
and the realtime risk assessment module.
Tier 1: The SUP model. Tier 1 considers the basicscenario of S2A
framework, in which an appliance is an in-dependent device without
any knowledge to external histori-cal information or environmental
information. Note that weassume a short operation log is available,
which records aqueue of user requests to use the device, and recent
historyof on-off operations. In the basic S2A solution, we definea
SUP model to capture security, usability and electricityprice. In
SUP, a security function s(t) is defined to modelthe operational
penalty for the physical safety of smart elec-trical devices; a
usability penalty function u(t) is defined tomodel the frustration
of users (similar to [34]) when theywait for the delayed
operations; and finally real-time elec-tricity price is received by
smart pricing p(t). When a userrequests a S2A-enabled device to
operate, SUP balances allthree penalties to make a smart operation
plan, so that: thedevice always works in a safe working mode; the
user willnot be very unhappy because of long wait; and the
totalcost of electricity to complete the task is relatively low.
Inour solution, we employ multi-criteria reinforcement learn-ing
(MCRL) to make real-time operational decisions basedon three
criteria: s(t), p(t), and p(t).
Tier 2: Real-time risk assessment for SUP. In thesecond tier of
the S2A framework, we consider the trust-worthiness of the user
requests and the electricity pricinginformation. To protect smart
appliances in the presence ofsuspicious control commands or price
data, we use Bayesianinference (RRA-RTP and RRA-UCC in Figure 2) to
assessthe credibility of the inputs, i.e. the likelihood of
tamperedcontrol commands or forged electricity prices. Note that
weonly evaluate the validity of remote data, not physical
op-erations on the device (e.g. pushing a button on the washeris
always considered to be a valid control command). TheBayesian
inference modules takes current inputs to comparewith historical
data, and generates two risk indexes Rp andRu, which measure the
trustworthiness of the control com-mands and electricity prices,
respectively. Unlike conven-tional intrusion detection approaches,
we do not provide averdict on whether the system is under attack or
not. In-stead, the risk factors are seamlessly integrated into
SUP.For instance, the risk index for user command (RU ) is sentback
to the SUP model to serve as weighting factor for theusability
penalty. In this way, when forged inputs are de-
tected at tier 2, its risk factor increases, and the
correspond-ing weight factor for the suspicious input channel
decreases,to fade out the suspicious input.
Override Rules. To improve user experience and to giveusers
better control, especially in unusual circumstances, thefollowing
override rules are enabled in S2A.
(1). In S2A, the user may force the task to be conductedwithout
any delay, i.e. force usability functions to overridesmart-pricing
functions. As a reference, in [34], user couldpress “start” button
twice to instantly start the operation,without waiting for low
electricity price. However, securitypenalty is still in place to
ensure device security.
(2). For security purposes, we assume that the device couldbe
turned off at anytime. That is, there is no securitypenalty if the
user intends to turn the devices off.
(3). When users request legitimate but unusual operationsfrom
remote, the operation could appear to be highly sus-picious to the
realtime risk analysis module. To preventany legitimate requests
from being denied or deferred, weintroduce an additional task
verification process, which isindependent from the routine
verification. Risk assessmentcould be overridden by additional
validation, so that crit-ical (and irregular) task will not be
delayed. Technically,we enforce an extra authentication to verify
the identity ofthe requestor. For verified tasks, we increase the
weight forusability and decrease the weight for smart pricing.
Onceagain, security penalty is still in place.
4.2 The SUP ModelOverview. The core of the S2A framework is an
SUPmodel. For a smart appliance without long time memory,we first
identify its operational states Ω (i.e., the total re-ward/penalty
gained by leaving the previous states), andmodel state transitions
as a set of actions A. For sim-plicity of description, we only
consider the case that ap-pliances are either ON or OFF. Hence, we
have four typesof actions: A = {〈OFF→OFF〉, 〈OFF→ON〉,
〈ON→ON〉,〈ON→OFF〉}. We model the process as a multi-objectiveMarkov
decision process (MMDP) that considers the follow-ing three
criteria, and define penalty functions for each ac-tion w.r.t. each
criterion. Note that we can easily add moremodes (e.g., use four
modes: high, mid, low, off) by addingpenalty functions. Our
learning algorithms takes generalMMDP, with no restrictions on
number of states or actions.
Security Criterion. A security penalty function s(A, t) ∈R+ is
defined to denote the penalty of performing actionA at time t. At
high level, security penalty s() quantifiesthe potential of
damaging the device (e.g. overheat the bat-tery) and/or harming the
environment (e.g. burn down thehouse). We only enforce penalties
for turning on the de-vice (〈OFF→ON〉) or keeping on the device
(〈ON→ON〉).s(〈ON → ON〉, t) and s(〈OFF → ON〉, t) cannot coexistsince
the current state is either ON or OFF. For simplic-ity of
description, we use s(t) when there is no confusion.A larger s(t)
indicates that the current working conditionis not desirable, and
pushes the decision against turning orkeeping the device on.
The parameters of the penalty function are defined by
themanufacture of the appliance, based on the operational
andsecurity characteristics of the device. In our model, eachdevice
is equipped a built-in function generator Gs(oper),
220
-
Figure 3: The S2A Framework: SUP-MCRL: the
security-usability-pricing model with multi-criteria rein-forcement
learning; RRA-RTP: realtime risk analysis on realtime electricity
pricing; RRA-UCC: realtimerisk analysis on users’ control commands;
RP and RU : risk factors.
Figure 4: Examples of security penalty functions
which constructs penalty functions based on pre-defined
rules,device states and recent operations. Gs(oper) is triggered
torefresh s(t) whenever an operation is performed (i.e. at
statechange: 〈OFF→ONrlangle or 〈ON→OFF〉). For “smarter”appliances,
the security penalty is generated on-the-fly fromsensor inputs
(e.g. heat, environmental temperature, etc).When the security
penalty reaches MAX, it cannot be sur-passed by other penalty
functions – the device should remainoff until security penalty
drops.
Example 5: Some devices cannot operate for more thana
pre-defined period of time – they need to stop and cooldown. When
the appliance is first switched on, Gs(oper)generates a security
penalty function for keeping the de-vice on (〈ON→ON〉). In this
case, s(t) demonstrates anincreasing pattern (Figure 4 (a)). When
it is switched offbefore reaching maximum penalty, Gs(oper)
refreshes s(t)to s(t, 〈OFF→ON〉), which requires the device to keep
offfor a while, and then starts to decrease (Figure 4 (b)). Onthe
other hand, some devices (e.g. batteries) cannot switchbetween on
and off frequently. There could be no securitypenalty for keep
charging (Figure 4 (c)), but the penaltyfunction for 〈OFF→ON〉
reaches maximum value once thedevice is turned off, hence
preventing it from being switchedon until a waiting period (Figure
4 (d)). �
Usability Criterion. A smart appliance receives user re-quest
c(t) ∈ R+ indicating his/her desire to run the de-vices at time t.
Such a control command, however, doesnot necessarily start the
device instantly, but rather speci-fies a reservation with the S2A
system to run the device atan optimal (possibly later) time to
balance user utility withother factors, such as economical
efficiency and system secu-rity. With the user reservation at time
t0, a certain quantity
Figure 5: Examples of usability penalty functions
c(t0) = e0 of electricity is requested for the operation,
oth-erwise c(t0) = 0, indicating that there is no reserved
energyuse at t0. User reservations are stored in a FIFO
pendingenergy queue qi = {〈t0, e0〉, 〈t1, e1〉, . . .}.
In the SUP model, we capture usability penalty with apenalty
function u(A, t) ∈ R+, which denotes the expectedusability penalty
when we take action A at time t. Whenthe requested task is delayed
(〈ON→OFF〉 or 〈OFF→OFF〉)due to high electricity price or active
security protection, us-ability penalty (u(t)) starts to increase.
Meanwhile, thereis no usability penalty for turning on or keeping
on the de-vice. In practice, u(t) cannot be detected on-the-fly,
rather,it is calculated from a pre-defined usability penalty
model,which is based on characteristics of the appliance’s usageand
user-centric analysis results. Note that s(t) representsthe
appliance’s perception (guess) of users’ dissatisfaction.When the
operation pauses at time t0 (〈ON→OFF〉), theappliance immediately
knows that task completion will bepostponed. Hence, the penalty
s(t) starts to increase at t0.
Example 6: Figure 5 shows some simple examples of us-ability
penalty function u(t). In Figure 5 (a), the task ispaused at τ1,
where the penalty function for 〈OFF→OFF〉starts to increase
linearly. In Figure 5 (b), the delayed taskis restarted at time τs,
the expected completion time stopschanging. Hence, usability
penalty (for 〈ON→OFF〉) keepsstatic, until the task is completed at
τ2. Meanwhile, if theuser is aware of the task progress, the
dissatisfactory levelcould decrease when s/he knows that the task
is in progressand is expected to finish soon (Figure 5(c)). Last,
as shownin Figure 5 (d), user frustration may increase again
whenthe task is paused at τ2, before its completion. �
In S2A, the usability model is pre-built in the smart de-vice by
its manufacturer. u(t) is generated when a new task
221
-
Figure 6: Example of a smart appliance operatingunder SUP
model.
is picked from the task queue. It is refreshed when the
op-eration of the appliance changes. In general, u(t)
increases(usually nonlinearly) for 〈OFF → OFF〉, and stays stableor
decreases when the task is progressing (〈ON → OFF〉).The model also
takes in the recent working history and en-vironmental parameters,
so that u(t) is adjusted to users’everyday life. Different
appliances will also have differentpatterns for usability penalty
in different conditions. Forinstance, users are less concerned when
a smart car is beingcharged at night; but s/he may want the task to
completesoon if the car is plugged-in in the morning. In this
pa-per, we model u(t) as an abstract function. Usability anduser
behavior modeling problems are studied in the humanfactors research
community. Usability in the context of dy-namical electricity
pricing has been studied in the contextof residential demand
response (RDS), e.g. [34, 15, 12].
Smart Pricing Criterion. Smart appliances are designedto receive
realtime retail electricity prices (RTP) from thedistributor. In
the SUP model, realtime price is provided bya function p(t) ∈ R+.
Data from world-wide pilot projectshave shown different patterns of
electricity pricing. Most ofthem demonstrates a daily revolving
pattern, which peaks inearly evening, decreases later into the
night, and increases inthe morning. Currently, our model only
considers electricitycost. With reasonable modifications, it could
be expandedto include more complicate cost models, which consider
costsfrom multiple sources.
The SUP Model. The SUP model integrates all the cri-teria
described above to minimize three factors: securitypenalty,
usability penalty, and total expense for the task.Before discussing
the detailed learning algorithm, we showan intuitive example on how
the SUP model works.
Example 7: As shown in Figure 6, the user submits a re-quest at
time t0 to an S2A embedded device. Since the elec-tricity price
p(t) is low, the appliance starts instantly (notethat the dashed
line in P plot represents average electricityprice, not a decision
threshold – there is no preset decisionthreshold for each penalty
function). We assume that thisdevice cannot continuously operate
for a very long period oftime. Security penalty starts to increase
gradually. At t1,there is a sharp raise of electricity price.
Meanwhile, u(t) isvery low at t1 – there has not been any delay
until t1. At t1,the SUP model decides to pause the job. Starting
from t1,
usability penalty starts to grow since the completion time
isexpected to be postponed (we use a linear function to
modelusability penalty in this example, however, real-world
usabil-ity model is usually non-linear). Security penalty reduces
asthe device is off. At t2, the SUP model decides to switchthe
device on, based on the increasing usability penalty anddecreasing
security penalty. At t3, due to very high securitypenalty (e.g. the
motor it very hot), the device is turned offagain. The device cools
down until t4, when it is restartedto get the task done at t6.
�
5. THE ALGORITHMSIn this section, we describe the core
algorithms to support
smart protection in the S2A framework. First, we
introducemulti-criteria reinforcement learning (MCRL) to
determinethe optimal operational strategy for the SUP model.
Next,we introduce Bayesian-based realtime risk assessment,
andseamlessly integrates risk indices into the SUP model.
5.1 MCRL for SUPThe core problem in the SUP model is to learn an
optimal
operational behavior for a smart appliance in the presenceof
dynamic preferences/penalties introduced by multiple ob-jectives.
In SUP, a device is an independent agent, whichlearns an
approximately optimal strategy through trail anderror interactions
with the environmental variables. Thepending energy is defined as
the amount of energy that isrequired to finish the task. When the
input power to thedevice is stationary, its pending energy is
directly propor-tional to the remaining time to finish the task.
The en-vironment of a smart appliance is described by a
deter-ministic multi-objective Markov decision process (MMDP)〈Ω, A,
f, �ρ〉, where Ω is the finite set of discrete states, Ais the set
of actions, f : Ω × A → Ω is the state transi-tion function, and �ρ
is the vector-based penalty function�ρ : Ω × A → Rn. The state
signal xk ∈ Ω describes theenvironment at each discrete time-step
k. In SUP, xk en-codes the device’s current working status (i.e.,
whether thedevice is on or off), the current pending energy, the
pric-ing information, the cumulative delay of the task, and
theduration since the last operation (i.e., how long has the
de-vice been on or off), etc. The learning agent can alter thestate
at each time step by taking actions ak ∈ A of keepingon/off or
turning off/on a device accordingly. As a resultof the action ak,
the environment changes its state from xkto xk+1 ∈ Ω according to
the state transition rules given byf : xk+1 = f(xk, ak). The agent
then receives immediatevector-valued penalties of taking the action
ak on the basisof multiple evaluating objectives, which is
completely deter-
mined by the current state and action: �φk+1 = �ρ(xk, ak).In the
SUP settings, each of the penalty criterions is asso-
ciated with a weight in accordance with its reliability. Givena
weight vector �w = (w1, . . . , wn) and an MMDP, a newMDP with
vector-valued penalty functions is created whenmultiplying each
penalty ρi(x, a) of type i with wi. For aconstant weight vector �w,
the goal of the learning agent isto minimize the expected
discounted penalty:
Φk = E{∞∑
j=0
γj�φk+j+1 · �w} (1)
where γ ∈ [0, 1) is the discount factor. It can be regardedas
encoding increasing uncertainty about the penalties that
222
-
will be received in the future. Such discounted penalty
com-pactly represents the penalty accumulated in the long run,and
measures a policy’s long-term performance.
For deterministic SUP models, the behavior of an agent
isdescribed by its policy π : Ω → A, which specifies how theagent
chooses its actions given the state. The vector-basedaction-value
function, �Qπ : Ω × A → Rn, is the expectedreturn of a state-action
pair given the policy π: �Qπ(x, a) =
E{∑∞j=0 γj�φk+j+1. ∗ �w|xk = x, ak = a, π}, and the opti-mal
Q-function is defined as �Q∗(x, a) = minπ �Qπ(x, a). Itsatisfies
the Bellman optimality equation
�Q∗(x, a) = �ρ(x, a) + γmina′
�Q∗(π(x, a), a′), ∀a′ ∈ A (2)
where a′ = argmina′ [�w · �Q∗(π(x, a), a′)].The formula is
derived from the original Q-Learning[43],
with vector-based representation of the immediate and ex-pected
discounted penalty function. The current estimate of�Q∗ is updated
using estimated samples of the right-hand sideof Equation 2. These
samples are computed using actualexperience with the task, in the
form of weighted penaltyvectors and pairs of subsequent states xk,
xk+1:
�Qk+1(xk, ak) = �Qk(xk, ak)+
αk[�φk+1 + γ �Qk(xk+1, a′)− �Qk(xk, ak)] (3)
where a′ = argmina′ [�w · �Qk(xk+1, a′)].In variable-penalty
settings, we employ an efficient variable-
transfer algorithm derived from [29]. Since the immediatepenalty
at each time step is a linear combination of differ-ent penalty
factors (e.g., usability penalty, electricity cost),and the Q-value
function (long term penalty) is based onthe sums of the immediate
penalties, we can infer that theexpected discounted penalty of
policy π starting from statex: �w · �Qπ(x, a) is also linear in
penalty weights.
In variable-penalty reinforcement learning, each weightvector
corresponds to an individual Markov decision pro-cess. All the MDPs
share the same transition dynamics(i.e., same states, same actions,
same transition function,etc. One example is that delaying a task
will always in-crease user frustration), but are linear in a set of
penaltyfeatures. Thus, given a new weight vector �wnew and a
start-ing state xk, one can approximate the optimal policy πnewfor
the new weight vector based on the already learned pol-icy set C,
simply by selecting the one with minimum ex-pected discounted
penalty πnew = argminπ∈C Qπ(xk, a′),where a′ = argmina′ [�wnew ·
�Qπ(xk, a′)].SUP-MCRL is presented in Algorithm 1. In step 11,
the
agent tests all actions in all states with nonzero probabil-ity,
which is an exploration-exploitation tradeoff problem.The agent
uses the Boltzmann exploration strategy, whichin state x selects
action a with probability
Probability(x, a) =e1/(τ �w·
�Q(x,a))
∑a′ e
1/(τ �w·�Q(x,a′)) (4)
where τ > 0 controls the randomness of the exploration.When τ
→ 0, this is equivalent with greedy action selec-tion. When τ → ∞,
actions are random. When τ ∈ (0,∞),actions with lower penalties are
more likely to be selected.
5.2 Real-time Risk Assessment (RRA) for SUPThe above model
assumes that all the inputs are valid.
However, the control commands and price data could be
Algorithm 1 Multi-Criteria Reinforcement Learning forSUP
i ← 1c ← 0C ← ∅πinit ← ∅repeat
Obtain the current weight vector �w and the startingstate xkif C
�= ∅ then
Compute πinit ← �w · �Qπ(xk, a′)Initialize the Q-function
vectors of the states
end ifLearn the new policy π′ through vector Q-Learningif (C =
∅) or �w ·Qπinit(xk, a′) − (�w ·Qπ′(xk, a′′) > γ)then
C ← C ∪ π′c ← 0i ← i+ 1
elsec ← c+ 1
end ifuntil c ≥ 1
�ln (i+1)
2
δreturn C
fake since the input channels from remote sources are
vul-nerable. Assuming (trusted) historical data is available, wecan
further evaluate the trustworthiness of current inputsby comparing
them with the reference data. Due to thedifferent characteristics
of smart pricing signals and remoteuser control commands, we
evaluate different input channelswith different models. In
particular, real-time pricing signalsmostly show a periodical
pattern that repeats daily; whilethe user control commands are more
likely to be scatteredover a certain period of a day and usually
conform to diver-siform distributions. The RRA scheme estimates
anomalieswhen the new patterns are not in accordance with a
historicnorm, and generates two risk indexes, indicating the
belief(for RTP) and the confidence (for remote use control
com-mands) that the input sources are trustworthy,
respectively.
5.2.1 RRA-RTPThe smart pricing signal p in S2A is represented in
terms
of stochastic variables that are time indexed. Suppose
RTPcirculates in periods of T . Rather than serializing real-time
pricing data continuously over time, we model the cur-rent pricing
by exploiting the periodical structure of his-toric pricing
information, and extracting each RTP pdt attime t (t = 1, ..., T )
as a distinct stochastic process, whichevolves over the index of
changing cycles d. For instance,if electricity pricing data
changes/evolves daily, the pricingsequence at midnight could be
modeled as random variablesthat are indexed with dates, as these
pricing variables aremore closely correlated and easier to be
inferred.
For real-time risk assessment of smart pricing inputs, wechoose
a Hidden Markov Model (HMM), where the hiddenstates correspond to
the working conditions z1:T of an appli-ance (i.e., time-indexed
states indicating whether the appli-ance is under attack), and the
observable states correspondto the real-time pricing states p1:T
(and any other statesthat we could measure). Such Dynamic Bayesian
Network(DBN) encodes the joint probability distribution over
those
223
-
stochastic variables that capture the evolution of the dy-namic
working conditions. In particular, we adopt the fol-lowing state
transition model Pt and observation model Po:
zt ∼ Pt(ẑt|zt−1)pt ∼ Po(p̂t|phistt , zt)
where ẑt and p̂t are the predicted states, phistt denotes
the
historic pricing vector at time t, pt ∈ R+ is the
real-timepricing signal, and zt ∈ {True, False} denotes the
unknownhidden states. The parameters of the conditional
probabil-ity functions are known matrices that could be obtained
orlearned from the S2A system. Although we only considersmart
pricing signals as observable states for discussion sim-plicity, it
is worth mentioning that our RRA-RTP algorithmis also applicable
for DBNs with multi-dimensional observa-tion states with minor
modification.
The aim of the analysis is to compute the posterior
dis-tribution of the hidden states P (z0:t|p1:t). Since the
obser-vation model could be non-Gaussian distribution (i.e.,
dailyelectricity pricing may change significantly with seasons),
weemploy a particle filtering (PF) algorithm [37] to approxi-mate
the probability distribution of the hidden variables.The basic idea
is to establish a posterior probability distri-bution of the hidden
variable by utilizing a large numberof random samples. The samples
are propagated over timein a sequential importance sampling step
and a subsequentresampling step: (1) The SIS step generates samples
froma specific probability distribution and computes their
asso-ciate weight. (2) The resampling step then multiplies
and/ordiscards these samples to automatically concentrate them
inregions of interest of the state-space of the hidden
variables.
Given N particles {z(i)0:t−1}Ni=1 at time t−1
approximatelydistributed according to the distribution P (z
(i)0:t−1|p1:t−1),
particle filters enable us to computeN particles {z(i)0:t}Ni=1
ap-proximately distributed according to the posterior P (z
(i)0:t|p1:t)
at time t. As we cannot sample from the posterior directly,the
PF update process is achieved by an appropriate im-portance
proposal distribution Q(z0:t), from which we cangenerate
samples:
Q(ẑ0:t|p1:t) = Q(ẑt|z0:t−1, pt)P (z0:t−1|p1:t−1)The samples
from Q(·) must be weighted by the impor-
tance weights
wt =P (ẑ0:t|p1:t)Q(ẑ0:t|p1:t) ∝
Po(pt|phistt , ẑt)Pt(ẑt|z0:t−1)Qt(ẑt|z0:t−1, p1:t) (5)
where Qt(·|·) denotes the choice of proposal distribution.
Tosimplify the calculation, one can adopt the transition prioras
proposal distribution (i.e., Qt(·|·) = Pt(·|·)) [13]. In thiscase,
the weights are given by the likelihood function
wt = Po(pt|phistt , ẑt)The detailed algorithm is shown in
2.
5.2.2 RRA-UCCThe patterns of user control commands are highly
user-
dependent, and may be non-revolving. Such characteris-tics make
it infeasible to construct a probabilistic graphicalmodel as we did
in RRA-RTP for anomaly inference. In-stead, we formulate a
frequentist approach to assess the reli-ability of remote control
signals, using observed frequenciesand statistical hypothesis
testing. With historic data, any
Algorithm 2 RRA-RTP with Particle Filtering
for t = 1 to T doFor i = 1, ..., N , sample from the transition
priors ẑ
(i)t ∝
Pt(zt|z(i)t−1), and setẑ(i)0:t ← (ẑ(i)t , z(i)0:t−1)
For i = 1, ..., N , evaluate and normalize the
importanceweights
w(i)t ∝ Po(pt|phistt , z(i)t )
Multiply/Discard particles with respect to high/low im-
portance weights w(i)t to obtain N particles {z(i)0:t)}Ni=1.
end for
given (daily) UCC input can be considered as one of an infi-nite
sequence of possible repetitions of the same experiment,each
capable of producing statistically independent results.
In RRA-UCC, we integrate two complementary risk as-sessment
schemes to detect anomalies in task starting time(e.g. remotely
start the bread maker at 1am) and anoma-lies in task frequency
(e.g. switch smart car charging onand off 10 times in a minute),
respectively. In our settings,smart appliances have pre-built
default distribution patternsof starting times, whose parameters
are learned from theusage in the household. Normally, the UCC
distributionprototype of appliances is a sum of N Gaussians in the
form
f(x) =∑
i
ai exp(− (x− μi)2
2σ2i).
For instance, a smart dishwasher is embedded with a UCCprior in
the form of three Gaussian distributions. In a house-hold where
residents do not eat breakfast, the first Gaus-sian will show a
weak (or none) peak. Given the numberof Gaussians in the prior
distribution, we can easily obtainthe parameters of each component
distribution by multi-Gaussian fitting techniques (e.g.,[49]). The
mean values μiare clustering centers of the user control commands.
Theconfidence level αt of an incoming command c(t) appearingat time
t is then evaluated according to the ith Gaussianwith the nearest
mean value: αt = fi{(t − μi)/σ2i }, wherei = argmini (t− μi). Next,
to detect operation frequencyanomalies, we explore the periodical
control command in-terval distributions in a household. The basic
idea is that,the intervals between adjacent operations within a
certainperiod should conform to the historic norm. Suppose thata
significant repeating cycle of an appliance’s behavior is
T(generally T should be n ∈ N+ days) . At time t, appli-ance A
receives a remote control command c(t). We obtainall the intervals
between UCCs in [t − T, t], and compareits distribution with the
distributions of intervals in timeslices [t − 2T, t − T ], ..., [t
− mT, t − (m − 1)T ] from his-toric clean data using non-parametric
statistical testing ap-proaches (i.e., Kolmogorov-Smirnov Test
[9]). Then we se-lect the historic time slice(s) where UCC
intervals are mostsimilarly distributed with the current time
window [t−T, t],and retrieve the statistical test results αf (i.e.,
the p-value ofK-S test) as a measurement of trustworthiness of the
currentoperation frequency.
The frequentist approach gives a confidence level with
afrequency probability interpretation and/or a
pre-experimentinterpretation. Such probabilities are combined as
the risk
224
-
Figure 7: Real-time risk assessment for UCC ofsmart car charging
system.
assessment of the user control command:
Ru = f(αt) + g(αf ) (6)
where f(·) and g(·) are monotone increasing functions.Example 8:
We consider a smart car, which could be re-motely controlled to
start and stop charging. In Figure 7,(a) and (b) are security
penalty functions for the chargingsystem: users can start charging
at anytime, but need towait for a while to restart charging after
stopping it. An ad-versary, taking over the control, can send many
consecutivecharging requests. In the basic SUP model w/o risk
detec-tion, usability penalty increases as the later tasks are
beinghold by the security function. The increasing u(t) penaltywill
force charging to restart soon after the security func-tion drops
below MAX. Restart interval will decrease withhigher usability
penalty. On the other hand, with RRA, it isdetected that the
requests are unusual. With more requestsreceived, weight for u(t)
will decrease significantly, so thatusability will have very small
impact in operational decisionmaking. Therefore, recharging
interval will increase to alevel that will not hurt physical
security of the battery. �
6. SECURITY ANALYSISObjective. From security perspective, the
goal of the S2A ap-proach is to ensure that: (1) the smart device
shall not workin extreme state; and (2) the smart device shall not
work inabnormal state for a long period. It is acceptable that a
de-vice may need to work in abnormal mode for a short whilein
special circumstances, or while the risk assessment com-ponents are
in the process of detecting an intrusion.
Threat model. We assume that smart devices are physi-cally
secure since they are usually located inside the house.We also
assume that their control systems are not compro-mised – the
control logic is relatively less complicate, andthey only receive
limited information (control and price)from designated sources;
therefore, it is not easy to hack intothe kernels of smart devices.
Devices are under two types ofthreats: (a) improper operational
requests from legitimateusers; and (b) faked operational requests
or electricity pricesfrom attackers. Threat (a) is usually
once-only, while threat(b) could be continuous and more risky.
Baseline security. In response to threat (a), physical se-curity
of each individual appliance is protected by securitypenalty
function s(t) in S2A. s(t) defines the penalty of turn-ing on or
keeping on the appliance at time t. It cannot beoverridden by other
factors. However, it could be suppressedwhen the usability penalty
is high (e.g. the task had beenheld for a long time), so that the
appliance may work atnon-favorable mode for a short period of time.
Both s(t)and u(t) are generated by mechanisms embedded in smart
appliances. Manufactures should set a very high securitypenalty
(e.g. infinite) when the device is approaching ex-treme status.
Moreover, to ensure security objective (1) de-scribed above, s(t)
cannot be surpassed when it reaches max– the device must be
switched off. Therefore, with a prop-erly designed security
function, the device is guaranteed notto work in extreme state. The
baseline security assuranceapplies for both threat (a) and (b).
Response to continuous attacks. Tier 2 of the S2A frame-work is
to identify abnormal inputs, especially continuousabnormal inputs.
Forged pricing data (or legitimate but un-stable data) is detected
by the RRA-RTP component. TheRRA-RTP model employs HMM, so that the
current risk as-sessment will affect the next assessment;
therefore, the riskindex will propagate continuously. When pricing
informa-tion demonstrates unusual patterns for an extended periodof
time, RRA-RTP will detect increasing risk, and the weightfor p(t)
will continuously decrease. In this way, price factorwill become
too weak to disturb the normal operations ofthe device. On the
other hand, fake user input will be de-tected by RRA-UCC. A
one-time fake command may notbe detected if the command history
does not demonstratea strong pattern, or the fake command falls in
the pattern.Meanwhile, when the attacker sends multiple commands
ina short period of time (e.g. “start battery charging” -
“stopcharging” - “start charging” - etc.), the high frequency
ab-normalities are always accurately detected. The weight
forusability factor decreases accordingly, and the system seesless
need to fulfill such requests. S2A ensures that the smartdevice
will not work in extreme mode in any condition; andalso ensures
that the smart device will not work in abnor-mal mode for a long
period, with the presence of continuousattacks (faked operational
requests or electricity prices).
False positives. Traditional intrusion detection systems(IDS)
strive to reduce false positives and false negatives.Conceptually,
false negatives are undetected anomalies. Aswe have shown, since we
do not label the input data witha binary decision (safe or
abnormal), unusual inputs willalways be penalized in the second
tier of the SUP model.On the other hand, false positives are normal
inputs (thatappears to be suspicious) that are mistakenly labeled
asanomalies. Again, since we do not enforce a decision bound-ary,
such inputs are not classified as anomalies. As theycarry patterns
that are different from regular ones, they willbe somehow penalized
(i.e. weights will be reduced) in theSUP model. However, the degree
of the penalties are lowerthan the“true negatives”. More
importantly, the existence ofthe usability criterion effectively
balances the (wrong) penal-ties, so that users will not become
extremely dissatisfied.
Comparison with rule-based security protection. Insome
appliances, security protection is provided by rule-based decision
(e.g. the motor should stop after continuousoperation for 5
minutes, or the device has to remain off for 3minutes before turned
on again). Compared with rule-baseddecision method, we provide
fine-grained security protection– SUP starts protection before
reaching the extremely crit-ical point (i.e., the rule-based
decision boundary), but alsoallows a certain level of compromise at
the strong demandsfrom other factors. Moreover, in the presence of
continu-ous active attacks, we provide better security by
droppingattacker inputs, instead of working at
minimum-protection
225
-
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
Time
RTPHistoric PricingRTP Risk Index
Figure 8: Real-time risk assessment for realtimepricing:
RRA-RTP
0 20 40 60 80 100 1200
0.5
1
Time
Sample UCCUCC Distribution
0 20 40 60 80 100 1200
0.5
1
Time
UCCUCC Risk Index
Figure 9: Real-time risk assessment for user controlcommands:
RRA-UCC
conditions (as shown in Example 4 in Section 3). On theother
hand, when we take smart pricing and remote controlinto
consideration, the system becomes too complicated tobe handled by
rule-based models.
7. EXPERIMENTAL RESULTSTo demonstrate the effectiveness of the
S2A approach, we
first generated synthetic usage and pricing data based
onheuristic assumptions, and tested S2A with these data. Notethat
our S2A framework could take arbitrary form of UCCand RTP inputs;
as well as arbitrary form of security andusability penalties. In
our simulation, the Q-value of a state-action pair converges after
approximately 150 learning steps.In real world applications,
however, the Q-table of a smartappliance is usually pre-trained by
manufacturer, so that itwould adapt to new conditions faster and
more accurate.
For RRA-RTP, we implement Algorithm 2 with 1000 par-ticles. The
observation model Po is set to be the weightedsum of historic mean
RTP and white noises, where the weightsare derived from the current
states zt in the HMM. As isshown in Figure 8, historical pricing
(dashed line) followsa periodic pattern that revolves daily. The
solid blue linedenotes realtime pricing, and the red dots indicates
the riskindexes (Rp) generated by RRA-RTP. As shown, RTP devi-
ates away from the historic distribution starting from timepoint
68. The anomalies are detected and Rp increases cor-respondingly.
On the other hand, Figure 9 shows the real-time risk assessments of
user control commands. The upperplot shows the UCC distribution
pattern, which is learnedfrom historical control commands. The
lower figure denotesreal-time user control commands and the
corresponding riskindexes generated by our algorithm. As we can
see, slightoffsets of request time will not immediately affect the
riskassessment. However, clear unusual patterns (starting attime
90) are effectively detected. The risk index increaseswhen we have
higher confidence that the received controlcommands demonstrate an
abnormal pattern.
We have tested S2A for different alternations of user com-mands,
electricity pricing and security penalty patterns. Fig-ure 10
demonstrates part of the experiment, which containsa complete use
case. In this experiment, we adopt a sce-nario that the device
cannot work for a long time (e.g. amotor). As shown in the first
plot, a request is made attime point 4829 (middle of X-axis). It is
put on hold due tohigh RTP (plot 4), and usability penalty starts
to increase(plot 3). At approximately time 4840, usability penalty
sur-passes RTP penalty, the job starts to be processed, and
thesecurity penalty increases. S and P together stop the oper-ation
at time 4842, and waited until time 4848, when RTPdrops to very
low. From 4848 to the end of the task, thesecurity penalty has
stopped the operation twice, to forcethe motor to cool off.
Overall, the task was completed withbalanced considerations of S,
U, and P.
8. CONCLUSION & FUTURE WORKIn this paper, we present S2A, a
two-stage security protec-
tion framework for smart household appliances. We first
in-troduce a Security–User–Price (SUP) model to capture threekey
factors, and present a multi-criteria reinforcement learn-ing
(MCRL) approach to integrate all three factors to dy-namically
determine an optimal operational strategy for thesmart device.
Furthermore, we present two risk assessmentapproaches based on
statistical inferences. They evaluatethe trustworthiness of users’
remote control commands aswell as the pricing information received
through smart gridcommunication systems. The realtime risk indices
are seam-lessly incorporated into the SUP model to serve as
weightingfactors of the corresponding penalty functions, therefore
en-sures device security under active attacks. Through
securityanalysis and experimental results, we show that S2A
pro-tects the device security of smart appliances, while
main-taining usability and economic efficiency.
We have presented the S2A model in the paper, however,deploying
the model on smart appliances still requires a lotof research and
engineering efforts. First, it is nontrivialto define security
functions for different types of smart de-vices. For appliances
with sensors to monitor device status,it is also challenging to
quantify (usually non-linear) sensorinputs and assess risks. On the
other hand, it requires in-tensive human and behavior studies to
observe usage habitsof different devices and construct usability
functions fromthe observed patterns. That is, the model still needs
tobe equipped with application-specific parameters to demon-strate
best performance. Finally, it is important and ef-fective to enable
collaborations between smart devices forsituational awareness and
better risk assessment.
226
-
Pending Energies
4790 4800 4810 4820 4830 4840 4850 4860 4870time sequence
0
1
2
3
4
5
6
7
pend
ing
ener
gy
pending energy
Appliance Security
4790 4800 4810 4820 4830 4840 4850 4860 4870time sequence
0
10
20
30
40
50
60
70
appl
ianc
e se
curit
y
appliance security
User Frustration
4790 4800 4810 4820 4830 4840 4850 4860 4870time sequence
0
20
40
60
80
100
120
user
frus
trat
ion
user frustration
Energy Prices
4790 4800 4810 4820 4830 4840 4850 4860 4870time sequence
0
10
20
30
40
50
60
ener
gy p
rices
pricing information
Figure 10: Sample results for S2A. From top to bottom: pending
energy, appliance security penalty, usabilitypenalty, smart
pricing, and energy allocation actions.
9. ACKNOWLEDGEMENTSBo Luo was partially supported by NSF
OIA-1028098 and
University of Kansas General Research Fund 2301420. Theauthors
would like to thank the anonymous reviewers fortheir constructive
suggestions to improve the manuscript.
10. REFERENCES[1] A. Aggarwal, S. Kunta, and P. Verma. A
proposed
communications infrastructure for the smart grid. InInnovative
Smart Grid Technologies (ISGT), 2010.
[2] T. Baumeister. Literature review on smart grid
cybersecurity. Technical Report CSDL-10-10, Departmentof
Information and Computer Sciences, University ofHawaii, Honolulu,
Hawaii 96822, Dec. 2010.
[3] R. Berthier, W. Sanders, and H. Khurana. Intrusiondetection
for advanced metering infrastructures:Requirements and
architectural directions. In IEEESmartGridComm, pages 350 –355,
oct. 2010.
[4] S. Borenstein. The long-run efficiency of
real-timeelectricity pricing. The Energy Journal, 26(3), 2005.
[5] S. Borenstein. The redistributional impact ofnon-linear
electricity pricing. Working paper 602,Regulation2point0, 2010.
[6] A. Bose. Smart transmission grid applications andtheir
supporting infrastructure. Smart Grid, IEEETransactions on, 1(1):11
–19, june 2010.
[7] F. Cleveland. Cyber security issues for advancedmetering
infrasttructure (ami). In IEEE Power andEnergy Society General
Meeting, pages 1 –5, july 2008.
[8] F. Cohen. The smarter grid. Security Privacy, IEEE,8(1):60
–63, jan.-feb. 2010.
[9] W. J. Conover. Practical Nonparametric Statistics.John Wiley
& Sons, December 1998.
[10] Q. Dam, S. Mohagheghi, and J. Stoupis. Intelligentdemand
response scheme for customer side loadmanagement. In IEEE ENERGY
2008, 2008.
[11] H. Farhangi. The path of the smart grid. Power andEnergy
Magazine, IEEE, 8(1):18 –28, 2010.
[12] A. Faruqui and S. George. Quantifying customerresponse to
dynamic pricing. The Electricity Journal,18(4):53 – 63, 2005.
[13] N. Gordon, D. Salmond, and A. Smith. Novelapproach to
nonlinear/non-gaussian bayesian stateestimation. Radar and Signal
Processing, IEEProceedings F, 140(2):107 –113, apr 1993.
[14] A. B. Haney, T. Jamasb, and M. G. Pollitt. Smartmetering
and electricity demand: Technology,
227
-
economics and international experience. Technicalreport, Faculty
of Economics, University ofCambridge, 2009.
[15] K. Herter, P. McAuliffe, and A. Rosenfeld. Anexploratory
analysis of california residential customerresponse to critical
peak pricing of electricity. Energy,32(1):25 – 34, 2007.
[16] A. Johnson. The history of the smart grid evolution
atsouthern california edison. In Innovative Smart GridTechnologies
(ISGT), pages 1 –3, jan. 2010.
[17] W. Kempton and J. Tomic. Vehicle-to-grid powerfundamentals:
Calculating capacity and net revenue.Journal of Power Sources,
144(1):268 – 279, 2005.
[18] W. Kempton and J. Tomic. Vehicle-to-grid
powerimplementation: From stabilizing the grid tosupporting
large-scale renewable energy. Journal ofPower Sources, 144(1):280 –
294, 2005.
[19] H. Khurana, M. Hadley, N. Lu, and D. Frincke.Smart-grid
security issues. Security Privacy, IEEE,8(1):81 –85, jan.-feb.
2010.
[20] Y. Kim, T. Schmid, Z. M. Charbiwala, and M. B.Srivastava.
Viridiscope: design and implementation ofa fine grained power
monitoring system for homes. InUbicomp ’09, 2009.
[21] M. A. Lisovich, D. K. Mulligan, and S. B. Wicker.Inferring
personal information from demand-responsesystems. IEEE Security and
Privacy, 8:11–20, 2010.
[22] S.-W. Luan, J.-H. Teng, S.-Y. Chan, and L.-C.Hwang.
Development of a smart power meter for amibased on zigbee
communication. In PEDS, 2009.
[23] J. marc Seigneur, C. D. Jensen, S. Farrell, E. Gray,and Y.
Chen. Towards security auto-configuration forsmart appliances. In
in Proceedings of the SmartObjects Conference, pages 03–45,
2003.
[24] M. Masoum, P. Moses, and S. Deilami. Loadmanagement in
smart grids considering harmonicdistortion and transformer
derating. In InnovativeSmart Grid Technologies (ISGT), 19-21
2010.
[25] S. Massoud Amin and B. Wollenberg. Toward a smartgrid:
power delivery for the 21st century. Power andEnergy Magazine,
IEEE, 3(5):34 – 41, sept.-oct. 2005.
[26] P. McDaniel and S. McLaughlin. Security and
privacychallenges in the smart grid. Security Privacy, IEEE,7(3):75
–77, may-june 2009.
[27] S. McLaughlin, D. Podkuiko, and P. McDaniel.Energy theft in
the advanced metering infrastructure.In Critical Information
Infrastructures Security. 2010.
[28] S. McLaughlin, D. Podkuiko, S. Miadzvezhanka,A. Delozier,
and P. McDaniel. Multi-vendorpenetration testing in the advanced
meteringinfrastructure. In ACSAC, 2010.
[29] N. Mehta, S. Natarajan, P. Tadepalli, and A. Fern.Transfer
in variable-reward hierarchical reinforcementlearning. Machine
Learning, 73(3):289–312, 2008.
[30] A. Metke and R. Ekl. Smart grid security technology.In
Innovative Smart Grid Technologies, 2010.
[31] A. R. Metke and R. L. Ekl. Security technology forsmart
grid networks. Smart Grid, IEEE Transactionson, 1(1), june
2010.
[32] K. Moslehi and R. Kumar. Smart grid - a reliability
perspective. In Innovative Smart Grid Technologies(ISGT), pages
1 –8, 19-21 2010.
[33] H. Nakakita, K. Yamaguchi, M. Hashimoto, T. Saito,and M.
Sakurai. A study on secure wireless networksconsisting of home
appliances. Consumer Electronics,IEEE Transactions on, 49(2):375 –
381, may 2003.
[34] D. O’Neill, M. Levorato, A. Goldsmith, and U.
Mitra.Residential demand response using reinforcementlearning. In
IEEE SmartGridComm, 2010.
[35] F. Orecchini and A. Santiangeli. Beyond smart grids -the
need of intelligent energy networks for a higherglobal efficiency
through energy vectors integration.International Journal of
Hydrogen Energy, 2011.
[36] D. Petersen, J. Steele, and J. Wilkerson. Wattbot:
aresidential electricity monitoring and feedback system.In CHI,
2009.
[37] B. Ristic, S. Arulampalam, and N. Gordon. Beyondthe Kalman
Filter: Particle Filters for TrackingApplications. Artech House,
2004.
[38] B. D. Russell and C. L. Benner. Intelligent systems
forimproved reliability and failure diagnosis indistribution
systems. Smart Grid, IEEE Transactionson, 1(1):48 –56, june
2010.
[39] T. Sauter and M. Lobashov. End-to-endcommunication
architecture for smart grids. IEEETransactions on Industrial
Electronics, 58(4), 2011.
[40] S.-Y. Son and B.-J. Chung. A korean smart gridarchitecture
design for a field test based on power it.In Transmission
Distribution Conference Exposition:Asia and Pacific, 2009, pages 1
–4, oct. 2009.
[41] V. Sood, D. Fischer, J. Eklund, and T. Brown.Developing a
communication infrastructure for thesmart grid. In Electrical Power
Energy Conference(EPEC), 2009 IEEE, pages 1 –7, oct. 2009.
[42] G. Srinivasa Prasanna, A. Lakshmi, S. Sumanth,V. Simha, J.
Bapat, and G. Koomullil. Datacommunication over the smart grid. In
IEEEInternational Symposium on Power LineCommunications and Its
Applications, 2009.
[43] C. J. C. H. Watkins and P. Dayan. Q-learning.Machine
Learning, 8:279–292, 1992.
[44] D. Wei, Y. Lu, M. Jafari, P. Skare, and K. Rohde.
Anintegrated security system of protecting smart gridagainst cyber
attacks. In Innovative Smart GridTechnologies, 2010, pages 1–7.
IEEE, 2010.
[45] X. Wei, Z. Yu-hui, and Z. Jie-lin.
Energy-efficientdistribution in smart grid. In SUPERGEN, 2009.
[46] M. Weiss and D. Guinard. Increasing energy awarenessthrough
web-enabled power outlets. In MUM, 2010.
[47] M. Weiss, F. Mattern, T. Graml, T. Staake, andE. Fleisch.
Handy feedback: connecting smart meterswith mobile phones. In
Ubicomp ’09, 2009.
[48] P. Wolfs and S. Isalm. Potential barriers to smart
gridtechnology in australia. In Australasian UniversitiesPower
Engineering Conference, 2009.
[49] D. Xu, L. Yang, and Z. He. Overcomplete time
delayestimation using multi-gaussian fitting method. InIEEE VLSI
Design and Video Technology, 2005.
[50] Z. Zhang. Smart grid in america and europe: Similardesires,
different approaches (part 1). Public UtilitiesFortnightly, 149(1),
2011.
228