Emergence of Extreme Networked Devices David Culler Computer Science Division U.C. Berkeley www.cs.berkeley.edu/~culler USC, Feb 28, 2001
Dec 20, 2015
Emergence of Extreme Networked Devices
David Culler
Computer Science Division
U.C. Berkeley
www.cs.berkeley.edu/~culler
USC, Feb 28, 2001
2/28/2001 Emerging Extremes 2
The Expanding Computing Spectrum
• Servers
• Workstations
• Personal Computers
• Internet Services
• PDAs / HPCs/ smartphones
2/28/2001 Emerging Extremes 3
Convergence at the middle
• Common platform– powerful microproc (choice of 3), dram (3), disk (2)– deep I/O hierarchy, OS layering
• Common system abstraction– collection of threads sharing a large virtual address space– GUI orientation– blocking interfaces
• Concurrency as threads• Services as local call / remote thread
– RPC, rmi, dCOM, http
• Ample resources easily abstracted– open loop– transparent allocation and usage
2/28/2001 Emerging Extremes 4
Emerging Extremes
• Servers
• Workstations
• Personal Computers
• Internet Services
• PDAs / HPCs/ smartphones
svr
• Open Internet Services
• Microscopic sensor networks
• Planetary Services
2/28/2001 Emerging Extremes 5
Convergence at the Extremes
• Concurrency intensive– data streams and real-time events, not command-response
• Communications-centric• Limited resources (relative to load)• Huge variation in load
– population usage & physical stimuli– robustness
• Hands-off (no UI)• Dynamic configuration, discovery
– Self-organized and reactive control
• Similar execution model– event driven, – components
• Complimentary roles– tiny semi-autonomous devices empowered by infrastructure– infrastructure services connected to the real world
2/28/2001 Emerging Extremes 6
Outline
• Emerging Extremes
• Robust Framework for Open Scalable Internet Services
– the garden path: threads to non-block I/O and RPC
– structured event-driven alternatives
– controllers within a graph of stages
• Tiny OS for Wireless Embedded Sensor Networks
2/28/2001 Emerging Extremes 7
Servers
Clients
ClientsClients
ClientsClients
ClientsServers
Servers
Infrastructure Services
Open
Ninja: Open Infrastructure Services
systematic framework for building robust, composable services
focus here on execution model
2/28/2001 Emerging Extremes 8
Variation in Load – slashdot effect
http://pasadena.wr.usgs.gov/stans/slashdot.html
USGS Web Server Traffic
October 16, 1999 Hector Mine Earthquake
2/28/2001 Emerging Extremes 10
Toward Robust Behavior Under Load
• Traditional Capacity Planning– over-provision by factor over typical (increasing 4 -> 10-15)
– cluster-based replication is, at least, cost-effective
– peaks occur when it matters most
• Content-distribution– potential replication proportional to use
• Still want graceful degradation when instance is overloaded
thru-put (op/s)
response-time (s)
load
2/28/2001 Emerging Extremes 11
Threads as THE building block
• Freely compose these two primitives
• But,... threads a limited resource
Remote Services Masking I/O Latency
2/28/2001 Emerging Extremes 12
Service “test problem”
• A: popularity
• L: I/O, network, or service composition depth
Threaded server
task arrivals rate: A tasks / sec
# concurrent tasks in server: T = A x L task completions
rate: S tasks / sec
closed loop implies S = A
latency: L sec
dispatch( ) or create( )
2/28/2001 Emerging Extremes 13
0
500
1000
1500
2000
2500
1 10 100 1000 10000
# threads executing in server (T)
max
ser
ver
thro
ughp
ut(S
task
s/se
c)
1-w ay Java
4-w ay Java
ultra 170 and E450, Solaris 7.2, jdk 1.2.2
Threads are a “limited resource”• Fix L = 10 ms, for each T measure max A = S
• Cluster parallelism just raises the threshold
2/28/2001 Emerging Extremes 14
Alternative: queues, events, typed msgs
• single-threaded server• queues absorb load and decouple
operations– svr chooses when to assign resources to
request event
• bounded resources at request interface
– impose load-conditioning or admission control
• provide non-blocking interface• client retains control of its thread
– chooses when to block– permits negotiation protocol– key to service composition
Explicit
request queue
2/28/2001 Emerging Extremes 15
0
1000
2000
3000
4000
5000
6000
1 10 100 1000 10000
# tasks in client-server pipeline
ma
x s
erv
er
thro
ug
hp
ut
(S t
ask
s/se
c)
1-way Java
4-way Java
Event-per-task saturates gracefully
• Better and more robust performance– Use cluster parallelism to match desired thruput
• Can decompose task into multiple events– circulate or pipeline
• but ...
2/28/2001 Emerging Extremes 16
Down-side of monolithic event approach
• Lose familiar programming model– thread steps through each stage in the task
– need a handler per stage
• Difficult software engineering– composing and scheduling
• Does not naturally exploit SMP parallelism– must pipeline multiple event handler blocks
• Whenever the thread blocks, the whole structure stalls
– throughput ~ 1/L
2/28/2001 Emerging Extremes 17
State-of-practice: bounded thread pool
• Only allow K threads to “accept” connections
– some OS’s have fixed hard limit
• Additional requests time-out
• choose K < Tmax xput
• choose K large enough to hide L
Threaded server
task arrivals rate: A tasks / sec
task completions rate: S tasks / sec
2/28/2001 Emerging Extremes 18
A “third road”
• Building block– bounded internal thread pool
– queue-based interface
– subset of task stages
» request event processing in familiar style
– can chunk request stream for efficiency
• Compose Service as a graph of “stages”– modularity
– stages can be replicated across nodes
• Stage “control loop” manages threads
read headerread
headerread header
exec
read headercache
check
read header
cachemiss
read headerwrite
resp
2/28/2001 Emerging Extremes 19
Well-conditioned Service Architecture
• Abstract System I/F as non-blocking stages– careful engineering at the system interface
• Describe stages as modular state machines• Associate thread manager with stages• Build Service as composition of stages
– can be dynamic
Matt Welsh
2/28/2001 Emerging Extremes 20
example: http throughput
0100200300400500600700800900
10001100
0 200 400 600 800 1000
# Clients
Req
ues
ts p
er s
eco
nd
SEDA
Apache
SPECweb99 static workload, 4 classes
2/28/2001 Emerging Extremes 21
Response Time
0
1000
2000
3000
4000
5000
6000
7000
0 200 400 600 800 1000
Clients
Res
po
nse
Tim
e (m
s)
SEDA mean
Apache mean
SEDA std dev
Apache stdev
2/28/2001 Emerging Extremes 22
Reactive Stage Thread Pool Sizing
Two Packet Types
ping – fast
query – 20 ms delay
3455 6442
10.934
29.36
0
5
10
15
20
25
30
35
40
45
50
ST-ping ST-query TG-ping TG-query
Lat
ency
100
200
400
1000
Thread Governor
- observes queue length
- over threshold => add threads
clients
2/28/2001 Emerging Extremes 23
Scalable Persistent Data Structures
I/O coredisk
I/O corenetwork
buffercache
single-nodeHT
distributed hashtable“RPC” skeletons
operating system
DDS Brick
Steve Gribble
Service
DDS lib
Storage
“brick”
Service
DDS lib
Service
DDS lib
Storage
“brick”
Storage
“brick”
Storage
“brick”
Storage
“brick”
Storage
“brick”
System Area Network
Clustered Service
2/28/2001 Emerging Extremes 24
Scalable Throughput
100
1000
10000
100000
1 10 100 1000
# of DDS bricks
ma
x t
hro
ug
hp
ut
(op
s/s)
reads
writes
(128,13582)
(128,61432)
2/28/2001 Emerging Extremes 25
Robust under load
0
4000
8000
12000
16000
20000
0 5 10 15 20 25 30# client process
(100 parallel requests issued per client process)
has
h t
able
th
rou
gh
pu
t (r
ead
s/s) 2 bricks
8 bricks
16 bricks
32 bricks
2/28/2001 Emerging Extremes 26
Outline
• Emerging Extremes
• Robust Framework for Open Scalable Internet Services
– modular generalized state machines
– constrained use of threads
– thread-manager as controller
• Tiny OS for Wireless Embedded Sensor Networks
– characteristics of the other extreme
– current platforms
– events and primitive threads in a graph of components
– exploring open problems
2/28/2001 Emerging Extremes 27
Emerging Microscopic Devices
• CMOS trend is not just Moore’s law
• Micro Electical Mechanical Systems (MEMS)– rich array of sensors are becoming cheap and tiny
• Imagine, all sorts of chips that are connected to the physical world and to cyberspace!
LNAmixer
PLL basebandfilters
I Q
• Low-power Wireless Communication
2/28/2001 Emerging Extremes 28
Characteristics of Network Sensors
• Small physical size and low power consumption• Concurrency-intensive operation
– flow-thru, not wait-command-respond
• Limited Physical Parallelism and Controller Hierarchy
– primitive direct-to-device interface
• Diversity in Design and Usage– application specific, not general purpose– huge device variation=> efficient modularity=> migration across HW/SW boundary
• Robust Operation– numerous, unattended, critical=> narrow interfaces
sensorsactuators
network
storage
2/28/2001 Emerging Extremes 29
Current Example
• 1” x 1.5” motherboard– ATMEL 4Mhz, 8bit MCU, 512 bytes RAM, 8K pgm flash– 900Mhz Radio (RF Monolithics) 10-100 ft. range– ATMEL network pgming assist– Radio Signal strength control and sensing– I2C EPROM (logging)– Base-station ready– stackable expansion connector
» all ports, i2c, pwr, clock…
• Several sensor boards– basic protoboard– tiny weather station (temp,light,hum,press)– vibrations (acc, temp, ...)– accelerometers– magnetometers
2/28/2001 Emerging Extremes 30
Basic Power Breakdown…
• But what does this mean?– Lithium Battery runs for 35 hours at peak load and years at
minimum load!
» three orders of magnitude difference!
– A one byte transmission uses the same energy as approx 11000 cycles of computation.
– Idleness is not enough, sleep!
Active Idle Sleep
CPU 5 mA 2 mA 5 μA
Radio 7 mA (TX) 4.5 mA (RX) 5 μA
EE-Prom 3 mA 0 0
LED’s 4 mA 0 0
Photo Diode 200 μA 0 0
Temperature 200 μA 0 0
Panasonic CR2354
560 mAh
2/28/2001 Emerging Extremes 31
A Operating System for Tiny Devices?
• Traditional approaches– command processing loop (wait request, act, respond)
– monolithic event processing
– bring full thread/socket posix regime to platform
• Alternative– provide framework for concurrency and modularity
– never poll, never block
– interleaving flows, events, energy management
– allow appropriate abstractions to emerge
2/28/2001 Emerging Extremes 32
Tiny OS Concepts
• Scheduler + Graph of Components
– constrained two-level scheduling model: threads + events
• Component:– Commands, – Event Handlers– Frame (storage)– Tasks (concurrency)
• Constrained Storage Model– frame per component, shared stack, no
heap
• Very lean multithreading• Efficient Layering
Messaging Component
init
Po
we
r(m
od
e)
TX
_p
ack
et(
bu
f)
TX
_p
ack
et_
do
ne
(s
ucc
ess
)RX
_p
ack
et_
do
ne
(b
uff
er)
Internal
State
init
po
we
r(m
od
e)
sen
d_
msg
(ad
dr,
ty
pe
, d
ata
)
msg
_re
c(ty
pe
, d
ata
)
msg
_se
nd
_d
on
e)
internal thread
Commands Events
2/28/2001 Emerging Extremes 33
Application = Component Graph
RFM
Radio byte
Radio Packet
UART
Serial Packet
ADC
Temp photo
Active Messages
clocks
bit
by
tep
ac
ke
t
Route map router sensor appln
ap
pli
ca
tio
n
HW
SW
Example: ad hoc, multi-hop routing of photo sensor readings
2/28/2001 Emerging Extremes 34
TOS Execution Model
• commands request action– ack/nack at every boundary
– call cmd or post task
• events notify occurrence– HW intrpt at lowest level
– may signal events
– call cmds
– post tasks
• Tasks provide logical concurrency
– preempted by events
• Migration of HW/SW boundary
RFM
Radio byte
Radio Packet
bit
by
tep
ac
ke
t
event-driven bit-pump
event-driven byte-pump
event-driven packet-pump
message-event driven
active message
application comp
encode/decode
crc
data processing
2/28/2001 Emerging Extremes 35
Dynamics of Events and Threads
bit event filtered at byte layer
bit event => end of byte =>
end of packet => end of msg send
thread posted to start
send next message
radio takes clock events to detect recv
2/28/2001 Emerging Extremes 36
Storage Breakdown (C Code)
0
500
1000
1500
2000
2500
3000
3500
4000
Multihop Router
AM light
AM Temp
AM
Packet
Radio Byte
RFM
Photo
Temp
UART Packet
UART
i2c
Init
TinyOS Scheduler
C Runtime
3450 B code 226 B data
2/28/2001 Emerging Extremes 37
Empirical Breakdown of Effort
• can take apart time, power, space, …• 50 cycle thread overhead, 10 cycle event overhead
ComponentsPacket reception work breakdown
Percent CPU Utilization Energy (nj/Bit)
AM 0.05% 0.20% 0.33
Packet 1.12% 0.51% 7.58
Radio handler 26.87% 12.16% 182.38
Radio decode thread 5.48% 2.48% 37.2
RFM 66.48% 30.08% 451.17
Radio Reception - - 1350Idle - 54.75% -Total 100.00% 100.00% 2028.66
2/28/2001 Emerging Extremes 38
Working Across Levels
• Encoding– DC-balanced SECDED
• Proximity detection– signal strength or error rates
• Low power listening
• Fair and efficient network access
• Security
• Tiny virtual machines
• Larger challenges
2/28/2001 Emerging Extremes 39
Low-Power Listening
• Costs about as much to listen as to xmit, even when nothing is received
• Only way to save power is to turn radio off when there is nothing to hear.
• Can turn radio on/of in about 1 bit– Can detect transmission at cost of ~2 bit times
Small sub-msg recv sampling (10x)
Application-level synchronization rendezvous to determine when to sample (10X)
Xmit:
Recv:
preamble messagesleep
b
Jason Hill
2/28/2001 Emerging Extremes 40
Managing local contention
Channel Utilization and % Throughput per Mote at 4 packet/s Duty Cycle
00.10.20.30.40.50.60.70.80.9
1
1 2 3 4 5 6 7 8 9 10
Number of Motes
%
Channel Utilization ~70%
Throughputper node isfair
• Highly correlated traffic, no collision detection– sensor events and beacons
• Randomize initial listen period, simple backoff
Alec Woo
2/28/2001 Emerging Extremes 41
Managing aggregate contention
• Hidden nodes between each pair of “levels”– CSMA is not enough
• RTS/CTS acks too costly (power & BW)• P[msg-to-base] drops rapidly with hops
– Investment in packet increases with distance
Local rate control to approx. fairness Priority to forwarding, adjust own data rate Additive increase, multiplicative decrease
Listen for retransmission as ack~ ½ of packets get through 4 levels out
2/28/2001 Emerging Extremes 42
Authentication / Security
Energy costs of a secure channel
MAC transmission
20%
MAC computation
2%
freshness transmission
7%
freshness computation
0%encryption
computation0%
encryption transmission
0%
data transmission
71%
• RC-5 shared key crypto in 1.7 kb
• Modified Tesla protocol for confidential & authenticated base broadcast
• Easy to compromise a node, but hard to get most of them
2/28/2001 Emerging Extremes 43
What’s in a program?
• HW + collection of components supports space of applications
• Application-Specific Virtual Machine– code-density, not portability
– small byte-code interpreter component
– accepts clock & message event capsules
– Hides split-phase operations below interpreter
• Capsules define specific query / logic– filter criteria
– diffusion primitives
– ...
2/28/2001 Emerging Extremes 44
Thoughts about robust Algorithms
• Active Dynamic Route Determination– When hear a new route beacon, record “parent”, retransmit
from SELF, ignore additional messages for epoch
• Radio cell structure very unpredictable
• Builds and maintains good breadth-first forest
• Each node maintains O(1) state
• Fundamental operation is pruning retransmission
– Monotonic variables
– Message signature caches
• Takes energy to retain structure
2/28/2001 Emerging Extremes 45
Larger Challenges
• Programming support for systems of generalized state machines
– language, debugging, verification
• Programming the unstructured aggregate
• Resilient Aggregators
• Understanding how an extreme system is behaving and what is its envelope
– adversarial simulation
2/28/2001 Emerging Extremes 46
Tides of Innovation
Time
Integration
Innovation
Log R
Mainframe
Minicomputer
Personal ComputerWorkstationServer
2/2001
2/28/2001 Emerging Extremes 47
Summary
• The extremes of the computing spectrum present tremendous opportunities for innovation
• Systems challenges– variation in load, unpredictability, hands-off embedded
operation
– limited resources, concurrency intensive, power constrained
– self-organizing and adaptive
• More in common with each other than with the “average” devices
• New kinds of software system structures– modular event-driven structures
– intrinsic feedback and control
2/28/2001 Emerging Extremes 49
Historical Perspective
• New eras of computing start when the previous era is so strong it is hard to imagine that things could be different
– mainframe -> mini
– mini -> workstation -> PC
– PC -> ???
• It is often smaller than what came before.– Most think of the new technology as “just a toy”
• The new dominant use was almost completely absent.
• it is likely to come from the extremes
2/28/2001 Emerging Extremes 50
1
10
100
1000
10000
1 10 100 1000
server throughput (S tasks/sec)
en
d-t
o-e
nd
ta
sk
late
nc
y (
ms
) L = 10ms
L = 50ms
L = 200ms
Mean Response Time(A)
• closed system, but limited bandwidth