TiredZebra: Exploring Gossip Protocols in Sensor Networks · TiredZebra: Exploring Gossip Protocols in Sensor Networks CS 244B Final Project Paul Crews Stanford University [email protected]

TiredZebra: Exploring Gossip Protocols inSensor Networks

CS 244B Final Project

Paul CrewsStanford University

[email protected]

Travis LanhamStanford University

[email protected]

ABSTRACTWireless sensor networks have extraordinary resourceconstraints that demand low power consumption inorder to extend longevity. Designing distributed algo-rithms to work on low-power, resource-constraineddevices is a challenging problem and requires a differ-ent approach than a typical distributed environment. Inthis paper, we examine one such algorithm, Trickle, indepth, experimentally measuring its behavior on real-world hardware and under simulated conditions. Wealso extend the algorithm to add an optimization thatallows nodes to turn off their radios for a set period oftime to conserve energy.

KEYWORDSIoT, Distributed Systems, Sensor Networks, Networking

1 INTRODUCTIONSensor networks have extreme resource constraints thatrequire a distinct class of distributed algorithms andprotocols designed for large numbers of lossy nodeswith low-power requirements. The devices that makeup these networks typically consist of small, embed-ded systems that are placed in remote locations andcannot connect to a centralized controller. The lack ofa centralized control system necessitates peer-to-peergossip protocols to propagate information, includingcode updates.Power consumption is critical for these systems since

their batteries cannot be replaced and thus battery lifedetermines useful system life. Network radios on thesesystems are the largest consumers of energy and there-fore must be the most optimized to reduce resourceconsumption. To compound the issue of propagation,

wireless sensor nodes offer few synchrony guaranteesand nodes constantly fail or experience packet loss.However, several consensus algorithms have been

designed to function in this restricted context. In thispaper, we examined the Trickle algorithm, which is analgorithm for eventual consistency over unreliable wire-less networks [3]. We present our own implementationof Trickle and measure its performance under differ-ent network conditions, both on real-world hardwareand in a simulated environment. We also present an ex-tension to Trickle, which makes different assumptionsabout the underlying network topology but enablessome nodes in the network to sleep.We first describe the background for the Trickle al-

gorithm, including our proposed modification and itsimplications. We then discuss the methodology usedto test Trickle in several network configurations andmediums, and evaluate the validity of claims made sur-rounding the recommended values for certain Trickleparameters.We also test our modified version of Trickle,Sleepy Trickle, under the same network conditions tosee how the modified algorithm changes the overallpower cost. We then present the results of our exper-iments, analyzing the results based on the expectedbehavior of the algorithm. Finally, we present our con-clusions, and suggest that Sleepy Trickle is a viableextension to Trickle under certain network conditions.

2 BACKGROUND2.1 Trickle AlgorithmThe Trickle algorithm was initially developed to dis-tribute code updates throughout a wirelss sensor net-work using a minimal number of packet transmissionswhile evenly distributing the transmit load over eachnode and preventing broadcast storms. One of the key

insights during the development of Trickle was thattransmission was much more expensive than radio re-ception; as a result, it was cheaper to leave radios onin the listen state if that reduced the overall number oftransmissions. The algorithm was initially developedfor TinyOS and associated projects, but is now incor-porated in a number of other network standards [4].The foundation of the algorithm is the proposition

that nodes should remain quiet unless they see an up-date on the network and then should broadcast it out;after receiving an update, a node should advertise itminimally, exponentially backing off while making ad-ditional transmissions to preserve resources. The moti-vation behind this foundation is that nodes in range ofeach other will quickly propagate an update and thengo into a standby mode with occasional transmissionsuntil the next update is introduced. Trickle also sendsall messages to a broadcast address which allows allnodes in range to hear an update without requiringpairwise connections for propagation.The Trickle gossip algorithm operates on a series of

time intervals where a random time is chosen from thecurrent interval to decide whether or not to broadcastthe current version. If the node has heard its currentversion broadcast by several other nodes in the currentinterval then it will not broadcast to avoid duplicateeffort. If a node hears a version that is behind its currentversion, it will reset its interval to the minimum sizethen broadcast out the more recent version to bring therest of the network up to speed.The algorithm relies on several configurable variables

to determine these time intervals. The time intervalspans from [0, T ] from which it randomly chooses thetime t to broadcast in the current time interval (notethat t is not an absolute time, but rather an offset fromthe beginning of the time interval). Each node also hasa counter c for tracking the number of updates it hasheard in the interval; if the node hears k updates beforeit’s transmission time t then it will not transmit. k istypically a small number (1 to 4) because if the updatehas already been heard several times then it is likelythat it has spread throughout the network and gainedsaturation. For a sparse network, a lower k is desirablebecause even if a node hears another has an update,there might be nodes that were out of range and didnot hear the transmission. After the interval T finishes,the counter c is reset and a new t transmission time isselected for the new interval.

However, in practice, the Trickle authorsmodified the

original algorithm to select t from the interval [T

2,T ] in-

stead of [0,T ]. We also adopted this modification whichaimed to address the "short listen" effect described bythe authors where the lack of time synchronization re-sults in nodes having offset intervals such that theywould ordinarily suppress broadcasts but line up so anode broadcasts, then it’s interval ends, then anothernode’s interval begins and it broadcasts shortly afterthe first. This represents an adversarial case that cansignificantly increase the number of broadcasts andneedlessly transmit messages. Taking the transmissiontime t from only the second half of the interval enforcesa fixed waiting period for all nodes which provides anupper bound on the broadcast rate.Another optimization the algorithm introduces is

a dynamic configuration of the window timing inter-val. A large T interval has a low resource overheadwhile a small T propagates information through thenetwork faster. Trickle responds dynamically by vary-ingT within an interval [Tl ,Th]. After intervalTn ,Tn+1is doubled to be Tn+1 = 2Tn . An upper bound is estab-lished as Th so doubling goes to that limit. If a nodehears an update then it resets T to Tl and begins again.This is analogous to TCP window size in the exponen-tial increase stage. Each interval that passes withoutan update results in a doubling of the prior intervallength. Thus, as time passes and no new updates areheard, nodes will wake up less frequently.

2.2 Sleepy Trickle ExtensionOur extension to the Trickle algorithmmakes additionalassumptions about the network topology. In particu-lar, we assume that there is some set of router nodes Rthat form a connected graph and some disjoint set ofleaf nodes L. Nodes in R are expected to be high-power,and nodes in L can be low-power. This network modelis intended to resemble a common network configura-tion, where low-power weak nodes are connected toreliable, high-power nodes. Our modification exploitsthese properties to allow the lower-power nodes in Lto sleep, while tuning Trickle parameters to increasethe transmit load on the nodes in R.The first modification we made was to increase the

transmit load on the nodes in R, and we achieved thisby increasing the Trickle k parameter for each node inR. As described in 2.1, the k parameter determines how

2

many messages are required to suppress a transmissionat a node. By increasing this parameter at some subset ofnodes, we increase the broadcast load at these nodes andincrease the probability that they transmit. Thus, evenin a lossy network, this helps to suppress transmissionsof nodes not in R - that is, the low-power leaf nodes.In addition to varying thek parameter between nodes

in R and nodes in L, we also allowed nodes in L to entera sleep state. Our algorithm for entering sleep state isas follows; when a node in L has reached the maximumsized interval Th and has not transmitted during theprior period, it is allowed to sleep for some intervalTs . Once waking from sleep, the node resumes normaloperations with interval Th . Note that this sleep stateis equivalent to a node simply disconnecting from thenetwork for the sleep interval Ts . The node is thenallowed to sleep again after not transmitting for aninterval of length Th as before.There are several interesting emergent properties of

this modification. First, the normal Trickle algorithmruns on each of the nodes in R. Since these nodes areassumed to be normally connected, this ensures thatupdates are eventually propagated to every node inR, assuming disconnected nodes are eventually recon-nected. A second interesting property of the modifiedalgorithm is that this algorithm is equivalent to somesubset of nodes in L disconnecting for time Ts in thestandard Trickle implementation. Thus, many of theproperties of standard Trickle is maintained in SleepyTrickle, including at the sleepy nodes L. Finally, anotherinteresting property is the long-term behavior of a dis-connected set of nodes in L. Consider some subset Ldof L such that no node has received an update. Thus,since no external transmissions can be heard, |Ld | − knodes must remain awake for the next sleep period,as k must have transmitted. Note that this propertyholds regardless of sleep interval timing; thus, so longas one node in the |Ld | − k awake nodes is eventuallyreconnected, the update will spread to the disconnectedgroup. Additionally, since the selection of awake nodesis randomized based on when the t timer fires, thereis a non-zero probability of a particular node remain-ing awake, indicating that so long as some node in Ldis eventually updated, the rest of the nodes are alsoeventually updated.Although this modification preserves many of the

guarantees of the original Trickle algorithm, there are

several properties to Sleepy Trickle that reduce its ef-ficiency. First, the update latency is significantly in-creased. Not only does it take substantially longer toupdate nodes in L, but in complex multi-hop networks,there can be additional propagation speedup in stan-dard Trickle by sending updates through nodes in L.Since these nodes may be asleep, the total number ofhops to update the nodes in R may be larger. In ad-dition to latency, transmit efficiency is impacted. Al-though Sleepy Trickle attempts to synchronize sleep pe-riods for nodes, an intrinsic property of Sleepy Trickleis that fewer nodes receive the update at around thesame time. Thus, groups of nodes may receive the up-date in batches, which increases the number of overalltransmissions (as some transmissions would have beensuppressed by other nodes that were simultaneouslyupdated). Finally, the guarantees of Trickle are moredifficult to ensure in Sleepy Trickle.

3 IMPLEMENTATION3.1 ZebraSim: Software SimulationOur first contribution is the development of a softwaresimulation environment for Trickle that we call Ze-braSim (in reference to ZebraNet, a deployment of sen-sors to track zebra herds in Africa and an inspirationfor the Trickle paper). The original Trickle paper dis-cussed a software simulator used for validating the al-gorithm and parameter tuning (particularly for sparsenetwork topologies). However, this simulator was notmade available by the authors and from the descriptionin their paper appears to be a fairly simple programthat mathematically models the parameters.ZebraSim introduces a flexible simulator that main-

tains state for each simulated node and provides flexibil-ity for creating the node topology. Nodes can be placedrandomly or in groups and with specified spacing con-straints (groups or individuals can be close together tocreate a dense network or far apart for a sparse onethat requires multiple hops).ZebraSim exposes a basic interface for each node,

namely a transmit function and a receive function. Thereceive function delivers a broadcast message to thenode and invokes the node processing for it, consistingof incrementing c , and potentially setting up a messageto be transmitted at the transmit time for the interval.Each node can be configured with a k , Tl , Th , loca-

tion, optional travel distance, and optional packet loss.3

Figure 1: Imix boards with raspberry pi con-troller.

If travel distance is greater than 0 then the node willmove position a random amount up to that distancein a random direction each interval (to model sensornetworks that are geographically dynamic, for exam-ple, GPS tags for endangered animals). The packet lossparameter is a probability that a broadcast packet is notheard by the node (to introduce loss into the network).We originally prototyped amulti-node simulatorwhere

nodes would be simulated with docker containers andnetworked together, however, this introduced noise intonetwork measurements due to inter-container network-ing delay and was less reliable than a formal simulation.

3.2 Hardware TestbedFor our hardware testbed, we implemented Trickle andSleepy Trickle on the Imix hardware platform runningthe Tock operating system [1]. Tock is a embeddedoperating system written in Rust and designed for low-power embedded devices, and represents a feature-richtest bed for low-power wireless research [2]. Only onemodification was made to Tock itself, which involvedadding back functionality for the hardware randomnumber generator required by Trickle. The Imix boardcontains 64kB of RAM, a 40MHz Cortex-M processor,and a 2.4GHz IEEE 802.15.4 radio. This platform is anaccurate representation for what kinds of low-powerhardware Trickle was designed to run on, and capturesthe resource-constrained environment that both Trickleand Sleepy Trickle should run on. We used a total of

13 Imix boards to conduct our measurements, and con-nected GPIO pins to a Raspberry Pi to get accuratepropagation measurements from a consistent internalclock as seen in Figure 1.

4 RESULTSIn this paper, we set out to examine two ideas. First,we were interested in the performance of Trickle underdifferent network configurations, focusing on the claimthat for dense networks, large values should be selectedfor k and Tl . Second, we were interested in the perfor-mance of Sleepy Trickle and how it compares to Tricklewith regards to update latency and total transmissioncounts for low-power nodes. We highlight our resultsin the following subsections. First, in 4.1 we examinethe behavior of Trickle with varying k and Tl values ina dense, single-hop network. In 4.2 we then test howthe optimal k andTl values perform in a multi-hop net-work environment. Finally, section 4.3 compares theperformance of Sleepy Trickle to Trickle with regards toincreased update latency and total transmission counts.We consider Sleepy Trickle a viable modification if itresults in a moderate increase in update latency, anda substantial decrease in total transmission counts forthe low-power leaf nodes. The results of our test con-firm that Sleepy Trickle appears to meet both of thesecriteria.

4.1 Dense Single-Hop NetworkOur first claim was that Trickle performs best in densenetworks with a high k value and a high Tl value. Inorder to measure for performance, we measured thetotal time it took for an update to propagate through anetwork of connected nodes and the total transmissioncounts for the nodes.We first tested this hypothesis on the hardware test

bed. Using 13 total nodes, we used the configuration de-scribed in section 3.2 to measure the propagation delayand packet counts. For the single-hop network, we mea-sured the latency for four different k and Tl value pairsas shown in Figure 2. We can see that the first test, fork = 1,Tl = 1s , had significant latency. We hypothesizethat this is due to the hidden terminal problem and pos-sible interference, delaying all nodes from immediatelyhearing the update. Once k is increased to 2 however,the delay dramatically drops; this makes sense, as eachnode must hear two transmissions to remain silent.

4

Figure 2: Propagation time for Single-Hop hard-ware nodes.

Figure 3: Propagation time for multi-hop vssingle-hop.

In general, the tests seem to confirm the pattern thatas k increases, the latency drops, while as Tl increases,the delay likewise increases. By increasing both values,this seems to minimize propagation delay while allow-ing for a quick increase in interval size. One potentialproblem in a dense single-hop network with a high kand a high Tl value is that the k value is approximatelythe total number of transmissions in an interval. Thismeans that increases in k also increases the energy costin a network, which can negatively impact the lifetimeof battery powered nodes.In our software simulation, we found similar patterns,

although with a higher k value and higher Tl we hadfaster convergence which we attribute to no loss and adense grouping that took only a few dozen intervals tofully propagate.

4.2 Multi-Hop NetworkAlthough a high k and a high Tl value works well indense single-hop networks, we wanted to examine howit performed in multi-hop networks. Since wirelessnetwork topologies can change dramatically over thecourse of a deployment, ensuring that this configura-tion also works for sparse multi-hop networks is cru-cial. To this end, we measured the end-to-end updatepropagation latency in ZebraSim and on the hardwaretestbed.For the hardware testbed, we implemented the multi-

hop simulation described in section 3.2. We used 13nodes, each of which could only communicate withnodes whose MAC addresses differed by +/−1. This en-sured worst-case performance, as each node could onlyreceive updates from its immediate neighbor, creating a13-hop long update chain.We thenmeasured the end-to-end update time for k = 2,Tl = 1s and k = 4,Tl = 10sas shown in Figure 3. As the graphs show, there is ahuge latency penalty for a large Tl in the multi-hopnetwork. As a node must remain silent for the first halfof an interval, when receiving an update with a largeTlvalue, the node remains silent for a significant amountof time. In our configuration, this creates substantiallatency as each hop introduces a multi-second delay.Although the high k and highTl parameters perform

well in dense single-hop networks, they perform partic-ularly bad in large, multi-hop networks. However, themulti-hop network configuration tested here is mostlikely not indicative of real-world wireless networks,but it still demonstrates the behavior of Trickle undersuch conditions.

4.3 Sleepy TrickleThe final claim we tested was that our Sleepy Tricklemodification decreases packet transmissions for leafnodes while marginally increasing overall update la-tency. We tested this on the hardware testbed, measur-ing per-node packet transmissions while in steady stateand while sending an update. We also measured thetotal update latency for all nodes, and compare theseresults to the standard Trickle algorithm in a densesingle-hop network. We used a total of 13 nodes, 2 ofwhich were the router nodes (nodes 0 and 1), while theremainder were sleepy nodes (nodes 2-12). The k-valuefor the router nodes (denoted kr ) we set to 4, whilefor the sleepy nodes we had ks = 1. For the reference

5

Figure 4: Normal Trickle transmission counts fork = 4,Tl = 1s.

Figure 5: Sleepy Trickle transmission counts forkr = 4,ks = 1,Tl = 1s.

Trickle version, we set k = 4, and for all nodes we setTl = 1s . We first let both Trickle and Sleepy Tricklestabilize, then measured 30 minutes of stable transmis-sions before initiating an update and measuring thetotal packet counts.As Figures 4 and 5 show, the transmissions for Sleepy

Trickle are dominated by the router nodes, and thenet transmit count is much lower (59 transmissions)compared to standard Trickle (108 transmissions). Forlatency measurements, Figure 6 shows the difference incompletion time for normal Trickle and Sleepy Trickle,with parameters k = 2,Tl = 1s and kr = 4,ks =1,Tl = 1s respectively. Although Sleepy Trickle clearlyincreases the propagation time, we note that this is di-rectly related to the amount of time nodes are asleepfor (in this case, Ts = 64s). With these parameters, theincrease in propagation time and the dramatic decrease

Figure 6: Trickle (k = 2,Tl = 1s) vs Sleepy Trickle(kr = 4,ks = 1,Tl = 1s) latency.

in transmissions by leaf nodes indicates that this modi-fication is a viable extension of the Trickle protocol, andwould enable more efficient deployment within specificnetwork topologies.

5 CONCLUSIONSWith growing interest in internet of things embeddeddevices has come a resurgence of interest in low cost dis-tributed protocols for information propagation. Trickleachieves the goal of fast code propagation through alossy wireless sensor network while conserving powerresources.We further validate Trickle’s performance, first with

a high fidelity simulator, and second with a port to theTock embedded operating system platform. We extendthe original algorithm with Sleepy Trickle which takesconcepts from traditional distributed networks (routingnodes) and apply them to the sensor network to achievesubstantially lower transmission costs, at the expenseof greater propagation delay.

REFERENCES[1] github.com/lanhamt/sleepyzebra[2] www.tockos.org/[3] Levis, Philip, et al. "Trickle: A self-regulating algorithm for

code propagation and maintenance in wireless sensor net-works." Proc. of the 1st USENIX/ACM Symp. on NetworkedSystems Design and Implementation. 2004.

[4] Hui, Jonathan W., and David Culler. "The dynamic behaviorof a data dissemination protocol for network programmingat scale." Proceedings of the 2nd international conference onEmbedded networked sensor systems. ACM, 2004.

6

github.com/lanhamt/sleepyzebra

www.tockos.org/

TiredZebra: Exploring Gossip Protocols in Sensor Networks · TiredZebra: Exploring Gossip Protocols in Sensor Networks CS 244B Final Project Paul Crews Stanford University [email protected]

Documents