Top Banner
Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue 5,Sep- Oct 2004 Date : 2007/5/24 Teacher Jong-Shin Chen Student number: 9530618 Name : 施施施 Design and evaluation of a fault-tolerant mobile-agent system
27

Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue 5,Sep- Oct 2004 Date : 2007/5/24Teacher : Jong-Shin ChenStudent number: 9530618Name : 施宏達

Design and evaluation of a fault-tolerant mobile-agent system

Page 2: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

OutlineIntroduction

System architecture and protocol design

Agent failure detection and recovery

Failures of witness agents and the recovery strategy

Simplifying the witnessing dependency

An example model

Experimental results

Conclusion

Page 3: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

IntroductionMobile agents create a new paradigm for data exchange and resource sharing in rapidly growing and

continually changing computer networks.

Therefore, survivability and fault tolerance are vital issues for deploying mobile-agent systems.

Page 4: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

System architecture and protocol design(1/5)

The agent server should provide three types of stable storage—for logs, checkpoints, and messages.

Each agent contains its internal data, which could also be lost due to the failure.

If the agent renews its computation from the starting point of its itinerary, it will violate the exactly-once property.

Page 5: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

An actual agent is a common mobile agent that performs specific computations for its owner.

Witness agents monitor the actual agent and detect whether it’s lost.

System architecture and protocol design(2/5)

Page 6: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

System architecture and protocol design(3/5)

(1)Log entry logiarrive (2)Send message msgi

arrive to serverSi-1

(3)After computation,checkpoint the data (4)Log entry logileave

(5)Send message msgileave to serverSi-1 (6)Spawn a witness agent

Page 7: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Witness agent wi–1 is more passive than the actual agent in this protocol.

Two messages are expected: msgi

arrive and msgileave.

After receiving these two messages,wi–1 waits for the direct heartbeat message, msgi

alive, which the witness agent at server Si sends.

System architecture and protocol design(4/5)

Page 8: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

System architecture and protocol design(5/5)

Page 9: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Agent failure detection and recovery(1/3)wi–1 fails to receive msgi

arrive:

1. The message is lost due to an unreliable network.2. The message arrives after the timeout period of wi–1.3. Actual agent α gets lost when it’s ready to leave Si–1

and is heading for Si.4. Actual agent α gets lost when it arrives at Si without

logging.5. Actual agent α gets lost when it arrives at Si with

logging.

Page 10: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Agent failure detection and recovery(2/3)

(1)The witness agent spawns a probe, which travels to Si. (2)The probe carries the checkpointed data.(3) The probe inspects the log in Si. (4) If logiarrive is found, the probe retransmits msgiarrive toSi–1.(5) If not, it recovers the agent from the checkpointed data.

Page 11: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Agent failure detection and recovery(3/3)

wi–1 fails to receive msgileave:

1. The message is lost due to an unreliable network.2. The message arrives after the timeout

period of wi–1.3. Actual agent α gets lost just after sending

message msgiarrive.

4. Actual agent α gets lost just after logging entry logi

leave.5. Actual agent α gets lost after spawning witness agent wi.

Page 12: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Failures of witness agents and the recovery strategy(1/2)

w0→w1→w2→… wi–1→ wi →α

Failures of witness agents:1. The network is congested or unreliable.2. The system load of Si is too high.3.Witness agent wi was not created or is lost.

Page 13: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Failures of witness agents and the recovery strategy(2/2)

(1)A failure strikes Si–1, and the witnessing dependency is broken. (2) A failure strikes Si, and the actual agent is terminated.(3) The witness agent at Si–2 recovers the witness agent at Si–1. (4) The witness agent at Si–1 recovers the actual agent at Si.

Page 14: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Simplifying the witnessing dependency(1/2)The actual agent creates witness agents

along its itinerary, and the witness agents exchange heartbeat messages.

These procedures consume considerable resources.

No more than k servers can fail at the same time,we can simplify our mechanism by shortening.

Page 15: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Simplifying the witnessing dependency(2/2)

If (i ≤ k) 0→w1→… wi–1→α Else wi–k→ wi–k+1→…→ wi–1→α

Finally, when α successfully logs entry logi+1

arrive, the system can terminate wi–k by sending message msgi+1

kill from Si+1 to Si–k.

Page 16: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

An example model

Page 17: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results

Network transmission rate: 100 for agents,200 for messages

Server repair rate, t_s_r: 0.1All message log rates: 100Arrival, leave, and heartbeat message bound times: 1, 100, and 20, respectivelyHeartbeat interval: 5

Page 18: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(Agent survivability a)

server failure rate is 0.001

job completion rate is 0.01

Page 19: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(Agent survivability b)

server failure rate is 0.005

job completion rate is 0.01

Page 20: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(Agent survivability c)

server failure rate is 0.005

job completion rate is 0.05

Page 21: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(witness agents a)

server failure rate is 0.001

job completion rate is 0.01

Page 22: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(witness agents b)

server failure rate is 0.005

job completion rate is 0.01

Page 23: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(witness agents c)

server failure rate is 0.005

job completion rate is 0.05

Page 24: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(probes a)

server failure rate is 0.001

job completion rate is 0.01

Page 25: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(probes b)

server failure rate is 0.005

job completion rate is 0.01

Page 26: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Experimental results(probes c)

server failure rate is 0.005

job completion rate is 0.05

Page 27: Authors: Lyu, M.R., Xinyu Chen, and Tsz Yeung Wong From : Intelligent Systems, IEEE [see also IEEE Intelligent Systems and Their Applications vol.19,Issue.

Conclusion

This agent fault-tolerant recovery approach improves agent survivability in failure-prone mobile agent systems.

Thus, it can help create a more reliable agent deployment environment.