International Journal of Computer Science July 2009

International Journal of Computer Science

& Information Security

© IJCSIS PUBLICATION 2009

IJCSIS Vol. 3, No. 1, July 2009 ISSN 1947-5500

Editorial Message from Managing Editor

The Editorial Board is pleased to present the third volume of International Journal of

Computer Science and Information Security (IJCSIS Vol. 3 No. 1 July 2009). This

IJCSIS volume 3 publication propose high quality research works in core computer

science & applications, information & communication security, mobile & wireless

networking, and other computing technologies. We thank researchers and

professionals who have submitted and published their research papers in this issue.

The IJCSIS technical committee for the July 2009 Issue has set a good standard of

35% paper acceptance rate after a meticulous peer-reviewing process. We hope that

you will find the research results fruitful for your future work or publication

references.

Available at http://sites.google.com/site/ijcsis/

IJCSIS Vol. 3, No. 1, 31 July 2009.

ISSN 1947-5500

© IJCSIS, USA

IJCSIS EDITORIAL BOARD

Gregorio Martinez Perez Associate Professor - Profesor Titular de Universidad University of Murcia (UMU), Spain Dr. Sanjay Jasola Professor and Dean School of Information and Communication Technology, Gautam Buddha University, Greater NOIDA (Gautam Buddha Nagar) Dr Riktesh Srivastava Assistant Professor, Information Systems Skyline University College, University City of Sharjah, Sharjah, PO 1797, UAE Dr. Siddhivinayak Kulkarni University of Ballarat, Ballarat, Victoria Australia Professor (Dr) Mokhtar Beldjehem Sainte-Anne University Halifax, NS, Canada

IJCSIS REVIEWERS’ LIST 1. Dr. Lam Hong Lee, Universiti Tunku Abdul Rahman, Malaysia

2. Assoc. Prof. N. Jaisankar, VIT University, Vellore,Tamilnadu, India

3. Dr. Amogh Kavimandan, The Mathworks Inc., USA

4. Dr. Ramasamy Mariappan, Vinayaka Missions University, India

5. Dr. Neeraj Kumar, SMVD University, Katra (J&K), India

6. Dr. Junjie Peng, Shanghai University, P. R. China

7. Dr. Ilhem LENGLIZ, HANA Group - CRISTAL Laboratory, Tunisia

8. Prof. Dr. Durgesh Kumar Mishra, Acropolis Institute of Technology and Research, Indore, MP, India

9. Prof. Dr.C.Suresh Gnana Dhas, Anna University, India

10. Prof. Pijush Biswas, RCC Institute of Information Technology, India

11. Dr. A. Arul Lawrence, Royal College of Engineering & Technology, India

12. Mr. Wongyos Keardsri, Chulalongkorn University, Bangkok, Thailand

13. Mr. Somesh Kumar Dewangan, CSVTU Bhilai (C.G.)/ Dimat Raipur, India

14. Mr. Hayder N. Jasem, University Putra Malaysia, Malaysia

15. Mr. A.V.Senthil Kumar, C. M. S. College of Science and Commerce, India

16. Mr. R. S. Karthik, C. M. S. College of Science and Commerce, India

17. Mr. P. Vasant, University Technology Petronas, Malaysia

18. Mr. Wong Kok Seng, Soongsil University, Seoul, South Korea

19. Mr. Praveen Ranjan Srivastava, BITS PILANI, India

20. Mr. Kong Sang Kelvin, The Hong Kong Polytechnic University, Hong Kong

21. Mr. Mohd Nazri Ismail, Universiti Kuala Lumpur, Malaysia

22. Dr. Rami J. Matarneh, Al-isra Private University, Amman, Jordan

23. Dr Ojesanmi Olusegun Ayodeji, Ajayi Crowther University, Oyo, Nigeria

24. Dr. Siddhivinayak Kulkarni, University of Ballarat, Ballarat, Victoria, Australia

25. Dr. Riktesh Srivastava, Skyline University, UAE

26. Dr. Oras F. Baker, UCSI University - Kuala Lumpur, Malaysia

27. Dr. Ahmed S. Ghiduk, Faculty of Science, Beni-Suef University, Egypt

and Department of Computer science, Taif University, Saudi Arabia

28. Assist. Prof. Tirthankar Gayen, CIT, West Bengal University of Technology, India

29. Ms. Huei-Ru Tseng, National Chiao Tung University, Taiwan

30. Prof. Ning Xu, Wuhan University of Technology, China

31. Mr Mohammed Salem Binwahlan, Hadhramout University of Science and Technology, Yemen &

Universiti Teknologi Malaysia, Malaysia.

32. Dr. Aruna Ranganath, Bhoj Reddy Engineering College for Women, India

33. Mr. Hafeezullah Amin, Institute of Information Technology, KUST, Kohat, Pakistan

34. Prof. Syed S. Rizvi, University of Bridgeport, USA

35. Mr. Shahbaz Pervez Chattha, University of Engineering and Technology Taxila, Pakistan

36. Dr. Shishir Kumar, Jaypee University of Information Technology, Wakanaghat (HP), India

37. Mr. Shahid Mumtaz, Portugal Telecommunication, Instituto de Telecomunicações (IT), Aveiro

38. Mr. Rajesh K Shukla, Corporate Institute of Science & Technology Bhopal M P

39. Dr. Poonam Garg, Institute of Management Technology, India

40. Mr. S. Mehta, Inha University, Korea

41. Mr. Dilip Kumar S.M, University Visvesvaraya College of Engineering (UVCE), Bangalore

University

42. Prof. Malik Sikander Hayat Khiyal, Fatima Jinnah Women University, Rawalpindi, Pakistan

43. Dr. Virendra Gomase , Department of Bioinformatics, Padmashree Dr. D.Y. Patil University

44. Dr. Irraivan Elamvazuthi, University Technology PETRONAS, Malaysia

45. Mr. Saqib Saeed, University of Siegen, Germany

46. Mr. Pavan Kumar Gorakavi, IPMA-USA [YC]

47. Dr. Ahmed Nabih Zaki Rashed, Menoufia University, Egypt

48. Prof. Shishir K. Shandilya, Rukmani Devi Institute of Science & Technology, India

49. Mrs.J.Komala Lakshmi, SNR Sons College, Computer Science, India

50. Mr. Muhammad Sohail, KUST, Pakistan

51. Dr. Manjaiah D.H, Mangalore University, India

52. Dr. S Santhosh Baboo, D.G.Vaishnav College, Chennai, India

53. Assist. Prof. Sugam Sharma, NIET, India / Iowa State University, USA

54. Jorge L. Hernández-Ardieta, University Carlos III of Madrid, Spain

55. Prof. Dr. Mokhtar Beldjehem, Sainte-Anne University, Halifax, NS, Canada

56. Dr. Deepak Laxmi Narasimha, VIT University, India

57. Prof. Dr. Arunkumar Thangavelu, Vellore Institute Of Technology, India

58. Mr. M. Azath, Anna University, India

59. Mr. Md. Rabiul Islam, Rajshahi University of Engineering & Technology (RUET), Bangladesh

60. Dr. Shimon K. Modi, Director of Research BSPA Labs, Purdue University, USA

61. Mr. Aos Alaa Zaidan Ansaef, Multimedia University, Malaysia

62. Dr Suresh Jain, Professor (on leave), Institute of Engineering & Technology, Devi Ahilya University,

Indore (MP) India,

63. Mr. Mohammed M. Kadhum, Universiti Utara Malaysia

64. Mr. Hanumanthappa. J. , University of Mysore, India

65. Mr. Syed Ishtiaque Ahmed, Bangladesh University of Engineering and Technology (BUET)

66. Mr Akinola Solomon Olalekan, University of Ibadan, Ibadan, Nigeria

67. Mr. Santosh K. Pandey, Department of Information Technology, The Institute of Chartered

Accountants of India

TABLE OF CONTENTS

1. New Protocol for QoS Routing in Mobile Ad-Hoc Networks Shahid Mumtaz and Alitio Gamerio , Institute of Telecommunication, Aveiro, Portugal 2. Applicability of a Novel Integer Programming Model for Wireless Sensor Networks Alexei Barbosa de Aguiar, Alvaro de M. S. Neto, Placido Rogerio Pinheiro, Andre L. V. Coelho, Graduate in Applied Informatics, University of Fortaleza, Brasil 3. Recent Applications of Optical Parametric Amplifiers in Hybrid WDM/TDM Local Area Optical Networks Abd El Naser A. Mohamed, Mohamed M. E. El-Halawany, Ahmed Nabih Zaki Rashed and Mahmoud M. A. Eid, Electronics and Electrical Communication Engineering Department, Faculty of Electronic Engineering, Menouf 32951, Menoufia University, Egypt 4. Deterministic Formulization of SNR for Wireless Multiuser DS-CDMA Networks Syed S. Rizvi and Khaled M. Elleithy, Computer Science and Engineering Department University of Bridgeport, Bridgeport Aasia Riasat, Department of Computer Science, Institute of Business Management Karachi, Pakistan 78100. 5. A multidimensional approach for context-aware recommendation in mobile commerce Maryam Hosseini-Pozveh, Mohamadali Nematbakhsh, Naser Movahhedinia Department of Computer Engineering, University Of Isfahan, Isfahan, Iran 6. Efficient Web Log Mining using Doubly Linked Tree Dr. R. S. Kasana, and Ratnesh Kumar Jain, Department of Computer Science & Applications, Dr. H. S. Gour, University, Sagar, MP (India) Dr. Suresh Jain, Department of Computer Engineering, Institute of Engineering & Technology, Devi Ahilya University, Indore, MP (India) 7. A New Scheme for Minimizing Malicious Behavior of Mobile Nodes in Mobile Ad Hoc Networks Syed S. Rizvi and Khaled M. Elleithy, Computer Science and Engineering Department, University of Bridgeport, Bridgeport, CT, USA 8. IPv6 and IPv4 Threat reviews with Automatic Tunneling and Configuration Tunneling Considerations Transitional Model: A Case Study for University of Mysore Network Hanumanthappa J., Dos in Computer Science,University of Mysore, Manasagangothri, Mysore, India. Dr. Manjaiah D. H., CS at Dept.of Computer Science Mangalore University, Mangalagangothri. 9. Performance Evaluation of Mesh based Multicast Reactive Routing Protocol under Black Hole Attack E.A. Mary Anita, Research Scholar, Anna University, Chennai V. Vasudevan, Senior Professor and Head / IT, A. K. College of Engineering, Virudunagar, India 10. Novel Framework for Hidden Data in the Image Page within Executable File Using Computation between Advanced Encryption Standard and Distortion Techniques A.W. Naji, Shihab A. Hameed, , B.B.Zaidan, Wajdi F. Al-Khateeb, Othman O. Khalifa, A. A. Zaidan and Teddy S. Gunawan, Department of Electrical and Computer Engineering, Faculty of Engineering, International Islamic University Malaysia 11. A Secure Multi-Party Computation Protocol for Malicious Computation Prevention for preserving privacy during Data Mining Dr. Durgesh Kumar Mishra, Neha Koria, Nikhil Kapoor, Ravish Bahety, Acropolis Institute of Technology & Research, Indore, MP, India

12. Efficient methodology for implementation of Encrypted File System in User Space Dr. Shishir Kumar, U.S. Rawat, Sameer Kumar Jasra, Akshay Kumar Jain, Department of CSE, Jaypee Institute of Engg. & Technology, Guna (M.P.), India 13. A new approach to services differentiation between mobile terminals of a wireless LAN Maher BEN JEMAA, Maryam KALLEL ZOUARI, Bachar ZOUARI ReDCAD Research Unit, National, School of Engineering of Sfax, BP 1173-3038 Sfax, Tunisia 14. Authentication Without Identification using Anonymous Credential System Dr. A. Damodaram, UGC- ASC, JNTUH, Hyderabad H. Jayasri, ATRI, Hyderabad, India 15. Transmission Performance Analysis of Digital Wire and Wireless Optical Links in Local and Wide Areas Optical Networks Abd El–Naser A. Mohamed, Mohamed M. E. El-Halawany , Ahmed Nabih Zaki Rashed , and Amina E. M. El-Nabawy, Electronics and Electrical Communication Engineering Department, Faculty of Electronic Engineering, Menouf 32951, Menoufia University, Egypt. 16. Automatic local Gabor Features extraction for face recognition Yousra BEN JEMAA, Sana KHANFIR, National Engineering School of Sfax, Tunisia 17. Intelligent Advisory System for Supporting University Managers in Law Atta E. E. Elalfi, Department of computer, Faculty of specific education, Mansoura, University , Egypt, & College of Computers and Information Systems, Taif University , Taif , Saudi Arabia M. E. ElAlami, Dept. of Computer Science, Faculty of Specific Education, Mansoura University, Egypt 18. A Hop-by-Hop Congestion-Aware Routing Protocol for Heterogeneous Mobile Ad-hoc Networks S.Santhosh baboo, PG & Research Department of computer application, DG Vaishnav College, Arumbakkam, Chennai, India B.Narasimhan, Department of BCA, K.G.College of Arts& Science, Saravanam Patti, Coimbatore-35, India 19. Enhanced Algorithm for Link to System level Interface Mapping Shahid Mumtaz, Alitio Gamerio, Rasool Sadeghi Institute of Telecommunication Institute of Telecommunication, Aveiro, Portugal 20. FPGA-based Controller for a Mobile Robot Shilpa Kale, Department of Electronics & Telecommunication Engineering Nagpur University, Nagpur, Maharashtra Mr. S. S. Shriramwar, Dept. of Electronics & Telecommunication Engg. Nagpur - 440025, India 21. Topological design of minimum cost survivable computer communication networks: Bipartite Graph Method Kamalesh V.N, Department of Computer Science and Engineering, Sathyabama University, Chennai, India. S K Srivatsa, St. Joseph College of Engineering Chennai, India. 22. Approach To Solving Cybercrime And Cybersecurity Azeez Nureni Ayofe, Department of Maths and Computer Science, Fountain University, Osogbo, Nigeria. Osunade Oluwaseyifunmitan, Department of Computer Science, University of Ibadan, Nigeria. 23. Knowledge Elecitation for Factors Affecting Taskforce Productivity– using a Questionnaire Muhammad Sohail, Institute of Information Technology, Kohat University of Science & Technology (KUST), Kohat, Pakistan Abdur Rashid Khan, Institute of Computing & Information Technology, Gomal University, Dera Ismail Khan, Pakistan

24. A Novel Generic Session Based Bit Level Encryption Technique to Enhance Information Security Manas Paul, Tanmay Bhattacharya, Suvajit Pal, Ranit Saha, Department of Information Technology, JIS College of Engineering, Kalyani, West Bengal 25. An Application of Bayesian classification to Interval Encoded Temporal mining with prioritized items C. Balasubramanian, Dr.K.Duraiswamy, Department of Computer Science and Engineering, K.S.Rangasamy College of Technology, Namakkal-Dt., Tamilnadu, India 26. A Proposed Algorithm to improve security & Efficiency of SSL-TLS servers using Batch RSA decryption R. K. Pateriya, J. L. Rana, S. C. Shrivastava, Jaideep Patel, Faculty of Computer Science & IT Dept., MANIT, Bhopal, 462-051 India 27. Log Management support for recovery in mobile computing Environment J.C. Miraclin Joyce Pamila, CSE & IT Dept, Government College of Technology, Coimbatore, India. K. Thanushkodi, Akshaya College of Engineering and Technology, Coimbatore, India. 28. Complete Security Framework for Wireless Sensor Networks Kalpana Sharma, M.K. Ghose, Kuldeep Sikkim Manipal Institute of Technology, Majitar-737136, Sikkim, India, 29. Dynamic Bandwidth Management in Distributed VoD based on the User Class Using Agents H S Guruprasad, Research Scholar, Dr MGR University, Asst Prof & HOD, Dept of ISE BMSCE, Bangalore Dr. H D Maheshappa, Director East Point Group of Institutions, Bangalore 30. Modeling reaction-diffusion of molecules on surface and in volume spaces with the E-Cell System Satya Nanda Vel Arjunan, Masaru Tomita Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, Japan Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan Department of Environment and Information, Keio University, Fujisawa, Japan 31. Modelling of IEEE 802.11e EDCA: presentation and application in an admission control algorithm Mohamad EL MASRI, Guy JUANOLE and Slim ABDELLATIF CNRS ; LAAS ; 7 avenue du colonel Roche , F-31077 Toulouse, France. Université de Toulouse ; UPS, INSA, INP, ISAE ; LAAS ; F-31077 Toulouse, France 32. Training Process Reduction Based On Potential Weights Linear Analysis To Accelarate Back Propagation Network Roya Asadi, Norwati Mustapha, Nasir Sulaiman Faculty of Computer Science and Information Technology, University Putra Malaysia, 43400 Serdang, Selangor, Malaysia.

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 3, No. 1, 2009

1

New Protocol for QoS Routing in Mobile Ad-Hoc Networks

Shahid Mumtaz Institute of Telecommunication,

Aveiro, Portugal [email protected]

Alitio Gamerio Institute of Telecommunication,

Aveiro, Portugal

Abstract— Provision of quality of service (QoS) in an ad hoc network is a challenging task. The challenge comes due the inherent characteristics of mobile ad hoc networks. In this paper, we describe the provision of QoS using node-disjoint multipath routing. We discuss how one can have probabilistic QoS guarantees using a protocol where nodes are aware of their neighbors that may interfere while relaying packets to other nodes along the paths towards the destination. We analyze the computational and communication overheads incurred in identifying correlation aware multiple node-disjoint paths, and the probability that the packets arrive at the destination before their respective deadlines. Keywords-component — Quality of service, node-disjoint multi-path routing, path independence.

I. INTRODUCTION An ad hoc network can be formed instantaneously enabling the participants to communicate without the required intervention of a centralized infrastructure or an access point. Those who want to bypass the infrastructure for some reasons and still want to communicate may form an ad hoc network. An ad hoc network may provide cheaper and cost-effective ways to share information among many mobile hosts. Some of the characteristics of an ad hoc network differentiate it from other classes of wired and wireless networks. In an ad hoc network, the transmission range of mobile devices is limited, therefore, routes are often multihop. There are no separate routers, therefore, nodes in the network cooperate to forward packets of one another towards their ultimate destinations. The devices used to form such a network are often powered through batteries whose power depletion may cause node failures. Further, nodes may move about randomly and therefore the topology of the network may vary dynamically. It is desirable to have a provision of quality of service (QoS) in an ad hoc network. However, there are many peculiar characteristics of an ad hoc network that hinder in providing QoS. For example, the absence of any centralized infrastructure and the dynamically varying topology of a mobile ad hoc networks make the provision of quality of service (QoS) a challenging task. Further, the topology information of the network is not available a priory at a central node. A node knows only about its neighbors. As a

result, a solution or scheme should be able to work with the localized topology information and in a distributed manner. In other words, devising schemes for providing any hard guarantees about the QoS is difficult due to frequent node and link failures. Therefore, one would like to have methodologies that may provide probabilistic QoS. Multiple paths, specifically, those that satisfy node disjointness between a given source and a destination may help in QoS provision in cases when the bandwidth required by a flow of packets is larger than that which can be provided by a single path. Ideally, these paths should be uncorrelated so as to have the maximum usage in terms of throughput and bandwidth utilization. However, identifying multiple node disjoint paths that are uncorrelated or independent is a challenging task. In [1], it is pointed out that the problem of finding two node-disjoint paths that do not interfere with one another in a connected network is NP-complete. The problem of finding the maximum number of node-disjoint paths between a given source and a destination such that there are no cross links between nodes that belong to different paths has already been proved to be NP complete in [2]. In other words, one may not be able to identify not only the maximum number of uncorrelated node-disjoint paths but also even two node-disjoint paths that are uncorrelated in a reasonable amount of time. Finding multiple uncorrelated node-disjoint paths between a given pair of nodes in an ad hoc network is the same as finding a chordless cycle in a graph that contains source and destination nodes [3]. The later problem is an NP-complete problem [4]. Some specialized methods that use directional antennas to mitigate path correlation among node-disjoint paths are described in [5]. In [7], the maximum flow problem is formulated as an optimization problem using switched beam directional antenna for sensor networks that have limited interference. How to select multiple paths based on channel aware multipath metric in mesh networks is described in [8]. An interference-aware metric for path selection for reliable routing in wireless mesh networks is presented in [9]. An interference-minimized multipath routing for high rate streaming in wireless sensor networks is presented in [10]. In this paper, we propose a routing protocol that provides a probabilistic QoS in mobile ad hoc networks. In our protocol, the ensuing data packets may flow through multiple node disjoint paths, one along one path.


2

However, at each hop the service received by the packet depends upon the following factors: how much bandwidth is available, how urgent the packet is, and if assigned a higher priority whether the packet can meet the deadline till the destination. The rest of the paper is organized as follows. Section II contains problem formulation and major issues. In Section III, we present the proposed protocol. In Section IV, we analyze the probability of meeting the QoS. Section V contains results and discussion. Finally, the last section is for conclusion and future directions.

II. PROBLEM FORMULATION AND MAJOR ISSUES In a network with a wireless channel, two or more node disjoint paths between a given pair of nodes can be correlated. By correlation, we mean that nodes along one path have their neighbors that lie along other node-disjoint paths. In other

S D

a b c

e

i j kl

f g h

Fig. 1. Node-disjoint paths from a given source S to a destination D are correlated.

words, there can be cross links from one path to other paths. Specifically, two paths that have k links from one path to another are said to be k-correlated. Consider a network shown in Figure 1, there are three node-disjoint paths from a given source S to a destination D represented by bold lines. Links between intermediate nodes of more than one node-disjoint paths are shown by dashed lines. Note that two node-disjoint paths < S, a, b, c, e, D> and < S, f , g, h, D > have six links going from nodes of one another, therefore, these two paths are 6-correlated. Similarly,< S, f , g, h, D > and < S, i, j, k, l, D > are 6-correlated. In a mobile ad hoc network, where generally an omni-directional antenna is used, the transmissions of a node are heard by all nodes that are in the vicinity. Generally, in an ad hoc network, the Medium Access Control (MAC) layer protocol used is IEEE 802.11, which uses carrier sensing mechanism such as Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) with a coordination function e.g. Distributed Coordination Function (DCF), Enhanced Distributed Channel Access (EDCA). In a carrier sensing multiple access, a node senses the channel. If it finds the channel busy, it chooses a backoff period and waits. When the backoff period is expired, it senses the channel again. Continuing in this manner, either it finds the channel free or it

leaves if the number of retries exceeds a threshold limit. The point here is not to describe the 802.11, as one can see its description in the literature. The point to mention here is that nodes contend for the channel with their neighbors. So, if a node that belongs to a separate node-disjoint path has some of its neighbors that are part of other node-disjoint paths, then the gain that may result using node-disjoint paths may not be the same as expected in a wired network or a wireless network with some infrastructure support. The nodes that belong to separate node-disjoint paths may end up contending simultaneously and that will bite the gains of having multiple node-disjoint paths. If these paths were independent or uncorrelated, then there would not have been such contention and packets might have travelled along multiple node-disjoint paths independently. Therefore, the first issue is that how to find node-disjoint paths that are uncorrelated or independent. As mentioned earlier, finding uncorrelated node-disjoint paths between a given pair of nodes is not an easy task. Finding even two such paths is pointed out to be an NP complete problem in [1]. It means that identification of multiple node disjoint paths that are uncorrelated cannot be carried out, in general, in a reasonable amount of time. The correlation among node-disjoint paths may adversely affect the effective bandwidth that a group of multiple node-disjoint paths might have provided in the absence of any such correlation. However, if the nodes are aware of the correlation, they may take some remedial steps. The question arises that can one find paths in which nodes are aware about who are their neighbors that are part of other node-disjoint paths so as to take some remedial steps, if possible, to alleviate the effect of correlation. Therefore, the second issue is how to identify paths such that nodes are aware of the correlation among paths.

III. PROPOSED PROTOCOL We call the proposed protocol Probabilistic QoS Routing (PQR). It is an on-demand protocol, and like most of the other on-demand protocols, it is based on the request-reply cycle. The protocol consists of two major phases: route discovery and route maintenance. The description of each of these phases is as follows.

A. Route Discovery In the route discovery phase, the protocol tries to identify multiple node-disjoint paths from a given source to a destination. In other words, whenever a source needs to communicate to a destination node, it looks for a path to the intended destination in its route cache. If it finds a valid path, it starts sending packets along the path to the destination. If the source does not find a path in its route cache, it generates a route request (RREQ) packet and sends it to its neighbors. An RREQ contains the following information <SourceAddress, DestinationAddress, SourceSeqNo, PathTraversed>. Let us call a node that is neither the source nor the destination of an RREQ as an intermediate node. When an intermediate node receives a copy of an RREQ, it processes the RREQ according to the RREQ forwarding policy. The RREQ forwarding policy


3

that we use at an intermediate node is All Disjoint Copies (ADC) [6]. In ADC, an intermediate node forwards a copy of the RREQ if PathTraversed of the RREQ is disjoint with the copies of the RREQ already forwarded. Otherwise, the copy of the RREQ is discarded. Eventually, the copies of the RREQ reach the destination. The destination is responsible for computing the disjointness. The destination sends a route reply (RREP), one for each node-disjoint path. An RREP contains the following information <SourceAddress, DestinationAddress, SourceSeqNo,Path, OtherPaths>. Note that Path contains the path from the source to the destination, albeit in reverse direction. The field OtherPaths contains the information about other computed node-disjoint paths. When an RREP travels upstream, an intermediate node unicasts it to the next hop node along the path towards the source. The information contained in OtherPaths has to be stored by all intermediate nodes that receive and unicast the RREP. For that purpose, all nodes in the network maintain a path-info cache in which information about node-disjoint paths for a pair of communicating nodes is stored. Following this way, all intermediate nodes that are lying on node-disjoint paths from the given source to the destination have information about all other node-disjoint paths from the source to the destination. Also, a node knows who are its neighbors either through a beacon mechanism or through MAC layer contentions. An intermediate node sorts the node-id’s of all its neighbors, and each node-disjoint path. It then computes which of its neighbors are part of other node-disjoint paths.

B. Route Maintenance In the route maintenance phase, when a node detects a link failure, it marks the failed outgoing link to be invalid. It, then, looks its route cache to find what are paths that are using the failed link. For each path, it sends a RREQ maintenance (RREQM) to its neighbors. Its neighbors reply by sending RREP-maintenance (RREPM) packet. Upon receiving RREPM, the intermediate node checks whether it can repair the path such that node-disjointness with other paths that have not yet failed is maintained. If it is able to find such a neighbor, it generates a route repair (RPAIR) packet towards the source. Every node that receives RPAIR packet modifies the path in their corresponding route caches for the particular source and destination. If the intermediate node is unable to find such a neighbor, it marks the path to the destination as invalid. It, then, generates a route error (RERR) message and sends it towards the source. When an intermediate node receives an RERR, it marks the path invalid and unicasts the RERR upstream. Upon receiving an RERR, the source marks the path invalid and looks for an available path to the destination in its route cache. If it finds a valid path in its route cache, it starts sending data packets towards the destination. Otherwise, it initiates a new route discovery. Note that an intermediate node along a path that detects the link failure is allowed to repair the routes through its immediate neighbours. The node is able to do so because it is aware of the complete node-disjoint paths from the source to

the destination. If the path is repaired, there will be changes only at one place in the routing entries of the upstream nodes when RPAIR packet travels upstream. However, an intermediate node would not like to repair the path beyond one hop because doing so will put an additional burden on the intermediate node as it may be relaying packets for other source-destination pairs besides its own traffic. Remember that doing otherwise may involve paths to be identified till the destination, and nobody would like to discover paths on behalf of others, if one has its own traffic; and the source node also would not like it to be done. Therefore, we stick to the repair of paths by an intermediate node only upto one hop. There is a question. How the nodes along the node-disjoint paths come to know that a particular path is amended due to link or node failures. To that end, whenever there is a modification among paths that were currently in use, the source assumes the responsibility to propagate the modification to nodes along currently used node-disjoint paths to the destination. For that, the source unicasts a Route Modification (RMOD) packet

Fig. 2. The maximum correlation between path i and path j. to its neighboring nodes that lie along node-disjoint paths to the destination. The RMOD packet contains the following information <SourceAddress, DestinationAddress, RREQSeqNo, NewPath(s)>. Where the SourceAddress, DestinationAddress, RREQSeqNo contain the source address, destination address, and sequence number of the RREQ which was used to identify the route. The field NewPath(s) contains the newly computed paths to the destination that was repaired and reported by a node that sensed the link failure to the source. When an intermediate node receives an RMOD packet, it makes corresponding change in its RouteCache and unicasts it to the next hop node along the path. After that the intermediate node computes again who are its neighbors that lie along other node disjoint paths from the source to the destination. This is done so as to update the information about the correlation at nodes along the modified set of node-disjoint paths.

IV. ANALYSIS In this section, we analyze the probabilistic guarantees for provision of QoS. First, we discuss the adjustment made in the


4

available bandwidth due to correlation and we shall discuss the probabilistic guarantees.

A. Adjustment for Path Correlation Let there be mi intermediate nodes in path i and mj intermediate nodes in path j, then maximum number of links that may be common to path i and path j are

( )( )i j

i j

l m m

= h h

max

1 1

=

− − (1)

where hi and hj are number of hops in path i and j, respectively. Note that the l and lmax have the units number of links (i.e. the product of (hi−1) and (hj−1) should not be confused to have a unit in square of the number of hops, it is simply a number). The proof of the above formula is simple and can be understood as follows. There are mi intermediate nodes at one end and mj intermediate nodes on the other end. As shown in Figure 2, one node out of mi nodes can be selected in mi number of ways. Similarly, one node of mj nodes can be selected in mj number of ways. Total number of ways in which two nodes are selected such that one of them is from mi nodes and the other is from mj nodes are mi mj which is given by (1). Let l be the number of links that are common in path i and j. We define a term path correlation factor, α, as follows.

l

lmaxα = ( )( )i j

l=

h h1 1− − (2)

Let the raw bandwidth of outgoing links from a node be W. Let the number of neighbors of a node that are part of other node-disjoint paths between a given source and a destination be α. Due to path correlation, the bandwidth is reduced because multiple packets cannot be sent along node-disjoint paths that pass through neighboring nodes. We, heuristically, propose that the effective bandwidth of an outgoing link can be written as

W=W

1

1 γα

⎛ ⎞⎜ ⎟

−⎜ ⎟⎜ ⎟⎝ ⎠

(3)

where, γ denotes an offset for the number of source- destination pairs whose node-disjoint paths are passing through the node. For a single source and destination γ = 1.

B. Probabilistic Guarantees We analyze probabilistic guarantees that may be provided using the awareness about the correlation among multiple node-disjoint paths. Let Δ be the maximum tolerable end-to end delay. In other words, a packet sent by the source should reach the destination during the time Δ. Let δ be the time that

a packet may spend at an intermediate node. Let hi be the number of intermediate nodes along a path i. Then, the time that a packet is allowed to spend over an intermediate node is

ihδ

Δ=

In practice, some of the nodes that are lightly loaded may transmit the packet earlier while others that might be heavily loaded may forward the packet after relatively large delays. In other words, δ may vary from node to node and from packet to packet. Therefore, while forwarding a packet towards the destination one should take into account what service is received by the packet at upstream nodes and one should consider the residual time required and the remaining number of nodes to be traversed by the packet so as to reach the

destination. Let ∧Δ be the remaining time and ih

∧ be the

remaining number of intermediate nodes that need to be traversed by the packet.Then, the average time that the packet may spend on an intermediate node downstream is

i

h

δ

∧∧

∧

Δ=

A packet is said to be delayed at a node if it is not able to relay the packet on to its outgoing link within the stipulated time. The excess of amount time incurred at a node other than the packet was supposed to spend may or may not be recovered till the packet arrives at the destination. Also, whether the packet will be able to make up till the destination depends on the extent of delay. A large amount of time that a packet spends at a node is queueing delay which depends upon the queue length or number of occupied buffers. If the queue length is below a threshold value, the packets are not delayed. If the queue length grows beyond the threshold, the packets will start to be delayed. In general, the probability that a packet is delayed depends upon the queue length. However, whenever a node has realtime packets and a best effort packets, priority is given to relay the realtime packet. The priorities are non-preemptive. The excess delay, therefore, will not only depend upon the queue length, it will also depend upon how many packets of higher priority are there before the real time packet that arrives at the node. On the other hand, the probability that a delayed packet will be able to make up the excess delay incurred at a node till it reaches the destination will depend upon at what hop the packet is delayed and the amount of excess delay incurred. In other words, the probability of making up for the extra delay by a packet that is still near the destination is larger than the situation when a packet is delayed and is near the destination. We assume that the probability that a delayed packet that is currently at an intermediate node i will be able to make up till it arrives at the destination is heuristically given by


5

ii P

1

1 1ε

δ

⎛ ⎞⎛ ⎞⎜ ⎟= − Ψ −⎜ ⎟⎜ ⎟⎜ ⎟⎝ ⎠⎜ ⎟

⎝ ⎠ (4)

where, ε is the value by which the delay incurred by a packet at a node exceeds the delay, δ, that the packet was normally supposed to spend. The variable ψ can be written as

kh C

1Ψ = (5)

where, hk is the number of hops in the path k, and C is a variable, and is chosen in such a fashion so that ψ ≈ 0.01. Let Bi be the probability that an intermediate node has enough bandwidth to relay the packet despite the correlation. The probability Bi can be related to (3) such that

eff

iW

BW

= (6)

Then, the probability that the packet will be able to arrive at the destination before the deadline is

khi

D ii

P B

1

1 1ε

δ

⎧ ⎫⎛ ⎞⎛ ⎞⎪ ⎪⎜ ⎟= − Ψ −⎜ ⎟⎨ ⎬⎜ ⎟⎜ ⎟⎝ ⎠⎪ ⎪⎜ ⎟

⎝ ⎠⎩ ⎭∏ (7)

where ψ is defined as in (5). As mentioned earlier, iε is an amount of delay by which the

delay caused by a node i, i'δ ′i, exceeds the normal delay, δi,

that a packet is supposed to spend at node i of path k. In other words, iε = i

'δ −δ. Note that the delay, i'δ , consists of all

delays such as queuing delays, MAC scheduling delays, transmission delays at node i. Note that a node may increase the priority of the data packet if it is delayed less than a threshold value provided that the node has not forwarded too many packets of the same priority level within a time window. A packet which is delayed too much is less likely to reach at the destination in time. Similarly, forwarding too many packets with high priorities may not have a significant effect in terms of meeting the deadline. Further, a node may increase the bandwidth beyond the bandwidth reserved so as to speed up the packet till next hop provided that the enough additional bandwidth is available and the bandwidth utilization is fairly low.

Fig. 3. The probability Pi as a function of number of hops for

fε

δ= = 0.0 and for f

ε

δ= = 0.1.

Fig. 4. The probability of success, PD, as a function of the probability that sufficient bandwidth is available at an intermediate node at Bi = 1.0, and Bi = 0.9.

V. RESULTS

In this section, we present some empirical results. As mentioned earlier, we assumed that the probability that a packet (which is delayed at an intermediate node i along a node disjoint path) is able to make up till it reaches the destination decreases as the packet travels downstream. We pointed out that this assumption is some what realistic because a packet can make up for the delay near the source as compared to the situation when it is about to arrive at the destination. Figure 3 shows the probability Pi at ith intermediate node along a path from the given source to the destination. Here, the number of intermediate nodes in the

path, kh , is 10. The probability Pi is shown for fε

δ= =0.0


6

and for fε

δ= 0.1. As the packet travels downstream, the

probability that a delayed packet will be able make up for the delays till the destination decreases. If the packet is delayed more, the probability that the packet will be able to reach at the destination in time decreases faster as compared to the less delayed situation. Figure 4 shows the probability that the packets sent by the source will be able to reach the destination on or before the deadline, PD, as a function of the path length. In this case, the probability that sufficient bandwidth is available at an intermediate node is assumed to be Bi = 1.0, and Bi = 0.9. As the length of path is increased, the probability that a packet sent by the source will be able to meet the deadline decreases, and this decrease is sharper for large values of the probability that enough bandwidth is available at nodes along the node disjoint path to relay the packet.

VI. CONCLUSION In this paper, we presented a protocol for the provision of QoS in mobile ad hoc networks. In our protocol, intermediate nodes are aware of the correlation between multiple node disjoint paths. The awareness about the dependencies or correlation enables nodes along the paths from the source to the destination to take remedial actions so as to provide QoS to packets flowing between the given source to the destination. We show that by using multiple node-disjoint paths that are aware of the correlation, one may have probabilistic guarantees for delay based QoS.

REFERENCES [1] S. Waharte, R. Boutaba, “On the Probability of Finding Non-Interfering Paths in Wireless Multihop Networks”, Proceedings of International Conference on Networking (NETWORKING), pp. 914-921, May 2008. [2] J.L. Wang, J.A. Silvester, “Maximum Number of Independent Paths and Radio Connectivity”, IEEE Transactions on Communications, vol. 41, no. 10, pp. 1482-1493, October 1993. [3] A.B. Mohanoor, S. Radhakrishnan, V. Sarangan, “Interference Aware Multi-path Routing in Wireless Networks”, Proceedings of 5th IEEE International Conference on Mobile Ad Hoc and Sensor Systems (MASS), pp. 516-518, October 2008. [4] D. Bienstock, “On the Complexity of Testing for Odd Holes and Induced Odd Paths”, Discrete Mathematics, vol. 90, no. 1, pp. 85-92, OJune 1991. [5] A.M. Abbas, M. Alam, B.N. Jain, “Alleviating Path Correlation Using Directional Antenna in Node-Disjoint Multipath Routing for Mobile Ad hoc Networks”, Proceedings of 3rd ACM International Wireless Communication and Mobile Computing Conference (IWCMC), pp. 405- 410, August 12-14, 2007. [6] A.M. Abbas, B.N. Jain, ”Path Diminution is Unavoidable in Node- Disjoint Multipath Routing with Single Route Discovery”, Proceedings of 1st IEEE International Conference on COMmunication SoftWAre and MiddlewaRE (COMSWARE), pp. 1-6, January 8-12, 2006. [7] X. Huang, J.Wang, Y. Fang, “Achieving Maximum Flow in Interference- Aware Wireless Sensor Networks with Smart Antennas”, Elsevier Journal on Ad hoc Networks, vol. 5, pp. 885-896, February 2007. [8] I. Sheriff, E.M.B. Royer, “Multipath Selection in Multi-Radio Mesh Networks”, Proceedings of 3rd International Conference on Broadband Communications (BROADNETS), pp. 1-11, October 2006. [9] J.W. Tsai, T. Moors, “Interference-Aware Multipath Selection for Reliable Routing in Wireless Mesh Networks”, Proceedings of 3rd International Conference on Mobile Ad hoc and Sensor Systems (MASS), pp. 1-6, October 2007.

[10] J.Y. Teo, Y. Ha, C.K. Tham, “Interference-Minimized Multipath Routing with Congestion Control in Wireless Sensor Network for High-Rate Streaming”, IEEE Transactions on Mobile Computing, vol. 7, no. 9, pp. 1124-1137, September 2008. [11] R.D. Haan, R.J. Boucherie, J.K.V. Ommeren, “The Impact of Interference on Optimal Multi-path Routing in Ad Hoc Networks”, Proceedings of 20th International Teletraffic Congress (ITC), LNCS 4516, pp. 803-815, June 2007.

Shahid Mumtaz received his Masters degree in Electrical engineering from the Blekinge Institute of Technology from Sweden, Karlskrona , in 2005. He is working as Research Engineer at the Instituto de Telecomunicações, Pólo de Aveiro Portugal. His research interests include QoS in 3G/4G Networks, Radio Resource Management for wireless systems. His current reserach activities involve Cross-Layer Based Dynamic Radio Resource Allocation for WANs.

Atílio Gameiro received his Licenciatura (five years course) and his PhD from the University of Aveiro in 1985 and 1993 respectively. He is currently a Professor in the Department of Electronics and Telecommunications of the University of Aveiro, and a researcher at the Instituto de Telecomunicações - Pólo de Aveiro, where he is head of group. His main interests lie in signal processing techniques for digital communications and communication protocols. Within this research line he has done work for optical and mobile communications, either at the theoretical and experimental level, and has published over 100 technical papers in International Journals and conferences. His current research activities involve space-time-frequency algorithms for the broadband component of 4G systems and joint design of layers 1 and 2.

(IJCSIS) International Journal of Computer Science and Information Security,

Vol. 3, No. 1, 2009

7

Applicability of a Novel Integer Programming Model for Wireless Sensor Networks

Alexei Barbosa de Aguiar #1, A lvaro de M. S. Neto #2 , Placido Rogerio Pinheiro #3 , Andre L. V. Coelho #4

# Graduate Program in Applied Informatics, University of Fortaleza Washington Soares Avenue, 1321, Room J-30, Fortaleza, CE, Brasil, 60811-905

1 [email protected] 2 [email protected] {3placido, 4 acoelho}@unifor.br

Abstract - This paper presents an applicability analysis over a novel integer programming model devoted to optimize power consumption efficiency in heterogeneous wireless sensor networks. This model is based upon a schedule of sensor allocation plans in multiple time intervals subject to coverage and connectivity constraints. By turning off a specific set of redundant sensors in each time interval, it is possible to reduce the total energy consumption in the network and, at the same time, avoid partitioning the whole network by losing some strategic sensors too prematurely. Since the network is heterogeneous, sensors can sense different phenomena from different demand points, with different sample rates. As the problem instances grow the time spent to the execution turns impracticable.

Keywords: Integer Linear Programming, Wireless Sensor Network, Consumption Optimization

I. INTRODUCTION

Wireless sensor networks (WSNs) have been primarily used in the monitoring of several physical phenomena, such as temperature, barometric pressure, humidity, ambient light, sound volume, solar radiation, and precipitation, and therefore have been deployed in different areas of application/research, like agriculture, climate study, biology, and security.

Although they are highly useful for such applications, specially because of its low cost, WSNs offers many challenges. Their projects often demands for requirements like reliability, failure tolerance, security and long network lifetime. These implementations must be very sophisticated to overcome hardware extreme limitations.

The simple deployment of the approach proposed by Quintao et al. [1], while sensing different phenomena through the same WSN, can lead to inefficiency in terms of energy expenditure. With this perspective in mind, in this work, we provide an extension to the model devised by Quintao et al. [1], namely, to consider different coverage radius and sampling rates for different phenomena. We argue that the incorporation of such aspects into the model can have a significant impact on the network lifetime mainly when the spatio-temporal properties of the phenomena under observation vary a lot. The introduction of this new dimension into the model brings about novel issues

to be dealt with. The critical issue relates to the concurrent routing of data related to different phenomena, as these data should be relayed to different sinks.

The rest of the paper is organized as follows. Section II presents the WSN, how do they work, the components of a sensor, the problems that can occur in a WSN and complementary knowledge to optimize the Network. Section III presents the novel integer linear programming (ILP) model for the minimization of energy expenditure in WSNs regarding the heterogeneity aspects of the sensed phenomena mentioned above. Section IV presents initial results achieved by simulation. Finally, Section V concludes the paper and comments on future work.

II. THE WIRELESS SENSOR NETWORK

A Wireless Sensor network typically consist of a large number of small, low power, and limited-bandwidth computational devices, named sensor nodes. These nodes can frequently interact with each other, in a wireless manner, in order to relay the sensed data towards one or more processing machines (a.k.a. sinks) residing outside the network. For such a purpose, special devices, called gateways, are also employed, in order to interface the WSN with a wired, transport network. To avoid bottleneck and reliability problems, it is pertinent to make one or more of these gateways available in the same network setting, a strategy that can also reduce the length of the traffic routes across the network and consequently lower the overall energy consumption.

A typical sensor node is composed of four modules, namely the processing module, the battery, the transceiver module and the sensor module [2]. Besides the packet building processing, a dynamic routing algorithm runs over the sensor nodes in order to discover and configure in runtime the best network topology in terms of number of retransmissions and waste of energy. Due to the limited resources available to the microprocessor, most devices make use of a small operating system that supplies basic functionalities to the application program.

To supply the power necessary to the whole unit, there is a battery, whose lifetime duration depends on several aspects, among which, its storage capacity and the


Vol. 3, No. 1, 2009

8

levels of electrical current employed in the device. The transceiver module, conversely, is a device that transmits and receives data using radio- frequency propagation as media, and typically involves two circuits, viz. the transmitter and the receiver. Due to the use of public frequency bands, other devices in the neighborhood can cause interference during sensor communication. Likewise, the operation/interaction among other sensor nodes of the same network can cause this sort of interference. So, he lower is the number of active sensors in the network, the more reliable tends to be the radio-frequency communication among these sensors. The last component, the sensor module, is responsible to gauge the phenomena of interest; the ability of concurrently collecting data pertaining to different phenomena is a property already available in some models of sensor nodes.

For each application scenario, the network designer has to consider the rate of variation for each sensed phenomenon in order to choose the best sampling rate of each sensor device. Such decision is very important to be pursued with precision as it surely has a great impact on the amount of data to be sensed and delivered, and, consequently, on the levels of energy consumed prematurely by the sensor nodes. This is the temporal aspect to be considered in the network design.

Another aspect to be considered is the spatial one. Megerian et al. [3] define coverage as a measure of the ability to detect objects within a sensor field. The lower the variation of the physical variable being measured across the area, the shorter has to be the radius of coverage for each sensor while measuring the phenomenon. This will have an influence in the number of active sensors to be employed to cover all demand points related to the given phenomenon. The fact is: the more sensors are active in a given moment, the bigger is the overall energy consumed across the net. WSNs are sometimes deployed in hostile environments, with many restrictions of access. In such cases, the network would be very unreliable and unstable if the minimum number of sensor nodes was effectively used to cover the whole area of observation. If some sensor node fails to operate, its area of coverage would be out of monitoring, preventing the correlation of data coming from this area with others coming from other areas. The localization of each sensor node is assumed to be known a priori by an embedded GPS circuit or other method [4].

A worst-case scenario occurs when we have sensor nodes as network bottlenecks, being responsible for routing all data coming from the sensor nodes in the neighborhood. In this case, a failure in such nodes could jeopardize the whole network deployment. To avoid these problems and make a robust design of the WSN, extra sensor nodes are usually employed in order to introduce some sort of redundancy. By this means, the routing topology needs to be dynamic and adaptive: When a sensor node that is routing data from other nodes fails, the routing algorithm discovers all its neighbor nodes and then the network reconfigures its own topology dynamically. One problem with this approach is that it entails unnecessary energy consumption. This is

because the coverage areas of the redundant sensor nodes overlap too much, giving birth to redundant data. And these redundant data bring about extra energy consumption in retransmission nodes. The radio-frequency interference is also stronger, which can cause unnecessary retransmissions of data, increasing the levels of energy expenditure. Megerian and Potkonjak [5] present many ILP models to maximize energy consumption but not consider the dynamic time scheduling.

The solution proposed by Quintao et al. [1] is to create different schedules, each one associated with a given time interval, that activate only the minimum set of sensor nodes necessary to satisfy the coverage and connectivity constraints. The employment of different schedules prevents the premature starvation from some of the nodes, bringing about a more homogeneous level of consumption of battery across the whole network. This is because the alternation of active nodes among the schedules is often an outcome of the model as it optimizes the energy consumption of the whole network taking into account all time intervals and coverage and connectivity constraints. It is well-known that the sensing of different phenomena does not follow the same spatio-temporal profile. For instance, the temporal and spatial variations of temperature measurements in a given area can be very different from those related to humidity. Working with only one radius of coverage for all sensed phenomena entails that this radius be the smallest one. Likewise, choosing only one sampling rate for all sensed phenomena implies that this rate can keep up well with the phenomenon that varies faster.

Megerian and Potkonjak [5] present many integer linear programming models to maximize energy consumption but its paper does not consider the dynamic time scheduling. Quintao et al. [1] and Nakamura et al. [6] use a model that consider the temporal dimension as worked here but treats the phenomena as equals dimensioning by the worse of their characteristics. They use pure linear integer programming which limits the matrix sizes while the hybrid methodology presented here trespasses this barrier. Quintao et al. [7] use a genetic algorithm to solve the coverage problem, but does not address the connectivity problem, neither deal with time schedules. [8] use Lagrangean Relaxation to improve the results of previous pure linear integer programming approaches. However, this work is not time fashioned either.

III. THE MODEL FOR OPTIMIZING ENERGY CONSUMPTION

This model was presented in Aguiar et al. [9], [10] as an extension of the work of Quintao et al. [1]. The base model had the limitations explained in section II so it was enhanced to address these gaps. New dimensions were inserted in many matrices, new constraints and auxiliary variables as well.

In order to properly model the heterogeneous WSN setting, some previous remarks are necessary:


Vol. 3, No. 1, 2009

9

1. A demand point is a geographical point in the region of monitoring where one or more phenomena are sensed. The distribution of such points across the area of monitoring can be regular, like a grid, but can also be random in nature. The density of such points varies according to the spatial variation of the phenomenon under observation. At least one sensor must be active in a given moment to sense each demand point. Such constraint is implemented in the model;

2. Usually, the sensors are associated with coverage areas that cannot be estimated with accuracy. To simplify the modeling, we assume plain areas without obstacles. Moreover, we assume a circular coverage area with a radius determined by the spatial variation of the sensed phenomenon. Within this area, it is assumed that all demand points can be sensed. The radio-frequency propagation in real WSNs is also irregular in nature. In the same way, we can assume a circular communication area. The radius of this circle is the maximum distance at which two sensor nodes can interact;

3. A route is a path from one sensor node to a sink

possibly passing through one or more other sensor nodes by retransmission. Gateways are regarded as special sensor nodes whose role is only to interface with the sinks. Each phenomenon sensed in a node has its data associated with a route leading to a given sink, which is independent from the routes followed by the data related to other phenomena sensed in the same sensor node;

4. The energy consumption is actually the electric current drawn by a circuit in a given time period. In what follows, the elements of the novel ILP model are introduced in a step-by- step manner.

Set of sensors

Set of demand points

Set of sinks

Set of phenomena (temperature, humidity, barometric pressure, etc.). Each phenomenon has its own spatio-temporal properties. The associated sampling rate has impact on data traffic, while the associated radius of coverage has impact on the number of active sensors

Number scheduling periods

Set of arcs that link sensors to demand points for phenomena

Set of arcs that interconnects sensors

Set of arcs that link sensors and sinks

Set of incident arcs for demand point d D which belong to A

Set of incident arcs for sensor s S which belong to A

Set of output arcs leaving sensor s S which belong to A

Cumulated battery energy for sensor i S

Energy dissipated while activating sensor i S

Energy dissipated while sensor i S is activated (effectively sensing)

Energy dissipated when transmitting data from sensor i to sensor j with respect to phenomenon g. Such values can be different for each arc ij if a sensor can have its transmitter power adjusted based on the distance to the destination sensor. Each phenomenon has its own sampling rate, a parameter that impacts the total amount of data transmitted across the WSN and, consequently, the levels of energy waste

Energy expended in the reception of data for sensor i S

Penalty applied when a demand point j D for phenomenon g is not covered by any sensor

Penalty applied when sensor i S is activated to unnecessarily sense the phenomenon g

If sensor i covers demand point j in period t for phenomenon g

If arc ij belongs to the route from sensor l to a sink in period t for phenomenon g

If sensor i was activated in period t for at least one phenomenon

If sensor i was activated in period t for phenomenon g

If sensor i is activated in period t

If demand point j for phenomenon g is not covered by any sensor in period t

Energy consumed by sensor i considering all time periods

The objective function (1) minimizes the total energy consumption through all time periods. The second term penalizes the existence some not covered demand points, but the solution continues feasible. It penalizes unnecessary activation for phenomenon too.


Vol. 3, No. 1, 2009

10

(1)

These are the constraints adopted:

(2)

Constraint (2) enforces the activation of at least one sensor node i to cover the demand point j associated with phenomenon g in period t. Otherwise, the penalty variable h is set to one. This last condition will occur only in those cases when no sensor node can cover the demand point.

(3)

Constraint (3) turns on variable r (which means that a sensor node is actively sensing phenomenon g in period t) if its associated sensor node is indeed allocated to cover any demand point associated with g.

(4)

Constraint (4) reads that sensor node i is fully active (parameter y), if it is active for at least one phenomenon of observation.

(5)

Constraint (5) relates to the connectivity issue using the flow conservation principle. This constraint enforces that an outgoing route exists from sensor node j to sensor node k if there is already an incoming route from sensor node i to sensor node j.

(6)

Constraint (6) enforces that a route is created for phenomenon g if a sensor node is already active for that phenomenon.

(7)

In Constraint (7), if there is an outgoing route passing through sensor node i, then this sensor node has to be necessarily active.

(8)

In the same way, with Constraint (8), if there is an incoming route passing through sensor i, then this sensor has to be active.

(9)

The total energy consumed by a sensor node is the sum of the parcels given in Constraint (9).

(10)

Constraint (10) enforces that each sensor node should consume at most the energy capacity limit of its battery.

(11)

Constraint (11) determines when the sensor node should start to sense (parameter w). If a sensor is active in the first period, its corresponding w should be set to 1.

(12)

In Constraint (12), the past and current activation states of a sensor node are compared. If the sensor node was active from period t − 1 to period t, then w is set to 1.

(13)

IV. COMPUTATIONAL RESULTS

In order to assess the potentialities of the novel optimization model, we have devised the simulation scenario that is described in the sequence. First of all, we have considered only one phenomenon of interest to be concurrently sensed by the same WSN. Three to six time periods were taken in consideration, although the reader should be aware that the real benefits of our extended model appear (that is, the savings in terms of energy expenditure would be more significant) when one has to deal with large numbers of time intervals.

There were 100 demand points in a square area of 10 per 10 meters. Each demand point can be assigned to either or both phenomena, but the overall coverage of each phenomenon is totally independent from each other regarding a demand point alone. In the same vein, sixteen sensor nodes were placed in the observation area. All nodes have the same processing/sensing capabilities with the possibility to sense concurrently the two phenomena. The coverage radius for the first


Vol. 3, No. 1, 2009

11

phenomenon was set as 8.8 meters in length while the length of the coverage radius for the second phenomenon was 16 meters.

Two types of position generation were considered: Grid and random. In grid fashion, sensors and demand points are disposed regularly in columns and lines. The other scenario is created by disposing sensors and demand points in coordinates that follows a uniform probability density function inside the observation area. Due to the stochastical nature of this variation, 10 problem instances were used for each number of periods. So the results for these instances are presented as average ± standard deviation.

The sampling rate for the first and second phenomena was set as two samples per minute and one sample per minute, respectively. The length of the radius of communication between two neighbor sensors was 11 meters in size. Only one sink was placed at the middle of the regular grid. All elements of this scenario (demand points, sensors, and sink) were generated with its associated geographic coordinates. The matrix was filled with ones in those cases where the distance from the sensor and the demand point was less than or equal to the coverage radius for each phenomenon, and with zeros otherwise. Similarly, the matrices and were filled with ones in those positions where the distance between the sensor nodes or from a sensor node to the sink was less than or equal to the communication radius, and with zeros otherwise. The energy constants were calculated having as basis the values announced at a spreadsheet from a sensor node manufacturer [11]. The energy values for transmission and reception were calculated having as basis the amount of sensed data and the bit rate adopted in the devices. The penalty constant was assigned to a high value to enforce that the model covers all demand points of interest.

In order to establish a comparison, in terms of problem difficulty (variables and constraints) and energy savings (objective function values), between the heterogeneous WSN setting and its homogeneous counterpart, we have also conducted some simulations with our model considering two phenomena with the same characteristics, namely coverage radius of 8.8 meters and sampling rate of two samples per minute.

Table I shows the simulation results achieved by playing with the CPLEX platform [2] with OPL Development Studio 4.2 and Cplex 10.0. The tests were executed in Pentium Core 2 Quad 2.4 GHz 8GB of RAM memory on Windows XP. In this table, in the calculus of the real objective value we ignore the penalties and sum up only the variables. The Figures 1 to 5 provide snapshots of the scheduled plans generated for the first and second phenomena regarding the 6 time intervals considered.

In a manner as to have a better feeling of the impact of the data routing process on the energy expenditure of the WSN nodes, we have set up a second scenario with a larger

area, where the length of the coverage and communication radii become smaller. By this means, there are few communication options to each sensor, and routes must be established in order to convey data to the sinks. In this new scenario, there are four sinks in the corners of the square area and our aim is to assess how many sensor nodes the model recruits to operate as routers of the traffic towards the sinks. Figure 6 the routes generated to this scenario by our model.

Figure 1: Phenomenon 1 - Interval 1

Figure 2: Phenomenon 1 - Interval 2 and 3

V. CONCLUSION AND FUTURE WORKS

These experiments explored other possibilities of applications for this novel model. One of them is the sensors and demand point placements. Grid instances could be solved with a 0% demand point uncovered rate due to its regularity and richness of alternatives. On the other hand, random instances presented some uncovered demand point, even though they were penalized in the objective function. The reason is that some demand


Vol. 3, No. 1, 2009

12

Table I: Results for WSN problem instances

Periods Type Objective Real objective Uncovered demand points rate Time (s) 1 2 3 4 5 6

Grid Grid Grid Grid Grid Grid

3,378.62 6,326.80 9,705.62

12,653.80 19,643.00 24,017.10

3,378.62 6,326.80 9,705.62

12,653.80 19,643.00 24,017.10

0% 0% 0% 0% 0% 0%

1.40 2.60

44.86 56.48

26,858.81 38,885.53

1 2 3 4 5 6

Random Random Random Random Random Random

3, 332.61 ± 340.28 5, 922.16 ± 644.40

10, 598.40 ± 552.32 16, 092.15 ± 1, 297.76 19, 864.90 ± 3, 535.01 24, 901.40 ± 4, 542.08

4, 021.61 ± 327.24 7, 159.43 ± 457.09 8, 853.31 ± 848.73

12, 029.01 ± 590.71 14, 669.99 ± 1, 517.12 17, 385.13 ± 1, 159.51

1.40 ± 0.48 % 1.24 ± 0.99 % 1.16 ± 0.16 % 2.03 ± 0.84 % 2.08 ± 1.61 % 2.51 ± 1.90 %

1.63 ± 0.23 3.25 ± 0.50

10.45 ± 33.02 50.77 ± 108.27 153.50 ± 88.32

578.74 ± 323.67

Figure 3: Phenomenon 1 - Interval 4 Figure 4: Phenomenon 1 - Interval 5 and 6

Figure 5: Phenomenon 2 Figure 6: Routing

points are placed in some regions in the observation area that they are in the sensing coverage radius of too few sensor nodes. And the battery autonomy for each sensor

node cannot supply energy for more than 3 time periods. One even more restrictive situation occurs when there is no sensor node that can reach a certain demand point. This


Vol. 3, No. 1, 2009

13

model is prepared to handle these situations that can be found in real WSNs. This uncovered demand point penalty mechanics gives enough flexibility to deal with wider range of applications without incurring in infeasibilities.

As combinatorial explosion quickly consumes time and memory resources, limiting WSN sizes and practical applications, the need of more sophisticated and robust methods emerges. One promising optimization area that growths and gain more attention is the hybridization of complementary approaches. In Aguiar et al. [12], an hybridization of Genetic Algorithm (GA) and ILP is used to extend the results of the homogeneous model version for optimization on WSN. In this methodology, each individual of its population generates reduced instances of the original problem. ILP is used on the solver of reduced instances and the objective value is feed backed as fitness value on that individual evaluation. This cycle evolves the solutions towards the best compromises between effectiveness and efficiency.

REFERENCES [1]F. Quintao, F. G. Nakamura, and G. R. Mateus, “Evolutionary algorithm for the dynamic coverage problem applied to wireless sensor networks design,” in IEEE Congress on Evolutionary Computation. Edimburgo, UK, 2005. [2]ILOG CPLEX 10.0 User’s Manual, ILOG, January, 2006 [3]S. Megerian, F. Koushanfar, G. Qu, G. Veltri, and M. Potkonjak, “Exposure in wireless sensor networks: Theory and practical solutions,” Wireless Networks, vol. 8, no. 5, pp. 443–454, 2002. [4]X.-Y. Li, P.-J. Wan, and O. Frieder, “Coverage in wireless ad hoc sensor networks,” IEEE Transactions on Computers, vol. 52, no. 6, Jun 2003. [5] S. Megerian and M. Potkonjak, “Lower power 0/1 coverage and scheduling techniques in sensor networks,” University of California, Los Angeles, Technical Reports 030001, January 2003. [6] F. G. Nakamura, F. P. Quintao, G. C. de Menezes, and G. R. Mateus, “Planejamento dinamico para controle de cobertura e conectividade em redes de sensores sem fio,” Workshop de Comunicaca o sem Fio e Computaca o Mo vel, vol. 1, pp. 182–191, 2004. [7] F. P. Quintao, F. G. Nakamura, and G. R. Mateus, “Uma abordagem evolutiva para o problema de cobertura em redes de sensores sem fio,” Revista Eletro nica de Iniciaca o Cientıfica (REIC) da Sociedade Brasileira de Computaca o (SBC), vol. 3, September 2004. [8]G. C. de Menezes, F. G. Nakamura, and G. R. Mateus, “Uma abordagem lagrangeana para os problemas de densidade, cobertura e conectividade em uma rede de sensores sem fio,” Workshop de Comunicaca o sem Fio e Computaca o Mo vel, vol. 1, pp. 192–201, 2004. [9]A. B. de Aguiar, P. R. Pinheiro, and A. L. V. Coelho, “Optimizing energy consumption in heterogeneous wireless sensor networks: A novel integer programming model,” in Proceedings of the IV International Conference on Operational Research for Development - ICORD2007. IFORS, August 2007, pp. 496–505. [10]A. B. de Aguiar, P. R. Pinheiro, A. L. V. Coelho, N. V. Nepomuceno, A lvaro de Menezes Sobreira Neto, and R. P. P. Cunha, “Scalability analysis of a novel integer programming model to deal with energy consumption in heterogeneous wireless sensor networks,” Communications in Computer and Information Science, vol. 14, pp. 11–20, 2008.

[11] “Mote battery life calculator,” Internet, May 2007. [On- line]. Available:http://www.xbow.com/Support/Sypport_pdf_ files/Power_Management.xls [12]A. B. de Aguiar, P. R. Pinheiro, A. L. V. Coelho, A lvaro de Menezes Sobreira Neto, and R. P. P. Cunha, “A hybrid methodology for coverage and conectivity in wireless sensor network dynamic planning,” in XLI Brasilian Symposium on Operations Research, 2009, to appear.


14

Recent Applications of Optical Parametric Amplifiers in Hybrid

WDM/TDM Local Area Optical Networks

Abd El–Naser A. Mohamed1, Mohamed M. E. El-Halawany2

Ahmed Nabih Zaki Rashed3* and Mahmoud M. A. Eid4 1,2,3,4Electronics and Electrical Communication Engineering Department

Faculty of Electronic Engineering, Menouf 32951, Menoufia University, EGYPT 1E-mail: [email protected], 3*E-mail: [email protected]

Tel.: +2 048-3660-617, Fax: +2 048-3660-617

Abstract—In the present paper, the recent applications of optical parametric amplifiers (OPAs) in hybrid wavelength division multiplexing (WDM)/time division multiplexing (TDM) local area passive optical networks have been modeled and parametrically investigated over wide range of the affecting parameters. Moreover, we have analyzed the ability of the hybrid WDM/TDM Passive optical networks to handle a triple play solution, offering voice, video, and data services to the multiple users. Finally, we have investigated the maximum time division multiplexing (MTDM) bit rates for optical network units (ONUs) for maximum number of supported users with optical parametric amplifier technique across the single mode fiber (SMF) or highly nonlinear fiber (HNLF) cables to achieve both maximum network reach and quality of service (QOS). Keywords—Passive optical network; time division multiplexing; wavelength division multiplexing; highly nonlinear fiber; optical parametric amplifier; fiber optics.

I. INTRODUCTION

Optical access networks present the future-proof alternative to the currently deployed copper access infrastructure [1]. With the standardization of time-division-multiplexing passive optical networks (TDM-PONs), a cost-effective access technology based on optics has been developed [2]. However, further development needs to be carried out in order to fully exploit the benefits of optical fiber technology. WDM-PONs are an option, where capacity per user can be very high, but their cost does not make them attractive for practical implementation nowadays. Several recent proposals have demonstrated the feasibility of combining WDM and TDM to optimize network performance and resource utilization. PONs WDM technology (such as WDM PON and hybrid WDM/TDM-PON can provide a promising solution for broadband access [3]. High utilization of wavelengths is desirable to support more subscribers in access

networks since the total number of wavelengths (offered by commercially available light sources) is limited. Furthermore, access networks are very cost-sensitive. Because network operators need to guarantee the level of connection availability specified in the service level agreement it is important in PON deployment to minimize the cost for protection while maintaining the connection availability at the acceptable level [4]. Typically, PON consists of optical line terminal (OLT), remote node (RN), several ONUs, and fiber links including feeder fiber between OLT and RN which are shared by all the ONUs and distribution fibers (DFs) between RN and each ONU [5]. Obviously, multiplying these network resources (and investment cost) to provide protection is not acceptable in access networks. Therefore, much effort has been made to develop cost-effective protection schemes for PONs [6]. Fiber interconnection between two neighboring ONUs is used to provide the protection for DFs which allows us to save a lot of investment cost. The protection architecture presented in [7] requires only half of the wavelengths as compared with , but needs much more interconnecting fiber between the ONUs. With the explosive growth of end user demand for higher bandwidth, various types of PONs have been proposed. PON can be roughly divided into two categories such as TDM and WDM methods. Compared with TDM-PONs, WDM-PON systems allocate a separate wavelength to each subscriber, enabling the delivery of dedicated bandwidth per ONU. Moreover, this virtual point-to-point connection enables a large guaranteed bandwidth, protocol transparency, high quality of service, excellent security, bit-rate independence, and easy upgradeability. Especially, recent good progress on a thermal arrayed waveguide grating (AWG) and cost-effective colorless ONUs has empowered WDM-PON as an optimum solution for the access network. However, fiber link failure from the OLT to the ONU leads to the enormous loss of data [8]. Parametric amplification is a well-known phenomenon in materials providing χ(2) nonlinearity. However, parametric amplification can also be obtained in optical fibers exploiting the χ(3) nonlinearity.


15

New high-power light sources and optical fibers with a nonlinear parameter 5–10 times higher than for conventional fibers , as well as the need of amplification outside the conventional Erbium band has increased the interest in such OPAs. The fiber-based OPA is a well-known technique offering discrete or lumped gain using only a few hundred meters of fiber [9]. It offers a wide gain bandwidth and may in similarity with the Raman amplifier be tailored to operate at any wavelength. An OPA is pumped with one or several intense pump waves providing gain over two wavelength bands surrounding the single pump wave, or in the latter case, the wavelength bands surrounding each of the pumps [10]. As the parametric gain process do not rely on energy transitions between energy states it enable a wideband and flat gain profile contrary to the Raman and the Erbium-doped fiber amplifier (EDFA). OPAs have been intensively studied in recent years due to their potential use for amplification and wavelength conversion in multi-terabit/sec. WDM transmission systems [11]. OPAs have the advantage of being able to operate in any of the telecom bands (S–C–L) depending upon pump

wavelength and the fiber zero dispersion wavelength, which, in principle, can be appropriately tailored. In order to be a practical amplifier in a WDM system, the OPA should exhibit high gain, large bandwidth and should be spectrally flat, among other requirements [12]. OPAs have been used in several applications, e.g., as wavelength converters, amplifiers, and pulse sources. Based on four wave mixing (FWM) with exponential gain [13], the OPA can (when used as a pulse source) generate short return-to-zero (RZ) pulses at the input signal wavelength and the converted idler wavelength [14]. This is due to the fact that the peak of the pump pulse, which has higher gain than its wings, results in a signal pump pulse compressed with respect to the total pump pulse [15]. In the present study, the OPAs are deeply studied and parametrically investigated over wide range of the affecting parameters in hybrid WDM/TDM local area Passive optical networks across SMF, or HNLF cables to achieve the best QOS to handle a triple play solution (video, voice, and data) to the supported users.

II. SIMPLIFIED HYBRID WDM/TDM LOCAL AREA PASSIVE OPTICAL NETWORK ARCHITECTURE MODEL

Figure1. Hybrid WDM/TDM Local Area PON Architecture Model

WDM/TDM local area PON is considered as a compromise between WDM-PON and TDM-PON which combines the advantages of both technologies [6]. The architecture model of WDM/TDM PON is shown in Fig. 1. WDM/TDM PON consists of many laser diodes as a source of optical signals, arrayed waveguide grating multiplexer (AWG Mux) in the OLT, optical fiber cable, two optical parametric amplifiers to strength the optical signal wavelength, arrayed waveguide grating demultiplexer (AWG Demux), ONU in the RN, and optical time division multiplexer (OTDM) which lies in the ONU and network terminal (NT) which connects to the user.

In the downstream direction, traffic including video, voice, and data is transmitted from the backbone network to the OLT and according to different users and location, data is transmitted into corresponding wavelength and multiplexed by AWG Mux. When traffic arrives at RN, wavelengths are demultiplexed by AWG Demux and sent to different fibers. Each fiber (wavelength) serves several NTs. Signal in each wavelength is demultiplexed by OTDM and different time slots are sent to corresponding users [7]. TDM PON has emerged as a promising technology to replace conventional access network. To put it simple, PON consists of OLT

Video

Voice

Data

LD

LD

LD

.

.

ONU1

ONU2

.

.

ONUn

NT

NT

.

.

NT

.

.

OPA

Time slot

Fiber cable

OPA

Downstream

AW

G D

Mux

1x

N

AW

G M

ux M

x 1


16

which lies in the central office, passive optical splitter and ONU which lies at the users. Unlike digital subscriber line (DSL), this is a point to multipoint topology without any active components from the central office to the users. The TDM scheme reduces cost and provide a very efficient method since several users will share the same wavelength. But it also brings some problems such as security issues: because of it broadcast nature and the truth that many users share the bandwidth, transmission distance between user and OLT is limited due to the fact that many users share an optical splitter, and the protocol needed to implement TDM and dynamically allocate bandwidth is very complicated and not easy to realize. Because of these disadvantages, WDM-PON is proposed currently. In the WDM-PON, for each user there is a dedicated wavelength from OLT to ONU. Obviously, this is a point to point topology which differs from point to multipoint. The optical splitter is replaced by RN which actually is a multiplexer/demultiplexer and each user is equipped with a transmitter and receiver. In the downstream direction, optical signals designated for different users are transmitted in their own dedicated wavelength from the OLT and multiplexed in a single fiber cable. At the receiver side which is the RN, the demultiplexer demultiplexes the wavelengths and the signals will be received respectively by each user. But the only problem which is a very big issue of WDM-PON is the low efficiency and high cost [8]. Gbit/sec rates of WDM-PON is too large for a single user so most of the time, big portion of the bandwidth of one wavelength is wasted. Also, due to large number of wavelengths needed in WDM-PON, more fibers will be employed and more transceivers, the cost needed to build such architecture as well as the maintenance fee, all of these will add the cost for each user to afford. Thus, unless the big drop of components and component installation fee and also large increase of bandwidth demand, it is actually impractical to implement WDM-PON in a least a few years. So the directions to hybrid WDM/TDM PON is urgent to achieve the advantages of both technologies.

III. BASIC MODEL AND ANALYSIS

Considering the minimum bandwidth per user that would be offered in a saturated case, where optical network units (ONUs) transmit at their maximum capacity [12]:

( ) ,windowLaser

user TTd

TTdMNTdKBW =+

= (1)

( ) ,Laserwindow TTdKMNT += (2)

where K is the number of lasers at the OLT (typically K=M), d is the data rate, N is the number of of input ports of distribution AWG Mux, M is the number of output ports of distribution AWG Demux, T is the time slot assigned to each ONU in time units, and Twindow is maximm delay. In order to reduce the effect of TLaser, it is very clear that the solution is to increase T. However, in that case the interval of service to serve the same ONU may be too wide for a certain applications. Therefore, a compromise must be met

combining BWuser and delay parameters. Equation (2) has been developed supposing a deterministic situation where the users are transmitting at full rate under TDM conditions. As we need to switch to all active ONUs on the network segment, there is a delay between the packet generation and the moment when the system is prepared to process it, it defined as the following expression [12]:

( ) ,2 Lasertxwindow TTWT += ρ (3)

where ρ is the network utilization, W is the number of users, and Ttx is the average time slot per user. This parameter is also known as network delay, and TLaser is the time duration served to each subscriber or user. By combining a strong CW pump signal at angular frequency (ωp) with a signal at another frequency (ωs) into a SMF or a HNLF cable, parametric gain can be achieved. At the same time, a converted signal, called idler (ωi), will be generated at the frequency ωi= 2ωp- ωs. The process is described using the following three coupled equations that describes the amplitudes Ap,s,i of the pump, signal, and idler [13]:

( ) ,exp22 222⎥⎦

⎤⎢⎣

⎡Δ+⎟

⎠

⎞⎜⎝

⎛⎟⎠⎞⎜

⎝⎛ ++= ∗ ziAAAAAAAi

dz

dApispisp

p βγ

(4)

( ) ,exp2 2222⎥⎦

⎤⎢⎣

⎡Δ−+⎟

⎠⎞

⎜⎝⎛

⎟⎠⎞⎜

⎝⎛ ++= ∗ ziAAAAAAi

dzdA

pisisss βγ

(5)

( ) ,exp2 2222⎥⎦

⎤⎢⎣

⎡Δ−+⎟

⎠⎞

⎜⎝⎛

⎟⎠⎞⎜

⎝⎛ ++= ∗ ziAAAAAAi

dzdA

psissii βγ

(6) The equation for the pump amplitude can now be integrated to give:

( ) ,expexp2

00zPiPzAiAA ppppp γγ =⎟⎟

⎠

⎞⎜⎜⎝

⎛= (7)

where pp PA =2

0is the pump power at z = 0, which implies

that the pump signal does not lose any power. In the no depletion approximation, the parametric amplification is described by the signal power gain [13] as:

( ) ( )( ) ( ) ,sinh10

2

⎥⎥⎦

⎤

⎢⎢⎣

⎡+== gL

g

P

PLP

LG p

s

ss

γ (8)

,2 2

effAn

λπ

γ = (9)

where Pp, Ps are the pump and signal powers in the fiber cable, γ is the fiber nonlinear coefficient, L is the fiber cable length, Aeff is the effective cross-section area of the fiber, n2 is the nonlinear refractive-index coefficient ≈ 2.6 x10-20 m2/V2, λ is the optical signal wavelength, and g is the parametric gain parameter give by [13]:

( ) ( ) ,4/4/22 2pp PkPg γββγ +ΔΔ−=−= (10)

where k=Δ β+ γ Pp, and the propagation mismatch Δβ is given as follows [13]:


17

( ) ( ) ,2 202

0spppSc

λλλλλ

πβ −−−=Δ (11)

( ) ,exp1L

LLeffα−−

= (12)

0

20

40

60

80

100

120

140

160

180

5 40 75 110 145 180 215 250 285 320

Laser switching time = 1 μsec



BW per supported user, Kbits

Figure 2. Variations of transmission data rate with BW per supported user at the assumed set of parameters.

0

1

2

3

4

5

6

7

0.4 0.5 0.6 0.7 0.8 0.9

Video at 1 μsec

Voice at 1 μsec

Data at 1 μsec

Network utilization, ρ

Figure 3. Variations of network delay time with the network utilization

at the assumed set of parameters for three offered services.

0

2

4

6

8

10

12

0.4 0.5 0.6 0.7 0.8 0.9

Video at 100 μsec

Voice at 100 μsec

Data at 100 μsec

Network utilization, ρ

Figure 4. Variations of network delay time with the network utilization at the assumed set of parameters for three offered services.

0

5

10

15

20

25

30

0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4

λs= 1.53 μm

λs= 1.55 μm

λs= 1.57 μm

Pump power, PP, watt Figure 5. Variations of optical parametric gain with pump power at

Tran

smis

sion

dat

a ra

te/u

ser,

Mbi

t/sec

Net

wor

k de

lay

time,

mse

c

Opt

ical

par

amet

ric g

ain,

dB

SMF

N

etw

ork

dela

y tim

e, m

sec


18

the assumed set of parameters.

0

10

20

30

40

50

60

70

0.5 0.7 0.9 1.1 1.3

λs= 1.53 μm

λs= 1.55 μm

λs= 1.57 μm

Pump power, PP, watt

Figure 6. Variations of optical parametric gain with pump power at the assumed set of parameters.

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

4 9 14 19 24

Fiber cable length = 0.12 kmFiber cable length = 0.15 kmFiber cable length = 0.18 km

Number of links in the fiber cable core, NL

Figure 7. Variations of MTDM bit rate/channel with the number of links in the fiber cable core at the assumed set of parameters.

0.3

0.4

0.5

0.6

0.7

0.8

0.9

4 9 14 19 24

Fiber cable length = 1.8 km





0

0.15

0.3

0.45

0.6

0.75

0.9

1.05

1.2

1.35

1.5

4 9 14 19 24



Figure 9. Variations of MTDM bit rate/link with the number of links in the fiber cable core at the assumed set of parameters.

MTD

M b

it ra

te/c

hann

el, B

rcha

nnel ,G

bit/s

ec

MTD

M b

it ra

te/li

nk, B

rlink

,Gbi

t/sec

M

TDM

bit

rate

/cha

nnel

, Brc

hann

el G

bit/s

ec

HNLF Without OPA For dispersion

cancellation

HNLF With OPA

for dispersion cancellation

HNLF

Opt

ical

par

amet

ric g

ain,

dB


cancellation


19

1

1.25

1.5

1.75

2

2.25

2.5

4 9 14 19 24






0

5

10

15

20

25

30

4 9 14 19 24



Figure 11. Variations of MTDM bit rate/fiber cable core with the number of links in the fiber cable core at the assumed set of parameters.

40

60

80

100

120

140

160

4 9 14 19 24






0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

4 9 14 19 24




MTD

M b

it ra

te/li

nk, B

rlink

, Gbi

t/sec

M

TDM

bit

rate

/cor

e, B

rcor

e ,M

bit/s

ec

MTD

M b

it ra

te/c

ore,

Brc

ore ,

Mbi

t/sec

MTD

M b

it ra

te/c

hann

el, B

rcha

nnel ,G

bit/s

ec

SMF Without OPA for dispersion cancellation

HNLF With OPA



cancellation

HNLF With OPA



20

10

20

30

40

50

60

70

80

4 9 14 19 24



Fiber cable length = 10 km



2

4

6

8

10

12

14

16

4 9 14 19 24




10

20

30

40

50

60

70

4 9 14 19 24



Fiber cable length = 10 km



60

80

100

120

140

160

180

4 9 14 19 24



Figure17. Variations of MTDM bit rate/fiber cable core with the number of links in the fiber cable core at the assumed set of parameters.

M

TDM

bit

rate

/cha

nnel

, Brc

hann

el ,G

bit/s

ec

MTD

M b

it ra

te/li

nk, B

rlink

,Gbi

t/sec

MTD

M b

it ra

te/li

nk, B

rlink

, Gbi

t/sec

M

TDM

bit

rate

/cor

e, B

rcor

e ,M

bit/s

ec

SMF With OPA



SMF With OPA




21

200

300

400

500

600

700

800

900

4 9 14 19 24

Fiber cable length = 3.5 kmFiber cable length = 6.5 kmFiber cable length =10 km



where λ0 is the zero dispersion wavelength, λp is the pump wavelength, λs is the optical signal wavelength, c is the velocity of light, and Sp is the parametric gain slope. If the fiber is long or the attenuation is high, the interaction length will be limited by the effective length Leff can be expressed as Leff ≈ L. The signal gain of the optical fiber parametric amplifier can be expressed as follows [14]:

( ) ,.............1206

11

22 42

⎥⎥⎦

⎤

⎢⎢⎣

⎡++++=

gLgLPG ps γ (13)

From Eq. (13), it may be noted that for signal wavelengths close to λp, Δβ ≈ 0, and Gs= (γPpL)2. In the special case of perfect phase matching (k ≈ 0) and γPpL »1, Eq. (13) can be rewritten as follows:

( ) ( ) ( )LPLPgLG pps γγ 2exp41sinhsinh 22 ≈≈= (14)

A very simple expression for the OPA gain can be obtained if Eq. (14) is rewritten in decibel units as:

( ) ,62exp41log10 10 −=⎥

⎦

⎤⎢⎣

⎡= pppdB SLPLPG γ (15)

( )[ ] ,7.82explog10 10 γγ ≈=pS (16) where Sp is introduced as the parametric gain slope in [dB/Watt. km]. Based on Eqs. (4-6), the derived expression for the generated pulse is obtained [13]. A sinusoidally modulated pump Pp(t) is assumed together with a CW signal. The pump is also considered to be undeleted. They have found that the output pulses are approximately chirped Gaussian pulses in high gain regime defined as follows [13]:

( ) ,2

1exp,02

00 ⎥

⎥

⎦

⎤

⎢⎢

⎣

⎡

⎟⎟⎠

⎞⎜⎜⎝

⎛+−=

TtCiAtA (17)

where the amplitude A0 can be identified as follows [14]:

( ),

2exp2

0

00 g

LgA = (18)

where g0=g (t=0) is given by the following expression:

,24 0

2

0P

gγββ Δ+Δ−

= (19)

and the pulse width duration T0 is given by the following expression [14]:

,2

''0

00

LP

gT

γβΔ= nsec (20)

In these expression P0=PP (0) is the peak pump power and ''

0P is the second derivative with respect to time of PP at the

peak value. The pulse width 20T is proportional to 1/L. The

optical signal wavelength span 1.5 μm ≤ λsi, optical signal wavelength≤ 1.65 μm is divided into intervals per link as follows:

linkmNN LL

if /,15.00 μ

λλλ =

−=Δ (21)

The transmitted MTDM bit rates per optical network channel is computed as follows [15]:

channelGbitTT

Brchannel sec//,25.0

41

00== (22)

Then the MTDM bit rates per fiber cable link is given by the following expression:

linkGbitT

NxB ch

rLink sec//,25.0

0= (23)

Therefore, the total MTDM bit rates per fiber cable core is given by the following expression:

coreMbitT

NxNxxB chL

rCore sec//,100025.0

0= (24)

where Nch is the number of optical network channels in the fiber cable link, and NL is the number of links in the fiber cable core.

IV. RESULTS AND DISCUSSIONS

We have investigated the basic MTDM transmission technique to transmit multi-optical network channels with higher bit rates based on both WDM, and TDM With the assistant of OPAs in the interval of 1.5 µm to 1.65 µm. The following numerical data (set of the controlling parameters) of our system model are employed to obtain the performance of hybrid WDM/TDM local area PON with the assistant of OPAs: 1.5 ≤ λsi, optical signal wavelength, μm ≤ 1.65, 1.4 ≤ λp, pumping wavelength, μm ≤ 1.55, 0.5 ≤ PP, pump power, Watt/pump ≥ 1.4, NL: total number of links up to 24 links, number of laser diodes: K= 16 lasers, number of input ports of AWG Mux: M=16 channels, number of output ports: N= 16 channels, number of users: W=256 user, and the fiber

MTD

M b

it ra

te/c

ore,

Brc

ore ,

Mbi

t/sec

SMF With OPA



22

cable parameters for different fiber cable types as shown in Table 1.

TABLE 1. PHYSICAL PARAMETERS ARE USED IN PROPOSED HYBRID NETWORK MODEL FOR DIFFERENT FIBER CABLE

TYPES [13].

Fiber Parameters SMF HNLF

Attenuation, α [dB/km] 0.2 dB/km 0.7 dB/km Effective area, Aeff [μm] 85 μm 12 μm Nonlinear coefficient at 1.55 μm, γ [Watt-1. km-1]

1.8 Watt-1. km-1

15 Watt-1. km-

1

Parametric gain slope, Sp [dB/Watt.km]

16 131

Based on the above governing equations analysis of the proposed hybrid network model, the set assumed of controlling data parameters, and the series of the Figs. (2-18), the following features are assured: 1) Figure 2 has indicated that the bandwidth per

supported user increases, the transmission data rate per user also increases at the same laser switching time. While, as the laser switching time increases, the transmission data rate per user decreases at the same bandwidth per supported user.

2) Figs. (3, 4) have demonstrated that as the network utilization increases, the network delay time also increases at the same laser switching time. Moreover, as the laser switching time increases for three offered services, the network delay time also increases at the same network utilization.

3) As shown in Figs. (5, 6), as the pump power increases, the optical parametric gain also increases across both HNLF and SMF cables at the same optical signal wavelength. Moreover, as the optical signal wavelength increases, the optical parametric gain also increases across both HNLF and SMF cables at the same pump power.

4) Figs. (7, 8) have explained that as the number of links in the fiber cable core increases, the MTDM bit rate per channel also increases at the same fiber cable length. Moreover, as the fiber cable length increases. The MTDM bit rate per channel also increases at the same number of links. But we observed that with OPA across HNLF cable gives long distance transmission for network reach, and higher bit rates per channels.

5) Figs. (9, 10) have demonstrated that as the number of links in the fiber cable core increases, the MTDM bit rate per link also increases at the same fiber cable length. Moreover, as the fiber cable length increases. The MTDM bit rate per link also increases at the same number of links. But we observed that with OPA across HNLF cable gives long distance transmission for network reach, and higher bit rates per links.

6) Figs. (11, 12) have indicated that as the number of links in the fiber cable core increases, the MTDM bit rate per fiber core also increases at the same fiber cable length. Moreover, as the fiber cable

length increases. The MTDM bit rate per fiber core also increases at the same number of links. But we observed that with OPA across HNLF cable gives long distance transmission for network reach, and higher bit rates per fiber core.

7) Figs. (13, 14) have given that as the number of links in the fiber cable core increases, the MTDM bit rate per channel also increases at the same fiber cable length. Moreover, as the fiber cable length increases. The MTDM bit rate per channel also increases at the same number of links. But we observed that with OPA across SMF cable gives long distance transmission for network reach, and higher bit rates per channels.

8) Figs. (15, 16) have shown that as the number of links in the fiber cable core increases, the MTDM bit rate per link also increases at the same fiber cable length. Moreover, as the fiber cable length increases. The MTDM bit rate per link also increases at the same number of links. But we observed that with OPA across SMF cable gives long distance transmission for network reach, and higher bit rates per links.

9) As shown in Figs. (17, 18), as the number of links in the fiber cable core increases, the MTDM bit rate per fiber core also increases at the same fiber cable length. Moreover, as the fiber cable length increases. The MTDM bit rate per fiber core also increases at the same number of links. But we observed that with OPA across SMF cable gives long distance transmission for network reach, and higher bit rates per fiber core.

V. CONCLUSIONS

In a summary, the OPAs are employed over wide range of the affecting parameters in hybrid WDM/TDM local area Passive optical networks across SMF, or HNLF cables to achieve the best QOS to handle a triple play solution to the supported users. We have demonstrated that the fast of the laser switching time, the higher of transmission data rate per supported user, and the lower network delay time to handle the offered services as voice, video, and data for the supported users. Moreover, we have demonstrated that the higher of both pumping power, and optical signal wavelength, the higher optical parametric gain across HNLF, and SMF cables. It is evident that the OPAs play a vital role in extended network reach with higher bit rates either per links or per channels across both HNLF and SMF cables. Finally, we have demonstrated that OPAs with SMF cables have offered higher bit rates either per link or per channels than HNLF cables for the same extended network reach.


23

REFERENCES

[1] J. Prat, V. Polo, C. Bock, C. Arellano, and J. J. Vegas-Olmos, “Full-Duplex Single Fiber Transmission Using FSK Downstream and IM Remote Upstream Modulations for Fiber-to-the-Home,” IEEE Photon. Technol. Lett., Vol. 17, No. 3, pp. 702–704, Mar. 2005.

[2] J. Prat, C. Arellano, V. Polo, and C. Bock, “Optical Network Unit Based on A bidirectional Reflective Semiconductor Optical Amplifier for Fiber to-the-Home Networks,” IEEE Photon. Technol. Lett., Vol. 17, No. 1, pp. 250–252, Jan. 2005.

[3] C. Bock and J. Prat, “Scalable WDMA/TDMA Protocol for Passive Optical Networks that Avoids Upstream Synchronization and Features Dynamic Bandwidth Allocation,” OSA J. Opt. Netw., Vol. 19, No. 4, pp. 226–236, Apr. 2005.

[4] J. Chen, L. Wosinska, and S. He, “High Utilization of Wavelengths and Simple Interconnection Between Users in a Protection Scheme for Passive Optical Networks,” IEEE Photon. Technol. Lett., Vol. 20, No. 6, pp. 389–391, Mar. 2008.

[5] J. Chen and L.Wosinska, “Analysis of Protection Schemes in Passive Optical Network (PON) Compatible With Smooth Migration from TDM-PON to Hybrid WDM/TDM PON,” J. Opt. Netw., Vol. 6, No.3, pp. 514–526, May 2007.

[6] J. Park, J. Baik, and C. Lee, “Fault-Detection Technique in a WDM PON,” Opt. Express, Vol. 15, No.4, pp. 1461–1466, 2007.

[7] K. Lee, S. B. Lee, J. H. Lee, Y.-G. Han, S.-G. Mun, S.-M. Lee, and C.-H. Lee, “A self-Restorable Architecture for Bidirectional Wavelength Division-Multiplexed Passive Optical Network With Colorless ONUs,” Opt. Express, Vol. 15, No.2, pp. 4863–4868, 2007.

[8] K. Lee, S. Mun, Chang-Hee Lee, and S. B. Lee, “Reliable Wavelength-Division-Multiplexed Passive Optical Network Using Novel Protection Scheme,” IEEE Photon. Technol. Lett., Vol. 20, No. 9, pp. 679–681, May 2008.

[9] J. Prat, C. Arellano, V. Polo, and C. Bock, “Optical network Unit Bbased on A bidirectional Reflective Semiconductor Optical Amplifier for Fiber to- the-Home Networks,” IEEE Photon. Technol. Lett., Vol. 17, No. 1, pp. 250–252, Jan. 2005.

[10] Hung-Chih Lu and Way-Seen Wang, “Cyclic Arrayed Waveguide Grating Devices With Flat-Top Passband and Uniform Spectral Response,” IEEE Photon. Technol. Lett., Vol. 20, No. 1, pp. 3–5, Jan. 2008.

[11] T. Torounidis, M. Westlund, H. Sunnerud, B.-E. Olsson, and P. A. Andrekson, “Signal Generation and Transmission at 40, 80, and 160 Gb/s Using A fiber-Optical Parametric Pulse Source,” IEEE Photon. Technol. Lett., Vol. 17, No. 2, pp. 312–314, Feb. 2005.

[12] C. Bock, J. Prat, and D. Walker, “Hybrid WDM/TDM PON Using the AWG FSR and Featuring Centralized Light Generation and Dynamic Bandwidth Allocation,” J. Lightw. Technol. Lett., Vol. 23, No. 12, pp. 3981–3988, Dec. 2005.

[13] J. Hansryd, P. A. Andrekson, M. Westlund, J. Li, and P. O. Hedekvist, “Fiber-Based Optical Parametric Amplifiers and Their Applications,” IEEE J. Sel. Topics Quantum Electron., Vol. 8, No. 3, pp. 506–520, May/Jun. 2002.

[14] T. Torounidis, M. Karlsson, and P. A. Andrekson, “Optical fiber Parametric Amplifier Pulse Source: Theory and Experiments,” J. Lightw. Technol. Lett., Vol. 23, No. 12, pp. 4067-4073, Dec. 2005.

[15] J. Qiao, F. Zhao, R. T. Chen, J. W. Horwitz, and W. W. Morey, “A thermalized Low Loss Echelle Grating Based Multimode Dense Wavelength Division Multiplexer,” Journal of Applied Optics, Vol. 41, No. 31, pp. 6567-6575, 2002.


24

Abd-Elnaser A. Mohammed

Received Ph.D scientific degree from the

faculty of Electronic Engineering, Menoufia

University in 1994. Now, his job career is

Assoc. Prof. Dr. in Electronics and Electrical

Communication Engineering department.

Currently, his field and research interest in

the all passive optical and communication

Networks, analog-digital communication systems, optical systems,

and advanced optical communication networks.

Ahmed Nabih Zaki Rashed was born in Menouf, Menoufia State, Egypt,

in 1976. Received the B.Sc. and M.Sc.

practical scientific degrees in the Electronics

and Electrical Communication Engineering

Department from Faculty of Electronic

Engineering, Menoufia University in 1999

and 2005, respectively. Currently, his field

interest and working toward the Ph.D degree

in Active and Passive Optical Networks (PONs). His theoretical and

practical scientific research mainly focuses on the transmission data

rates and distance of optical access networks.

Mohamoud M. A. Eid

was born in gharbiya State, Egypt, in 1977.

Received the B.Sc. and M.Sc. degrees in the

Electronics Communication Engineering


Engineering, Menoufia University in 2002 and

2007. Currently, his working toward the Ph.D

degree in Ultra wide wavelength division

multiplexing.

Abstract—Wireless Multiuser receivers suffer from their

relatively higher computational complexity that prevents

widespread use of this technique. In addition, one of the main

characteristics of multi-channel communications that can

severely degrade the performance is the inconsistent and low

values of SNR that result in high BER and poor channel

capacity. It has been shown that the computational complexity

of a multiuser receiver can be reduced by using the

transformation matrix (TM) algorithm [4]. In this paper, we

provide quantification of SNR based on the computational

complexity of TM algorithm. We show that the reduction of

complexity results high and consistent values of SNR that can

consequently be used to achieve a desirable BER

performance. In addition, our simulation results suggest that

the high and consistent values of SNR can be achieved for a

desirable BER performance. The performance measure

adopted in this paper is the consistent values of SNR.

Keywords—Computational complexity, DS-CDMA,

wireless multiuser receivers, signal to noise ratio

I. INTRODUCTION

From the design standpoint, for a given modulation and

the coding scheme there is a one to one correspondence

between the bit error rate (BER) and the signal-to-noise

ratio (SNR). From the user standpoint, SNR is not the

favorite criterion for the performance evaluation of digital

communication links, because the user measures the quality

of a system by the number of errors in the received bits and

prefers to avoid the technical detail of modulation or

coding. However, using received SNR rather than BER

will allow us to relate our performance criteria to the

required transmitted power, which is very important for

battery-operated wireless operations. Using SNR rather

than BER has two advantages. First, SNR is the criterion

used for accessing both digital and analog modulation

techniques. Second, SNR is directly related to the

transmitted power, which is an important design parameter.

A significant amount of efforts have been made in order

to achieve high values of SNR [3, 5]. However, none of

these methods relate the complexity of multiuser receivers

for achieving high SNR values. On the other hand, the TM

algorithm is a low complexity, but synchronous

transmission technique that is able to reduce the number of

computations performs by a multiuser receiver for signal

detection [4]. The TM algorithm therefore provides fast

multiuser signal detection which can be further used to

achieve high SNR values. The contribution of this research

work is the quantification of SNR using the TM algorithm

proposed by Rizvi et. [4]. At high SNR values, the error

rate for multi channel can be reduced as well the capacity of

the channel can be well approximated.

Multiuser receivers can be categorized in the following

two forms: optimal maximum likelihood sequence

estimation (MLSE) receivers and suboptimal linear and

nonlinear receivers. Non-linear multiuser receiver involves

the estimation and reconstruction of MAI [6] seen by each

user with the objective of canceling it from the received

signal. The two well known implementations of this

mechanism are SIC and PIC. In interference cancellation,

MAI is first estimated and then subtracted from the

received signal [1, 7]. On the other hand, linear multiuser

receivers apply a linear transformation to an observation

vector, which serves as soft decision for the transmitted

data. Recently, Ottosson and Agrell [2] proposed a new ML

receiver that uses the neighbor descent (ND) algorithm.

They implemented a linear iterative approach using the ND

algorithm to locate the region where the actual observations

belong. The linearity of their iterative approach increases

noise components at the receiving end. Due to the

enhancement in the noise components, the SNR and BER

of ND algorithm is more affected by the MAI. Table 1,

reported from [8], highlights the assumed knowledge for

the computational complexity of a CDMA based multiuser

receiver. Table I shows that different receivers distinguish

themselves with respect to the requirement of the desired

knowledge as well as the implementation complexity.

Verdu [1] proposed the optimum multiuser detector for

asynchronous systems. The complexity of multiuser

Syed S. Rizvi and Khaled M. Elleithy Computer Science and Engineering Department

University of Bridgeport

Bridgeport, CT 06601

{srizvi, elleithy}@bridgeport.edu

Aasia Riasat Department of Computer Science

Institute of Business Management

Karachi, Pakistan 78100

[email protected]

Deterministic Formulization of SNR for

Wireless Multiuser DS-CDMA Networks

(IJCSIS) International Journal of Computer Science and Information Security Vol. 3, No. 1, 2009

25

receiver grows exponentially in an order of O (2)K, where K

is the number of active users. Recently, [2] proposed a ML

receiver that uses the neighboring decent (ND) algorithm

with an iterative approach to locate the regions. The

linearity of the iterative approach increases noise

components at the receiving end. The TM algorithm [4]

observes the coordinates of the constellation diagram to

determine the location of the transformation points. Since

most of the decisions are correct, the TM algorithm can

reduce the number of computations by using the

transformation matrices only on those coordinates which

are most likely to lead to an incorrect decision.

II. PROPOSED TRANSFORMATION MATRIX ALGORITHM

We consider a synchronous DS-CDMA system as a linear

time invariant (LTI) channel. In a LTI channel, the

probability of variations in the interference parameters,

such as the timing of all users, amplitude variation, phase

shift, and frequency shift, is extremely low. This property

makes it possible to reduce the overall computational

complexity at the receiving end. Our TM technique utilizes

the complex properties of the existing inverse matrix

algorithms to construct the transformation matrices and to

determine the location of the TPs that may occur in any

coordinate of the constellation diagram. The individual TPs

can be used to determine the average computational

complexity.

The system may consist of K users. User k can transmit a

signal at any given time with the power of Wk. With the

binary phase shift keying (BPSK) modulation technique,

the transmitted bits belong to either +1 or -1, (i.e.,

{ 1}k

b ∈ ± ). The cross correlation can be reduced by

neglecting the variable delay spreads, since these delays are

relatively small as compared to the symbol transmission

time. In order to detect signals from any user, the

demodulated output of the low pass filter is multiplied by a

unique signature waveform assigned by a pseudo random

number generator. It should be noted that we extract the

signal using the match filter followed by a Viterbi

algorithm. The optimum multiuser receiver exists and

permits to relax the constraints of choosing the spreading

sequences with good correlation properties at a cost of

increased receiver complexity.

A. Description of Transformation Matrix Algorithm

According to original Verdu’s algorithm, the outputs of

the matched filter 1( )y m and 2 ( )y m can be considered as a

single output ( )y m . In order to minimize the noise

components and to maximize the received demodulated

bits, we can transform the output of the matched filter, and

this transformation can be expressed as

follows: ( )y m Tb η= + where T represents the

TM, { 1}k

b ∈ ± andη represents the noise components. In

addition, if the vectors are regarded as points in K-

dimensional space, then the vectors constitute a

constellation diagram that has K total points.

The constellation diagram can be mathematically

expressed as: { }bX = Τ where { }1, 1b ∈ − + and X represents

the collective computational complexity of a multiuser

receiver.

The preceding equation is fundamental to the proposed

algorithm. According to the detection rule, the constellation

diagram can be partitioned into 2K lines (where the total

possible lines in the constellation diagram can be

represented as ſ) that can only intersect each other at the

following points: X = {Tb} b ∈{-1, 1}K \ ſ .

Fig. 1 shows the constellation diagram that consists of

three different vectors (lines) with the original vector ‘ X ’

that represents the collective complexity of the receiver. Q,

R, and S represent vectors or TP within the coverage area of

a cellular network (see Fig. 1). In addition, Q¬, R¬, and S¬

represent the computational complexity of each individual

TP. In order to compute the collective computational

complexity of the optimum receiver, it is essential to

determine the complexity of each individual TP.

TABLE I. COMPLEXITY REQUIREMENTS OF DETECTION ALGORITHMS FOR DS-CDMA SYSTEMS

Receivers Signature of

Desired User

Signature of

Interference

Timing of

Desired User

Timing of

Interferers

Relative

Amplitude

Training

Sequence

Conventional and Rake YES NO YES NO YES NO

Linear ZF YES YES YES YES NO NO

Linear MMSE YES YES YES YES YES NO

SIC and PIC YES YES YES YES YES NO

Trained Adaptive

MMSE

NO NO YES NO NO YES

Blind Adaptive MMSE YES NO YES NO NO NO


26

The computational complexity of each individual TP is

represented by X¬ of the TP which is equal to the collective

complexity of Q¬, R¬, and S¬. In order to derive the value

of the original vector X, we need to perform the following

derivations. We consider the original vector with respect to

each transmitted symbol or bit.

i j K

i j K

X Q Xi XQ XR XS i

XQ i XR i XS i

¬ ¬ ¬

¬ ¬ ¬

= = + + =

+ +

i j K

i j K

X R Xj XQ XR XS j

XQ j XR j XS j

¬ ¬ ¬

¬ ¬ ¬

= = + + =

+ +

i j K

i j K

X S Xk XQ XR XS k

XQ k XR k XS k

¬ ¬ ¬

¬ ¬ ¬

= = + + =

+ +

The following equation can be derived from the above

system:

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

X Q i i i j i k i XQ

X R j i j j j k j XR

i k j k k k XSX S k

¬ ¬ ¬ ¬ ¬

¬ ¬ ¬ ¬ ¬

¬ ¬ ¬¬ ¬

=

(1)

Equation (1) represents the following: QRS with the unit

vectors , , and i j k , and , , a n d X Q X R X S¬ ¬ ¬

with the inverse of the unit vectors and , ,i j k¬ ¬ ¬. The

second matrix on the right hand side of (1) represents b,

where as the first matrix on the right hand side of (1)

represents the actual TM. The TM from the global

reference points to a particular local reference point can

now be derived from (1):

/L G

X Q i XQ

X R j T XR

XSX S k

¬ ¬

¬ ¬

¬ ¬

=

(2)

Equation (2) can also be written as:

/L G

ii ji ki

T ij jj kj

ik jk kk

¬ ¬ ¬

¬ ¬ ¬

¬ ¬ ¬

=

(3)

In (3), the dot products of the unit vectors of the two

reference points are in fact the same as the unit vector of

the inverse TM of (2). We need to compute the locations of

the actual TP described in (2) and (3). Let the unit vectors

for the local reference point be:

[ ][ ][ ]

11 12 13

21 22 23

31 32 33

, ,

, ,

, ,

i T i T j T k

j T i T j T k

k T i T j T k

¬

¬

¬

=

=

=

(4)

Since, ( )i i j k i¬ ¬

+ + = , where ( ) 1i j k+ + = . The same

argument is true for the rest of the unit vectors. Therefore,

(4) can be rewritten as:

[ ][ ][ ]

11 12 13

21 22 23

31 32 33

, ,

, ,

, ,

i T T T

j T T T

k T T T

¬

¬

¬

=

=

=

(5)

X S

Q

R¬

R

S¬

Q¬

X S

Q

R¬

R

S¬

Q¬

Figure 1. A constellation diagram consisting of three different vectors


27

By substituting the values of and , ,i j k¬ ¬ ¬from (5) into

(3), we obtain

11 11 12 13 11 12 13

21 21 22 23 21 22 23

31 31 32 33 31 32 33

/

, , , , , ,12 13

, , , , , ,22 23

, , , , , ,32 33

L G

i T i T j T k j T i T j T k k T i T j T k



T

=

11 12 13

21 22 23

31 32 33

/

, ,

, ,

, ,L G

T T T

T T T

T T T

T

=

(6)

Substituting TL/G from (6) into (2), yields

11 12 13

21 22 23

31 32 33

X Q T T T XQ

X R T T T XR

X S T T T XS

¬

¬

¬

=

(7)

Equation (7) corresponds to the following standard

equation that used for computing the computational

complexity at the receiving end: X= Tb where { }b 1, 1k

∈ − + .

If the target of one transformation ( ):U Q R→ is the same

as the source of other transformation ( ):T R S→ , then we

can combine two or more transformations and form the

following composition: TU: Q�S, TU(Q)=T[U(Q)].This

composition can be used to derive the collective

computational complexity at the receiving end using (7).

Since we assumed that the transmitted signals are

modulated using BPSK which can at most use 1 bit out of

two bits (i.e., { 1}bk

∈ ± ), consider the following set of TP to

approximate the number of demodulated received bits that

need to search out by decision algorithm:

( ) (0) ( 1) 0 0 ( )( 1) (1) (0) ( 1) 0 ( 1)

0 (1) (0) 0

( )0 (1) (0)( )

y m Tb Tb my m Tb Tb Tb m

Tb Tb

m kTb Tby K

η

η

η

=

− + − + + +

�

�

�

� � ��

��

… …

(8)

Equation (8) is derived using our fundamental equation

of TM (i.e., y Tb η= + ). Our approach is to assume

terms ( )mη and ( )m kη + in (7) not equal to zero. This

condition is fulfilled by periodically inserting a nonzero-

energy bit in the information bit sequence. Therefore, the

interference due to the cross-correlation of the actual

symbols with the past and future symbols in the

asynchronous channels can be accounted.

Using (7), a simple matrix addition of the received

demodulated bits can be used to approximate the number of

most correlated TP. The entire procedure for computing the

number of demodulated bits that need to be searched out by

the decision algorithm can be used to approximate the

number of most correlated signals for any given set of TP.

This is because we need to check whether or not the TP are

closest to either (+1, +1) or (-1, -1). The decision regions or

the coordinates where the TP lie for (+1, +1) and (-1, -1)

are simply the corresponding transformation matrices that

store the patterns of their occurrences. If the TP do not exist

in the region of either (+1, +1) or (-1, -1), then it is just a

matter of checking whether the TP are closest to (+1, -1) or

to (-1, +1).

The minimum search performed by the decision

algorithm is conducted if the TP exist within the incorrect

region. Since the minimum search saves computation by

one degree, the decision algorithm has to search at least 4k

demodulated bits. This implies that the total number of

demodulated bits that need to be searched out by the

decision algorithm can not exceed by 5K - 4K. Thus, the total

number of most correlated pairs has an upper bound of 5K -

4K.

Since most of the decisions are correct, we can reduce the

number of computations by using the transformation

matrices only on those coordinates that are most likely to

lead to an incorrect decision. Thus, this greatly reduces the

unnecessary processing required to make a decision about

the correct region. Thus, the number of received

demodulated bits that need to be searched out can be

approximated as 5K - 4K.

The computational complexity of any multiuser receiver

can be quantified by its time complexity per bit [6]. The

collective computational complexity of the proposed

algorithm is achieved after performing the TM sum. This

implies that both quantities T and b from our fundamental

equation can be computed together and the generation of all

the values of the demodulated received bits b can be done

through the sum of the actual TM T that approximately

takes О (5/4)k operations with an asymptotic constant.

Using the Newton approximation method given in

MATLAB, we can directly come to an approximation of О

(5/4)k. The computational complexity of the proposed

algorithm is not polynomial in the number of users, instead

the number of operations required to maximize the

demodulation of the transmitted bits and to choose an

optimal value of b is О (5/4)k, and therefore the time

complexity per bit is О (5/4)k.


28

III. PROPOSED QUANTIFICATION OF SIGNAL-TO-NOISE

RATION (SNR)

In this section, we derive an expression to provide

quantification of SNR for the signals received at the DS-

CDMA multiuser receiver. The reduced complexity of the

TM algorithm provides faster detection rate. The faster

detection rate results high and consistent values of SNR.

Once we determine the values of SNR, we can relate them

to the BER performance and the channel capacity

approximation for a wireless multiuser receiver.

MAI causes the SNR degradation resulting in a degraded

SNR performance for a particular value of Eb/No. We

present that due to the reduced complexity, the SNR

performance of the TM algorithm would remain consistent

in terms of the desired values even for a large value of K.

This consistency in SNR performance yields an optimal

BER performance.

A. System Model and Key Assumptions

Our fundamental assumption is that the system is linear

time invariant (LTI) which leads us to the fact that the

transmitted signals experience no deep fades. Due to the

linearity and time invariant properties of the system, we can

ignore the phase shift, and deep fades. In other words, the

overall SNR of the received signals has a slow convergence

rate compared to the convergence rate of the BER.

B. Proposed Formulization for SNR

Consider the following assumptions for an AWGN

channel:

(a) ℵ represents the computational complexity that

belongs to a certain coverage area.

(b) SNR (we represent SNR byγ ) is uniformly

distributed among all the active user’s signals with respect

to computational complexity.

(c) A certain cellular coverage area has K users.

Based on these above assumptions, we can give the

following hypothesis:

{ }1 2 3 , , ,.................,i K

ℵ ℵ ℵ ℵ ℵ∈ (9)

where 1, 2, 3, .................... Kℵ ℵ ℵ ℵ indicates the indicates

the computational complexity-domain and

{ }1 2 3 , , ,................,i K

h h h h h∈ (10)

where 1 2 3, , ,................,K

h h h h indicates the user-

domain.

Complexity-domain can be considered as a simple data

structure for storing the patterns of occurrences of all active

users. User-Domain is the number of active users present in

the certain coverage area of a cellular network. The

collective computational complexity can be expressed as:

1

1, 2,.....,K

ii

where i K=

ℵ = ℵ =∑ (11)

Since each user has th

h part of the computational

complexity such as: 1 1 2 2, ,......,K K

h h hℵ ℵ ℵ∈ ∈ ∈ .

This implies that each active user in a certain area of a

cellular network has an average of Kℵ computational

complexity. Since SNR is uniformly distributed among all

the user’s signals at the receiving end, each user

experiences an average of Kγ SNR. Therefore, this

argument leads us to:

( ) ( )1 1 11K C C Cγ γ− − − ℵ = − ℵ = − ℵ (12)

where C in (12) represents the normalization factor, K ℵ

is the inverse of the computational complexity, and γ ℵ

represents the SNR with respect to average computational

complexity.

Equation (12) can be interpreted that the inverse of

computational complexity equals to the difference between

the inverse-normalization factor and the product of the

inverse-normalization factor and SNR with respect to the

collective computational complexity. The main objective of

(12) is to make sure that we should get maximum positive

values of SNR for most of the values of K.

C. Proof forγ ℵ

If the previous assumptions are valid for an AWGN

channel, the following approximation must be true for both

the complexity and the user domains:

approximationK C Kγℵ → + (13)

We present our hypothesis that the difference between

the average computational complexity and the average SNR

should equal to the normalization factor. The main

objective of (13) is to get maximum positive values of SNR

for most of the values of K. Equation (5) can also be

written as:

( ) ( )K K Cγℵ − = (14)

Based on (14), we can write the following equation:


29

( )1 KCγ

= −ℵ ℵ

(15)

Since the right hand side of (15) represents the inverse of

the average computational complexity with the

normalization factor, the number of required operations can

not be less than zero. It should be noted that the right hand

side of (15) always gives us a positive value of SNR for any

value of K which is greater than 10. Equation (15) can also

be rewritten as:

( ) [ ]1

1K C γ−

ℵ = − ℵ (16)

Using the complexity and the user domain, we can make

an argument that the inverse of an average SNR should be

at least greater than zero. This argument guarantees that

the system does not work with a non positive value of SNR.

In other words, the inverse of the average SNR should

equal to the difference of the normalization factor and the

inverse of the average computational complexity. Recall

(12):

( ) ( )1K C CKγ γ−

ℵ = ℵ− ℵ = ℵ− � (17)

Equation (17) represents SNR by determining the

difference between the power of the transmitted signal from

the computational complexity-domain and the number of

users from the user-domain. Equation (17) can also be used

to compute the values of SNR in an ideal situation only if

MAI does not affect the received signals by K-1 users.

However, in a practical DS-CDMA system, this assumption

does not exist. Therefore, we should consider that the

variations in the network load for an AWGN channel

introduces the presence of variance (we represent variance

by2

Φ ) that represents MAI.

The selection of variance is entirely dependent on the

network load. The variance is a linear function of the active

users (K) and it should increase as we increase the value of

K. In order to compute the values of SNR, we need to

change the linear quantity into decibels (dB) by multiplying

it to the base-10 logarithmic function as well as with the

variance. This leads us to the following expression for

SNR:

( )2

1010 log CKγ = Φ ℵ− (18)

We use the values of variance in our simulation that

represents MAI with respect to K.

IV. EXPERIMENTAL VERIFICATION AND SIMULATION

RESULTS

Fig. 2 shows the logical diagram of a cellular system that

uses synchronous DS-CDMA system. We assume that all

the users among the cellular area communicate using

AWGN channels through one or more base stations.

Because of AWGN multiple channels, the symbol duration

of the transmitted signal is much larger than the delay

spread which avoid inter-symbol interference. Therefore the

uplink (from user to BS) model is based on synchronous DS

CDMA system with multiple path channels and the

presence of AWGN with zero mean and a varying amount

of variance.

The choice of variance depends on the number of active

users present in the coverage area of a cellular network.

Furthermore, we assume that the transmission power of

each user is tightly controlled (which is a usual thing for

wireless applications) by the central entity of a coverage

area such as a central base station (BS) or an access point

(AP). This implies that the central entity (BS/AP) of a

coverage area receives uniform-power-signals and they

remain same throughout the total communication time.

It has been shown that the SNR degradation depends on

the number of users, K, [4]. An increase in K would

degrade the performance because it would increase the

cross correlation between the received signals from all the

users (i.e., K-1 users). Mathematically, we can express this

as: K ∝ MAI ∝ high BER ∝ 1/SNR. This shows that a

slight increase in K would degrade the SNR performance

that consequently increases the BER. However, a large

increase in value of K forces MAI to reach its peak value

that limits the divergence of SNR for the TM algorithm.

Three different types of detection algorithms are

investigated, which are the original ML algorithm, reduced

ND algorithm, and the proposed TM algorithm [4]. The

following is the description of the parameters that we use

for two different scenarios: (i) lightly-loaded network where

K starts from 2 to 50 and (ii) heavily-loaded network where

K starts from 2 to 100. LTI synchronous DS-CDMA over

an AWGN channel with small variation in 2

Φ are used.

In order to compare the SNR performance of the

proposed algorithm with the other multiuser detection

algorithms, we use a same constant value with their

asymptotic computational complexities that does not make

an exception for any one of the investigated algorithms. In

our simulation for both scenarios, we use one (i.e., 1C = )

as a normalization factor that remains same for all the

investigated algorithms.

The choice of a small value of 2

Φ is entirely based on

the load of the coverage area (K) and it is selected through

a random process for a certain range of users. For a lightly

loaded network, we expect that the value of variance (2

Φ )


30

may vary from 0.6 to 0.9 and for a heavily-loaded network;

the value of variance may vary from 0.1 to 1. Since the

proposed algorithm detects transmitted signals by using

complex properties of inverse matrix algorithm that

observes the coordinates of the constellation diagram to

determine the location of the corresponding transformation

points, it is more likely that the value of variance is

extremely small for both lightly-loaded as well as heavily-

loaded networks. Furthermore, all signals transmit at the

same bit rate and all signals receive with the same power

(i.e., perfect power control).

For lightly-loaded network, (2<K<50) whereas for

heavily-loaded network (2<K<100). LTI synchronous DS-

CDMA over an AWGN channel with small variation in 2

Φ are used. The choice of a small value of variance is

entirely based on the value of K and it is selected through a

random process.

A. Performance Evaluation for Lightly Loaded Networks

Fig. 3 shows one of the possible cases of a lightly-loaded

network where 22 active users transmit BPSK modulated

signals. For a small value of K, the proposed TM algorithm

2 4 6 8 10 12 14 16 18 20 22

6

8

10

12

14

16

18

U S E R S

S N

R

ML

ND

Proposed

Figure 3. Approximate values of SNR (dB) versus number of users (K=22)

with a random amount of variance for a synchronous system in an AWGN

channel.

The average collective computational

complexity remains same as the

occurrence of the total umber of user

increases.

Users are located in different user domains

Since the system is LTI, each user

experiences the same average amount of

SNR A large cellular area may have more than

one base station that provides services to

different number of user domains

Figure 2. Logical Diagram of a Cellular network


31

achieves approximately 6.5 dB of SNR where as the ND

and the ML algorithms give 5.8 and 5.5 dB, respectively.

This implies that a slight increase in the value of K forces

the TM algorithm to give an acceptable value of SNR that

can be used to achieve a satisfactory BER performance at

least for a voice communication network. This can be seen

in Fig. 4 that the TM algorithm has more rapid divergence

with respect to the number of users than the ND and the

ML algorithms. The divergence in SNR is directly

proportional to the convergence in BER performance. In

addition, it can be clearly observed in Fig. 4 that the linear

increase in SNR for the TM algorithm is more uniform and

smoother over the ND and the ML algorithms.

Furthermore, the importance of variance can not be

ignored, since Figures 3 and 4 clearly depict that a random

amount of variance is more affected on the ND and the ML

algorithms than on the proposed algorithm. This is because

both ML and ND algorithms have comparatively larger

complexity-domains which take more time to perform

required iterations to detect the received signals and thus

give more time to variance to effect comprehensively on the

received SNR. The degradation in SNR due to variance can

be seen in Figures 5 and 6 when K = 42 and K = 52,

respectively. Moreover, for a lightly-loaded network, it can

be expected that the selection of variance within the

specified range does not meet the threshold value. In other

words, the random amount of variance is more likely

unstable for a lightly-loaded network than in a heavily-

loaded network and thus may cause a serious degradation in

the values of SNR.

B. Performance Evaluation for Heavily Loaded Networks

For heavily-loaded case, we consider a cellular network

that consists of approximately 2 to 100 active users. Fig. 7

and 8 is one of the examples of a heavily-loaded case where

72 to 102 active users transmit signals through the central

entity of the network.

Fig. 7 shows that the linear increase in SNR is consistent

not only for a lightly-loaded network but also for a heavily-

2 7 12 17 22 27 32 37 42 47 525

10

15

20

25

30

U S E R S

S N

R

ML

ND

Proposed

Figure 6. Approximate value of SNR (dB) versus number of users (K =52)

with a random amount of variance for a synchronous DS-CDMA system in a

Gaussian channel.

2 5 8 11 14 17 20 23 26 29 32

6

8

10

12

14

16

18

20

22

U S E R S

S N

R

ML

ND

Proposed

Figure 4. Approximate value of SNR (dB) versus number of users (K =32)

with a random amount of variance for a synchronous system in an AWGN

channel.

2 6 10 14 18 22 26 30 34 38 42

6

8

10

12

14

16

18

20

22

24

26

U S E R S

S N

R

ML

ND

Proposed

Figure 5. Approximate value of SNR (dB) versus number of users (K =42) with

a random amount of variance for a synchronous DS-CDMA system in a

Gaussian channel.


32

loaded network. However, this can also be noticed from

Figures 7 and 8 that as the number of users increase in the

system, the differences between the SNR values for the

proposed algorithm and the other two ML and the ND

algorithms become wider. From Fig. 7, the proposed

algorithm gives approximately 36 dB for K = 72 which is

more than what we expect to achieve for an optimal BER

performance. In addition to that, the random amount of

variance is more affected on the SNR values in a heavily-

loaded case than in a lightly-loaded case.

In Fig. 8, it can be seen that the ND algorithm

comparatively gets high values of SNR than the ML

algorithm in a heavily-loaded network (typically when K >

55) when compare to a lightly-loaded network. This is

because the computational complexity for a heavily-loaded

case is much greater than the computational complexity for

a lightly-loaded case that forces both ML and ND

algorithms to minimize the factor of divergence and hence

maximize the factor of convergence. Since we assume that

the selection of variance is random within the specified

range, it remains stable after a certain value of K that limits

the divergence of SNR.

Another important point that can be observed from Fig. 8

is that the graph for the proposed algorithm converges to

approximately 45 dB after 100 users and only a slight

increase in the value of SNR can be expected for very large

values of K. This is also essential for achieving an

acceptable performance, since crossing the threshold value

of SNR might degrade the overall system performance. In

other words, after a certain value of K, the MAI reaches to

its peak value that limits the divergence of the SNR curve

for the proposed algorithm.

V. CONCLUSION

In this paper, we presented the quantification of SNR

based on the TM algorithm. We have shown that the

reduction in the computational complexity of a multiuser

receiver can be used to achieve high and consistent values

of SNR. The simulation results suggest that due to a low

complexity domain, the SNR performance of the TM

algorithm is more uniform and smoother over the other

well known algorithms. For the future work, it will be

interesting to implement the proposed approach for

asynchronous systems to achieve desirable BER

performance and approximate the capacity of a multi

channel.

REFERENCES

[1] S. Verdu, Multiuser Detection. Cambridge University Press, 1988.

[2] T. Ottosson and E. Agrell, “ML optimal CDMA Multiuser Receiver,”

Electronics Letters, Vol. 31, Issue-18, pp. 1544-1555, August 1995.

[3] N. Jindal, “High SNR analysis of MIMO broadcast channels,”

Proceedings of international symposium on information theory,

Vol.4, no. 9, pp. 2310 – 2314, Sept. 2005

[4] Syed S. Rizvi, Khaled M. Elleithy, and Aasia Riasat, “Transformation

Matrix Algorithm for Reducing the Computational Complexity of

Multiuser Receivers for DS-CDMA Wireless Systems,” Wireless

Telecommunication Symposium (WTS 2007), Pomona, California,

April 26-28 2007.

[5] A. Lozano, M. Antonia, and S. Verdú, “High-SNR Power Offset in

Multiantenna Communication,” IEEE Trans on information theory,

Vol. 51, No, 12, pp. 4134- 4151, Dec 2005.

[6] Z. Tian, H. Ge, and L. Scharf, “ Low-complexity multiuser detection

and reduced-rank Wiener filters for ultra-wideband multiple access,”

IEEE International Conference on Acoustics, Speech, and Signal

Processing, Vol. 3, pp. 621-624, 2005.

[7] E. Del, R. Fantacci, S. Morosi, S. Marapodi, “A low-complexity

multiuser detector for asynchronous CDMA QPSK adaptive antenna

2 12 22 32 42 52 62 72 82 92 1025

10

15

20

25

30

35

40

45

50

U S E R S

S N

R

ML

ND

Proposed

Figure 8. Approximate value of SNR (dB) versus number of users (K =102,

heavily-loaded network) with a random amount of variance for a synchronous

system in an AWGN channel.

2 12 22 32 42 52 62 725

10

15

20

25

30

35

40

U S E R S

S N

R

ML

ND

Proposed

Figure 7. Approximate value of SNR (dB) versus number of users (K =72) with

a random amount of variance for a synchronous DS-CDMA system in a

Gaussian channel.


33

array systems,” Wireless Networks , Kluwer Academic Publishers.

Manufactured in The Netherlands, Vol. 9, Issue 4, pp. 373-378, July

2003.

[8] C. Piero, Multiuser Detection in CDMA Mobile Terminals. Artech

House, Inc., 2002.

Syed S. Rizvi is a Ph.D. student of

Computer Science and Engineering at

University of Bridgeport. He received a

B.S. in Computer Engineering from Sir

Syed University of Engineering and

Technology and an M.S. in Computer

Engineering from Old Dominion University

in 2001 and 2005, respectively. In the past,

he has done research on bioinformatics

projects where he investigated the use of

Linux based cluster search engines for

finding the desired proteins in input and

outputs sequences from multiple databases. For last three year, his research

focused primarily on the modeling and simulation of wide range

parallel/distributed systems and the web based training applications. Syed

Rizvi is the author of 68 scholarly publications in various areas. His current

research focuses on the design, implementation and comparisons of algorithms

in the areas of multiuser communications, multipath signals detection, multi-

access interference estimation, computational complexity and combinatorial

optimization of multiuser receivers, peer-to-peer networking, network

security, and reconfigurable coprocessor and FPGA based architectures.

AASIA RIASAT is an Associate

Professor of Computer Science at Collage

of Business Management (CBM) since

May 2006. She received an M.S.C. in

Computer Science from the University of

Sindh, and an M.S in Computer Science

from Old Dominion University in 2005.

For last one year, she is working as one of

the active members of the wireless and

mobile communications (WMC) lab

research group of University of

Bridgeport, Bridgeport CT. In WMC

research group, she is mainly responsible for simulation design for all the

research work. Aasia Riasat is the author or co-author of more than 40

scholarly publications in various areas. Her research interests include

modeling and simulation, web-based visualization, virtual reality, data

compression, and algorithms optimization.

KHALED ELLEITHY received the

B.Sc. degree in computer science and

automatic control from Alexandria

University in 1983, the MS Degree in

computer networks from the same

university in 1986, and the MS and

Ph.D. degrees in computer science from

The Center for Advanced Computer

Studies at the University of Louisiana at

Lafayette in 1988 and 1990,

respectively. From 1983 to 1986, he was

with the Computer Science Department,

Alexandria University, Egypt, as a

lecturer. From September 1990 to May 1995 he worked as an assistant

professor at the Department of Computer Engineering, King Fahd University

of Petroleum and Minerals, Dhahran, Saudi Arabia. From May 1995 to

December 2000, he has worked as an Associate Professor in the same

department. In January 2000, Dr. Elleithy has joined the Department of

Computer Science and Engineering in University


34

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 3, No. 1 2009

34

A multidimensional approach for context-aware recommendation in mobile commerce

Maryam Hosseini-Pozveh Department of Computer Engineering

University Of Isfahan Isfahan, Iran

[email protected]

Mohamadali Nematbakhsh Department of Computer Engineering


[email protected]

Naser Movahhedinia Department of Computer Engineering


[email protected]

Abstract—Context as the dynamic information describing the situation of items and users and affecting the user’s decision process is essential to be used by recommender systems in mobile commerce to guarantee the quality of recommendation. This paper proposes a novel multidimensional approach for context-aware recommendation in mobile commerce. The approach represents users, items, context information and the relationship between them in a multidimensional space. It then determines the usage patterns of each user under different contextual situations and creates a new 2-dimensional recommendation space and does the final recommendation in that space. This paper also represents an evaluation process by implementing the proposed approach in a restaurant food recommendation system considering day, time, weather and companion as the contextual information and comparing the approach with the traditional 2-dimensional one. The results of comparison illustrates that the multidimensional approach increases the recommendation quality.

Keywords-component; context-awareness; multidimensional recommendation approach; mobile commerce; self-organizing maps; collaborative filtering

I. INTRODUCTION Recommender systems in mobile commerce have been one

of the important research issues in recent years which with advances in mobile and wireless computing have been concerned. Mobile or pervasive commerce is known as doing e-commerce activities via wireless environment, especially wireless internet, and mobile handled devices which are growing very fast [1], [2]. Mobile commerce applications have two special properties, mobility and broad reach [1], [3]. The former emphasizes breaking the not-anywhere limitations and the latter emphasizes breaking the not-any time limitations in the interactions between the users and the applications [1], [3], [4], [5]. The opportunity that users can simply use mobile phones or personal digital assistance at any place and any time for doing the electronic commerce activities such as e-banking, e-shopping and many other opportunities for them and also for businesses make a clear insight to consider mobile commerce applications as important research topics [1], [3]. Implementation of recommender systems in mobile commerce

could not be very useful without considering the effect of unique parameters of that environment which are named context information [6].

The goal of recommendation systems is to propose the suitable resources means items such as web pages, books or movies users prefer to other items [7], [8], [9]. In recommendation systems, three data sets, user’s information (C), recommendable items information (S) such as books, movies, music and so on, and the relationship data between users and items exist. The relationship between S and C is based on a rating structure that describes the usefulness degree of items to users. This relationship could be defined with a function is named utility function, u: [7]

RatingsSCu →×: (1)

Where Ratings is a totally ordered set including nonnegative integers or real numbers within a certain range.

The main problem of recommendation systems is that u function has defined only on a subset of SC × domain, not on the whole set and therefore unspecified parts of it must be predicted. After the prediction phase, system could recommend the items with highest predicted rates to the users [7].

To reaching the final goal in recommendation systems, different methods are developed so far which are categorized as follow: [7], [9], [10]

• Content-based: in this group of methods, system recommends items that are most similar to the items which user has rated them highly in the past. It means u(c, s) is estimated based on the utilities u(c, si) which si are items that are similar to s.

• Collaborative filtering: in this group of methods, system recommends items which are highly rated by peers of current user. More formally, the utility u(c, s) is estimated based on utilities u(cj, s) that users cj are users who are similar to user c.


35

• Hybrid models: these approaches combine two previous methods and therefore use the benefits of both of them for specifying and recommending suitable items.

From the other point of view, recommendation approaches, content-based and collaborative filtering, are divided into model-based and memory-based approaches. Opposite of memory-based models, model-based approaches make a model using rating sets and machine learning methods and use that model for future rating predictions [7], [10], [11].

The rest of this paper is organized as follows. Section 2 gives an overview of related works. Section 3 describes context-aware recommendation concepts in mobile commerce. Section 4 details the novel multidimensional approach. Section 5 presents the implementation experiences and reports the evaluation results and finally the paper concludes in section 6.

II. RELATED WORK Mobile handled devices and wireless technologies

emergence has made many opportunities for electronic commerce applications. Presenting more customized and personalized information is one of the most important goals of the mobile commerce applications. Using context as the dynamic information describing the situation of items and users and affecting the user’s decision process, by recommender systems in mobile commerce is a solution to guarantee the quality of recommendation. All the proposed methods so far, have tried to use contextual information for producing proper outputs.

Location-aware recommender systems are an important subset of context-aware ones. Yang, Cheng and Dia [12] have presented a location-aware recommender system in mobile environments that its goal is to recommend vendors websites considering customer preferences and also his/her location distance from the location presented in websites. Proximo [25] is another location-aware recommender system for indoors environments such as museums and art galleries. It shows the recommendable items on a map in the user’s mobile device.

In addition to the location parameter, using other contextual parameters for recommendation has been concerned by researchers. Li, Wang, Geng and Dai [13] have proposed a context-aware recommender system for mobile commerce applications. That framework uses multidimensional model for representing recommendation space and a reduction-based approach for reducing that space to a 2-dimensional one. It then uses a traditional recommendation method in that final space.

Some mobile context-aware recommender systems use ontology and semantic web [26], [27]. Ontology could be used for modeling context or modeling relation between context and other data sets. It could also be used in recommendation process.

III. MOBILE CONTEXT-AWARE RECOMMENDER SYSTEMS Context-awareness concept has been used in various

researches belonging to different fields of mobile computing scope and defined in them as a general definition or a special

one conforming to the application usage. As Doulkeridis, Loutas and Vazirgiannis [14] present, Context could be defined as facts which include other things and add meaning to them. In mobile computing, the first reasoning of those facts is awareness of user’s situation and environment surrounding her/him. Dey and Abowed [15] define context as any information which helps describing the situation of an entity. In this definition, entity is known as a person, place, or any other thing that relates to the interactions between user and application including user and application themselves. Many other articles in the context scope use the latter definition in their researches [16], [17], [18].

In the most of those mentioned researches, context has been defined as some parametric examples in addition to the general conceptual definitions. In general, context can be defined as a set of user’s personal information, his/her preferences and interests, his/her current activities and environmental parameters surrounding him/her including geographical information such as location, direction and speed, environmental information such as temperature and weather, time and social parameters [6], [14], [16], [17], [19]. Dey and Abowed [15] have proposed a two-tiered architecture to define and categorize contextual parameters. At the first level, primary context types which are location, time, identity and activity are located. Other contextual parameters, secondary context types, are located at the second level and are considered as the attributes of the first level types. For example, weather or temperature can be retrieved from the location and time information.

Considering context as an input to information providing phase causes to more effective results [14], [19]. Context-aware applications adapt information/service providing with the context changes [14], [17] and thus bring to the better use of information/services for the users in different situations around them [6], [16], [19].

In the current research, the main goal is to propose an approach for recommendation in mobile commerce considering context information. Context could be used to filter or prioritize services/information for the users [6], [16], [19] and therefore using it in recommendation techniques may be very beneficial. In fact, context describes the users’ dynamic properties and system can inference users’ needs from the context content [16].

A mobile context-aware recommendation system must be able to recommend the items considering items/users’ dynamic properties which describe the situation of users and items and affect the users’ decision process.

Context information in mobile recommendation systems can be divided as follows:

1- Historical (offline) contextual information The knowledge of which are the best users’ favorite

items/services in different context scopes and over time could be registered in the system for using in future estimations. Using those parameters helps to better compute the similarity degree among users and items.

2- Online contextual information


36

Online contextual information includes users (and items/services) in the time of sending the request to the system. A context-aware recommendation system uses this kind of information in a filtering mechanism or prioritizing process for delivering information/services to users.

In the context parameters (historical or online) definition phase, it is necessary to determine the effectiveness degree of them on making the suitable output for the system. This issue is an important sub-problem in context-aware recommendation systems. Therefore context-aware recommendation systems may include both historical and online parameters or only use online parameters to recommend items. Suppose that system only uses online context, so if user is in contexti, the system could recommend items which are in contexti too but if system uses both online and historical context, it could recommend items which are predicted to be liked by the user in contexti not all the items existing in that context.

When a system uses contextual information, it must be modeled and the relationship between it and other system information sets of the system must be determined. Adomavicius, Sankaranarayanan, Sen and Tuzhilin [4] have proposed a multidimensional model for context-aware recommender systems that can be used in mobile recommender systems too. Therefore the historical contextual information could be represented in multidimensional model and recommendation space changes from a 2-dimensional space to a multidimensional one that each of the historical parameters is one of its dimensions. Presenting each dimension as Di, utility function is as below: [10]

u: D1× D2 × ... Dn → Ratings (2)�

Where each Di (each of dimensions) is a subset of Cartesian product of its attributes and each attribute is a set of values:

Di ⊆ Aik1× Aik2×...× Aiki (3)�

Where, each dimension is supposed to have Ki attribute. These attributes are called profile of the dimension (for example user dimension includes name, sex, age and so on).

If a mobile recommender system doesn’t use historical contextual information, multidimensional model changes to a classical 2-dimensional one. Online contextual parameters are delivered to system with users’ request. Online parameters set may not exactly equal with the historical parameters set. In fact set of historical parameters could be a smaller or equal subset of online parameters set.

IV. THE MULTIDIMENSIONAL APPROACH FOR MOBILE CONTEXT-AWARE RECOMMENDER SYSTEMS

Context information affects user’s decision process. In a recommendation system, it means that the item sets which a user likes could be different in various context situations, context1, context2, to contextn. Therefore, the proposed approach uses a multidimensional dataset with the cube of ratings and it includes phases as follow:

(1) Recognizing Users’ different usage patterns under different context situations: As Ehrig, Hasse, Hefke and Stojanovic define [20]”If in two contexts the same (related) entities are used then these contexts are similar”, the context situations that a user has similar usage patterns in them would be determined. In fact, different context situations that have been defined for the system would be clustered based on the user’s usage patterns. The clusters would be labeled from 1 to m (m is the number of clusters).

(2) Making a new 2-dimensional recommendation space: In this phase, using the results of the previous phase, for each user ci, m new user ci1 to cim would be defined. m could be the same or not for different users depending on the method that is used in phase one. In fact, if ci in contexts, contextg and contextc has similar usage patterns and those contexts are labeled with contextout-L (1<L<m), as it is shown in “Fig. 1” a new user ci-L would be defined. User ci-L is equivalent with user ci in different context situations labeled with contextout-L.

Figure 1. Making new users equivalent with user ci in different contextual

situations

So the relation (2) could be redefined as relation (4):

sRating S C :u ′→×′ (4)

Where C ′ is the new user set, S is the same as in relation (2) and sRating ′ is the new users’ rating set. If ci has rated item si in different context situations labeled with contextout-

L(1<L<m), the user ci-L’s rate to item si in relation(4) is the result of a aggregation function (such as average) on the previous ratings.

(3) Doing the recommendation process in the new 2-dimensional space: Any of the model-based or memory-based approaches then could be used in this phase to recommend suitable items to users.

V. IMPLEMENTATION AND EVALUATION For evaluating the proposed approach, a mobile context-

aware recommender system for restaurant food items is used. First, ratings are gathered and then the method is evaluated on that data set. Various dimensions of the system are user and


37

item as the main dimensions and day, time, weather and companion as the contextual dimensions:

• Day: Weekday, Weekend.

• Time: Morning, Noon, Afternoon, Night.

• Companion: Spouse, Family, Friends, Co-workers, Alone, Others.

• Whether: Cold/Sunny, Cold/Rainy, Moderate/Sunny, Moderate/Rainy, Hot/Sunny, Hot/Rainy, others.

In this research, location is not used as a historical parameter, because it is not differed between the same food items of various restaurants in the recommendation scenario.

Data set includes ratings of 630 users to 400 foods in different contextual situations from October to December 2008. The system used for gathering the ratings is implemented with J2ME technology (see http://java.sun.com/ for its description). Evaluation is done offline and with division of the set to 80% (training set) and 20% (test set).

For the usage pattern recognition and for the similar users recognition, self-organizing maps are used. Self-organizing maps (i.e. SOMs) or kohonen model are a kind of unsupervised artificial neural networks. Because they could be used as a clustering technique, they are used in many recommendation systems, especially in collaborative filtering-based recommendation systems. They have been proved to have good performance in the recommendation quality and processing costs [11] [21]. As a simplified definition for them, in a topology-preserving map, units (neurons) located physically next to each other will respond to classes of input vectors that are likewise next to each other. It means these networks find the similar input vectors or cluster them [22], [23] (“Fig. 2”).

Figure 2. Self-organizing map

For each input vector fired node and its neighborhood nodes are updated as below [22], [23]:

))(( W(t) 1)W(t tWX −+=+ α (5)

In the current paper, cosine similarity measure is used to compute the distance between input vectors and weight vectors of units:

∑∑

∑

∈∈

∈=×

=

Sssw

Sssx

Ssswsx

rr

rr

2,

2,

,,

22wx

w.x )w,xcos( (6)

In the usage pattern recognition phase, different contextual situations are clustered for each user. So each input vector of the self-organizing map is one of the context situations (number of the inputs is equal with the production of the number of contextual dimensions’ values). The vectors are defined in a p-dimensional space, where p is the number of the items. Values of the vectors are user’s rating to the items. After user’s usage pattern recognition phase, the new recommendation space is built. In the next phase, using SOM, a collaborative filtering-based recommendation method is used. Self-organizing map networks is used for clustering the users. So each input vector is one of the users of the system. The vectors are defined in a p-dimensional space, where p is the number of items. Values of the vectors are user’s rating to the items.

With changing the node numbers of kohonen layer and considering its effect on the F1 measure [24], the optimal numbers of the neurons for the usage pattern recognition network and similar users’ recognition network have been determined. In the former network, they were changed from 2 to 15 and in the later they were changed from 5 to 35. The final neuron numbers were determined to be 6 neurons (resulted in 6 clusters) in the former network and 21 neurons (resulted in 21 clusters) in the latter network. F1 measure is defined as below [24]:

Recall Precision

n.Recall2.Precisio F1+

= . (7)

Where Precision is defined as the ratio of relevant items selected to number of items selected and Recall is defined as the ratio of relevant items selected to total number of relevant items available.

Average F1 for 200 users and Top-5, Top-10, Top-15, Top-20, Top25 and Top-30 is calculated. The results are presented in “Fig. 3”.

0.46

0.47

0.48

0.49

0.5

0.51

0.52

0.53

0.54

Top‐5 Top‐10 Top‐15 Top‐20 Top‐25 Top‐30

Top Recommendation Items

Average F1

Figure 3. Average F1 measure diagram for the multidimensional recommendation approach


38

The proposed approach is compared with the traditional recommendation approach without considering the contextual parameters in recommendation. The gathered dataset is used in this phase too. Some items are rated only in one context situation but some items are rated in more than one context situation by a single user. In the former, the values of user and item dimensions and the rating corresponding to them are selected. In the latter, the values of user and item dimensions are selected and the rating corresponding to them is calculated with a aggregation function (for example with the average function) between all those ratings.

This method is also implemented with SOMs as a collaborative filtering-based way in the 2-dimensional space. The optimal numbers of the neurons in the kohonen layer was determined to be 19 (resulted in 19 clusters). Average F1 measure for 200 users and Top-5, Top-10, Top-15, Top-20, Top25 and Top-30 is calculated. As the results which are presented in “Fig. 4” show, the recommendation quality is increased in context-aware multidimensional recommendation approach.

0.37

0.38

0.39

0.4

0.41

0.42

0.43

0.44

0.45

Top‐5 Top‐10 Top‐15 Top‐20 Top‐25 Top‐30

Top Recommendation Items

Average F1

Figure 4. Average F1 measure diagram for the traditional recommendation approach

Average F1 measure is also calculated for 10 user of each cluster and for Top-5 recommendation items. The result for both multidimensional approach and traditional approach are presented in “Fig. 5” and “Fig. 6”.

0.47

0.48

0.49

0.5

0.51

0.52

0.53

0.54

0.55

cluster‐1

cluster‐2

cluster‐3

cluster‐4

cluster‐5

cluster‐6

cluster‐7

cluster‐8

cluster‐9

cluster‐10

cluster‐11

cluster‐12

cluster‐13

cluster‐14

cluster‐15

cluster‐16

cluster‐17

cluster‐18

cluster‐19

cluster‐20

cluster‐21

Average F1

Figure 5. Average F1 measure diagram for each cluster in the contextual recommendation approach

0.37

0.39

0.41

0.43

0.45

0.47

0.49

cluster‐1

cluster‐2

cluster‐3

cluster‐4

cluster‐5

cluster‐6

cluster‐7

cluster‐8

cluster‐9

cluster‐10

cluster‐11

cluster‐12

cluster‐13

cluster‐14

cluster‐15

cluster‐16

cluster‐17

cluster‐18

cluster‐19

Average F1

Figure 6. Average F1 measure diagram for each cluster in the contextual recommendation approach

VI. CONCLUSION In this paper a multidimensional approach for

recommendation in m-commerce has been presented. The approach includes three phase, recognizing users’ usage patterns, making a new 2-dimensional space and doing the final recommendation. It has been tested in a restaurant foods recommendation system. Context parameters of the system are Day, Time, Companion and whether. The evaluation results illustrate the proper quality of the approach and the necessity of context in mobile recommender systems.

REFERENCES

[1] E. Turban and D. King, Introduction to E-commerce : Mobile Commerce, Prentice Hall, 2003, pp. 332-380.

[2] N. Shi, Mobile Commerce Applications, Idea Group Publishing, 2004. [3] E. W. T. Ngai and A. Gunasekaran, “A review for mobile commerce

research and applications,” Decision Support Systems, vol. 43, pp. 3 – 15, 2005.

[4] B. E. Mennecke and T. J. Strader, Mobile Commerce - Technology, Theory, and Application, Idea Group Publishing, 2003.

[5] A. H. Morales-Aranda, O. Mayora-Ibarra and S. Negrete-Yankelevich, “M-Modeler: A Framework Implementation for Modeling M-Commerce Applications,” Proceedings of the 6th international conference on Electronic commerce ICEC '04, 2004, pp.596-602,.

[6] A. Schmidt, M. Beigl and H. Gellersen,”There is more to context than location,” Computers & Graphics, vol. 23, 1999, pp. 893-901.

[7] G. Adomavicius and A. Tuzhilin, “Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions”, IEEE, IEEE Computer Society, 2005.

[8] J. Bennett, “A Collaborative Filtering Recommender using SOM clustering on Keywords”, A Proposal for the degree of Masters of Computer Science, Golisano College of Computing and Information Science Rochester Institute of Technology, 2006.

[9] C. Cornelis, J. Lu, X. Guo, G. Zhang, “One-and-Only Item Recommendation with Fuzzy Logic Techniques”, Information Sciences, Vol. 177, 2007, pp. 4906-4921,.

[10] G. Adomavicius, R. Sankaranarayanan, S. Sen, A. Tuzhilin, “Incorporating Contextual Information in Recommender Systems Using a Multidimensional Approach”, ACM Transsactions on Information Systems, Vol. 23, No. 1, 2005, pp. 103-145.

[11] J. Bennett, “Independent Study Report: A survey of SOM and Recommender techniques”, A Research Report, , Golisano College of Computing and Information Science Rochester Institute of Technology, 2006.


39

[12] W. Yang and H. Cheng and J. Dia, “A location-aware recommender system for mobile shopping environments,” Expert Systems with Applications, vol. 34, 2006, pp. 437–445.

[13] Q. Li, Ch. Wang, G. Geng and R. Dai, “A Novel Collaborative Filtering-Based Framework for Personalized Services in M-commerce,” Proceedings of the 16th international conference on World Wide Web WWW '07 , May 2007, PP.1251+.

[14] Ch. Doulkeridis, N. Loutas and M. Vazirgiannis, “A System Architecture for Context-Aware Service Discovery,” Electronic Notes in Theoretical Computer Science, vol. 146, Jun.2006, pp. 101–116.

[15] A. K. Dey and G. D. Abowd, “Towards a Better Understanding of Context and Context-Awareness”, Handheld and Ubiquitous Computing, Vol. 1707/1999, 1999, pp. 304-307.

[16] O. Riva and S. Toivonen, “The DYNAMOS approach to support context-aware service provisioning in mobile environments,” The Journal of Systems and Software, vol. 80, 2007, pp. 1956–1972.

[17] O. Kwon, J. M. Shin and S. W. Kim, “Context-aware multi-agent approach to pervasive negotiation support systems,” Expert Systems with Applications, vol. 31, 2006, pp. 275–285.

[18] R.A. Yaiz, F.Selgert and F. D. Hartog, “On the definition and relevance of Context-awareness in personal networks,” Third Annual International Conference on Mobile and Ubiquitous Systems: Networking&Services, 2006, PP. 1-6.

[19] S. Figge, “Situation-dependent services-a challenge for mobile network operators,” Journal of Business Research, vol. 57, 2004, pp. 1416-1422, 2004.

[20] M. Ehrig, P. Haase, M. Hefke and N. Stojanovic, ” SIMILARITY FOR ONTOLOGIES - A COMPREHENSIVE FRAMEWORK”, [Online]. Available: http://www.aifb.uni-

[21] J. Vesanto and E.Alhoniemi, “Clustering of the Self-Organizing Map”, IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 3, 2000.

[22] N. K. Kasabov, Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering, The MIT Press, 1996.

[23] J. A. Freeman, D. M. Skapura, Neural Networks Algorithms, Applications, and Programming Techniques, Addison-Wesley Publishing Company, 1991.

[24] J. L. Herlocker, J. A. Konstan, L. G. Terveen, J. T. Riedl, “Evaluating Collaborative Filtering Recommender Systems”, ACM Transactions on Information Systems, Vol. 22, 2004, pp. 5-53.

[25] E. Parle and A. Quigley, ”Proximo, Location-Aware collaborative Recommender”, School of Computer Science and Informatics, University College Dublin Ireland 2006.

[26] L. Buriano, M. Marchetti, F. Carmagnola and F. Cena, “The Role of Ontologies in Context-Aware Recommender Systems”, Proceedings of the 7th International Conference on Mobile Data Management, IEEE, 2006.

[27] A. Loizou and S. Dasmahapatra, “Recommender Systems for the Semantic Web”, ECAI 2006 Recommender Systems Workshop, 2006.


40

Efficient Web Log Mining using Doubly Linked Tree

Ratnesh Kumar Jain1 , Dr. R. S. Kasana1 1Department of Computer Science & Applications,

Dr. H. S. Gour, University, Sagar, MP (India)

[email protected], [email protected]

Dr. Suresh Jain2

2Department of Computer Engineering, Institute of Engineering & Technology,

Devi Ahilya University, Indore, MP (India)

[email protected]

Abstract— World Wide Web is a huge data repository and is growing with the explosive rate of about 1 million pages a day. As the information available on World Wide Web is growing the usage of the web sites is also growing. Web log records each access of the web page and number of entries in the web logs is increasing rapidly. These web logs, when mined properly can provide useful information for decision-making. The designer of the web site, analyst and management executives are interested in extracting this hidden information from web logs for decision making. Web access pattern, which is the frequently used sequence of accesses, is one of the important information that can be mined from the web logs. This information can be used to gather business intelligence to improve sales and advertisement, personalization for a user, to analyze system performance and to improve the web site organization. There exist many techniques to mine access patterns from the web logs. This paper describes the powerful algorithm that mines the web logs efficiently. Proposed algorithm firstly converts the web access data available in a special doubly linked tree. Each access is called an event. This tree keeps the critical mining related information in very compressed form based on the frequent event count. Proposed recursive algorithm uses this tree to efficiently find all access patterns that satisfy user specified criteria. To prove that our algorithm is efficient from the other GSP (Generalized Sequential Pattern) algorithms we have done experimental studies on sample data.

Keywords: Web mining; Pattern discovery.

I. INTRODUCTION

The World Wide Web provides almost unlimited access to the documents on the Internet to its users. Web mining is a specialized field of data mining. In web mining we apply data mining techniques on the huge repository of web data. Web mining can be categorized into web content mining, web structure mining and web usage mining. Web usage mining looks at the log of Web access. Web server records each access of the web page in web logs. Number of entries in the web logs is increasing rapidly as the access to the web site is increasing. These web logs, when mined properly can provide useful information for decision-making. Most of the web logs contain information about fields: IP Address, User Name, Time Stamp, Access Request, Result Status, Byte Transferred,

Referrer URL and User Agent. There are many efforts towards mining various patterns from Web logs [4, 9,11].

Web access patterns mined from Web logs can be used for purposes like: Improving design of web sites, used to gather business intelligence to improve sales and advertisement, analyzing system performance, building adaptive Web sites [7, 6, 10].

Finding access pattern is the problem of finding association rules in the Web logs [2]. The problem of finding association rules falls within the purview of database mining [1,5] also called knowledge discovery in databases. Mining frequent access patterns (called sequential access pattern mining) in a sequence database was firstly introduced by Agrawal and Srikant [3] which is based on AprioriAll algorithm. After its introduction lots of work was done to mine sequential pattern efficiently. Srikant and Agrawal in 1996 [8] gave a generalized sequential pattern mining algorithm, GSP, which outperforms their AprioriAll algorithm. In this algorithm sequence database is scanned many times to mine sequential access pattern. In the first scan, it finds all frequent 1-event and forms a set of 1-event frequent sequences. In the following scans, it generates candidate sequences from the set of frequent sequences and checks their supports. The problem with GSP is that it does not perform well if the length of the access sequences and transactions are large, which is the basic need of Web log mining.

In this paper we address the problem of handling large access patterns efficiently. Our solution is consists of two phases: In the first phase we compress the presentation of access sequences using doubly linked tree structure and in the second phase we apply the mining algorithm to efficiently mine all the frequent access sequences. We give a performance study in support of our work, which proves that this mining algorithm is faster than the other Apriori-based GSP mining algorithms.

II. PROBLEM STATEMENTS

As we have discussed a Web log consist of many types of information including the information about the user and the access done by the user. We can extract the unnecessary data and only keep the required data in the preprocessing phase of the log mining. If each access is regarded as event we can say that after preprocessing web log is a sequence of events from


41

one user or session in timestamp ascending order. Let us define some terms before finally stating the problem.

Let E be a set of events. Then nkk eeeeeS LL 121 += ( ie ∈ E) for (1≤ i ≤ n) is an access sequence and n is the length of the access sequence called n-sequence. Remember in any access sequence repetition is allowed i.e. ei≠ej for i≠j. Access sequence keeeeS ′′′′= ....' 321 is called a subsequence of sequence S and S is called the super-sequence of sequence S’denoted asS’⊆ S, if and only if there exist 1≤ i1<i2<K ik ≤ n, such that

jij ee =′ for (1 ≤ j ≤ k). S’ is called proper subsequence of

sequence S that is SS ⊂′ , if and only if S’is a subsequence of S and S’≠S. Subsequence nkks eeeS L21 ++= of S is a super-sequence of a sequence lkk eeeP L21 ++= where nl ≤ then

kp eeeS L21= is called the prefix of S with respect to sequence P and sS is called the suffix of pS .

Let Web access sequence database WAS is represented as a set { }mSSS ,, 21 L where each iS (1≤ i ≤m) are access sequences. Then the support of access sequence S in WAS is

defined as { }m

SSSSSup

ii ⊆=)( . A sequence S is said a ξ-pattern

or simply (Web) access pattern of WAS, if ξ≥)(sSup . Here it is important to remember that events can be repeated in an access sequence or pattern, and any pattern can get support at most once from one access sequence.

The problem of mining access pattern is: Given Web access sequence database WAS and a support threshold ξ, mine the complete set of ξ-pattern of WAS.

III. EFFICIENT WEB LOG MINING USING DOUBLY LINKED TREE

Most of the previously proposed methods were based on the Apriori heuristic. According to Apriori “if a sequence G is not a ξ-pattern of sequence database, any super-sequence of G cannot be a ξ-pattern of sequence database.”

Though this property may substantially reduce the size of candidate sets but the combinatorial nature of the pattern mining, it may still generate a huge set of candidate patterns, especially when the sequential pattern is long. This motivates us to introduce some new technique for Web access pattern mining. The central theme of our algorithm is as follows:

Scan the WAS twice. In the first scan, determine the set of frequent events. An event is called a frequent event if and only if it appears in at least (ξ . |WAS| ). Where |WAS| denotes the number of access sequences in WAS and ξ denote the support threshold. In the second scan, build a doubly linked tree After creating a doubly linked tree we recursively mine it using conditional search to find all ξ-pattern.

The following observations are helpful in the construction of the doubly linked tree.

1. Apriori property that if a sequence G is not a ξ-pattern of sequence database, any super-sequence of G cannot be a ξ-pattern of sequence database is used. That means, if an event e is not in the set of frequent 1-sequences, there is no need to include e in the construction of a doubly linked tree.

2. We create a single branch for the shared prefix P in the tree. It helps in saving space and support counting of any subsequence of the prefix P.

Above observation suggest that doubly linked tree should be defined to contain following information:

• Each node must contain event (we call it label) and its count except the root node which have empty label and count=0. The count specifies the number of occurrences of the corresponding prefix ended with that event in the WAS.

• To manage the linkage and backward traversal we need two additional pointers except the pointers tree normally has. First, all the nodes in the tree with the same label are linked by a queue called event-node queue. To maintain the front of a queue for each frequent event in the tree one header table is maintained. Second, for backward traversal from any intermediate node to the root we add a pointer to the parent at each node.

The tree construction process is as follows: First of all filter out the nonfrequent events from each access sequence in WAS and then insert the resulting frequent subsequence into tree started from the root. Considering the first event, denoted as e, increment the count of child node with label e by 1 if there exists one; otherwise create a child labeled by e and set the count to 1. Then, recursively insert the rest of the frequent subsequence to the subtree rooted at that child labeled e. The complete algorithm for doubly linked tree creation is given below:

Algorithm 1 (Doubly Linked Tree Construction)

Input: A Web access sequence database WAS and a set of all possible events E.

Output: A doubly linked tree T.

Method:

Scan 1: 1. For each access sequence S of the WAS 1.1. For each event in E

1.1.1. For each event of an access sequence of WAS. If selected event of access sequence is equal to selected event of E then

a. event count = event count + 1 b. continue with the next event in E.

2. For each event in E if event qualify the threshold add that event in the set of frequent event FE.

Scan 2:

1. Create a root node for T


42

2. For each access sequence S in the access sequence database WAS do

(a) Extract frequent subsequence S’ from S by removing all events appearing in S but not in FE. Let

nsssS L21'= , where is (1≤ i ≤ n) are events in S’. Let current node is a pointer that is currently pointing to the root of T.

(b) For i=1 to n do, if current node has a child labeled is , increase the count of is by 1 and make current node point to is , else create a new child node with label= is , count =1, parent pointer = current node and make current node point to the new node, and insert it into the is -queue.

3. Return (T);

After the execution of this algorithm we get doubly linked tree. This contains all the information in very condensed form. Now we do not need WAS database to mine the access pattern. The length of the tree is one plus the maximum length of the frequent subsequences in the database. The width of the tree that is the number of distinct leaf nodes as well as paths in a doubly linked tree cannot be more than the number of distinct frequent subsequences in the WAS database. Access sequences with same prefix will share some upper part of path from root and due to this scheme size of the tree is much smaller than the size of WAS database.

Maintaining some additional links provides some interesting properties which helps in mining frequent access sequences.

1. For any frequent event ie , all the frequent subsequences contain ie can be visited by following the ie -queue, starting from the record for ie in the header table of doubly linked tree.

2. For any node labeled ie in a doubly linked tree, all nodes in the path from root of the tree to this node (excluded) form a prefix sequence of ie . The count of this node labeled ie is called the count of the prefix sequence.

3. A path from root may have more than one node labeled ie , thus a prefix sequence say G of ie if it contain another prefix sequence say H of ie then G is called the super-prefix sequence and H is called the sub-prefix sequence. The problem is that super-prefix sequence contributes in the counting of sub-prefix sequence. This problem is resolved using unsubsumed count. A prefix sequence of

ie without any super-prefix sequences, unsubsumed count is the count of ie . For a prefix sequence of

ie with some super-prefix sequences, the unsubsumed count of it is the count of that sequence minus unsubsumed counts of all its super-prefix sequences.

4. It is very difficult to traverse from root to the node pointed by the ie -queue because it requires several traversal hits to get required prefix. Parent pointer allows backward traversal from any intermediate node pointed by ie -queue to the root and efficiently extract the prefix sequences.

With the above information we can apply conditional search to mine all Web access patterns using doubly linked tree. Conditional search means, instead of searching all Web access patterns at a time, it turns to search Web access patterns with same suffix. This suffix is then used as the condition to narrow the search space. As the suffix becomes longer, the remaining search space becomes smaller potentially. The algorithm to mine all ξ-patterns is as follows:

Algorithm 2 (Mining all ξ-patterns in doubly linked tree)

Input: a Doubly linked tree T and support threshold ξ.

Output: the complete set of ξ-patterns.

Method:

1. If doubly linked tree T has only one branch, return all the unique combinations of nodes in that branch

2. Initialize Web access pattern set WAP=φ . Every event in T itself is a Web access pattern, insert them into WAP

3. For each event ie in T, a. Construct a conditional sequence base of ie , i.e.

PS( ie ), by following the ie -queue, count conditional frequent events at the same time

b. If the set of conditional frequent events is not empty, build a conditional doubly linked tree for

ie over PS( ie ) using algorithm 1. Recursively mine the conditional doubly linked tree

c. For each Web access pattern returned from mining the conditional doubly linked tree, concatenate

ie to it and insert it into WAP

4. Return WAP.

IV. PERFORMANCE EVALUATION

To compare the performance of Doubly Linked Tree mine Algorithm and GSP Algorithm, both are implemented with C++ language running under Turbo C++. All experiments are performed on 2.16GHz NoteBook Computer with 1GB of RAM. The Operating System is Microsoft XP Professional Version 2002 Service Pack 2. To illustrate the performance comparisons we used freely available web access logs on internet. Since these web logs are in different format we did some preprocessing work to convert this weblog into the Web Access Pattern Dataset (WASD) format. This original Web logs are recorded in server from 23 feb, 2004 to 29 feb, 2004 are 61 KB in size, and 36,878 entries which includes 7851 unique host names, 2793 unique URLs and 32 different file


43

types (extensions). We filter out the web logs according to our need.

There are so many measures to compare efficiency of the two algorithms. We here used run time as a measure of efficiency. To compare the performance of Doubly Linked Tree mine and GSP, we did several experiments that can be categorized into two types. In the first type of experiments we checked the performance with respect to the threshold for a fixed size of WASD. In the second type of experiments we checked the performance with respect to the size of the WASD for a fixed support threshold.

As the results shows performance of the doubly linked tree mining out performs GSP in both the cases. As can be seen in figures when the support threshold is low Doubly Linked Tree mining algorithm took approx. 100 sec. while GSP took 450 sec. and this difference reduces as support threshold increases. And as the size of data base increases, run time of GSP increases more rapidly than Doubly Linked Tree mining.

0

100

200

300

400

500

5 10 15 20 25Support threshold (%)

Run

tim

e (in

sec

.)

doubly linkedtree miningGSP

(a) Comparison for varying support threshold

0100200300400500600

200

400

600

800

1000

No.of access seq. in WAS(k)

Run

tim

e (in

sec

.)

doubly linkedtree miningGSP

(b) Comparison for varying Number of Access Sequences

Figure 1: Experimental Results of the Comparative study

between Doubly Linked Tree Mining and GSP Algorithms.

V. CONCLUSION As shown in figure 1(a) and 1(b) we can say that run time required by GSP for any support threshold and for any size of Web Access Sequence Database is higher than Doubly Linked Tree mining Algorithm. For low support threshold and for large data base Doubly Linked Tree mining performance is much better than GSP. While for higher support threshold and small size of data set since only few events qualify the criteria of frequent event there is no significant difference in both the algorithms. The comparison proves that Doubly Linked Tree mining Algorithm is more efficient than GSP especially for

low support threshold and large Web Access Sequence Database. For mining sequential patterns from web logs, the following aspects may be considered for future work. Some algorithm should be developed so that we do not need to do preprocessing work manually, rather these mining algorithms can be applied directly on the web log files. Also efficient web usage mining could benefit from relating usage of the web page to the content of web page. Some other area of interest may be implementing Doubly Linked mine algorithm to the distributed environment

ACKNOWLEDGMENT Author is grateful to the technical reviewers for the comments, which improved the clarity and presentation of the paper. Author wishes to thank Dr. Pankaj Chaturvedi and Mr. Deepak Sahu for all the discussions and contributions during the initial stages of this research.

REFERENCES [1] R. Agrawal, T. Imielinski, and A. Swami. Database mining: A

performance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6):914-925, December 1993. Special Issue on Learning and Discovery in Knowledge Based Database.

[2] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases, pages 487-499, Santiago, Chile, September 1994.

[3] R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering, pages 3-14, Taipei, Taiwan, March 1995.

[4] R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining World Wide Web browsing patterns. In Journal of Knowledge & Information Systems, Vol. 1, No. 1, 1999.

[5] M. Holsheimer and A. Siebes. Data mining: The search for knowledge in databases. Technical Report CS-R9406, CWI, Netherlands, 1994.

[6] B. Mobasher, R. Cooley, and J. Srivastava. Automatic personalization based on Web usage mining. In Communications of the ACM, (43) 8, August 2000.

[7] M. Perkowitz and O. Etzioni. Adaptive Sites: Automatically learning from user access patterns. In Proc. 6th Int’l World Wide Web Conf., Santa Clara, California, April 1997.

[8] R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 1-12, Montreal, Canada, June 1996.

[9] M. Spiliopoulou and L. Faulstich. WUM: A tool for Web utilization analysis. In Proc. 6th Int’l Conf. on Extending Database Technology (EDBT’98), Valencia, Spain, March 1998.

[10] L. Tauscher and S. Greeberg. How people revisit Web pages: Empirical findings and implications for the design of history systems. In Int’l Journal of Juman Computer Studies, Special Issue on World Wide Web Usability, 47:97-138, 1997.

[11] O. Zaiane, M. Xin, and J. Han. Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. In Proc. Advances in Digital Libraries Conf. (ADL’98), Melbourne, Australia, pages 1244-158, April 1998.

AUTHORS PROFILE

Ratnesh Kumar Jain is currently a research scholar at Department of Computer Science and Applications, Dr. H. S. Gour Central University (formerly, Sagar University) Sagar, M P, India. He completed his bachelor’s degree in Science (B. Sc.) with Electronics as special subject in 1998 and master’s degree in computer


44

applications (M.C.A.) in 2001 from the same University. His field of study is Operating System, Data Structures, Web mining, and Information retrieval. He has published more than 4 research papers and has authored a book.

Suresh Jain completed his bachelor’s degree in civil engineering from Maulana Azad National Institute of Technology (MANIT) (formerly, Maulana Azad College of Technology) Bhopal, M.P., India in 1986. He completed his master’s degree in computer engineering from S.G. Institute of Technology and Science, Indore in 1988, and doctoral studies (Ph.D. in computer

science) from Devi Ahilya University, Indore. He is professor of Computer Engineering in Institute of Engineering & Technology (IET), Devi Ahilya University, Indore. He has experience of over 21 years in the field of academics and research. His field of study is grammatical inference, machine learning, web mining, and information retrieval. He has published more than 25 research papers and has authored a book.

R. S. Kasana completed his bacholar’s degree in 1969 from Meerut University, Meerut, UP, India. He completed his master’s degree in Science (M.Sc.-Physics) and master’s degree in technology (M. Tech.-Applied Optics) from I.I.T. New Delhi, India. He completed his doctoral and post doctoral studies from Ujjain University in 1976 in Physics

and from P. T. B. Braunschweig and Berlin, Germany & R.D. Univ. Jabalpur correspondingly. He is a senior Professor and HoD of Computer Science and Applications Department of Dr. H. S. Gour University, Sagar, M P, India. During his tenure he has worked as vice chancellor, Dean of Science Faculty, Chairman Board of studies. He has more than 34 years of experience in the field of academics and research. Twelve Ph. D. has awarded under his supervision and more than 110 research articles/papers has published.

Abstract- The performance of Mobile Ad hoc networks

(MANET) depends on the cooperation of all active nodes.

However, supporting a MANET is a cost-intensive activity for a

mobile node. From a single mobile node perspective, the

detection of routes as well as forwarding packets consume local

CPU time, memory, network-bandwidth, and last but not least

energy. We believe that this is one of the main factors that

strongly motivate a mobile node to deny packet forwarding for

others, while at the same time use their services to deliver its

own data. This behavior of an independent mobile node is

commonly known as misbehaving or selfishness. A vast amount

of research has already been done for minimizing malicious

behavior of mobile nodes. However, most of them focused on the

methods/techniques/algorithms to remove such nodes from the

MANET. We believe that the frequent elimination of such miss-

behaving nodes never allowed a free and faster growth of

MANET. This paper provides a critical analysis of the recent

research wok and its impact on the overall performance of a

MANET. In this paper, we clarify some of the misconceptions in

the understating of selfishness and miss-behavior of nodes.

Moreover, we propose a mathematical model that based on the

time division technique to minimize the malicious behavior of

mobile nodes by avoiding unnecessary elimination of bad nodes.

Our proposed approach not only improves the resource sharing

but also creates a consistent trust and cooperation (CTC)

environment among the mobile nodes. The simulation results

demonstrate the success of the proposed approach that

significantly minimizes the malicious nodes and consequently

maximizes the overall throughput of MANET than other well

known schemes.

Keywords- channel capacity, mobile nodes, MANET,

throughput analysis.

I. INTRODUCTION

Misbehavior in mobile ad-hoc networks occurs for several

reasons. Selfish nodes misbehave to save power or to improve

their access to service relative to others [1]. Malicious

intentions result in misbehavior as exemplified by denial of

service attacks. Faulty nodes simply misbehave accidentally.

Regardless of the motivation for misbehavior its impact on

the mobile ad-hoc network proves to be detrimental,

decreasing the performance and the fairness of the network,

and in the extreme case, resulting in a non-functional

network [2]. This paper addresses the question of how to

make network functional for normal nodes when other nodes

do not route and forward packets correctly. Specifically, in

mobile ad-hoc networks, nodes do not rely on any routing

infrastructure but relay on packets for each other. Thus

communication in mobile ad-hoc networks functions properly

only if the participating nodes cooperate in routing and

forwarding. However, it may be advantageous for in nodes

not to cooperate, such as a selfish node wants to preserve own

resource to save power, memory, network-bandwidth, and

local CPU time. Therefore nodes assume themselves that

other nodes would forward the packet. This selfish or

malicious intention of nodes can significantly degrade the

performance of mobile ad-hoc-networks by denial of service.

Many contributions to prevent misbehavior have been

submitted so far, such as payment schemes for network

services, secure routing protocols, intrusion detection,

economic incentives and distributed reputation systems to

detect and isolate misbehaved nodes. These exiting

approaches alleviate some of the problems, but not all. In this

paper, we focus on the design of a new time division based

scheme that can avoid unnecessary elimination of malicious

nodes while at the same time maximize the throughput of the

system by increasing the recourse sharing among the mobile

nodes. The existing methods/algorithms not only creating a

performance bottleneck (i.e., by increasing the network

congestion, transmission overhead etc.) but also diminishing

the self-growing characteristic of a peer to peer network.

These methods such as CONFIDANT [3] and CORE [4] force

the participating nodes to adopt the same behavior as the

other selfish nodes that have already been removed from the

network due to the lack of resources. We believe that we

should not propose any algorithm/method that becomes the

reason for reducing the network resources and consequently

force the existing participating nodes to behave exactly in the

same way as other removed nodes. Instead, we strongly

believe that we should come up with something that not only

improves the resources and resource sharing but also creates a

A New Scheme for Minimizing Malicious

Behavior of Mobile Nodes in Mobile Ad Hoc

Networks

Syed S. Rizvi and Khaled M. Elleithy Computer Science and Engineering Department, University of Bridgeport, Bridgeport, CT, USA

{srizvi, elleithy}@bridgeport.edu

(IJCSIS) International Journal of Computer Science and Information Security Vol. 3, No.1, 2009

45

consistent trust and cooperation (CTC) environment among

the mobile nodes.

The rest of this paper is organized as follows: Section II

describes the research that has already been done in this area.

The proposed analytical and mathematical models for CTC

are presented in Section III. The simulation results are

provided in section IV. Finally, section V concludes the

paper.

II. RELATED WORK

The terms reputation and trust are being used for various

concepts in the literature, also synonymously [5, 6]. We

define the term reputation here to mean the performance of a

principal in participating in the base protocol as seen by

others. By the term trust we denote the performance of a

principal in the policing protocol that aims at protecting the

base protocol.

The key thing in reputation system is watchdog and

pathrater which have been proposed by Marti, Giuli, Lai and

Baker [7]. They observed increased throughput in mobile ad-

hoc networks by complementing DSR with a watchdog for

detection of denied packet forwarding and a path rater for

trust management and routing policy rating every path used,

which enable nodes to avoid malicious nodes in their routes

as a reaction. Their approach does not punish malicious

nodes that do not cooperate, but rather relieves them of the

burden of forwarding for others, whereas their messages are

forwarded without complaint. This way, the malicious nodes

are rewarded and reinforced in their behavior. They used a

watchdog that identifies misbehaving nodes and a pathrater

that helps routing protocols avoid these nodes. When used

together in a network with moderate mobility, the two

techniques increase throughput by 17% in the presence of

40% misbehaving nodes, while increasing the percentage of

overhead transmissions from the standard routing protocol's

9% to 17%. During extreme mobility, watchdog and pathrater

can increase network throughput by 27%, while increasing

the overhead transmissions from the standard routing

protocol's 12% to 24%.

CORE, a collaborative reputation mechanism proposed by

Michiardi and Molva [4], also has a watchdog component;

however it is complemented by a reputation mechanism that

differentiates between subjective reputation (observations),

indirect reputation (positive reports by others), and functional

reputation (task-specific behavior), which are weighted for a

combined reputation value that is used to make decisions

about cooperation or gradual isolation of a node. Reputation

values are obtained by regarding nodes as requesters and

providers, and comparing the expected result to the actually

obtained result of a request. Nodes only exchange positive

reputation information.

A reputation-based trust management has been introduced

by Aberer and Despotovic in the context of peer-to-peer

systems [8], using the data provided by a decentralized

storage method (P-Grid) as a basis for a data-mining analysis

to assess the probability that an agent will cheat in the future

given the information of past transactions.

A context-aware inference mechanism has been proposed

by Paul and Westhoff [9], where accusations are related to the

context of a unique route discovery process and a stipulated

time period. A combination is used that consists of un-keyed

hash verification of routing messages and the detection of

misbehavior by comparing a cached routing packet to

overheard packets.

The EigenTrust mechanism was proposed by Kamvar,

Schlosser and Garcia-Molina [10] which aggregates trust

information from peer by having them perform a distributed

trust calculation approaching the Eigenvalue of the trust

matrix over the peers. The algorithm relies on the presence of

pre-trusted peers, that is some peers have to be trusted, prior

to having interacted with them. By isolating peers with bad

reputation, the number of inauthentic downloads is decreased,

however, if the motivation for misbehavior is selfishness, the

misbehaved peers are rewarded. If the download is not

successful, the peer is removed from the list of potential

downloads. A potential drawback of this approach is that it

provides an incentive to change one’s identity after having

misbehaved. A formal model for trust in dynamic networks

based on a policy language has been proposed by Carbone,

Nielsen, and Sassone [11]. They express both trust and the

uncertainty of it as trust ordering and information ordering,

respectively. In their model, only positive information

influences trust, such that the information ordering and the

trust ordering can differ. In our system, however, both

positive and negative information influence the trust and the

certainty, since we prefer p positive observations that come

out of n total observations to p out of N when n<N.

III. PROPOSED ANALYTICAL AND MATHEMATICAL MODELS

FOR CREATING CONSISTENT TRUST AND COOPERATION

This section first presents an analytical model that gives

our hypothesis to mitigate the problem of misbehavior among

the mobile nodes. Secondly, we use the proposed analytical

model to create a corresponding mathematical model. The

creation of mathematical model can be viewed as a

formalization of the proposed hypothesis. Based on the

proposed mathematical model, we perform the numerical and

simulation analysis for variety of scenarios in two parts. First,

we use the mathematical model to run different scenarios in

order to determine the performance of Ad-hoc networks by

analyzing different critical network parameters such as

throughput, transmission overhead and the utilization.

Secondly, we use the same set of parameters as a performance

measure.

A. The Proposed Analytical Model

We model the Ad-hoc network in much the same way as

other researcher does except this paper introduces the new

concept of time division. The idea of time division can simply


46

be envisioned by considering a particular node of a network

that has a potential to misbehave in the absence of the

sufficient resources require to forward the packets of the

neighboring nodes. This implies that if one can ensure that

the network has enough resources that can be shared equally

among the network nodes, then it can be assumed that the

possibility of node misbehavior degrades significantly. Thus

this reduction in the node misbehavior can be achieved

through the time division technique that divides the time

asymmetrically into the following two times: transmission-

time required for node-packets and transmission-time

required for neighbor-packets. The asymmetric division

enables a node to effectively adjust the time required to

transmit its own packets and/or the neighbor’s packets. The

reason for using the asymmetric division of the available time

is to allow a node to effectively utilize the time by dividing it

with respect to its current status (i.e., the available recourses)

and consequently utilizing the bandwidth in an efficient

manner. The efficient utilization of the bandwidth satisfies

the requirement of the fairness which is one of the key factors

that forces a node to unfair with its neighbor. This indirectly

points that we reduce the chances of misbehave since the

node now has a total authority on the available resources. It

should also be noted that we adopt an asymmetric approach to

work with the time division method for this research which

opposed to the conventional division of time (i.e., the

symmetric or equal division employed by many different

techniques).

Thus this clearly allows a node to optimize the use of

network parameters such as throughput, transmission

overhead and the utilization by effectively utilizing the total

time with respect to the current situation of the network. In

other words, the proposed hypothesis can be considered as a

dynamic mechanism that allows all nodes to perform

performance optimization at run time by intelligently using

the available time which is one of key elements of any system.

In either case, the proposed hypothesis moves the control

from the resources to nodes.

B. The Proposed Mathematical Model

Before going to develop the actual mathematical model

based on the above analytical model, it is worth mentioning

some of our key assumptions. These assumptions help

understanding the complex relationship between a large

numbers of parameters. For the proposed mathematical

model, we assume that a system has K nodes where each

individual node k not only works as a normal mobile station

but also works as a packet forwarding device for the other

nodes. In addition, we assume that any kind of topology can

be implemented among the mobile nodes to construct the Ad-

hoc network. This assumption allows us to implement

different scenario (such as a node can have any number of

input and output lines) on each node of the network to show

the consistency of the proposed analytical and mathematical

model. For the ease of simplicity, we perform the numerical

analysis for a single node k. This can be further extended for

the whole network by computing the collective behavior of

the Ad-hoc network. This approach allows a reader to grape

the idea of the proposed method from a very simple equation

to highly complex derivations. The systems parameters along

with their definitions are listed in Table I.

The primary principal of Ad-hoc network is that it allows

each node of the network to fully participate in the

construction of the network. The word fully participation

leads us to the fact that a node not only transmits its own

packets to the other neighboring nodes but also provides its

services to other nodes as a forwarding device. For the

proposed method, we assume that a node can decide to

transmit its own packets with a certain probability while at

the same time it can also deny the transmission of the other

neighboring packets with a difference of a certain

probabilities. In simple words, we can develop a relationship

TABLE I. SYSTEM PARAMETERS DEFINITIONS

Parameters Definition

it

Total time use by node

pt

Time that spend node on personal packets

npt

Time that spend node on neighbor packets

putT

Through put of the node

RD

Data rate on route

K Is the no of packets

pN

Node power

ppN

Node power use on personal packets

npN

Node power use on neighbor packets

poutK

Personal Packets that goes out form node

noutK

Neighbor packets that goes to out from node

ninK

Neighbor packets that come in the node

tU

Total utilization

nRU

Utilization on number of route


47

between these two probabilities as follows: a node can

transmit the self generated packet(s) with a probability of p

where as it can transmit its neighbor packet(s) with the

probability of q.

Suppose, p is the probability for which a node forwards

personal packets where as p (I – p) is the probability for

which a node transmit packets received from one ore more

neighbors. In addition, we assume that k is total number of

packets that can be transmitted by a certain node of the Ad-

hoc network. The total numbers of packets include both the

self generated packets and the packets receive from one or

more nodes. Taking this into account, we can say that if the

probability of transmission of a single packet is (1-p)x where x

represents a single packet, then the probability to

transmission k packets would be (1-p)k where k represents the

total number of packets that a node can transmit. This leads

us to the following mathematical fact:

( )1k

p− (1)

Equation (1) can simply be formalized for k number of

packets as follows:

( )1k

p p− (2)

As mentioned earlier, the proposed method is exclusively

dependent on the time division methodology where a node

can divide the time asymmetrically to represent the time it

needs to transmit self generated packets as well as the time it

takes to transit the packets arriving from one or more nodes.

To make our proposed approach more realistic, we assume

that if the packet that resides in a certain node is not

delivered to its intended destination within the specified time,

then that packet must be discarded by the node. The lost of

the packet at the node level forces us to retransmit the packet.

This assumption is essential for us to make our derivations

close to what actually happen in the real world. This also

helps us demonstrating the effectiveness of the proposed

algorithm in the presence of packet retransmission.

For the ease of understating, we assume that the time a

node takes to transmit self generated packet can be

represented as t pp where as the time it takes to forward the

packets received from one or more neighbors is represented as

tnp . It should be noted that the total available time per node

is just the sum of the time a node takes to transmit self

generated packet and time it takes to forward the packets

received from one or more neighbors. This relationship can

be mathematically expressed in the following equation:

i pp npt t t= + (3)

where i represents the index of node that can be expended

from 1 to K (i.e., K represents the total nodes present in a Ad-

hoc network)

As we mentioned in the introduction that there are some

critical parameters such as throughput, transmission overhead

and the utilization that one should consider when the

intention is to perform a true evaluation of a network. Based

on this, we are now in the position to give our hypothesis

about one of the key parameter, system throughput. The

maximum throughput is defined as the asymptotic throughput

when the load (the amount of incoming data) is very large. In

packet switched network where the load and the throughput

are equal, the maximum throughput may be defined as the

load in bits per seconds. Thus this in turns lead us to a fact

that the maximum throughput can not be defined in the

presence of packet drops at the node level. As mentioned

earlier, to make our model more realistic we consider the

possibility of packet drops and consequently the packet

retransmission at the node level. In addition to that, we

believe that the maximum throughput can only be defined

when the delivery time (that is the latency) asymptotically

reaches infinity. The second argument is absolutely not true

for the proposed algorithm, since we have a finite time

available per node that indicates the presence of finite

bandwidth. That is both of them are the realistic assumptions

made by us for proof the authenticity of the proposed time

division technique. Thus these two arguments force us to

derive a new formula that behaves with respect to the

proposed time division technique. The throughput from the

proposed algorithm for a certain node of the Ad hoc network

can be computed as follows:

put

Total Packets ForwardedT

Total Time= (4)

The denominator of (4) is derived from (3) where as the

numerator of equation is determined by using (1) and (2).

One can see that as we increase the left hand side of (2), it

causes a decrease in the left hand side of (4). It should also be

noted that as we increase the sum of (1) and (2), it

significantly increases the left hand side of (4). To make these

relationships simple, we can say that the increase in the sum

of (1) and (2) causes an increase in the throughput where as

an increase in the total time that is determined by (3) causes a

decrease in the throughput per node. This is because the more

we increase the time, the more bandwidth we need to reserve

to satisfy the transmission requirements.

A significant increase in the bandwidth utilization (which

is beyond the scope of the available bandwidth per node)

represents degradation in the throughput that indicates an

increase in the possibility of node misbehavior. Thus, this

implies that the proposed algorithm is not only improving the

performance but also providing a chance to choose the

optimal values of critical parameters (such as time) to achieve

comparatively better performance than the others well known

Ad- hoc networks routing algorithms. Equation (4) can be

further simplified in the following form:


48

' 'Node s Packets Neighbour s Packets

Tput Total Time

+= (5)

To formalize the above discussion, we can combine

probabilities of transmission from (1) and (2) with the total

available time per node from (3) in (5). Thus this expresses

the node throughput not only by means of total available time

but also by means of the total number of packets a node can

transmit. The final result can be expressed in the following

equation:

( ) ( )1 1k k

iputT p p t= − + − (6)

It should be noted that (6) gives node throughput by

considering the time ti spends on a single packet (that is the

time spend on one packet is the sum of the time spend on self

generated packets and the neighbor packets). Solving (5) for

k number of packets in terms of the total time required by a

node can be expressed in the following equation:

( ) ( )1 1

k k

pp k np kit t t= +∑ ∑ 1 k≤ < ∞ (7)

where k in (7) represents the number of packets that are

bounded between 1 and the infinity. The first and the second

quantity of the right hand side of (7) are indicating the time

required transmitting the self generated packets and the time

required to transmit the neighbor packets. In addition to that,

it would be interesting to compute the time that the node can

spend in transmitting the self generated packets and compare

it with the time required to transmit the neighbor packets. For

doing this, one may need to generate a generic time that can

be further used in computing the specific time. The generic

time equation can simply be stated as:

no of packett

data rate= (8)

Using (8), one can now compute the two major components

of the proposed time division algorithm. It is essential in

order to understand the concept of asymmetric division. One

of the two asymmetric time division quantities can be

quantified as follows:

(1 )

R

kP Ptnp

D

−= (9)

where DR

in (9) represents the data rate.

Recall one of our fundamental assumptions that a node

transmits k number of packets in total time ti . This

assumption allows us to set up a lower and upper bound on

the number of packets that a node can transmit. Therefore,

the limit for k should exist somewhere zero to infinity. One

of the main reasons for recalling this assumption is make a

more generalized form of (9). That is we need to derive the

same expression for k number of packets that a node can

transmit. In addition to this assumption, let tnp is the time

taken by a node to forward packets received from one or more

neighbors. Taking these two factors into account, one can

generalize (9) as follows:

( )

1

1k k

npk R

P Pt

D

≤∞

≥

−= ∑ where 1 K≤ ≤ ∞ (10)

The numerator of (10) is just a summation of total packets

forwarded by a node with respect to the probabilities set up at

static time. Similarly, the denominator is the data rate at

which the numbers of bits per packets are arrived at the

destination (note that the destination in this case is the

targeted node). One of the main advantages of this

generalization is the analysis of the proper behavior of a node

in the presence of malicious node.

In similar manner, one can derive the corresponding

generalized form of an equation for node’s personal packets.

This introduction, therefore, allows us to make the following

assumption and derive another mathematical expression for

node’s personal packets. If t pp is the total time taken by a

node to forward its own k number ofpackets, then equation

for t pp can be rewritten as.

1

(1 )

R

kk Pt pp

D

− = ∑

where 1 K≤ ≤ ∞ (11)

Equation (11) is the summation of probabilities of one

packet to k number of packets per node in the presence of a

certain data rate. The numerator of (11) is just a summation

of total packets forwarded by a node with respect to the

probabilities set up at static time. Similarly, the denominator

is the data rate at which the numbers of bits per packets are

arrived at the destination (note that the destination in this

case is the targeted node). It should be noted that the same

proposed equations will be used to conduct the analysis of the

proposed mathematical model.

In order to extend our proposed mathematical model, one

needs to derive an expression for the throughput per node. To

follow the same bottom-up mathematical technique, we need

to proceed from one node to n number of nodes. We begin the

derivation for throughput by recalling one of our fundamental

equations of total time taken by a single node. By

substituting the value of total time ti from (3) into (6), we get

{ }{ }

(1 ) (1 )k k

put

pp np

p p pT

t t

+− −=

+ (12)

In order to generalize (12), we need to substitute the values

of t pp and tnp from (10) and (11), respectively, into (12),

we get:


49

{ }1

(1 ) (1 )

(1 ) (1 )

k kk R

put k k

p p pDT

p p p

+− − =∑ +− −

(13)

The first two quantities in denominator of (13) represent

the summation of the time a node takes to transmit the

personal packet and the neighbor’s packets. Where as the

numerator is the summation of probabilities set up for both

the personal packets and the neighbor packet. It should be

noted that (13) is generalized in a sense that it accommodates

k number of packets that a node can deal at a certain point of

time. To make it simple, we can rewrite equation as follows:

( ) { }1

(1 ) (1 )

(1 ) (1 )

k kkR R

k k

p p pD DT put of node

p p p

× + ×− − =∑

+− −

(14)

Equation (14) is the total throughput of a node for k

number of packets that a node can transmit (i.e., both the

personals packets and the neighbor’s packets). For a small set

of numerical analysis using the mathematical expressions

derived above, let the value of k =1. This leads us to the

following result: RT Dput = . This result can be interpreted by

understanding different conditions and/or assumptions. For

instance, if we assume that a node becomes selfish,

consequently it does not forward the packets which were

received from one of its neighboring nodes.

Based on the above analysis, one can conclude that the

throughput of the system depends mainly on the factors or

parameters that we include in different equations of the

proposed mathematical model. Increasing or decreasing the

values of these parameters result in different performance

from node to node. However, it would be more interesting to

account those parameters that are not directly related to the

internal components of a node. One of the best ways to

consider these parameters is to compute the utilization per

node and extend the derived mathematical expression for

typically n number of nodes. It is expected that the utilization

of node remains stable as long as the node utilizes the

available route efficiently. However, the utilization per node

may degrade due to the improper use of available channels.

Thus, this clearly shows that we need to consider the node

utilization per channel and need to extend that expression for

generalization. To make this practical, let us assume that

Np is the power of node and K is the number of packet that

a node can transmit. Taking these assumptions into account,

one can derive a generic expression for utilization as follows:

pout

pin

NU

N= (15)

We call (15) as a generic mathematical expression of

utilization, since both the numerator and the denominator are

unknown and need to be determined to find out a more

specific expression. Before going to utilize a bottom-up

methodology, it is worth mentioning that node power is

distributing non-uniformly among the packets almost in the

same way as we distribute the time. Therefore, this new

concept of power division leads us to the following

mathematical expression for node-utilization with respect to

the node’s personal packets.

1

KPout

ppoutpp

KN

t

=∑

(16)

It should be noted that (16) is a more specific form of (15)

since it only account for the personal packets. In addition to

that, it can be considered a generalized form since it includes

a large number of packets whose value may vary from one to

infinity. To make this model equivalent, one can derive the

same expression to compute the utilization per node that is

related to the packets receive by the targeted node from one of

its neighboring nodes. Thus the opposite hypothesis leads us

to the following mathematical expression for the node

utilization with respect to the personal packets:

1

Knout

pnoutnpK

KN

t

< ∞

≥

= ∑

(17)

Contrary to (17), there should be an equivalent possibility of

node inputs that can easily be computed as follows:

1

Knin

pninnp

KN

t

= ∑

(18)

It should be noted that (18) can be useful to compute the

output of the nodes in terms of the inputs of the node. In other

words pout

N is the sum of work on outgoing personal and

neighbor packets that lead us to derive the simple

mathematical relationship:

( )N N Npout pp out pnout= + (19)

In order to show that (19) is a valid true mathematical

relationship between the input and output lines of a node, one

needs to give another relationship as follows:

N Npin pnin= (20)

This should now be clear that one of the reasons for

deriving the above two relationship is to derive a more

general expression from (16) and (17). Therefore, by

substituting (16) and (17) into (19), we get the following

equation:


50

1

kppout nout

ppoutk pp np

K KN

t t

< ∞

≥

= +∑

(21)

Similarly, we can derive another expression using (20) which

opposed to (21) as follows:

1

Knin

Pinnp

KN

t

= ∑

(22)

The last two equations (i.e., (21) and (22)) can now be used

to derive the final expression for utilization as follows:

1

k

K Kpout nout

t tpp np

UK

nin

tnp

+ =

∑ (23)

All lines that are used for transferring the data or packets

are also used for receiving the data or packets from neighbor

nodes. This implies that the utilization per channel or line

can be computed using (23). If we denote this line-utilization

as (24), we can extend it to generalized (23).

( )1

kpout nout np

R

k nin

K K tU

K

<∞

≥

=∑ (24)

If we assume that n numbers of routes are attached through

the targeted node, then the utilization of the targeted node on

all routes can simply be computed by summing the utilization

of each node per channel. In other words, (24) needs to be run

and sum for n numbers of routes that are connected to a

certain node. This can lead us to the following equation:

1n

n

t R

n

U U<∞

≥

=∑ 1where n≤ < ∞ (25)

This can also be interpreted as follows:

1 2 3 ........t R R R Rn

U U U U U= + + + + (26)

Therefore, the total utilization of system can be derived

from (23) and (25) as follows:

( ) ( )1 1

n kpout pp nout np

t

n k nin np

K t K tU

K t

<∞ <∞

≥ ≥

+=∑∑ (27)

We perform some simplification in (27) that results the

following equation:

( )1 1

1n k

t pout np pp nout

n k nin

U K t t KK

<∞ <∞

≥ ≥

= + ∑∑ (28)

The above equation can be used to compute the total

utilization of a certain node for all packets that it can forward

and/or receive from one of its neighbor though all possible

channels.

IV. EXPERIMENTAL VERIFICATIONS AND PERFORMANCE

ANALYSIS OF CTC SCHEME

We have shown that the system throughput can be

measured in term of packets that neighboring node is

generated as well as the self generated packets. When it

comes to performance, it is a standard in a wireless Ad-Hoc

network to determine the performance of the network in

terms of node misbehavior by looking at the ratio between the

packet drop per node and the total number of packets per

node. In other words, in order to quantify the node-

misbehavior that apparently looks a philosophical concept

one can need to compute that how much packets the node is

dropping that it should forward to the intended destination.

To make the proposed methodology up to the standard, we

derive the formula for computing the packet drop per node

using (5).

As mentioned earlier, we determine the behavior of the

malicious node in terms of the number of packets that should

have transmitted to the intended destination. For taking this

into account, one can say that the effective throughput of a

node is entirely dependence on how efficiently the node is

forwarding the neighbor packets and thus creating a

consistent trust environment among the nodes. This

argument, therefore, allows us to make minor changes in (5).

( ) ( )NodesPacket NeighborPacketPacketDrop

TotalTime

+=

For the ease of clarity, we make some implicit assumptions

that remain same for all the investigated algorithms presented

in our simulation results. These include an initial small

probability of fixed packet drop that remain same for all

algorithms. The reason for making an initial value of packet

drop as an assumption is due to the fact that we are unaware

that how the nodes misbehave when they first boot up.

Instead of considering this value as zero, it would rather

useful to smooth out the effects that result due to the

malicious behavior of nodes. Thus, this significantly clarifies

the performance difference between the proposed and the

other well known techniques.

A. Case I

Before discussing the simulation results, it is worth

mentioning some of our key assumptions that we made for the


51

sake of experimental verification for the proposed CTC

algorithm. Some of them are as follows. For case-1 we

assume that the self generated packets per node is constant

(i.e., node generates a fixed number of packets for a specified

amount of time that remains same for both CTC and DSR

algorithms). We assume that one of the neighboring nodes of

the target node sends packets at a certain rate that will

increase linearly over the total simulation time. This

assumption helps understanding the true performance of the

proposed CTC algorithm.

Fig. 1 shows the simulation results of packet-drops per

node with respect to the number of packets generated by one

of the neighboring nodes. It should be noted that as we

increase the self generated packets, the number of packet-

drops per node is increased. In addition, it can be seen in Fig.

1 that for a small value of neighbor packet generation

typically 500, both CTC and DSR are overlapping each other.

However a slight increase in the neighbor packet generation

causes a performance difference between these two

approaches. In other words, an increase in neighbor packet

generation forces the DSR to perform poorly as compared to

the proposed CTC algorithm. Thus the node-misbehave

increases for the DSR algorithm whereas it gives a consistent

behavior for the proposed CTC algorithm. Fig. 1 suggests

that for large value of neighbor packet generation (typically

after 800 to 1600), the proposed CTC algorithm successfully

maintain a consistent node misbehavior (typically the node

misbehavior for the proposed CTC algorithm exists between

20 to 25 percent) where as the node misbehavior increases

linearly in the case of DSR algorithm. Based on the results

(Fig. 1), one can say that the proposed CTC algorithm

outperforms the DSR algorithm for a large neighbor packet

generation.

B. Case II

The CASE-II of our simulation is different from CASE-I in

such a way that now both inputs of a node-forwarding system

become a linear function of the node-time. In other words, for

CASE-II we are not only increasing the neighbor-generated

packets but also increasing the self-generated packets. The

simulation result of this case satisfies the proposed

mathematical model discussed in Section III in a way that the

overall packet drop performance of both investigated

algorithms decreases. In other words, it can be seen that the

packet drop is more rapid in Fig. 2 with respect to the

neighbor-generated packets.

In harmony with our expectations, as the number of

neighbor-generated packets increased, the packet-drop

performance of the proposed algorithm degraded. However,

the performance degradation of the proposed algorithm was

small compared to the performance degradation of the DSR

algorithms. The packet-drop performance of the CTC

algorithm below 40 neighbor-generated packets is almost

similar to that of the CTC algorithm as shown in figure of

CASE-II. However, the amount of the packet-drop

improvement for the proposed CTC algorithm over the DSR

algorithm increases with respect to the values of neighbor-

generated packet.

C. Case III

The parameters-assumption for CASE-III is different from

the previous cases in such a way that now one input (that is

the neighbor-generated packets) of a node-forwarding system

becomes a linear increasing function of the node total time

where as the input (that is the neighbor-generated packets)

becomes a linear decreasing function of the node total time.

In other words, for CASE-III we are interested to see the

packet drop performance of the investigated algorithms (that

is the proposed CTC algorithms as well as the DSR

algorithm) in the presence of both increasing and decreasing

functions. The expected output of this simulation was exactly

Figure 2. Neighbor packet generation vs. packet drop



52

the same as we were expecting based on our proposed

mathematical model. That is the values of packet-drop for

both CTC and DSR decreases as compared to the other two

cases we discussed above. This is due to the fact that we

consider the number of self-generated packet as a decreasing

linear function of the node total time while at the same time

we use the neighbor-generated packets as an increasing

function as shown in Fig. 3.

D. Case IV

CASE-IV is yet another verification of the proposed

mathematical model. For this case, we assume that the

neighbor-generated packets is a constant function of time

(i.e., we use a constant value for this system parameter and

used it with respect to time throughput the simulation of

CASE-IV). On the other hand, we consider self-generated

packets as a linear increasing function of the total node time.

It should be noted that the term linear increase or decrease

implies a constant uniform change in the system parameter

with respect to time. This case can also be considered as a

reciprocal of CASE-I from its fundamental assumptions point

of view. Thus we should also expect a reciprocal output for

this simulation (that is its packet drop performance should

behave exactly the opposite as we have seen in Fig. 1). With

harmony to our expectations, the packet drop remains

constant for all values of neighbor packets as shown in Fig. 4.

E. Case V

This case describes the effect of the last four cases in terms

of node malicious behavior as shown in Fig. 5 and 6. We used


Figure 5. Malicious Nodes (%) vs. packet drop


Figure 6. Malicious Nodes (%) vs. packet drop


53

the statistics to derive a relationship between the node

malicious behavior and the ratio packet drop. As one can see

that in both Fig. 5 and 6, the number of malicious node

becomes an increasing function of the packet drop

performance which also validates the structure of our

proposed mathematical model. However, the performance

differences between the two investigated algorithms from

malicious nodes perspective are quite subtle. That is a less

number of nodes misbehave in the case of the proposed CTC

algorithm when compared to the DSR algorithms. For small

value of packet drops typically 10, both algorithms are

overlapping each other but however, as the number of packet

drop increases, the proposed algorithms giving much better

performance than the other algorithms.

V. CONCLUSION

This paper presented a critical analysis of the recent

research wok and its impact on the overall performance of a

mobile Ad hoc network. We provided a discussion on some of

the common misconceptions in the understating of selfishness

and miss-behavior of nodes. Moreover, this paper proposed

both analytical and mathematical model that can be used to

effectively reduce the number of malicious nodes and packet

drops. Our simulation results demonstrated that the proposed

mathematical model not only points out the weaknesses of the

recent research work but also approximates the optimal

values of the critical parameters (such as the throughput,

transmission over head, channel capacity and utilization etc.)

that have great impact on the overall performance of a mobile

Ad hoc network. Simulation results presented in this paper

show that how the performance of mobile Ad hoc networks

degrades significantly when the nodes eliminations are

frequent. The simulation results of this paper are completely

based on the proposed mathematical model for both lightly

and heavily loaded networks. These results addressed many

critical system parameters such as packet drop and packet

loss versus malicious nodes, neighbor packet generation and

drop ratio, and throughput per node per system. Our

simulation study is also a comparison with the most recent

and well admitted research work such as CONFIDANT and

CORE. This comparative study provides a proof of our

proposed methodology that appears to be correct.

REFERENCES

[1] Zhang and W. Lee, “Intrusion Detection in wireless Ad-hoc networks,” in

Proceedings of MOBICOM 2000, pp. 275–283, 2000.

[2] Y. Huang, W. Fan, W. Lee, and P. Yu, “Cross-Feature analysis for

detecting Ad-hoc routing Anomalies,” in Proceedings of the 23rd

International Conference on Distributed Computing Systems (ICDCS

2003), Providence, RI, pp. 478–487, May 2003.

[3] S. Marti, T. Giuli, K. Lai, and M. Baker, “Mitigating routing Misbehavior

in mobile Ad hoc networks,” in Proceedings of MOBICOM 2000, pp.

255–265, 2000.

[4] P. Michiardi and R. Molva, “CORE: A Collaborative Reputation

Mechanism to enforce node Cooperation in mobile Ad hoc networks,”

Sixth IFIP conference on security communications, and multimedia

(CMS 2002), Portoroz, Slovenia, 2002.

[5] T. Moreton and A. Twigg, “Enforcing collaboration in P2P routing

services,” 2003.

[6] S. Bansal and M. Baker, “Observation-based Cooperation Enforcement in

Ad hoc networks,” Technical Report, 2003.

[7] S. Marti, T. Giuli, K. Lai, and M. Baker, “Mitigating routing Misbehavior

in mobile Ad hoc networks,” in Proceedings of MOBICOM 2000, pp.

255–265, 2000.

[8] K. Aberer and Z. Despotovic, “Managing trust in a P2P information

system,” in Proceedings of the Ninth International Conference on

Information and Knowledge Management (CIKM 2001), 2001.

[9] K. Paul and D. Westhoff, “Context aware Inferencing to rate a selfish node

in DSR based Ad-hoc networks,” in Proceedings of the IEEE Globecom

Conference, Taiwan, 2002.

[10] S. Kamvar, M. Schlosser, and H. Garcia, “The Eigen-trust algorithm for

Reputation Management in P2P networks,” in Proceedings of the Twelfth

International World Wide Web Conference, May, 2003, 2003.

[11] M. Carbone, M. Nielsen, and V. Sassone, “A formal model for trust in

dynamic networks,” BRICS Report RS-03-4, 2003.

Syed S. Rizvi is a Ph.D. student of

Computer Science and Engineering at

University of Bridgeport. He received a

B.S. in Computer Engineering from Sir

Syed University of Engineering and

Technology and an M.S. in Computer

Engineering from Old Dominion

University in 2001 and 2005, respectively.

In the past, he has done research on

bioinformatics projects where he

investigated the use of Linux based cluster

search engines for finding the desired

proteins in input and outputs sequences from multiple databases. For last three

year, his research focused primarily on the modeling and simulation of wide

range parallel/distributed systems and the web based training applications. Syed

Rizvi is the author of 68 scholarly publications in various areas. His current

research focuses on the design, implementation and comparisons of algorithms in

the areas of multiuser communications, multipath signals detection, multi-access

interference estimation, computational complexity and combinatorial

optimization of multiuser receivers, peer-to-peer networking, network security,

and reconfigurable coprocessor and FPGA based architectures.

Khaled Elleithy received the B.Sc. degree

in computer science and automatic control

from Alexandria University in 1983, the

MS Degree in computer networks from

the same university in 1986, and the MS

and Ph.D. degrees in computer science

from The Center for Advanced Computer

Studies at the University of Louisiana at

Lafayette in 1988 and 1990, respectively.

From 1983 to 1986, he was with the

Computer Science Department,

Alexandria University, Egypt, as a

lecturer. From September 1990 to May

1995 he worked as an assistant professor at the Department of Computer

Engineering, King Fahd University of Petroleum and Minerals, Dhahran, Saudi

Arabia. From May 1995 to December 2000, he has worked as an Associate

Professor in the same department. In January 2000, Dr. Elleithy has joined the

Department of Computer Science and Engineering in University of Bridgeport as

an associate professor. Dr. Elleithy published more than seventy research papers

in international journals and conferences. He has research interests are in the

areas of computer networks, network security, mobile communications, and

formal approaches for design and verification.


54


Vol. 3, No.1, 2009

55

IPv6 an IPv4 Threat reviews with Automatic Tunneling and Configuration Tunneling

Considerations Transitional Model: -A Case Study for University of Mysore Network-

1Hanumanthappa.J. Dr.Manjaiah.D.H.2

1Dos in Computer Science, 2 CS at Dept.of Computer Science, University of Mysore,Manasagangothri, Mangalore University,Mangalagangothri,

Mysore,INDIA, Mangalore,INDIA. [email protected] [email protected]

Abstract

The actual transition from IPv4 to IPv6 requires network administrators to become aware of the next generation protocol and the associated risk problems.Due to the scale and complexity of current internet architecture how to protect from the existing investment and reduce the negative influence to users and service providers during the transition from IPv4 to IPv6 is a very important future topic for the advanced version of an internet architecture.This paper summarizes and compares the IPv6 transition mechanism methods like Dual Stack,Tunneling issues like IPv6 Automatic tunneling and manually configured tunneling considerations, the IPv6 transition scenarios,IPv6 transition security problems,highlights IPv6 and IPv4 threat review with automatic tunneling and configuration tunneling considerations.In this paper we have proposed a transitional threat model for automatic tunneling and a configuration tunneling that could be followed by the University of Mysore(UoM),to estimate automatic tunneling and a manually configured tunneling threat review issues.Furthermore,there are different tunneling mechanisms such as:IPv6 over IPv4 GRE Tunnel,Tunnel broker,Automatic IPv4–Compatible Tunnel and Automatic 6-to-4 Tunnel and also outlines many of the common known threats against IPv6 and then it compares and contrast how these threats are similar ones,might affect an IPv6 network.

Keywords: Automatic Tunneling; Configuration Tunneling; IPv6 Transition; IPv6 Tunneling; IPv6 Security.

I. Introduction

In the last 20 years,the internet undertook a huge and unexpected explosion of growth [63].There was an effort to develop a protocol that can solve problems in the current Internet protocol which is in the Internet protocol version 4(IPv4).It was soon realized that the current internet protocol the IPv4,would be inadequate to handle the internet’s continued growth.The internet Engineering task force(IETF) was started to develop a new protocol in 1990’s

and it was launched IPng in 1993 which is stand for Internet Protocol Next Generation.So a new generation of the Internet Protocol(IPv6)was developed [7],allowing for millions of more IP addresses.The person in charge of IPng area of the IETF recommended the idea of IPv6 in 1994 at Toronto IETF[1].But mainly due to the scarcity of unallocated IPv4 address the IPv4 protocol cannot satisfy all the requirements of the always expanding Internet because however its 32 bit address space being rapidly exhausted[2]alternative solutions are again needed[3].The long term solution is a transition to IPv6[5]which is designed to be an evolutionary step from IPv4 where the most transport and application–layer protocol need little or no modification to the work.The deployment of NAT[3]can alleviate this problem to some extend but it breaks end to end characteristic of the Internet,and it cannot resolve the problems like depletion(exhaustion) of IPv4 addresses.IPv6 protocol has 128-bit addresses instead 32 bit IPv4 addresses,however the migration from IPv4 to IPv6 is an instant is impossible because of the huge size of the Internet and of the great number of IPv4 users[16].Moreover, many organizations are becoming more and more dependent on the Internet for their daily work,and they therefore cannot tolerate downtime for the replacement of the IP protocol.IPv6 has some transition methods or techniques that permit end user to put into operation slowly but surely provides a high level of interoperation between both protocols IPv4 and IPv6.The IPv6 has some transition methods or techniques that permit end user to put into operation IPv6 slowly but surely and provides a high level of interoperation between both the protocols IPv4 and IPv6.The current IPv4 based Internet is so large and complex that the migration from IPv4 to IPv6 not as simple as the transition from NCP network to TCP/IP in 1983and also will take so many years to occur very smoothly[63].


Vol. 3, No.1, 2009

56

can install it as a software which upgrade in most Internet machines,and it can work smoothly with the current IPv4 data.

This Journal paper is intended to provide a review of the most common threats in IPv6.This paper also can extend by investigating the additional threats through the deployment of IPv6 including those associated with the various transition mechanisms available with specific emphasis on Automatic tunneling and Configuration tunneling.Since there are no transitional networks(ISP’s)IPv6 ready yet in INDIA,therefore,to get going deploying IPv6 we need transition techniques,since there is no complete world wide IPv6 network infrastructure.We should look at this stage as strategic vision, and we should look at the existing of IPv4 and IPv6 network infrastructure as necessary situation before complete migration to IPv6.The scarcity of information on the subject of IPv6 migration costs,merged with the reality that many organizations are not sold on the supposed benefits offered by the IPv6,is making the case for upgrading difficult to argue[2].It is quite obvious that changing from IPv4 to IPv6 is very costly,since many current network applications running on IPv4 .

This paper presents a comprehensive explanation about the current status of research on IPv6 Transition mechanisms,Tunneling types like Automatic Tunneling and Manually configured tunneling etc,Tunneling types threat reviews,IPv6 Security aspects,Threat review model and indicates the prospect of the future research.The paper is organized as follows: We briefly described the Theoretical considerations of IPv6 Transition issues in Section 2.We described IPv6 to IPv4 threat review in section 3.We discuss a brief overview of IPv6 Automatic Tunneling and Configuration tunneling mechanism considerations in section 4.We discussed prototype which explains the Automatic tunneling,Configuring tunneling review research in section 5.We describe our research approach recommendations on IPv6 tunneling threat types in section 6.In section 7 we will learn the current and future Innovative research challenges of IPv6 threat issues for researchers,finally we concluded the whole paper in section 8.

II.Theoretical Consideration.

A .Types of Transition Strategies in IPv6

The key elements of these transition technologies are dual stack and configuration tunneling.The below figure-1 shows description of the different IPv6 tunneling scenarios and their configurations which are explained by using some of the available commands.The main important IPv6 transition techniques are Dual-Stack,Tunneling Techniques,and Header translation.

Fig .I.The IPv6 Transition Mechanisms.

B. Introduction to Tunnels

A tunnel is a bidirectional point –to-point link between two network endpoints.Data is carried through the tunnel using a process called encapsulation in which IPv6 packet is carried inside an IPv4 packet which makes IPv4 as a Data Link layer with respect to IPv6 packet transport. The term “tunneling” refers to a means to encapsulate one version of IP in another so the packets can be sent over a backbone that does not support the encapsulated IP version. For example,when two isolated IPv6 networks need to communicate over an IPv4 network,dual-stack routers at the network edges can be used to set up a tunnel which encapsulates the IPv6 packets within IPv4,allowing the IPv6 systems to communicate without having to upgrade the IPv4 network infrastructure that exists between the networks. This mechanism can be used when two nodes that use same protocol wants to communicate over a network that uses another network protocol.The tunneling process involves three steps:encapsulation,decapsulation,and tunnel management.It also requires two tunnel end-points,which in general case are dual-stack IPv4/IPv6 nodes,to handle the encapsulation and decapsulation.There will be performance issues associated with tunneling,both for the latency in en/de capsulation and the additional bandwidth used.Tunneling is one of the key deployment strategies for both service providers and enterprises during the period of IPv4 and IPv6 coexistence.Fig-2 Shows the deployment of IPv6 over IPv4 tunnels.

Fig.II.The Deploying an IPv6 over IPv4 Tunnels.

Tunneling is a strategy used when two computers using IPv6 want to communicate with each other and the packet must pass through a region that uses IPv4.To pass through

this region,the packet must have an IPv4 address.So the IPv6 packet is encapsulated in an IPv4 packet when it enters


Vol. 3, No.1, 2009

57

the region,and it leaves its capsule when it exists the region.It seems as if the packet IPv6 packet goes through a tunnel at one end and emerges at the other end [10].To make it clear that the IPv4 packet is carrying an IPv6 packet as data,the protocol value is set to 41[see Figure-2]Although IPv6 Dual Stack,IPv6 Tunneling,IPv6 Header Translation,are providing with us with transition solution but still it is not complete,there are still some other issues we should consider to get complete solution for transitioning.Tunneling techniques are broadly divided into two types,first one is an automatic tunneling and second one is configuration tunneling[Ref-Figure-4].The tunneling technique we can use

the compatible addresses discussed as shown in the below Figure-3.A compatible address is an address of 96 bits of zero followed by 32 bits of IPv4 address.It is used when a computer using IPv6 wants to send a message to another computer using IPv6.However suppose the packet passes through a region where the networks are still using IPv4.The sender must use the IPv4-compatible address to facilitate the passage of the packet through the IPv4 region.For example the IPv4 address 2.13.17.14 becomes 0::020D:110E.The IPv4 is prepended with 96 zeros to create a 128–bit address (See figure-3)[10].

Fig.III.The IPv6 Compatible Address.

C. Types of Tunneling Mechanisms.

Fig.IV.The IPv6 Tunneling Mechanisms.

D.An Automatic Tunneling

If the receiving host when it uses a compatible IPv6 address,tunneling occurs automatically without any reconfiguration.In automatic tunneling,the sender sends the receiver an IPv6 packet using the IPv6 compatible address as the destination address.When the packet reaches boundary of the IPv4 network,the router encapsulates it in an IPv4 packet,which should have an IPv4 address.To get this address,the router extracts the IPv4 address embedded in the IPv6 address.The packet then travels the rest of its journey as an IPv4 packet.The destination host,which is using a dual stack,now receives an IPv4 packet.Recognizing its IPv4 address, it reads the header,and finds that the packet is carrying an IPv4 packet.It then passes the packet to the IPv6 software for processing.(SeeFigure-5)[10].

Fig .V.The IPv6 an Automatic Tunneling..

E.Configured tunneling

The tunnel point addresses are determined by the configuration information that is stored at the encapsulating end point hence the name configured tunneling.The other name for a configured tunneling is an explicit tunneling.If the receiving host does not support an IPv6 –compatible address,the sender receives no compatible IPv6 address from the DNS.In a configuration the sender sends the IPv6 packet with the receiver’s no compatible IPv6 address, however the packet cannot pass through the IPv4 region without first being encapsulated in an IPv4 packet.The two routers at the boundary of the IPv4–region are configured to pass the packet encapsulated in an IPv4 packet.The router at one end sends the IPv4 packet with its own IPv4 address as the source address and the other router’s address as the destination.The router receiver’s the packet,decapsulates the IPv6 packet, and sends it to the destination host.The destination host then receives the packet in IPv6 format and processes it [10](SeeFigure-6).


Vol. 3, No.1, 2009

58

Fig .VI.The IPv6 Configuration Tunneling.

Configuration and Automatic tunnels can be defined to go between router-to-router,Host–to-Host,Host-to-Router,and Router–to-Host but are most likely to be used in a router–to-router configuration.

F.IPv6 over IPv4 GRE Tunneling.

Fig.VII.The IPv6 over IPv4 GRE Tunneling.

The IPv6 over IPv4 GRE tunnel is a variety of tunneling mechanism in a TCP/IP protocol suite.This is also a type of GRE tunneling technique that is designed to provide the services necessary to implement standard point-to-point encapsulation scheme.GRE tunnels are the links between end points with a separate tunnel for each link as similar to IPv6 manually configuration tunnel.However each tunnel is not tied to a specific passenger or transport protocol but in this

situation they carry IPv6 as the passenger protocol over GRE as the carrier protocol.The Fig-7 Shows how to configure IPv6 over IPv4 GRE tunnel.

G.Configuration Tunneling Scenarios:

1.Router –to-Router Tunneling Configuration: During the migration,the tunneling technique can be used in the following ways:

1.Router-to-router:IPv6/IPv4 routers interconnected by an IPv4 infrastructure can tunnel IPv6 packets between themselves.

Fig.VIII. The Router –to-Router Tunneling Configuration.

2.Host–to-Router:IPv6/IPv4 hosts can tunnel IPv6 packets to an intermediary IPv6/IPv4 router that can be reached via an IPv4 Infrastructure.

Fig.IX.The Host–to-Router Tunneling Configuration.

3.Host-to-Host:IPv4/IPv6 hosts that are interconnected by an IPv4 infrastructure can tunnel IPv6 packets between themselves.

4.Router-to-Host:IPv6/IPv4 routers can use tunnels to reach an IPv4/IPv6 host via an IPv4 infrastructure.

Fig.XI.The Router–to-Host Tunneling Configuration.

Fig.X.The Host–to-Host Tunneling Configuration.

III.The IPv6 to IPv4 Threat review

A .Types of Threats in IPv6 Security


Vol. 3, No.1, 2009

59

Fig.XII.The various types of IPv6 to IPv4 Threats.


Vol. 3, No.1, 2009

60

Networks threats are broadly divided into two types.The first type of threat is passive threat and active threats.The term passive indicates that the attacker does not attempt to perform any modifications to the data [15].In fact,this is also why passive attacks are harder to detect, where the active attacks are based on modifications of the original message in some manner or in creation of a false message.

IV. A brief overview of IPv6 Automatic Tunneling and Configuration tunneling mechanism considerations.

Several approaches to transition from IPv4 to IPv6 networks exist. These approaches are broadly divided into three following types[Ref-Figure-1].

1. Dual Stack, 2.Tunneling, 3.Translation.

A.IPv6 and IPv4 Threat Issues and Observations

With regard to the IPv6 tunneling technologies and firewalls,if the Network designer does not consider IPv6 tunneling when defining security policy,unauthorized could possibly traverse the firewall in tunnels.This is similar to the issue with instant messaging(IM)and file sharing applications using TCP port 80 out of organizations with IPv4.According to some transition issues automatic tunneling mechanisms are susceptible to packet forgery and DOS attacks.

1.With regard to IPv6 Tunneling technologies and Firewalls,if the network designer does not consider IPv6 Tunneling when defining security policy,unauthorized traffic could possibly traverse the firewalls in tunnels.This is similar to the issue with instant messaging(IM) and file sharing applications using TCP port 80 out of organizations.

2.All ready we know that an automatic tunneling mechanisms are susceptible to packet forgery and DoS attacks.These risks are the same as in IPv4,but increase the number of paths of exploitation for adversaries.

3.According to the Network designer while deploying Automatic tunneling or Configuration tunneling,the tunneling overlays are considered non broadcast multi-access (NBMA) networks to IPv6 and require the network designer to consider this fact in the network security design.

4.An Automatic Tunneling with DoS threats and third parties has introduced by Relay translation technologies. These risks do not change from IPv4,but do provide new avenues for exploitation [13], either for external customers or internal customers the relays avenues can be limited by restricting the routing advertisements.

5.IPv6 to IPv4 and translation and relay techniques can defeat active defense trace back efforts hiding the origin of an attack.

6.Translation techniques outlined for IPv6 have been analyzed as shown to suffer from similar spoofing and DoS issues as IPv4 only translation technologies [14].

7.Static IPv6 in IPv4 Tunneling is preferred because explicit allows and disallows are in the policy on edge devices.

The below mentioned Figure-13 Shows a study that has been conducted by University of Mysore to estimate the IPv6 to IPv4 threats review with the help of automatic tunneling and Configuration tunneling issues.

Fig. XIII.Threat differences between IPv4 and IPv6 with Tunneling

Techniques.

V.Prototype

5.1. Threat Analysis due to Transition Mechanisms

Threat modeling (or analysis) is essential in order to help us to develop a security model than can focus or protecting against certain threats and manage the related assumptions.One methodology to discover and list all possible security attacks against a system is known as attack trees.To create an attack tree we represent attacks against a system in a tree structure, the attack goals as root nodes and the different sub goals necessary to achieve them as their leaf nodes.Figure-14 represents the general threat categories we have identified against network convergence architectures namely attack on the network processes are responsible for IPv6 transition,Dual stack,Automatic tunneling and Configuration tunneling threats.Dual Stack threats are totally different from the IPv6 Tunneling techniques like an automatic tunneling and Configuration tunneling,manually configured tunneling,Static tunneling etc.As we have discussed there are large number of transition mechanisms to deploy IPv6 but broadly be categorized into,Dual Stack,Tunneling(Automatic,Manual Configuration),and Translation Header.

Fig.XIV.General Threat categories for IPv6 Tunneling.

The problems are identified when IPv6 is tunneled over IPv4 encapsulated in UDP as UDP is usually allowed to pass through NATS and Firewalls [59].Consequently allowing an attacker to punch holes with in the security infrastructure.The First and Second authors of this paper recommends that if the necessary security measures cannot be taken ,tunneled traffic should be used with caution if not


Vol. 3, No.1, 2009

61

completely blocked.To provide ingress and egress filtering of known IPv6 tunneled traffic, perimeter firewalls should block all inbound and outbound IPv4 protocol 41 traffic.For

circumstances where Protocol 41 is not blocked it can easily be detected and monitored by the open-source IPv4 IDS Snort.During the development of the IP6-to-IPv4 threat

model we have identified that several attacks lead to other attacks which we have previously included and analyzed. These are represented in the tree as identical nodes in different locations.

5.2. Security IPv6 Deployment for Automatic Tunneling and Configuration Tunneling.

A.Specific of IPv6 Tunneling Deployment

Considering the issues surrounding IPv6 firewalls,Figure-15 demonstrates how all traffic originating from the Internet must be split up into its corresponding protocols.Each protocol must then be inspected and filtered independently based on a consistent policy before being forwarded to their respective destinations. Under most circumstances, the deployment model will be much more complex and will probably consist of some hybrid deployment structure, which may include some element of tunneling. For these situations, the principle should remain the same but the model should be adapted accordingly.

D.Avoid IPv6 Tunneling or be aware of the security consequences.

I.6 to 4 does not support source address filtering.,2.Teredo punches holes into the NAT device,3.Any Tunneling

Fig.XV.Basic Security model for IPv6 Tunneling types.

B. Predicted Security models for IPv6 Tunneling.

Research is pushing the way towards reducing the restrictions that are preventing widespread deployment of Insect.Advances in the development of a PKI solution [49] to offer basic and advanced certification services,supplies a solution to allow client systems or end entities in one administrative domain to communicate securely with client systems or end users in another administrative domain.This can be extended to support multi-domain IPv6 scenarios through the deployment of cross-certification modules, thus reducing the key management problems. In addition to this, policy-based management systems are being implemented to solve the challenges presented by large-scale IPSec policy deployment across many network elements through the use of a centralized policy server which controls policy targets [50].These advances are in their preliminary stages but demonstrate the options available to start solving the

limitations of widespread IPSec deployment within IPv6.

C.The more complicated the IPv6 transition/coexistence becomes the greater danger that security issues will be introduced either

1.In the mechanisms themselves, in the interaction between mechanisms or by introducing unsecured paths through multiple mechanisms.

mechanism may be prone to spoofing, 4.With any tunneling mechanism we trust the relay-servers[60].

E.Block IPv6 Tunneling Protocols.

The networking and security communities have invested time and energy in ensuring that IPv6 is a security-enabled protocol. However, one of the greatest risks inherent in the migration is the use of tunneling protocols to support the transition to IPv6.These protocols allow the encapsulation of IPv6 traffic in an IPv4 data stream for routing through non-compliant devices.Therefore, it's possible that users on your network can begin running IPv6 using these tunneling protocols before you're ready to officially support it in production.If this is a concern,block IPv6 tunneling protocols (including SIT, ISATAP, 6to4 and others) at your perimeter.To block IPv6 Tunneling protocols we have to do two importantly configure the network by using upgrade our edge firewall, proxy and IDS to include IPv6 and tunneled IPv6 functionality, Drop all outbound IPv4 based UDP traffic with source or destination port 3544 and IPv4 protocol 41 packets[60].Along with these it includes leading threats of IPv4 DoS and DDoS attack.IPv6 DoS are related to Neighbor discovery(ND) protocol.ND includes various problems like address resolution, Neighbor unreachability detection, Duplicate address detection, and router discovery.These types of threats are controlled by IPSec[60].DDOS attacks are based on four representative modes like TCP-flood ,UDP-flood,ICMP-flood, Smurf attack.TCP involves three way handshake mechanism of the TCP protocol.The attacking node sends a series of SYN request to the victim with spoofed address.The victim will send SYN/ACK as response and wait some time for an ACK.Because of spoofed source address there is no ACK return. It causes the connection queue and memory buffer to fill up [62].

F.Specific for Teredo: Network

To restrict the outgoing traffic (white listing) then block UDP port 3544,then for Windows OS disable the teredo client, to disable the teredo client we have to use command like: netsh interface teredo set state disabled.To register teredo client the specific directory is


Vol. 3, No.1, 2009

62

HKEY_LOCAL_MACHINE|SYSTEM\CurrentControlSet\Services\Tcpip6 \parameters [60].

5.3. IPv6-Threats –IPv6 Tunneling.

In Tunneling based methods,when a tunnel end point receives an encapsulated data packet,it decapsulates the

packet and sends it to the other local forwarding scheme. The security threats in tunneling mechanisms,take IPv6 over IPv4 tunnel are mostly caused by the spoofed encapsulated packet sent by the attackers in an IPv4 networks.As shown

in Figure-16,the target of attacks can be either a normal IPv6 node or the tunnel end point [25]

Fig.XVI.Security issues of various tunneling types.

5.4. The Security issues in IPv6 tunneling are as follows.

1. The hard to trace back :

Case-1:IPv4 networking node can make an attack on IPv6 node(network):The attackers(hackers) in IPv4 networks can make an attack on the IPv6 nodes through the 6to4 router(tunnel) end point by forwarding a spoofed encapsulated messages(Packets).Therefore here in this situation it is very difficult to trace back(Refer Fig-16).

Case-2:IPv6 networking node can make an attack on IPv6 network (node):In this type the hacker in IPv6 networks can make an attack on the IPv6 network through 6-to4 relay end point and 6-to4 router by sending a spoofed encapsulated packets.In this case also its very difficult to trace back.(Refer Fig-16)

2. Potential reflect- DoS attack on Destination Host (Refer-Fig-16):The hackers in the IPv4 networks can make a reflect–DoS attack to a normal IPv6 network (node) through the 6-to-4 router (tunnel) end point by sending the encapsulated packets with the spoofed IPv6 source address as the specific IPv6 node.

3. Cheat by a Hacker with the IPv6 Neighbor Discovery (ND) message: Whenever IPv4 network is treated as the link layer in tunneling technology,the hackers in the IPv4 networks can cheat and DoS attack the tunnel end point by sending encapsulated IPv6 neighbor discovery (ND)messages with a spoofed IPv6 link local address.The automatic tunneling techniques like 6-to 4 and Teredo get the information of remote tunnel end point from the certain IPv6 packets.

4. Distributed Reflection DoS: This type of attack can be performed if the very large number of nodes whenever involved in the sending spoofed traffic with same source IPv6 addresses.

5. If the Destination host generates replies by using TCP SYN ACK ,TCP RST,ICMPv6 Echo reply,ICMPv6 Destination unreachable etc):In this case of attack the victim host is used as a reflector for attacking another victim connected to the network by using a spoofed source(Refer Fig-16).

6. Spoofing in IPv4 with 6 to 4: In this type of attack 6 to 4 tunneling spoofed traffic can be injected from IPv4 into IPv6.The IPv4 spoofed address acts like an IPv4 source, 6 to 4 relay any cast (192.88.99.1) acts like an IPv4 destination. The 2002::spoofed source address acts like an IPv4 destination.

7. IPv6 source address and the valid destination is IPv6 destination [28][Refer fig-17].

Fig.XVII.Spoofing in an IPv4 with 6 to 4.

7.Theft of Service: During the IPv6 transition period, many sites will use IPv6 tunnels over IPv4 infrastructure. Sometimes we will use static or automatic tunnels. The 6 to 4 relay administrators will often want to use some policy limit the use of the relay to specific 6 to 4 sites or specific IPv6 sites.However some users may be able to use the service regardless of these controls by configuring the address of the relay using its IPv4 address instead of 192.88.99.1 or using the router header to route the IPv6 packets to reach specific 6 to 4 relays.

8.Attack with IPv4 broadcast address: In the 6 to 4 mechanisms, some packets with the destination addresses spoofed and mapped to their broadcast addresses of the 6to4 or relay routers are sent to the target routers by the attackers in the IPv6 network.In this case also 6 to 4 or relay routers are attacked by the broadcast addresses.

The security issues in tunneling mechanisms can generally limited by investigating the validness of the source/destination address at each tunnel end point.Usually in tunneling techniques it is easier to avoid ingress filtering checks. Sometime it is possible to send packets having link-local addresses and hop-limit=255,which can be used to attack subnet hosts from the remote node, but it is very difficult to deal with attacks with legal IP addresses now[26].Since the tunnel end points of configuration tunnels are fixed, so IPSec can be used to avoid spoofed


Vol. 3, No.1, 2009

63

attacks[29].IPv6 Security issues even though it has provided lot off features but its known that automatic tunneling are dangerous as other end points are unspecified because it’s very difficult to prevent automatic tunneling mechanisms DoS/reflect-DoS attacks by the attackers in IPv4 network.

VI. Recommendations

The conceptual ease of Tunneling mechanism deployment

has resulted in it becoming one of the most popular transition methods. This Journal paper presents an overview of the protocols and technologies needed to secure current IPv6 Tunneling deployment, including basic security models, in addition to investigating and predicting future security models.The following section helps us to summarize all the key recommendations made throughout this paper to avoid IPv6 Tunneling techniques threats like Automatic and

Configuration Tunneling.In addition to this,a general guideline is presented for a network administrator to take through each stage of deployment.

A.Immediate Actions to take before IPv6 deployment

1.When the Tunneled IPv6 is encapsulated in the following ways.

1.1:By using an IPv4 Header:Administrators who have not deployed IPv6 must first ensure that it is not being maliciously used without their knowledge.We know that 6to4,ISATAP,Tunnel Broker traffic is IPv6 traffic tunneled using an IPv4 header that has the IP protocol field set to 41.To protect from such traffic filter all the traffic with the IP protocol set to 41 set in an IPv4 header will prevent known IPv6 traffic from being tunneled within IPv4,thus preventing any back doors from being created within the network However,tunnels can also be set up over UDP,HTTP(port and so on,so the author recommends to use an IDS to carefully detect and monitor all tunneled traffic for instances of IPv6 traffic.

1.2:By using an IPv4 header and a UDP header:If is an IPv6 traffic is a IPv6 teredo traffic[Also called as(IPv4 network address translator(NAT-T) traversal (NAT-T) for IPv6

provides address and automatic tunneling for IPv6 connectivity across IPv4 Internet even when the IPv6/IPv4 hosts are located behind one or multiple IPv4 NAT’S]assignment and tunneled using an IPv6 header and a UDP port 3544.To protect from such Treed traffic drop(filter)all the traffic with the Source or Destination UDP port to set to 3544.The below figure-18 shows a figure of Teredo traffic(NAT-T)

Fig.XVIII.Teredo Traffic (NAT-T Traffic).

1.3:By using an 6 to 4 static tunneling 6 to 4 is an address assignment automatic tunneling technology that is used to provide IPv6 connectivity between IPv6 sites and hosts across the IPv4 internet] instead of the specified tunneling techniques[Ref.fig-19.]

Fig.XIX.6-to-4 Static Tunneling.

VII.The Current and future Innovative research challenges of IPv6 threat issues for Researchers.

This paper has not considered the overall threat review of IPv6 for all the aspects like dual stack,tunneling mechanisms,and Header Translation which are large and complex topics.To provide a complete overview of IPv6 security this paper should be in conjunction with the IPv6 to IPv4 threat review with tunneling considerations.The most important area to move forward with in IPv6 security is the extension of current IPv6 Firewalls and Network tools to test them(IPv6 packet constructors ,IDS’s and so on).This will allow more users to adopt IPv6 without being paranoid about their openness to attack.

Before formulating analysis,we have proposed(formulated) several innovative research challenges.Presently there have been plenty of studies done on the research about basic security issues of IPv6, threat issues of IPv6, however there are still so many problems not yet resolved yet,calling for

great challenges ahead.The innovative research challenges of IPv6 threat issues are as follows.

1.Notion of the system identification within an organization: With the advent of privacy extensions and the size of IPv6 ranges in use, identifying systems within an organization and in particular identifying mis behaving.

2.Transition mechanisms from IPv4 to IPv6:The current research on the basic transition mechanisms mostly focus on the situation of IPv6 over IPv4.Due to advanced deployment of IPv6,the IPv4 networks may also be separated by IPv6 ones. We are using only few kinds of method in this situation like IPv4 configuration tunnel and DSTM,more research on IPv4 over IPv6 transition methods is necessary.

3.Increased dependence on multicast addresses in IPv6 could have some interesting implications with flooding attacks. For example all routers and NTP servers have site specific multicast addresses. Can we use site specific


Vol. 3, No.1, 2009

64

multicast addresses to create an amplification attacks as similar as to the smurf attacks in IPv4.

4.We know that neighbor discovery is a new addition to the IPv6 to replace ARP and RARP of IPv4 and also it is an essential component of a well-run IPv6 network it should be tested from a security point like a neighbor-discovery cache fall victim to a resource starvation attack in any of the currently deployed neighbor discovery implementations. Can the CPU of a device be exhausted by processing information of IPv6 neighbor discovery?

5.IPv6 is new and security information on the protocol is not widespread, it is the opinion of all the authors that a large number of dual stack hosts may be more exposed to attack with IPv6 than in IPv4.

6.With a new IPv6 header configuration, new extension headers, and ICMP message types there may be a several novel ways to deal with flooding attacks.

7.Scenario Analysis:Typical Scenario analysis is still in progress. Some of them are in draft mode, such as enterprise network analysis along with this other possible scenarios should also be analyzed to support for next coming future wireless technologies.

8.Support of Any cast,multihoming,multicast and Mobility:All the research on basic transition mechanisms and analysis of typical transition scenarios normally focus on network connection. More effort should be made for the long process IPv6 transition to support multihoming, mobility, any cast and multicast.

9.Security considerations:All the IPv6 tunneling techniques are introducing more security however these problems cannot be settled or solved now a days. Besides the IPv6 firewall technology is also a good innovative topic for the future research.

10.Difficult to identify Software and setup:The various initialization of protocols of different transition issues like dual stack ,tunneling issues like automatic tunneling and configuration tunneling and header translation security make the chosen and setup of suitable IPv6 transition mechanisms difficult and more complex.A Standard way to discover and setup the software’s for connecting the IPv6 networks across IPv4 only network and vice versa is needed for the interoperation of IPv4 and IPv6.

VIII.Conclusion This paper has shown both benefits and drawbacks from a security point of view.Many of the IPv4 threats (attacks)are similar to the IPv6 threats (attacks) but they are different in the way they are applied.All the tunneling techniques described here are useful in one way or another however they have different usage according to the type of network and the intended use of the tunnel.This paper outlines many of the common known threats against IPv4

and IPv6 and then it compares and contrasts how these threats issues or similar ones,might affect an IPv6 network.Automatic tunneling is also useful to provide hosts without support of their ISP with IPv6.Reading this paper should also stimulate further innovation ideas regarding the further research in IPv6 security.This paper has also shown IPv6 has both benefits and drawbacks from a security perspective. Many of the attacks applicable to principles of IPv6, but different in the way that they are applied. Sagacity of the presence of IPv6 and its corresponding transition methods is often enough to arm network administrators with enough information to thwart common attacks.This paper also has introduced us to the security issues and candidate best practices surrounding the introduction of IPv6 into a network with or without IPSec.Due to the prevalence of current Internet, the transition from IPv4 to IPv6 couldn’t be accomplished in a short time.Besides the scarcity of IPv6 key applications IPv6 key applications makes no enough impetus to deploy IPv6 network.As a result, the transition to IPv6 is a long process.Threat estimation of IPv6 automatic and configuration tunneling can provide powerful security issues faster.As for UoM,Manasagangothri; the threat issues transition model is relatively very easy to understand tunneling threats.

Acknowledgment

The First author would like to thank Dr.Manjaiah.D.H, Reader,Mangalagangothri,Mangalore University, for his valuable guidance and helpful comments throughout writing of this journal paper.This research journal paper has been supported by Department of Studies in Computer Science,Manasagangothri,University of Mysore,and Department of P.G.Studies and Research in Computer Science,Mangalagangothri,and Mangalore University and also this Journal paper is also based upon works supported by Department of P.G.Studies and Research in Computer Science,Mangalagangothri,Mangalore University,under the University Grants Commission (UGC),New Delhi.

References

[1].Dr.Manjaiah.D.H. Hanumanthappa.J,2008, A Study on Comparison and Contrast between IPv4 and IPv6 Feature sets.In Proceedings of ICCNS’08, 2008,Pune,297-302. [2].Dr.Manjaiah.D.H. Hanumanthappa.J. 2008,Transition of IPv4 Network Applications to IPv6 Applications, In Proceedings of ICETiC-09,2009, S.P.G.C.Nagar,VirudhaNagar-626 001,Tamil Nadu,INDIA-35-40. [3].Dr.Manjaiah.D.H. Hanumanthappa.J. 2009,IPv6 over Bluetooth: Security Aspects, Issues and its Challenges, In Proceedings of NCWNT-09,2009, Nitte -574 110,Karnataka,INDIA –18-22. [4].Dr.Manjaiah.D.H.Hanumanthappa.J. 2009,Economical and Technical costs for the Transition of IPv4–to- IPv6 Mechanisms [ETCTIPv4 to ETCTIPv6], In Proceedings of NCWNT-09, 2009,Nitte -574 110, Karnataka,INDA-12-17. [5].S.Deering and R.Hinden,“Internet Protocol Version 6(IPv6) Specification”, RFC 2460, December 1998. [6].Silvia Hagen. 2002.IPv6 essential. New York: O'Reilly. [7].Joseph Davies. 2003.Understanding IPv6.Washington: Microsoft Press.[8].Dr.Manjaiah.D.H. Hanumanthappa.J.2009 An Overview of Study on Smooth Porting process scenario during IPv6 Transition (TIPv6),in Proceedings of IEEE IACC-09, 2009, Patiala, Punjab,INDIA-6-7,March-2009-2217-2222. [9].S.Tanenbaum,“Computer Networks”,Third Edition, Prentice Hall Inc., 1996, pp. 686,413-436,437-449. [10].Behrouz A.Forouzan, Third Edition,“TCP/IP Protocol Suite” [11].B.Carpenter and K. Moore,“Connection of IPv6 Domains via IPv4 Clouds,”RFC 3056, Feb.2001. [12] P.Savola and C.Patel,“Security Considerations for 6to4,” RFC 3964, Dec.2004.


Vol. 3, No.1, 2009

65

[13].P.Savola,“Security Considerations for 6to4”(October 2003),at http://www.ietf.org/internetdrafts/draft-ietf-6ops-6to4-Security-00.txt. [14].Okazaki,A Desai,“NAT-PT Security Considerations”(June 2003),at http://www.ietf.org/internetdrafts/draft-okazaki-v6ops-natpt-Security-00.txt [15].Atul Kahate,“Cryptography and Network Security“,Tata McGraw-Hill, 2003, pp-8-10. [16].Kurose.J.& Ross .K.(2005) Computer Networking: A top-down approach featuring the Internet .3rd ed,(Addison Wesley). [17].Deering.S.Hinden.R.(1998),Internet Protocol Version 6(IPv6) Specification .http: /// www.ietf.org/rfc/rfc2460.txt [18].RFC 2553 –Basic Socket Interface Extensions for IPv6. [19].Gilligan.& Nodmar .E.(1996) Transition Mechanisms for IPv6 Hosts and Routers. [20].Silvia Hagen, 2002.IPv6 essential .New York: ORielly. [21].Joseph Davies .2003.Understanding IPv6.Washington: Microsoft Press. [22].Ioan R,Sherali.Z.2003.Evaluating IPv4 to IPv6 Transition mechanism.IEEE,West Lafayette, USA,v (1):1091–1098.

[23].S.Convey,D.Miller,IPv6 and IPv4 Threat Comparison and Best Practice Evaluation(V 1.0),Presentation at the 17th NANOG, May 24, 2004. [24].J.Mohacsi, Security of IPv6 from firewalls point of view “presentation on TNC2004 Conference,June 2004. [25].E.Davies, S.Krishnan and P.Savola,IPv6 Transition /Co existence Security considerations,draft–ietf-v6ops-natpt-toexprmntl-03,October 20, 2005. [26].IPv6 Security address report,Ray hunt,Associate Professor,21-23 March,2005. [27].P.Savola and C.Patel,Security considerations for 6 to 4,RFC 3964, December 2004. [28].IPv6 Security,Malta 6 DISS workshop,4-6 April 2006. [29].R.Graveman, M.parthasarthy, P.Savola, and H.Tschofeing using IPSec to secure IPv6 –in IPv4 tunnels draft –ietf—v6ops-ipsec-tunnels-01, August, 25,2005. [30].Kent,S. and R.Atkinson,"IP Authentication Header",RFC 2402, November 1998.

[31].Narten, T., Nordmark, E. and W. Simpson,” Neighbor Discovery for IP Version 6(IPv6)",RFC 2461,December 1998. [32].Thomson.S.and T.Narten,"IPv6 Stateless Address Auto configuration", RFC 2462, December 1998. [33].Wellington, B.,"Secure Domain Name System(DNS) Dynamic Update",RFC 3007, November 2000. [34].Mankin, A.,”Threat Models introduced by Mobile IPv6 and Requirements for Security in Mobile IPv6", Work in Progress. [35].Kempf, J., Gentry,C. and A. Silverberg,"Securing IPv6 Neighbor Discovery Using Address Based Keys(ABKs)",Work in Progress,June 2002. [36].Roe, M.,"Authentication of Mobile IPv6 Binding Updates and Acknowledgments",Work in Progress, March 2002. [37].Arkko, J.,"Manual Configuration of Security Associations for IPv6 Neighbor Discovery",Work in Progress, March 2003. [38].P.Nikander, J.Kempf, and E.Nordmark,“IPv6 Neighbor Discovery (ND) Trust Models and Threats”, RFC3756,May 2004. [39].J.Mohacsi, IPv6 Security:Threats and Solutions, http://www.6net.org/events/workshop-2005/mohacsi.pdf. [40].E.Davies, S. Krishnan and P. Savola, “IPv6 Transition/Co-existence Security Considerations”, draft-ietf-v6ops-security-overview- 06.txt (work in Progress),Oct 2006. [41].Alvaro Vives and Jordi Palet, IPv6 Distributed Security:Problem Statement,Proceedings of the 2005 Symposium on Applications and the Internet Workshops (SAINT-W’05),IEEE, 2005.

[42].Kaeo, et. al., 2006, IPv6 Network Security Architecture 1.0, NAv6tf, www.nav6tf.org.

[43].Abidah Hj Mat Taib, IPv6 Transition: Why a new security mechanisms model is necessary.

[44].Savvas Chozas, “Implementation and Analysis of a Threat model for IPv6 Host Configuration”,September 2006.

[45].Savola P., Security implications and considerations of internet protocol version 6(IPv6), October 2005.

[46].Baker, F.,"Requirements for IP Version 4 Routers", RFC 1812,June 1995.

[47].Savola, P,"Security of IPv6 Routing Header and Home Address Options", Work in Progress,March 2002.

[48].Ferguson.P. and D.Senie,"Network Ingress Filtering:Defeating Denial of Service Attacks which employ IP Source Address Spoofing",BCP 38, RFC 2827, May 2000. [49].Carpenter, B. and K. Moore,"Connection of IPv6 Domains via IPv4 Clouds ", RFC 3056,February 2001. [50].J.Hagino and K.Yamamoto, IPv6 –to IPv4 Transport relay translator, RFC 3142, June 2001. [51].Jim Bound, IPv6 Transition JITC, FT Huachuca,AZ, and October 8-9, 2002. [52].S.Kent and R.Atkinson,“Security Architecture for the Internet Protocol “, RFC 2401,Nov,1998. [53].Savola P.(2003)Security considerations for 6to4 http://www.ietf.org/internetdrafts draft-ietf-v6ops-6to4-security-00.txt.

[54].B.Schneir,“Attack Trees“, Dr.Dobb’s Journal, pp.21-19,1999. [55].B.Carpenter and K.Moore,“Connection of IPv6 Domains via IPv4 Clouds“, RFC3056, Feb 2001. [56].D.Atkins, and R.Austein “Threat Analysis of the Domain Name System (DNS)” ,RFC 3833, 2004. [57].A.Barbir, S.Murphy, and Y.Yang,"Generic Threats to Routing Protocols ",IETF Draft draft-ietf-rpsec-routing-threats-07,2004. [58].P.Argyroudis and D.O.Mahony,"Secure Routing for Mobile Ad hoc Networks"IEEE Communications Surveys and Tutorials,vol.7,no.3, pp.2¨C21,2005. [59].P.S.E.Davies,S.Krishnan,(2006,March)IPv6Transition/Co-existencesecurityconsiderations.IETF internal Draft [Online],Available ://www.cs.columbia.edu/~smb/papers/v6worms.pdf. [60].Daniel Stirnimann,IPv6(IPv6 Transition and Tunneling Specific Issues),September 25,2008. [61].Xinyu Yang, Ting ma, Yi Shi,“Typical DoS/DDoS threats under IPv6, Proceedings of International Multi-Conference on computing in the Global information Technology (ICGI’07) March 2007.Page(s):50-55. [62].J.Postel,“Internet Protocol“,DARPA Internet Program Protocol Specification, RFC 791,Sept .1981. [63].Jelena Mirkovic and Peter Reiher,A Taxonomy of DDoS attack and DDoS defense mechanisms,ACMSIGCOMM Computer Communication review, PP.39-53, April 2004. [64].J.Postel,NCP/TCP Transition Plan,RFC 801,November 1981.

AUTHORS PROFILE Mr.Hanumanthappa. is Lecturer at the DoS in CS,University of Mysore, Manasagangothri,Mysore-06 and currently pursuing Ph.D in Computer Science and Engineering,from Mangalore University under the supervision of Dr.Manjaiah.D.H on entitled “IPv6 Tunneling Issues for 4G Networks”.His teaching

and Research interests include Computer Networks, Wireless and Sensor Networks,Mobile Ad-Hoc Networks, Intrusion detection System, Network Security and Cryptography, Internet Protocols, Mobile and Client Server Computing, Traffic management, Quality of Service, RFID, Bluetooth, Unix internals, Linux internal, Kernel Programming, Object Oriented Analysis and Design etc.His most recent research focus is in the areas of Internet Protocols and their applications.He received his Bachelor of Engineering Degree in Computer Science and Engineering from University B.D.T College of Engineering,Davanagere,Karnataka(S),India(C),Kuvempu University,Shimoga in the year 1998 and Master of Technology in CS&Engineering from NITK Surathkal,Karnataka(S ),India (C) in the year 2003.He has been associated as a faculty of the Department of Studies in Computer Science since 2004.He has worked as lecturer at SIR.M.V.I.T,Y.D.I.T,S.V.I.T,of Bangalore.He has guided about 250 Project thesis for BE,B.Tech,M.Tech,MCA,MSc/MS.He has published about 15 technical articles in International, and National Peer reviewed conferences. He is a Life member of CSI, ISTE,AMIE,IAENG,Embedded networking group of TIFAC–CORE in Network Engineering,ACM,Computer Science Teachers


Vol. 3, No.1, 2009

66

Association(CSTA),ISOC,IANA,IETF,IAB,IRTG,etc.He is also a BOE Member of all the Universities of Karnataka,INDIA.He has also visited Republic of China as a Visiting Faculty of HUANG HUAI University of ZHUMADIAN,Central China, to teach Computer Science Subjects like OS and System Software and Software Engineering, Object Oriented Programming With C++,Multimedia Computing for B.Tech Students.In the year 2008 and 2009 he has also visited Thailand and Hong Kong as a Tourist.

Dr.Manjaiah.D.H is currently Reader and Chairman of BoS in both UG/PG in the Computer Science at Dept.of Computer Science,Mangalore University,and Mangalore.He is also the BoE Member of all Universities of Karnataka and other reputed universities in India. He received Ph.D degree from University of Mangalore, M.Tech. from NITK, Surathkal and B.E.,from Mysore University. Dr.Manjaiah.D.H has an extensive academic, Industry and Research experience.He has worked at many technical bodies like IAENG, WASET,ISOC,CSI,ISTE,and ACS.He has authored more than-25 research papers in international conferences and reputed journals.He is the recipient of the several talks for his area of interest in many public occasions.He is an expert committee member of an AICTE and various technical bodies.He had written Kannada text book,with an entitled,” COMPUTER PARICHAYA”,for the benefits of all teaching and Students Community of Karnataka.Dr.Manjaiah D.H’s areas interest are Computer Networking & Sensor Networks, Mobile Communication, Operations Research,E-commerce,Internet Technology and Web Prog ramming.


67

Performance Evaluation of Mesh based Multicast Reactive Routing Protocol under Black Hole Attack

E.A.Mary Anita Research Scholar Anna University

Chennai

V.Vasudevan Senior Professor and Head / IT A. K. College of Engineering

Virudunagar, India

Abstract— A mobile ad-hoc network is an autonomous system of mobile nodes connected by wireless links in which nodes cooperate by forwarding packets for each other thereby enabling communication beyond direct wireless transmission range. The wireless and dynamic nature of ad-hoc networks makes them vulnerable to attacks especially in routing protocols. Providing security in mobile ad-hoc networks has been a major issue over the recent years. One of the prominent mesh base reactive multicast routing protocols used in ad-hoc networks is On Demand Multicast Routing protocol (ODMRP). The security of ODMRP is compromised by a primary routing attack called black hole attack. In this attack a malicious node advertises itself as having the shortest path to the node whose packets it wants to intercept. This paper discusses the impact of black hole attack on ODMRP under various scenarios. The performance is evaluated using metrics such as packet delivery ratio and end to end delay for various numbers of senders and receivers via simulation. Simulations are carried out using network simulator ns-2. The results enable us to propose solutions to counter the effect of black hole attack.

Keywords-MANET; Black hole; ODMRP

I. INTRODUCTION Security in wireless ad-hoc networks is a complex issue.

This complexity is due to various factors like insecure wireless communication links, absence of a fixed infrastructure, node mobility and resource constraints [1]. Nodes are more vulnerable to security attacks in mobile ad-hoc networks than in traditional networks with a fixed infrastructure. The security issues of Mobile Ad-hoc Networks (MANETs) are more challenging in a multicasting environment with multiple senders and receivers. There are different kinds of attacks by malicious nodes that can harm a network and make it unreliable for communication. These attacks can be classified as active and passive attacks [2]. A passive attack is one in which the information is snooped by an intruder without disrupting the network activity. An active attack disrupts the normal operation of a network by modifying the packets in the network. Active attacks can be further classified as internal and external attacks. External attacks are carried out by nodes that do not form part of the network. Internal attacks are from compromised nodes that were once legitimate part of the network.

A black hole attack is one in which a malicious node advertises itself as having the shortest path to a destination in a network. This can cause Denial of Service (DoS) by dropping the received packets.

The rest of the paper is organized as follows. The next section gives an overview of ODMRP. Section III discusses about black hole attack. Section IV over views security in ad-hoc networks. In section V the results of simulation experiments that show the impact of black hole attack on the performance of ODMRP under scenarios are discussed. Finally section VI summarizes the conclusion.

II. OVERVIEW OF ODMRP ODMRP is a mesh based multicast routing protocol that

uses the concept of forwarding group. Only a subset of nodes forwards the multicast packets on shortest paths between member pairs to build a forwarding mesh for each multicast group [3].

O – Mobile node S – Multicast Source

R – Multicast Receiver ______ JREQ _ _ _ _ _ JREP

Figure 1 On demand route and mesh creation

In ODMRP, group membership and multicast routes are established and updated by the source on demand. When a multicast source has packets to send, it initiates a route discovery process. A JOIN REQUEST packet is periodically broadcast to the entire network. Any intermediate node that


68

receives a non- duplicate JREQ packet stores the upstream node ID and rebroadcasts the packet. Finally when this packet reaches the destination, the receiver creates a JOIN REPLY and broadcasts it to its neighbors. Every node receiving the JREP checks to see if the next node id in JREP matches its own. If there is a match, it is a part of the forwarding group, sets its FG_FLAG and broadcasts its JREP built upon matched entries. This JREP is thus propagated by each forwarding group member until it reaches the source via a shortest path. Thus routes from sources to receivers build a mesh of nodes called forwarding group. The forwarding group is a set of nodes that forward the multicast packets. It supports shortest paths between any member pairs. All nodes inside the bubble (multicast members and forwarding group nodes) forward multicast data packets. A multicast receiver can also be a forwarding group node if it is on the path between a multicast source and another receiver. The mesh provides richer connectivity among multicast members compared to trees.

After the route establishment and route construction process, a multicast source can transmit packets to receivers via selected routes and forwarding groups. A data packet is forwarded by a node only if it is not a duplicate one and the setting of the FG_Flag for the multicast group has not expired. This procedure minimizes traffic overhead and prevents sending packets through stale routes.

In ODMRP, no explicit control packets need to be sent to join or leave the group. A multicast source can leave the group by just stop sending JREQ packets when it does not have any data to be sent to the group. If a receiver no longer wants to receive data from a particular group, it removes the corresponding entries from its member table and does not transmit the JOINTABLE for that group.

III. BLACK HOLE ATTACK

A black hole attack is one in which a malicious node uses the routing protocol to advertise itself as having the shortest path to the node whose packets it wants to intercept[4]. This attack aims at modifying the routing protocol so that traffic flows through a specific node controlled by the attacker. The attacker drops the received messages instead of relaying them as the protocol requires. Therefore the quantity of routing information available to other nodes is reduced. The attack can be accomplished either selectively or in bulk. Selective dropping means dropping packets for a specified destination or a packet every‘t’ seconds or a packet every ‘n’ packets or a randomly selected portion of packets[5]. Bulk attack results in dropping all packets. Both result in degradation in the performance of the network.

A. Black hole problem in ODMRP

ODMRP is an important on demand routing protocol that creates routes only when desired by the source node. ODMRP does not include any provisions for security and hence it is

susceptible to attacks .When a node requires a route to a destination it initiates a route discovery process within the network. Any malicious node can interrupt this route discovery process by claiming to have the shortest route to the destination thereby attracting more traffic towards it. For example, source A wants to send packets to destination D, in fig.1, source A initiates the route discovery process. Let M be the malicious node which has no fresh route to destination D. M claims to have the route to destination and sends join reply JREP packet to S. The reply from the malicious node reaches the source node earlier than the reply from the legitimate node, as the malicious node does not have to check its routing table like the other legitimate nodes. The source chooses the path provided by the malicious node and the data packets are dropped. The malicious node forms a black hole in the network and this problem is called black hole problem. called black hole problem.

A-Source node D-Destination node M-Malicious node - - - - JREQ ____ JREP

Figure 2 Black hole attack

IV. RELATED WORK Several researchers have addressed the problem of

securing unicast routing protocols for ad-hoc networks. Ariadne [7], Secure Aware Ad-Hoc Routing (SAR) [8], Secure Efficient Ad-Hoc Distance (SEAD) Vector Routing Protocol [9], Secure AODV [10], Authenticated Routing for Ad-Hoc Network (ARAN) [11], Secure Routing Protocol (SRP) [12] and Secure Link- State Protocol (SLSP) [12] are all based on unicast routing protocols. Also these protocols do not address the problem of black hole attack. Marti, S., Giuli, T. J., Lai, K., & Baker, M.[13] have proposed a Watchdog and Pathrater approach against black hole attack which is implemented on top of source routing protocol such as DSR (Dynamic Source Routing). Ramanujam et al. [2] have presented some general techniques collectively called as TIARA (Techniques for


69

Intrusion resistant Ad-Hoc Routing Algorithms) to protect ad-hoc networks from attacks.

CONFIDANT (Cooperative of Nodes, Fairness In Dynamic Ad-hoc Networks) [10] is an extended version of Watchdog and Pathrater which uses a mechanism similar to Pretty Good Privacy for expressing various levels of trust, key validation and certification. It is also implemented on unicast routing protocol such as DSR. These papers have not addressed the challenges in multicast routing protocols which are our focus in this paper.

V. PERFORMANCE EVALUATION

The performance of a network depends on many factors such as number of senders, receivers, attackers and their positions. The performance of ODMRP has been observed in different scenarios.

A. Simulation Environment and Metrics The simulation is done using the ns-2 simulator. The

metrics used in evaluating the performance are: Packet Delivery Ratio: The ratio of the number of data packets delivered to the destinations to the number of data packets generated by the sources. Average End-to-End Delay: This is the average delay between the sending of packets by the source and its receipt by the receiver. This includes all possible delays caused during data acquisition, route discovery, queuing, processing at intermediate nodes, retransmission delays, propagation time, etc. [5]. It is measured in milliseconds.

B. Simulation Profile

The simulation settings are as follows. The network consists of 50 nodes placed randomly within an area of 1000m x 1000 m. Each node moves randomly and has a transmission range of 250m. The random way point model is used as the mobility model. In this model, a node selects a random destination and moves towards that destination at a speed between the pre-defined maximum and minimum speed. The minimum speed for the simulations is 0 m/s while the maximum speed is 50 m/s. The simulations were carried out with 2, 5, 7 and 9 attackers for different number of receivers. The malicious nodes were selected randomly.

C. Discussion of results

Fig.1 shows the variation of PDR with mobility for 1 sender and 20 receivers when the number of attackers are varied from 0 to 5. It is seen that the PDR decreases with increased mobility.

Mobility Vs PDR (1 sen 20 rec)

0.7

0.8

0.9

1.0

1.1

0 10 20 30 40 50 60

Mobility (m/s)

PDR

0-attacker1-attacker3-attackers5-attackers

Figure 3 PDR for 1 sender and 20 receivers

The drop in PDR in the presence of a single attacker is only around 1%. When the number of attackers is increased to 3, the drop in PDR increases by 10% and a further drop of 5% is observed when the number of attackers is increased to 5.Higher the number of attackers, higher the reduction in PDR.


0.8

0.9

1.0

1.1

0 10 20 30 40 50 60

Mobility (m/s)

PDR

No attacker1 attacker3 attacker5 attacker

Figure 4 PDR for 1 sender and 30 receivers

A similar situation in seen in fig. 4 also but, as the number of receivers in this case is increased to 30, the impact of the attack is comparatively lesser. This is due to the fact that more number of receivers results in a denser routing mesh providing alternate paths for the data packets. Given the same number of attackers, the PDR is higher for higher number of receivers.


70

Mobility Vs PDR (3sen 20 rec)

0.8

0.9

1.0

1.1

0 20 40 60

Mobility (m/s)

PD

R


Figure 5 PDR for 3 senders and 20 receivers

Fig. 5 shows an increase in the value of PDR compared to fig. 3. This can be attributed to the increased number of senders thereby providing more alternate paths for the data packets. Even if a packet gets dropped in one path due to the presence of black hole nodes, there is a chance for the duplicate copy of the packet to reach the destination through alternate paths free from malicious nodes.


0.8

0.9

1.0

1.1

0 20 40 60

Mobility (m/s)

PDR


Figure 6 PDR for 3 senders and 30 receivers The increase in PDR when the number of receivers is increased from 20 to 30 with the same number of senders varies from 1% in the absence of attackers to 3% in the presence of 5 attackers. This is clearly depicted in fig.6. From the above graphs we conclude that a large multicast group with more number of senders and receivers are more resilient to black hole attack than a smaller group. This is due to the presence of more alternative paths available to route duplicate copies of the data packets.

Mobility Vs Delay (1 sen 20 rec)

5.25.35.45.55.65.75.85.96.06.1

0 10 20 30 40 50 60

Mobility (m/s)

Del

ay (m

s) no-attacker1-attacker3-attackers5-attackers

Figure 7 Delay for 1 sender and 20 receivers Fig.7 shows the variation of end to end delay for different number of attackers in the presence of 1 sender and 20 receivers. There seems to be an increase in the delay in the presence of attackers.

Mobility Vs Delay (1sen 30 rec)

5.35.45.55.65.75.85.96.06.16.2

0 20 40 60

Mobility (m/s)

Dela

y (m

s) No attacker1 attacker3 attacker5 attacker

Figure 8 Delay for 1 sender and 30 receivers


71


5.45.55.65.75.85.96.06.16.26.3

0 20 40 60

Mobility (m/s)

Del

ay (m


Figure 9 Delay for 3 senders and 20 receivers

This is due to the fact that non shortest paths containing black hole nodes are selected for routing the packets. Also we see that the delay increases with increased group size as shown in fig. 8.


5.65.75.85.96.06.16.26.36.46.56.6

0 20 40 60

Mobility (m/s)

Del

ay (m


Figure 10. Delay for 3 senders and 30 receivers

End to end delay includes all delays caused during route discovery, transmission delays, processing at intermediate nodes, etc. Obviously, a larger group accounts to a larger delay. This is clearly depicted in fig. 9 and fig. 10.

VI. CONCLUSION Security is one of the major issues in MANETs. In this paper the effect of black hole attack on MANETs has been analysed. The multicast routing protocol ODMRP has been simulated with black hole nodes under different scenarios.The performance of a multicast routing protocol under black hole attack depends on factors such as number of multicast senders, number of multicast receivers and number of black hole nodes

From the simulation results it is observed that, the packet delivery ratio reduces with increased mobility of the nodes and also with increased number of black hole nodes and affect the performance of the network. Also the packet delivery ratio is higher for large number of receivers for the same number of attackers. That is, the effect of the attack is more in a small group than in a large group. A large group is able to withstand the attack to a reasonable extent when compared to a smaller group which is easily susceptible to attacks. This can be attributed to the existence of alternate paths for routing the data packets. The results also depict that the delay increases with increase in group size and increase in number of attackers. This is because of the fact that non shortest paths containing black hole nodes are selected for routing the packets. To implement security over ODMRP, all route request messages are to be authenticated. Several mechanisms can be found in literature for authentication. A self organized public key infrastructure can be used to authenticate the nodes participating in the route discovery process so that compromised nodes can be easily identified and excluded from the network.

REFERENCES [1] D. Djenouri, L. Khelladi, and N. Badache, “A Survey of

Security Issues in Mobile Ad Hoc and Sensor Networks,” IEEE Communication .Surveys & Tutorials, vol. 7, no. 4, 4th Quarter 2005

[2] L. Zhou and Z. J. Haas, “Securing Ad Hoc Networks,” IEEE Network Magazine., vol. 13, no.6, Nov./Dec. 1999, pp. 24–30.

[3] S.Lee, M.Gerla and C.Chain, “On Demand Multicast Routing protocol-(ODMRP),” Proc. of the IEEE Wireless Communication and Networking Conference (WCNC), September 1999.

[4] P. Papadimitratos and Z. J. Haas, “Secure Routing for Mobile Ad hoc Networks,” Proc. Communication Networks and Distributed Systems, Modeling and Simulation Conf. (CNDS’02), San Antonio, Texas, Jan. 2002, pp. 27–31.

[5] P. Ning and K. Sun, “How to Misuse AODV: A Case Study of Insider Attacks Against Mobile Ad Hoc Routing Protocols,” Info. Assurance Workshop, IEEE Sys., Man and Cybernetics Soc., June 2003, pp. 60–67.

[6] H. Deng, W. Li, and Dharma P. Agrawal, “Routing security in Ad Hoc Networks,” IEEE Communications Magazine, Special Topics on Security in Telecommunication Networks, Vol. 40, No. 10, October 2002, pp. 70-75.

[7] Y.-C. Hu, A. Perrig, and D. Johnson. Ariadne: “A secure on-demand routing protocol for ad hoc networks,” Proc. of 8th ACM Mobile Computing and Networking (MobiCom’02), pp. 12–23, 2002.

[8] S. Yi, P. Naldurg, and R. Kravets, “ Security-aware ad hoc routing for wireless networks,”. Proc. of 2nd ACM


72

Mobile Ad Hoc Networking and Computing (MobiHoc’01), pp. 299–302, 2001.

[9] Y.-C. Hu, D. Johnson, and A. Perrig. SEAD: “Secure efficient distance vector routing in mobile wireless ad hoc networks.” Proc. of 4th IEEE Workshop on Mobile Computing Systems and Applications (WMCSA’02), pp. 3–13, 2002

[10] Yang, H., Luo, H., Ye, F., Lu, S., & Zhang, L. (2004). “Security in mobile ad hoc networks: Challenges and solutions,” IEEE Wireless Communications, 11(1), 38-47.

[11] Sanzgiri, K., Dahill, B., Levine, B. N., Shields, C., & Belding-Royer, E. M. (2005). “Authenticated routing for ad hoc networks,” IEEE Journals on Selected Areas in Communications, 23(3), 598- 610

[12] Papadimitratos, P., & Haas, Z. J. (2003a). “Secure link state routing for mobile ad hoc networks.” Proceedings of the Symposium on Applications and the Internet Workshops (SAINT) (pp. 27-31).

[13] Marti, S., Giuli, T. J., Lai, K., & Baker, M. (2000). “Mitigating routing misbehavior in mobile ad-hoc

networks,” Proceedings of the 6th International Conference on Mobile Computing and Networking (MobiCom), ISBN 1-58113-197-6 (pp. 255-265).

AUTHORS PROFILE

E.A.Mary Anita received the Bachelor of Engineering in Electrical and Electronics Engineering and Master of Engineering in Computer Science and Engineering, from Government College of Engineering, Tirunelveli, TamilNadu, affiliated to Madurai Kamaraj University and

Manonmaniam Sundaranar University respectively. Her research interests include wireless communication, multicast and network security.

V.Vasudevan received the PhD degree from Madurai Kamaraj University, India. He is Senior Professor in the Information Technology Department of A.K.College of Engineering, Virudhunagar, TamilNadu. His research interests include multicasting, image processing and grid computing


73

Novel Framework for Hidden Data in the Image Page within Executable File Using Computation between Advanced

Encryption Standard and Distortion Techniques

A.W. Naji*, Shihab A. Hameed, , B.B.Zaidan**, Wajdi F. Al-Khateeb, Othman O. Khalifa, A.A.Zaidan and Teddy S. Gunawan,

Department of Electrical and Computer Engineering, Faculty of Engineering, International Islamic University Malaysia, P.O. Box 10, 50728 Kuala Lumpur, Malaysia

* [email protected], ** [email protected]

ABSTRACT----- The hurried development of multimedia and internet allows for wide distribution of digital media data. It becomes much easier to edit, modify and duplicate digital information. In additional, digital document is also easy to copy and distribute, therefore it may face many threats. It became necessary to find an appropriate protection due to the significance, accuracy and sensitivity of the information. Furthermore, there is no formal method to be followed to discover a hidden data. In this paper, a new information hiding framework is presented.The proposed framework aim is implementation of framework computation between advance encryption standard (AES) and distortion technique (DT) which embeds information in image page within executable file (EXE file) to find a secure solution to cover file without change the size of cover file. The framework includes two main functions; first is the hiding of the information in the image page of EXE file, through the execution of four process (specify the cover file, specify the information file, encryption of the information, and hiding the information) and the second function is the extraction of the hiding information through three process (specify the stego file, extract the information, and decryption of the information).

Keyword--(Image Pages within Portable Executable File Cryptography, Advance Encryption Standard, Steganography, Distortion Technique).

I. INTRODUCTION

Nowadays, protection framework can be classified into more specific as hiding information (Steganography) or encryption information (Cryptography) or a combination between them. Cryptography is the practice of ‘scrambling’ messages so that even if detected, they are very difficult to decipher. The purpose of Steganography is to conceal the message such that the very existence of the hidden is ‘camouflaged’.However, the two techniques are not mutually exclusive. Steganography and Cryptography are in fact complementary techniques [1],[2]. No matter how strong algorithm, if an encrypted message is discovered, it will be subject to cryptanalysis. Likewise, no matter how well concealed a message is, it is always possible that it will be discovered [1]. By combining Steganography with Cryptography we can conceal the existence of an encrypted

message. In doing this, we make it far less likely that an encrypted message will be found [2],[3]. Also, if a message concealed through Steganography is discovered, the discoverer is still faced with the formidable task of deciphering it. Also the strength of the combination between hiding and encryption science is due to the non-existence of standard algorithms to be used in (hiding and encryption) secret messages. Also there is randomness in hiding methods such as combining several media (covers) with different methods to pass a secret message. Furthermore, there is no formal method to be followed to discover a hidden data [1],[2],[3].

II. IMAGE PAGE WITHIN PORTABLE EXECUTABLE FILE

The proposed framework uses a portable executable file as a cover to embed an executable program as an example for the proposed framework. This section is divided into three parts [3],[4],[5]. Firstly concepts related with PE,the addition of the Microsoft® windows NT™ operating system to the family of windows™ operating systems brought many changes to the development environment and more than a few changes to applications themselves. One of the more significant changes is the introduction of the Portable Executable (PE) file format. The name "Portable Executable" refers to the fact that the format is not architecture specific [6].In other words, the term "Portable Executable" was chosen because the intent was to have a common file format for all versions of Windows, on all supported CPUs [5].The PE files formats drawn primarily from the Common Object File Format (COFF) specification that is common to UNIX® operating systems. Yet, to remain compatible with previous versions of the MS-DOS® and windows operating systems, the PE file format also retains the old familiar MZ header from MS-DOS [6].The PE file format for Windows NT introduced a completely new structure to developers familiar with the windows and MS-DOS environments. Yet developers familiar with the UNIX environment will find that the PE file format is similar to, if not based on, the COFF specification [6].The entire format consists of an MS-DOS MZ header, followed by a real-mode stub program, the PE file signature, the PE file header, the PE optional header, all of the section headers, and finally, all of the section bodies [4].The secondly part are techniques related with PE ,before looking


74

inside the PE file, we should know special techniques some of which are [6], general view of PE files sections, a PE file section represents code or data of some sort. While code is just code, there are multiple types of data. Besides read/write program data (such as global variables), other types of data in sections include application program interface (API) import and export tables, resources, and relocations. Each section has its own set of in-memory attributes, including whether the section contains code, whether it's read-only or read/write, and whether the data in the section is shared between all processes using the executable file. Sections have two alignment values, one within the desk file and the other in memory [5]. The PE file header specifies both of these values, which can differ. Each section starts at an offset that's some multiple of the alignment value. For instance, in the PE file, a typical alignment would be 0x200. Thus, every section begins at a file offset that's a multiple of 0x200.Once mapped into memory, sections always start on at least a page boundary. That is, when a PE section is mapped into memory, the first byte of each section corresponds to a memory page. On x86 CPUs, pages are 4KB aligned, while on the Intel Architecture IA-64, they're 8KB aligned. Relative Virtual Addresses (RVA), in an executable file, there are many places where an in-memory address needs to be specified. For instance, the address of a global variable is needed when referencing it. PE files can load just about anywhere in the process address space [7]. While they do have a preferred load address, you can't rely on the executable file actually loading there. For this reason, it's important to have some way of specifying addresses that are independent of where the executable file loads. To avoid having hard coded memory addresses in PE files, RVAs are used. An RVA is simply an offset in memory, relative to where the PE file was loaded. For instance, consider an .EXE file loaded at address 0x400000, with its code section at address 0x401000. The RVA of the code section would be: (Target address) 0x401000 – (load address) 0x400000 = (RAV) (1)

To convert an RVA to an actual address, simply reverse the process: add the RVA to the actual load address to find the actual memory address. Incidentally, the actual memory address is called a Virtual Address (VA) in PE parlance [7]. Another way to think of a VA is that it's an RVA with the preferred load address added in. Importing Functions, when we use code or data from another DLL, we're importing it. When any PE files loads, one of the jobs of the windows loader is to locate all the imported functions and data and make those addressees available to the file being loaded. The thirdly part is The PE file layout is shown in Figure 1. There is image page in PE file layout [7], and this image page suggested to hide a watermark.The size of the image page is different from one file to another [7].The most important reason behind the idea of this framework is that the programmers always need to create a back door for all of their developed applications, as a solution to many problems such that forgetting the password. This idea leads the customers to feel that all programmers have the ability to hack their framework any time. At the end of this

discussion all customers always are used to employ trusted programmers to build their own application. Programmers want their application to be safe any where without the need to build ethic relations with their customers. In this framework a solution is suggested for this problem [6],[8]. The solution is to hide the password in the executable file of the same framework and then other application to be retracted by the customer himself. Steganography needs to know all files format to find a way for hiding information in those files. This technique is difficult because there are always large numbers of the file format and some of them have no way to hide information in them [8].

Figure 1.Typical 32-bit Portable .EXE File Layout

III. CRYPTOGRAPHY A. Block Cipher In cryptography, a block cipher is a symmetric key cipher which operates on fixed-length groups of bits, termed blocks, with an unvarying transformation. When encrypting, a block cipher might take a (for example) 128-bit block of plaintext as input, and outputs a corresponding 128-bit block of cipher text. The exact transformation is controlled using a second input — the secret key. Decryption is similar: the decryption algorithm takes, in this example, a 128-bit block of cipher text together with the secret key, and yields the original 128-bit block of plaintext. To encrypt messages longer than the block size (128 bits in the above example), a mode of operation is used. Block ciphers can be contrasted with stream ciphers; a stream cipher operates on


75

individual digits one at a time and the transformation varies during the encryption. The distinction between the two types is not always clear-cut: a block cipher, when used in certain modes of operation, acts effectively as a stream cipher as shown in Figure 2 [8].

Figure 2. Encryption and Decryption An early and highly influential block cipher design is the Data Encryption Standard (DES). The (DES) is a cipher (a method for encrypting information) selected as an official Federal Information Processing Standard (FIPS) for the United States in 1976, and which has subsequently enjoyed widespread use internationally. The algorithm was initially controversial, with classified design elements, a relatively short key length, and suspicions about a National Security Agency (NSA) backdoor. DES consequently came under intense academic scrutiny, and motivated the modern understanding of block ciphers and their cryptanalysis. DES is now considered to be insecure for many applications. This is chiefly due to the 56-bit key size being too small; DES keys have been broken in less than 24 hours. There are also some analytical results which demonstrate theoretical weaknesses in the cipher, although they are infeasible to mount in practice. The algorithm is believed to be practically secure in the form of Triple DES, although there are theoretical attacks[1][8]. In recent years, the cipher has been superseded by the Advanced Encryption Standard (AES). B. Advanced Encryption Standard Advance Encryption Standard (AES) and Triple DES (TDES or 3DES) are commonly used block ciphers. Whether you choose AES or 3DES depend on your needs. In this section it would like to highlight their differences in terms of security and performance [3].Since 3DES is based on DES algorithm, it will talk about DES first. DES was developed in 1977 and it was carefully designed to work better in hardware than software. DES performs lots of bit manipulation in substitution and permutation boxes in each of 16 rounds. For example, switching bit 30 with 16 is much simpler in hardware than software. DES encrypts data in 64 bit block size and uses effectively a 56 bit key. 56 bit key space amounts to approximately 72 quadrillion possibilities. Even though it seems large but according to today’s computing power it is not sufficient and vulnerable to brute force attack. Therefore, DES could not keep up with advancement in

technology and it is no longer appropriate for security. Because DES was widely used at that time, the quick solution was to introduce 3DES which is secure enough for most purposes today.3DES is a construction of applying DES three times in sequence. 3DES with three different keys (K1, K2 and K3) has effective key length is 168 bits (The use of three distinct key is recommended of 3DES.). Another variation is called two-key (K1 and K3 is same) 3DES reduces the effective key size to 112 bits which is less secure. Two-key 3DES is widely used in electronic payments industry. 3DES takes three times as much CPU power than compare with its predecessor which is significant performance hit. AES outperforms 3DES both in software and in hardware [8]. The Rijndael algorithm has been selected as the Advance Encryption Standard (AES) to replace 3DES. AES is modified version of Rijndael algorithm. Advance Encryption Standard evaluation criteria among others was (Seleborg, 2004): • Security • Software & Hardware performance • Suitability in restricted-space environments • Resistance to power analysis and other implementation attacks. Rijndael was submitted by Joan Daemen and Vincent Rijmen. When considered together Rijndael combination of security, performance, efficiency, implement ability, and flexibility made it an appropriate selection for the AES.By design AES is faster in software and works efficiently in hardware. It works fast even on small devices such as smart phones; smart cards etc.AES provides more security due to larger block size and longer keys.AES uses 128 bit fixed block size and works with 128, 192 and 256 bit keys.Rigndael algorithm in general is flexible enough to work with key and block size of any multiple of 32 bit with minimum of128 bits and maximum of 256 bits.AES is replacement for 3DES according to NIST both ciphers will coexist until the year2030 allowing for gradual transition to AES.Even though AES has theoretical advantage over 3DES for speed and efficiency in some hardware implementation 3DES may be faster where support for 3DES is mature [1][2][5].

IV. STEGANOGRAPHY A. General Steganography Framework A general Steganography framework is shown in Figure 3. It is assumed that the sender wishes to send via Steganographic transmission, a message to a receiver. The sender starts with a cover message, which is an input to the stego-system, in which the embedded message will be hidden. The hidden message is called the embedded message. A Steganographic algorithm combines the cover massage with the embedded message, which is something to be hidden in the cover The algorithm may, or may not, use a Steganographic key (stego key), which is additional secret data that may be needed in the hidden process. The same key (or related one) is usually needed to extract the


76

embedded massage again. The output of the Steganographic algorithm is the stego message. The cover massage and stego message must be of the same data type, but the embedded message may be of another data type. The receiver reverses the embedding process to extract the embedded message [4].

Figure 3: General Steganography Framework

B. Distortion techniques DT is short for Distortion techniques are requires the knowledge of the original cover in the decoding process.The sender applies a sequence of modifications to the cover in order to get a stego-system.A sequence of modification is chosen in such a way that it corresponds to a specific secret message to be transmitted[5]. The receiver measures the difference in the original cover in order to reconstruct the sequence of modification applied by the sender, which corresponds to the secret message. An early approach to hiding information is in text. Most text-based hiding methods are of distortion type (i.e, the arrangement of words or the layout of a docement may reveal information). One technique is by modulating the positions of line and words, which will be detailed in the next subsection.Adding spaces and “invisible” characters to text provides a method to pass hidden information HTML files are good candidates for including image, extra spaces, tabs, and line breaks. The executable file include the image, this point make this cover to be suitable to use for this technique [5].

V. METHODOLOGY A. Framework Concept

Concept of this framework can be summarized as hiding the password or any information beyond the end of an executable file so there is no function or routine (open-file, read, write, and close-file) in the operating framework to extract it. This operation can be performed in two alternative methods: Building the file handling procedure independently of the operating system file handling routines. In this case we need canceling the existing file handling routines and developing a new function which can perform our need, with the same names.

The advantage of these methods is it doesn't need any additional functions, which can be identified by the analysts. And it can be executed remotely and suitable for networks and the internet applications .The disadvantage of these methods is it needs to be installed (can not be operated remotely). So we choose this concept to implementation in this paper. B. Framework Features This framework has the following features: • The hiding operation within image page of EXE file

increases the degree of security of hiding technique which is used in the proposed framework because the size of cover file doesn't change, so the attacker can not be attack the information hidden.

• It's very difficult to extract the hidden information it's difficult to find out the information hiding , that is because of three reasons: o The information hiding will be encrypted before hiding

of the information by AES method; this method very strong, 128-bit key would be in theory being in range of a military budget within 30-40 years. An illustration of the current status for AES is given by the following example, where we assume an attacker with the capability to build or purchase a framework that tries keys at the rate of one billion keys per second. This is at least 1 000 times faster than the fastest personal computer in 2004. Under this assumption, the attacker will need about 10 000 000 000 000 000 000 000 years to try all possible keys for the weakest version.

• The attacker impossible guessing the information hiding inside the EXE file because of couldn't guessing the real size of (EXE file and information hiding).

• The information hiding should be decrypted after retract of

the information. C. The Proposed Framework Structure.

To protect the hidden information from retraction the framework encrypts the information by the built-in encryption algorithm provided by the Java. The following Framework algorithm for hiding operation procedure as shown in Figure 4. The following Framework algorithm for Retract operation procedure as shown in Figure 5. 1. The following algorithm is the hiding operation

procedure:


77

Figure 4. Shows Framework Algorithm for Hiding Operation.

2. The following algorithm is retraction operation procedure:

Figure 5. Shows Framework Algorithm for Retract

Operation.

VI. CONCLUSION One of the important conclusions in implementation of the proposed framework is the solving of the problems that

are related to the change size of cover file, so the hiding method makes the relation between the cover and the message dependent without change of cover file and The encryption of the message increases the degree of security of hiding technique which is used in the proposed framework and PE files structure is very complex because they depend on multi headers and addressing, and then insertion of data to PE files without full understanding of their structure may damage them, so the choice is to hide the information beyond the structure of these files , finally The framework has achieved the main goal, makes the relation of the size of the cover file and the size of information dependent without change the size of cover file , so There is no change on the cover file size where you can hide file of image page within portable executable file by Structure on the property of the EXE file and The proposed framework is implemented by using Java.

VII. FUTURE WORK There are many suggestions for improving the proposed framework, the main suggestions are: • Improvement of the security of hiding framework by

adding compression function of the message before the hidden operation.

• Improvement of the security of the proposed framework by changing the encryption methods for other methods such as (MD5, BLOWFISH, DIGITAL SIGNATURE or KEY DISTRIBUTION).

REFERENCES [1] A.A.Zaidan, B.B.Zaidan, Fazidah Othman, “New Technique of

Hidden Data in PE-File with in Unused Area One”, International Journal of Computer and Electrical Engineering (IJCEE), Vol.1, No.5, ISSN: 1793-8198, pp 669-678.

[2] A.A.Zaidan, B.B.Zaidan, M.M.Abdulrazzaq, R.Z.Raji, and S.M.Mohammed," Implementation Stage for High Securing Cover-File of Hidden Data Using Computation Between Cryptography and Steganography", International Conference on Computer Engineering and Applications (ICCEA09), Telecom Technology and Applications (TTA), Vol.19, Session 6, p.p 482-489, ISBN: 978-1-84626-017-9,June 6 (2009), Manila, Philippines

[3] Alaa Taqa, A.A Zaidan, B.B Zaidan ,“New Framework for High Secure Data Hidden in the MPEG Using AES Encryption Algorithm”, International Journal of Computer and Electrical Engineering (IJCEE),Vol.1 ,No.5, ISSN: 1793-8198, pp.589-595 .

[4] A.W.Naji, A.A.Zaidan, B.B.Zaidan, Shihab A, Othman O. Khalifa, “ Novel Approach of Hidden Data in the (Unused Area 2 within EXE File) Using Computation Between Cryptography and Steganography ”, International Journal of Computer Science and Network Security (IJCSNS) , Vol.9, No.5 , ISSN : 1738-7906, pp. 294-300.

[5] A.W. Naji, A.A.Zaidan, B.B.Zaidan, Ibrahim A.S.Muhamadi, “Novel Approach for Cover File of Hidden Data in the Unused Area Two within EXE File Using Distortion Techniques and Advance Encryption Standard.”, Academic and Scientific Research Organizations (WASET), International Conference on Computer, Electrical, and Systems Science, and Engineering (CCESSE09), , ISSN:2070-3724, 26-28 .

[6] A.W. Naji, A.A.Zaidan, B.B.Zaidan, Ibrahim A.S.Muhamadi, “New Approach of Hidden Data in the portable Executable File without Change the Size of Carrier File Using Distortion Techniques”,


78

Academic and Scientific Research Organizations (WASET), International Conference on Computer, Electrical, and Systems Science, and Engineering(CCESSE09), , ISSN:2070-3724.

[7] B.B.Zaidan, A.A.Zaidan, Fazidah Othman, R.Z.Raji, S.M.Mohammed, M.M.Abdulrazzaq, “Quality of Image vs. Quantity of Data Hidden in the Image”, International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV'09), July 13-16 (2009), Las Vigas, Nevada, USA.

[8] B.B.Zaidan, A.A.Zaidan, Fazidah. Othman, Ali Rahem,“ Novel Approach of Hidden Data in the (Unused Area 1 within EXE File) Using Computation Between Cryptography and Steganography ”, Academic and Scientific Research Organizations (WASET), International Conference on Cryptography, Coding and Information Security (ICCCIS09), Vol.41, Session 24, ISSN: 2070-3740.

AUTHORS PROFIL Dr. Ahmed Wathik Naji - He obtained his 1st Class Master degree in Computer Engineering from University Putra Malaysia followed by PhD in Communication Engineering also from University Putra Malaysia. He supervised many postgraduate students and led many funded research projects with more than 50 international papers. He has more than 10 years of industrial and educational

experience. He is currently Senior Assistant Professor, Department of Electrical and Computer Engineering, International Islamic University Malaya, Kuala Lumpur, Malaysia.

Aos Alaa Zaidan - He obtained his 1st Class Bachelor degree in Computer Engineering from university of Technology / Baghdad followed by master in data communication and computer network from University of Malaya. He led or member for many funded research projects and He has published more than 40 papers at various international and national conferences and journals, he has done many projects on Steganography for data hidden through different multimedia carriers image, video, audio, text,

and non multimedia carrier unused area within exe.file, Quantum Cryptography and Stego-Analysis systems, currently he is working on the multi module for Steganography. He is PhD candidate on the Department of Computer System & Technology / Faculty of Computer Science and Information Technology/University of Malaya /Kuala Lumpur/Malaysia.

Bilal Bahaa Zaidan - he obtained his bachelor degree in Mathematics and Computer Application from Saddam University/Baghdad followed by master from Department of Computer System & Technology Department Faculty of Computer Science and Information Technology/University of Malaya /Kuala Lumpur/Malaysia, He led or member for many funded research projects and He has published more than 40 papers at various international and national conferences and journals. His research interest on Steganography &

Cryptography with his group he has published many papers on data hidden through different multimedia carriers such as image, video, audio, text, and non multimedia careers such as unused area within exe.file, he has done projects on Stego-Analysis systems, currently he is working on Quantum Key Distribution QKD and multi module for Steganography, he is PhD candidate on the Department of Computer System & Technology / Faculty of Computer Science and Information Technology/University of Malaya /Kuala Lumpur/Malaysia.

Dr. Shihab A Hameed - He obtained his PhD in software Engineering from UKM. He has three decades of industrial and educational experience. His research interest is mainly in the software engineering, software quality, surveillance and monitoring systems, health care and medication. He supervised numerous funded projects and has published more than 60 papers at various international and national conferences and journals. He is currently Senior Assistant Professor, Department of Electrical and

Computer Engineering, International Islamic University Malaysia. Malaya, Kuala Lumpur.

Dr. Teddy Surya Gunawan - He received his B.Eng degree in Electrical Engineering with cum laude award from Institut Teknologi Bandung (ITB), Indonesia in 1998. He obtained his M.Eng degree in 2001 from the School of Computer Engineering at Nanyang Technological University, Singapore, and PhD degree in 2007 from the School of Electrical Engineering and Telecommunications, The University of New South Wales, Australia. His research interests are in speech and image processing, biomedical processing, image processing,

and parallel processing. He is currently Assistant Professor and Academic Advisor at Department of Electrical and Computer Engineering, International Islamic University Malaysia. He is currently Senior Assistant Professor at Department of Electrical and Computer Engineering, International Islamic University Malaysia.

Othman O. Khalifa received his Bachelor’s degree in Electronic Engineering from the Garyounis University, Libya in 1986. He obtained his Master degree in Electronics Science Engineering and PhD in Digital Image Processing from Newcastle University, UK in 1996 and 2000 respectively. He worked in industrial for eight years and he is currently an Professor and Head of the department of Electrical and Computer Engineering, International Islamic University Malaysia. His area of research interest is Communication

Systems, Information theory and Coding, Digital image / video processing, coding and Compression, Wavelets, Fractal and Pattern Recognition. He published more than 130 papers in international journals and Conferences. He is SIEEE member, IEEE computer, Image processing and Communication Society member.

Dr. Wajdi Fawzi Al-Khateeb- He received his PhD from the International Islamic University, Malaysia and his MSc from the Technical University of Berlin, Germany. His research interest is mainly in the Reliability Engineering, Fault Tolerant Systems, QoS Networking, Microwave Radio Links. He is currently an assistant Professor in the department of Electrical and Computer Engineering, International Islamic University Malaysia


79

A Secure Multi-Party Computation Protocol for

Malicious Computation Prevention for preserving privacy during Data Mining

Dr. Durgesh Kumar Mishra Professor (CSE) and Dean (R&D),

Acropolis Institute of Technology & Research, Indore, MP, India

[email protected]

Neha Koria, Nikhil Kapoor, Ravish Bahety Computer Science Dept. (R&D)

Acropolis Institute of Technology & Research, Indore, MP, India

[email protected], [email protected], [email protected]

ABSTRACT- Secure Multi-Party Computation (SMC) allows parties with similar background to compute results upon their private data, minimizing the threat of disclosure. The exponential increase in sensitive data that needs to be passed upon networked computers and the stupendous growth of internet has precipitated vast opportunities for cooperative computation, where parties come together to facilitate computations and draw out conclusions that are mutually beneficial; at the same time aspiring to keep their private data secure. These computations are generally required to be done between competitors, who are obviously weary of each-others intentions. SMC caters not only to the needs of such parties but also provides plausible solutions to individual organizations for problems like privacy-preserving database query, privacy-preserving scientific computations, privacy-preserving intrusion detection and privacy-preserving data mining. This paper is an extension to a previously proposed protocol Encrytpo_Random, which presented a plain sailing yet effective approach to SMC and also put forward an aptly crafted architecture, whereby such an efficient protocol, involving the parties that have come forward for joint-computations and the third party who undertakes such computations, can be developed. Through this extended work an attempt has been made to further strengthen the existing protocol thus paving the way for a more secure multi-party computational process. KEYWORDS- Complexity, Encryption, Decryption, Encrytpo_Random, Extended Encrytpo_Random, Pool of function, Random Dissemination, Secure Multi-party Computation (SMC), Trusted Third Party (TTP).

1. INTRODUCTION

The SMC has been a problem that has attracted the attention of scholars and the industry for quite some time. Although a vast amount of work has been done upon the subject, the perpetual implementation of the endeavors has only yielded a perennial hornet’s nest. Having said that, it should be acknowledged that to compute results upon data whose source is not known is not child’s play; and the works undertaken until now have served a great purpose in enlightening the industry of the subtleties of this so-called SMC problem.

Thus motivated with the intention of solving this SMC problem we proposed a new protocol Encrytpo_Random through which we had put forward what we perceived, to be the most appropriate and seemingly plausible solution to the SMC conundrum. The methodology followed was quite elementary yet very comprehensible. Encrytpo_Random worked on a two layer basis; it consisted of the parties (1st layer) who aspire to draw out a result collectively and being apprehensive of each-others intentions appoint an assumedly unbiased third party (2nd layer) to carry out the computation and announce the result.

In Extended Encrytpo_Random the domain of the 2nd layer has been extended from a single third-party to multiple third-parties, from whom a single entity is chosen at run time and given the responsibility of performing the required computation. A proposal sounds overtly hyperbolic without a thorough layout of the architecture to aptly implement it. Thus, here we also present a meticulously worked-out architecture to realize the protocols and also to showcase and answer the pertinent queries that are bound to arise in the minds of the audience. The modus-operandi of the protocol deters the bodies involved to exhibit any malicious conduct by presenting thoroughly planned impediments in the path of the transfer of data among themselves. The security of


80

information of the parties is of utmost importance in any approach seeking to solve the SMC enigma. In our protocols we have taken adequate precautions so as to guarantee the security of data of the involved parties. Instead of sending the entire data blocks the parties break them into packets and randomly distribute amongst themselves, for a stipulated number of times. Provisions have been done so as to ensure that the parties do not get to know whose data packets they are forwarding, and in stark contrast, the third party also doesn’t have even a Lilliputian hint as to whose data packet a particular party is sending. This necessitates the need of a secure channel to transfer the data packets which have been dealt with in the deftly formed and apposite architecture. To further conceal the identity of the data packets we apply an encrypting function upon the data packets; these encrypting functions also reach to the third party through the same path and are used to decode the packets and rearrange them to form data blocks.

2. BACKGROUND The SMC came to the fore-front as the Millionaires problem described by Yao in [8]. U.Maurer considered the general SMC protocols [7]. For specific tasks like online auctions, public voting or online updating of the data there exist very efficient and effective protocols. General SMC protocols are less effective than special purpose protocols. Maurer also defined the different types of security in databases [5]. Privacy preserving data mining using SMC has great importance and many applications have been developed [9, 14]. Du et al. reviewed the various industrial problems and listed them in [1, 3]. Some of the existing protocols are in the form of the circuit evaluation protocols, and encryption with homomorphism schemes. The first general constant round protocol for secure two-party computations was given by Yao [16]. Yao’s original protocol considered only the case of semi-honest parties. An extension to the case of malicious party was given by Lindell [17]. Goldreich et al. showed the existence of a secure solution of SMC problem [6]. The size of such protocols depend upon the number of parties involved in the computation process.

A new concept was put forward by D.K.Mishra and M.Chandwani [18] through their multi-layer protocols. Initially a two-layer protocol along with a tentative architecture for its implementation was proposed. This two-layer protocol was improvised by a three-layer protocol in which an anonymizer layer was added in between the participating parties and the third party. This new layer hid the information of the parties from the third party, who computes the data and provides the result. In the next paper [19], this three-layer protocol, was further extended into a four-layer protocol in which a packet layer was introduced. This new concept provided security to the data from most of the malicious activity, even if the third party is not a trusted one.

The magnanimity and the complexity of this protocol present an unusual paradox. On one hand owing to its compounded and uncanny nature, this protocol prevents most of the unscrupulous activities; but on the other hand, its intricate technicalities make it very difficult to actually implement it. Another disadvantage of this protocol is that, the anonymzier-layer has been assumed to be incorruptible; if in case it becomes malicious then many an information leak can occur.

To solve such problems we proposed a new protocol Encrytpo_Random[20], which involved only two layers: 1st Layer consisted of the parties that wish to compute results and the 2nd Layer made of the Trusted Third Party (TTP), that which computes the result for these parties. Here we are putting forward an extension with the aim of further strengthening the existing protocol. We consider the third party as a trusted one on account that it computes the results correctly.

3. INFORMAL DESCRIPTION

3.1 Encrypto_Random

The previous attempts done to solve the SMC problem were either too simple or gullible in their outlook that it was quite easy for the various parties involved in the process of computation to leak the data; or were too complicated to be realized in the real world.

Our protocols dare to rectify the fallacies of these preceding works by proposing a new scheme for solving the SMC problem. The main aspect of SMC is the computation of data in a secure and private manner. Thus the handling of the data is the major concern. Keeping this vital point of the SMC in mind, we put forward a simple but effective two-layer protocol. The layout of our protocol is straight forward, which makes is easy to implement; but we have put various checks at subsequent levels so as to ensure the security of the data of the incumbent parties. Encrypto_Random works on the simple technique that ‘n’ number of parties decide to coordinate, and thus need to put forward their data; obviously aspiring not to give undue advantage to their competitors by revealing their information. Thus they appoint a third party to compute upon their collective data and announce the result publicly. Encrypto_Random exhibits two-layer architecture, activation of each layer being an alternate phenomenon. At the 1st layer the Parties (who wish to perform the computations) are active while at the 2nd layer the Third Party (who computes the result for the parties) is active. The parties break their data blocks into packets (number of packets formed by each party is fixed, as decided by the TTP). Then an encrypting function is used by each party to encode their data packets. This encrypting function is drawn out randomly by each party from a pool maintained by the Third Party. These encrypted packets are then randomly sent to other participating parties. This random distribution is


81

synchronized, and is realized by using a random function which can be initiated by any of the participating parties at run time. This process of random dissemination of the data packets is carried out many times; which further guarantees the confidentiality of the data blocks. After a certain number of this parceling out, these data packets are then sent to the third party; who has the resources to store all the incoming data packets. The TTP then decodes the received data packets by using the pool of encrypting functions maintained by it, consequently re-arranging them into whole data blocks. Now the computations are carried out as per the collective requirements of the parties and the result is announced publicly. 3.2 Extended Encrypto_Random

Encrypto_Random gives an efficient and secure way of carrying out multi-party computations. Requisite measures were taken to check the flow of data. But still a certain amount of hesitancy creeps into the mind of the observer that too much of information is being leveraged upon the TTP. To take care of such doubts we here propose an extension to our existing work, wherein the computational layer consists of not one but multiple third parties. One out of these many TTP is chosen at run time, and all data packets are then forwarded by the parties on the 1st layer to this TTP. This TTP then rearranges the data packets into data blocks and consequentially calculates and announces the result. Choice of the TTP is done on a random basis and is undertaken by another randomization function, in-built in the architecture of the protocol.

As the TTP is not predetermined and is unknown until the time of the computation, this decreases the possibility of a joint malicious conduct by the parties and the TTP. Also, since most of the activities take place at run-time, it guarantees against any sort of unscrupulous activities (joint or individual) by the various entities as well. The encrypting functions are selected from a pool maintained by each TTP, thus the parties does not individually need to intimate the TTP as to which function they have used. The data packets are distributed randomly amongst the parties, that too a number of times, because as when the parties will forward the data it will be a combination of various packets and thus the identity is hidden.

Here special care has been taken to keep the process very straight-forward for the parties. The chosen TTP however has to undertake a lot of efforts to decrypt the data packets and to rearrange them back into whole data blocks. But this is just a small price to pay for maintaining the secrecy of information, which is of utmost importance for every organization.

4. ASSUMPTIONS

The following assumptions have been stipulated: 1. TTP computes the result from the data provided

to it by the parties.

2. It will do so by using a function, for which it will bring into effect a function generator.

3. TTP has the ability to announce the result of the computation publicly, although this won’t be desirable in most of the cases.

4. Each party having an input can communicate with a trusted network connection.

5. The communication channels used by the input providing parties to communicate with the third party are secure. This implies that no intruder can intercept the data transferred between them.

6. The function given to the domain of the parties are not same, they are different.

7. Minimum three parties should be involved in the SMC.

8. The number of packets generated by each party is the same, as decided by the TTP.

9. The packet sizes of the parties are equal, as decided by the TTP.

10. The TTP has the resources to store all the incoming data packets and the encoded functions.

5. ARCHITECTURE

5.1 Encrypto_Random:

Figure 1. The Existing Architecture

Figure 1 depicts the simple architecture of Encrypto_Random. According to this the 1st layer i.e. The Input Layer comprise of all the parties that are involved in the computation process. Since all these parties are inter-connected, the data packets of the respective parties are randomly distributed amongst them, for a stipulated number of times. Thus when the parties forward the data packets, the Third Party receives the data packets that belong to some other party rather than the one who has forwarded them. This ensures that the identity of the

Layer 1 (Input Layer)

P1

P2

Pn P5

P6 P7

P4

P3

Layer 2 (Computation Layer)

Third Party


82

parties is hidden in terms of which data packets belong to them.

The Third Party exists at the 2nd layer, i.e. The Computation Layer, where the computations are carried out. After receiving the data packets from the parties (at layer 1), the data packets are re-arranged by the Third Party using the Pool of Functions. Once the data packets are reassembled the computation is carried out and the result is obtained. It seems to be a simple architecture but provides various checks keeping in mind the fulfillment of the all the constraints of the SMC.

5.2 Extended Encrypto_Random:

Figure 2. The Proposed Architecture

Figure 2 depicts the extended architecture of our protocol. Instead of a single TTP, the computational layer (2nd layer) consists of a pool of TTP. Each TTP has the same pool of functions, which were used previously by the parties (1st layer) to encrypt their data packets. One out these many TTP is chosen at run time and all data packets are forwarded to it making it responsible for the regrouping of data packets into data blocks and the subsequent required calculations.

6. FORMAL DESCRIPTION

Extended Encrypto_Random

The extended protocol, like its predecessor, is based on the simple pretext of a group of parties (P1, P2…Pn) who assign the task of computation to a TTP. Each party Pn breaks its data block into number of packets (PnKr) and applies encrypting function (f1, f2…fn) chosen from the pool of functions (D). This breaking of the data blocks

into packets and the application of the encrypting functions is facilitated by the already existing provisions deftly incorporated in the architecture of the protocol. These encrypted packets (Snr= PnKr + VrFn) are randomly distributed among parties. A TTP is chosen at run time using randomization function (Rf). All data is sent to this TTP, who then decodes the data packets, rearranges them into whole data blocks; computes and announces the result publicly. Algorithm Extended Encrypto_Random:

1. Define P1,P2, …, Pn as parties; 2. Define D as function pool; 3. Define F1,F2,…, Fn as encrypting functions; 4. For party P1 to Pn do

begin Break data block into packets;

(PnK1, PnK2 … PnKr) /* Where K1, K2 …Kr

designate the packets of a party */

Select function from pool (D); Attach encrypting value of function to

each packet; Compute Snr = PnKr + VrFn;

/* Where V1, V2, …,Vr are the values of function Fn)*/ end;

5. For r=1 to s, do begin

Send Snr randomly to Pn; end;

6. Repeat step 5 for n times. 7. Select TTP using Rf. 8. For P1 to Pn , do

begin Send Snr to TTP;

end; 9. TTP decodes Snr using Fn from D; and rearrange

PnKr into data blocks; 10. TTP computes and announces result;

7. ANALYSIS AND PERFORMANCE

Extended Encrypto_Random

Case 1: Joint malicious conduct by certain Parties.

The parties receive the data packets PnKr and the function value VrFn. At a given instant, if m number of parties get together, they will have the following information at their disposal: 1). m*Xr number of packets, (Xr = number of packets generated by a single party, as fixed by the TTP) and 2). m number of encrypting functions. This information is useless for them, because they won’t be having the encrypting functions of the other parties for generating the whole blocks of data. Moreover it is highly improbable that these m parties receive all the packets of a certain party, thus making the available information with them redundant.

Layer 1 (Input Layer) P1

P2

Pn P5

P6 P7

P4

P3

TTP m TTP 2 TTP 1 TTP 3 . . . Layer 2 (Computational Layer)

Single TTP selected at run

time using Randomization Function (Rf)


83

Case 2: The chosen TTP becomes malicious.

The TTP receives the packets PnKr and the function value VrFn. Also, it has the pool of random functions. At a given instant it has the set (PnKr, VrFn, pool of functions). The TTP can assemble the data packets to form whole data blocks, but in no case can relate any data block thus formed to a certain party. Further, if there are a large number of participating parties (which can be assumed to be the real world scenario, and for which our protocol is sincerely aimed at), then it becomes nearly impossible to decisively relate a particular data block to a particular party. This is also evident by the probabilistic curve, which describes an inversely parabolic path and tends to 0 as the number of parties increases.

Case 3: Joint malicious conduct by some Party and the TTP.

In the previous protocol there was a predetermined TTP, thus there was plausibility about some party goading the TTP to exhibit corrupt behavior, and consequently give undue advantage to such a party by revealing the regrouped data blocks. This problem is taken care of in this extended work, as the TTP is decided only at run time by the protocol. Thus it becomes virtually impossible for any party motivated by vindictive intent to collaborate with the unknown TTP for a joint malicious conduct.

8. PROBABILISTIC EVIDENCE

Let P(xn) be the probability that a party turns malicious, where n is the number of parties.

Thus, P(x1) = P(x2) = P(x3) = P(x4) = . . . =P(xn) = 1/n (equal for all parties).

If at a given instant, a party Pr (say) intends to leak the information of the other parties. The probability of the number of packets that party can decrypt is given by

= number of packets of that party / total number of packets. = Xr / (r=1∑

n Xr) Therefore, probability that one party exhibits malicious conduct = [1/n] * [Xr / (r=1∑

n Xr)] Now let’s say r number of parties become malicious. Thus, Probability for leak of the packets = [r/n] * [r=1∑

r Xr)/ (r=1∑n Xr]

Also, let P(tm) be the probability that the chosen TTP turns malicious, where m is the number of total TTP out of which one will be chosen.

Thus, P(t1) = P(t2) = P(t3) = P(t4) = . . . =P(tm) = 1/m (equal for all TTP).

Therefore, Total Probability for leak of the packets is given by = 1/m * [r/n] * [r=1∑

r Xr)/ (r=1∑n Xr]

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4 5 6 7

No. of parties

Tota

l Pro

babi

lity

of M

alic

ious

Co

nduc

t

Encrypto_Random Extended Encrypto_Random

Figure 3. Total Probability for Leak of Data n = number of parties coming together for joint computation. m = number of TTP from which one is chosen. Xr = 4, fixed number of packets made by each of these n parties (as decided by the TTP). Here has been taken as example to construct the graph. m = 4, fixed number of TTP taken as example to illustrate the graph. Let’s say that one party out of n decides to decrypt its data packets. Thus, probability that it becomes successful in exhibiting such malicious conduct is given by: (1/m * 1/n2). This curve depicts an inversely hyperbolic path and substantially proves that, as the number of participating parties increases, the probability of malicious conducts and the subsequent leakage of data decreases, and ultimately tends to zero if the number of parties are numerous.

9. CONCLUSION

Secure Multi-Party Computation is a well researched topic. Quite a few protocols already exist, and work is going-on on another handful. Through Extended Encrypto_Random we have endeavored to present a concept that emphasizes the need to keep the structure of the proposed solution to the problem very forthright so as to avoid ambiguities; at the same time ensuring the security of information by taking efficient and intricate measures. The data is first distributed and then sent forward; assuring that no party becomes victim to sabotage by other parties and also that, no party gets undue privilege, as the sole responsibility of the computation process is not vested upon a single entity. The encrypted nature of data further hinders any


84

possibility of spiteful conduct. The possibility of collaborative malefic behavior by some party and the TTP has been completely curbed by concealing the identity of the TTP until runtime. Our protocol also reduces the complexities that are encountered in three and four layer protocols.

10. FUTURE SCOPE

The function domain is being further developed and the transforming functions that leverage the proposed architecture in different areas are being fine-tuned. Subsequent enhancement of the protocol is expected in the sense that instead of making a single TTP responsible for calculating the result, multiple TTP are given the same set of data packets, upon which each of them will perform the same computations. Then the results of each of them are compared to decide which of them are harmonious. If more than half of the results are found synonymous, then such a result can be authenticated.

REFERENCES

[1] W. Du, and M. J. Attalah, “Secure Multi - Party Computation Problems and Their Applications: A Review and Open Problems.” Tech Report CERIAS Tech Report 2001-51, Centre for Education and Research in Information Assurance and Security and Department of Computer Sciences, Purdue University, West Lafayette, IN 47906, 2001.

[2] J. Vaidya, and C. Clifton, “Leveraging the Multi in Secure Multi-Party Computation.” WPES’03 October 30, 2003, Washington DC, USA, ACM Transaction 2003, pp 120-128.

[3] M. J. Atallah and W. Du., “Secure Multi-Party Computation Geometry.” Seventh International Workshop on Algorithms and Data Structures (WADS 2001), Providence, Phode Island, USA, Aug 8- 10, 2001, pp 136-152

[4] W. Du, and M. J. Attalah, “Privacy-Preserving Cooperative Scientific Computations.” IN 14th IEEE Computer Security Foundation Workshop, Nova Scotia, Canada June 11-13, 2001.

[5] U. Maurer, “The Role of Cryptography in Database Security.” SIGMOD 2004, June, Paris, France June 13-18, 2004, pp 29-35.

[6] O. Goldreich, S. Micali, and A. Wigderson, “How to Play Any Mental Game – A Completeness Theorem for Protocols with Honesty Majority.” 19th ACM Symposium on the Theory Of Computation, 1987, pp 218-229.

[7] U. Maurer, “Secure Multi-Party Computation made Simple.” Security in Computational Network (SCN’02), G. Persiano (Ed.), Lecture notes in Computer Science, Springer- Verlag, Vol. 2576, 2003, pp 14-28.

[8] A. C. Yao, “Protocols for Secure Computations.” In Proc. 23rd

IEEE Symposium on the Foundation of Computer Science (FOCS), IEEE 1982, pp 160- 164.

[9] Agrawal, A. Evfimievski, and R. Srikant, “Information Sharing Across Private Databases.” SIGMOD- 2003, SAndiego CA, June 9-11, 2003, pp 109-115.

[10] R. Canetti, “Secure and Composition of Multi-Party Cryptographic Protocols.” Journal of Cryptography, Vol. 13, No. 1, 2000, pp143-202.

[11] B. Pfitzmann, M. Schunter, and M. Waidner, “Secure Reactive System.” IBM Research Report RZ 3206, Feb 14, 2000.

[12] J. VAidya and C. Clifton, “Privacy Preserving Association Rule Mining in Vertically Partitioned Data.” 8thACM SIGKDD International Conference on Knowledge Discovery and Data Mining(2002), pp 639- 644.

[13] Y. LIndell and B. Pinkas, “Privacy Preserving Data Mining.” Advances in Cryptology – CRYPRTO 2000, Springer- Verlag, Aug 20-24, 2000, pp 36-54.

[14] H. Kargupta, B. Park, D. Herchberher, and E. Johnson, “Collective Data Mining: A New Perspective Toward Distributed Data MInig Book, Philip Chan, AAA Press, 1999.

[15] Ivan Demgard and Yuvai Ishai, “Constant-Round Multi-Party Computation using a Black-Box Pseudo Random Generator.” Aug 10, 2005.

[16] A C Yao, “How to generate and Exchange Secrets.” April 16, 2004. [17] Yehuda Lindell, “Parallel Coin-Tossing and ConstantRound Secure

Two-Party Computation.” [18] D. K. Mishra, M. Chandwani, "Secure Multi-Party Computation

Protocol using Ambiguous Identity," accepted for the publication in Journal.

[19] D. K. Mishra, M. Chandwani, “Extended Protocol for Secure Multi-party Computation using Ambiguous Identity”. WSEAS Transaction on Computer Research, Vol 2, issue 2, Feb 2007.

[20] Neha Koria, Nikhil Kapoor, Ravish Bahety, D K Mishra, “Complexity Minimization in Secure Multi-Party Computation for Preserving Privacy during Data Mining.” Computing for National Development, INDIACOM’09, 3rd National Conference, 26-27 Feb 2009.

AUTHORS PROFILE

Dr. Durgesh Kumar Mishra

Biography: Dr. Durgesh Kumar Mishra has received M.Tech. degree in Computer Science from DAVV, Indore in 1994 and PhD degree in Computer Engineering in 2008. Presently he is working as Professor (CSE) and Dean (R&D) in Acropolis Institute of Technology and Research, Indore, MP, India. He is having around 20 Yrs of teaching experience and more than 5 Yrs of research experience. He has completed his research work with Dr. M. Chandwani, Director, IET-DAVV Indore, MP, India in Secure Multi- Party Computation. He has published more than 60 papers in refereed International/National Journal and Conference including IEEE, ACM etc. He is a senior member of IEEE and Secretary of IEEE MP-Subsection under the Bombay Section, India. Dr. Mishra has delivered his tutorials in IEEE International conferences in India as well as other countries also. He is also the programme committee member of several International conferences. He visited and delivered his invited talk in Taiwan, Bangladesh, USA, UK etc in Secure Multi-Party Computation of Information Security. He is an author of one book also. He is also the reviewer of tree International Journal of Information Security. He is a Chief Editor of Journal of Technology and Engineering Sciences. He has been a consultant to industries and Government organization like Sale tax and Labor Department of Government of Madhya Pradesh, India.

Nikhil Kapoor (India), He has done BE in Computer Engineering and Science in 2009. He has published paper in referred International/National Journal and Conferences.


85

Neha Koria (India), She has done BE in Computer Engineering and Science in 2009. He has published paper in referred International/National Journal and Conferences.

Ravish Bahety (India), He has done BE in Computer Engineering and Science in 2009. He has published paper in referred International/National Journal and Conferences.


86

Dr. Shishir Kumar Department of CSE, Jaypee

Institute of Engg. & Technology Guna (M.P.), India

[email protected]

U.S. Rawat Department of CSE, Jaypee

Institute of Engg. & TechnologyGuna (M.P.), India

[email protected]

Sameer Kumar Jasra Department of CSE, Jaypee

Institute of Engg. & Technology Guna (M.P.), India

Akshay Kumar Jain Department of CSE, Jaypee

Institute of Engg. & TechnologyGuna (M.P.), India

Efficient methodology for implementation of Encrypted File System in User Space

Abstract— The Encrypted File System (EFS) pushes encryption services into the file system itself. EFS supports secure storage at the system level through a standard UNIX file system interface to encrypted files. User can associate a cryptographic key with the directories they wish to protect. Files in these directories (as well as their pathname components) are transparently encrypted and decrypted with the specified key without further user intervention; clear text is never stored on a disk or sent to a remote file server. EFS can use any available file system for its underlying storage without modifications, including remote file servers such as NFS. System management functions, such as file backup, work in a normal manner and without knowledge of the key. Performance is an important factor to users since encryption can be time consuming.

This paper describes the design and implementation of EFS in user space using faster cryptographic algorithms on UNIX Operating system. Implementing EFS in user space makes it portable & flexible; Kernel size will also not increase resulting in more reliable & efficient Operating System. Encryption techniques for file system level encryption are described, and general issues of cryptographic system interfaces to support routine secure computing are discussed.

Keywords- Advance Encryption standerd, Electronic code book mode, EFS daemon, Intialization vector, Network File system.

I. INTRODUCTION

Encrypted File System is an interface that ensures the user that the data stored on the hard disk is secure and cannot be hacked by any other user without the permission of the owner. It ensures that the original data doesn’t reside on the hard disk in the normal or the plaintext form, but it should always been stored in encrypted form which cannot be understood by the intruder. As in our current file system, it is

normally stored in plaintext on the hard disk. So, if someone hacks the data stored in hard disk then that person can easily access the data. But if the file is stored in an encrypted form on the hard disk, the hacking in such cases won’t be so effective.

User should not be aware about the location where the encryption and decryption takes place. By using encryption and decryption methodologies, user can secure his data and store it on the hard disk in an unreadable format. Several recent incidents accentuate the need for a cohesive solution to the problem of storage security that protects data using strong cryptographic methods in both personal and organizational scenarios. This paper investigates the implications of cryptographic protection as a basic feature of the file system interface.

II. RELATED WORK

There are many architectures and procedures available in these areas that have already been implemented. Very few of them are implemented in user space and most of them are in kernel space. Each one of them is having certain advantages and limitations. The crucial issues of both, systems level and user level cryptography are as mentioned below.

A. ISSUES WITH USER LEVEL CRYPTOGRAPHY

The simplest approach for file encryption is available through a tool, such as the UNIX crypt program, that enciphers (or deciphers) a file or data stream with a specified key. Depending on the particular software, the program may or may not be automatically delete the clear text while encrypting and such programs can usually be used as cryptographic "filters" in a command pipeline.

Another approach is integrated encryption in application software, where each program which has to manipulate sensitive data has built in cryptographic facilities. For example, a text editor could ask for a key when a file is


87

opened and automatically encrypt and decrypt the file’s data as they are written and read. All those applications that will be operated on the same data must, include the same encryption engine. An encryption filter, such as crypt, might also be provided to allow data to be imported into and exported out of other software.

Unfortunately, neither approach is entirely satisfactory in terms of security, generality or convenience. The former approach, which allows great flexibility in its application, invites mistakes; the user could inadvertently fail to encrypt a file, leaving it in the clear, or could forget to delete the clear text version after encryption. The manual nature of the encryption and the need to supply the key several times whenever a file is used makes encryption too cumbersome. More seriously, even when used properly, manual encryption programs open a window of vulnerability while the file is in clear form. It is almost impossible to avoid occasionally storing clear text on the disk and, in the case of remote file servers, sending it over the network. Some applications simply expect to be able to read and write ordinary files. In the application based approach, each program must have built in encryption functionality. Although encryption takes place automatically, the user still must supply a key to each application, typically when it is invoked or when a file is first opened. Software without encryption capability cannot operate on secure data without the use of a separate encryption program, making it hard to avoid all the problems outlined in the previous text. Furthermore, rather than being confined to a single program, encryption is spread among multiple applications, each of which must be trusted to interoperate securely and correctly with the others. A single poorly designed component can introduce a significant and difficult to detect window of vulnerability. Changing the encryption algorithm entails modification of every program that uses it, creating many opportunities of implementation errors. Finally, multiple copies of user level cryptographic code can introduce a significant performance penalty. [1]

B. ISSUES WITH SYSTEM LEVEL CRYPTOGRAPHY

One way to avoid many of the pitfalls of user level encryption is to make cryptographic services a basic part of the underlying system. In designing such a system, it is important to identify exactly what is to be trusted with clear text and what requires cryptographic protection i.e. we must understand what components of the system are vulnerable to compromise.

For files, we are usually interested in protecting the physical media on which sensitive data are stored. This includes online disks as well as backup copies, which may persist long after the online versions have been deleted. In distributed file server based systems, it is often also desirable to protect the network connection between client and server since these links may be very easy for interception attack [1].

Physical media can be protected by specialized hardware. Disk controllers are commercially available with embedded encryption hardware that can be used to encipher entire disks or individual file blocks with a specified key. Once the key

will be provided to the controller hardware, encryption will be completely transparent. This approach has a number of limitations for general uses. The granularity of encryption keys must be compatible with the hardware; often, the entire disk must be thought of as a single protected entity. It is difficult to share resources among users who are not willing to trust one another with the same key. Obviously, this approach is only applicable when the required hardware is available.

Network connections between client machines and file servers can be protected with end-to-end encryption. Specialized hardware may be employed for this purpose, depending on the particular network involved, or it may be implemented in software. However, all networks does not support encryption and among those that do, all system vendors does not supply working implementations of encryption as a standard product.[8]

Even when the various problems with media and network level encryption are ignored, the combination of the two approaches may not be adequate for the protection of data in modern distributed systems. In particular, even though clear text may never be stored on a disk or sent "over the wire", sensitive data can be leaked if the file server itself is compromised. At some point file server must maintain, the keys used to encipher both the disk and the network. Even if the server can be completely trusted, direct media encryption on top of network encryption has a number of shortcomings from the point of view of efficient distributed system design [2].

Further, the alternative approach taken by the Encrypted File System (EFS) has been mentioned. EFS pushes file encryption entirely into the client file system interface, and therefore does not suffer from many of the difficulties inherent in user level and disk and network based system level encryption.

III. CRYPTOGRAPHIC SERVICES IN THE FILE SYSTEM

The main focus of EFS is to identify the location in a system, where file encryption will be performed. If it is at too low level, then trusts in components are removed from the user’s control. If it is too close to the user, frequent human interaction may lead to error.

A. DESIGN GOALS

EFS occupy something of a middle ground between low level and user level cryptography. It aims to protect exactly those aspects of file storage that are vulnerable to attack in a way that is convenient enough to use routinely. In particular, we will be guided by the following specific goals:

Key management scheme: The sensitive information of the file in encrypted file system is access by the key which is taken as the input from the user as a passphrase. This key is used to encrypt the content of the file and also help in returning the original content of the file. There must be a way to get this key from the user. It is taken in the form of


88

passphrase. It is the most crucial input to the encrypted file system on which the whole security of the system relies.

Transparent access semantics: Encrypted files should support the same access methods available on the underlying storage system. All system calls should work normally, and it should be possible to compile and execute in a completely encrypted environment.

Transparent performance: Encryption algorithms are computationally intensive, but the overhead should not be too high so as to discourage the use of encrypted file system in the real scenario.

Security of file contents: The file contents should be secure effectively so that no other person gets to know about the data without the knowledge of key, the structural data should also be protected e.g. Information in the header & footer of the file should not generate same cipher text.

Natural key granularity: It should be easy to protect related files under the same key, and it should be easy to create new keys for other files. The UNIX directory structure is a flexible, natural way to group files.

Compatibility with underlying system services: The files and directory generated by the encrypted file system should behave normally and should be manageable as the normal file in the file system.

Portability: The encrypted file system should use the available functionality and features. The files and directory should be seen normally whenever the key is supplied to the file system.

Scalability: The encryption engine should not place an unusual load on any shared component of the system. File servers in particular should not be required to perform any special additional processing for clients who require cryptographic protection.

Compatibility with future technologies: Several emerging technologies have potential applicability for protecting data. In particular, keys could be contained in or managed by "smart cards" that would remain in the physical possession of authorized users. An encryption system should support, but not require, novel hardware of this sort.

B. EFS FUNCTIONALITY AND USER INTERFACE

Encrypted File System interacts with standard UNIX file system through system calls and treats all the files in same manner, irrespective of file being encrypted or normal file of standard file system. It prevents user from entering the same key several times. The EFS attaches a key to a directory and all the files within that directory are automatically encrypted. When this directory is attached to the Encrypted File System directory, then all the operations on the file can be executed. The files are automatically decrypted when they are read and are encrypted when write operation is performed. No modifications are required on the file system on which encrypted files are stored.

EFS provides “Virtual File System” on client’s machine typically mounted on /crypt, through which user access their encrypted files. All the files are stored in the encrypted form and with the encrypted path name in associated directory. These files are not visible to the user until they attach the associated directory to the /crypt of the EFS. The underlying encrypted directories can reside on any accessible file system, including remote file servers such as Sun NFS [6]. No space is required to be pre-allocated for EFS directories and user controls EFS through commands like create, attach, detach etc.

To use Encrypted File System user has to create a directory and EFS by issuing command emkdir, with this key associate a passphrase i.e. key which is used by EFS to encrypt all the files within that directory. The passphrase should be at least 16 characters long. For instance, it can be “This is Encrypted File System”. The emkdir works same as mkdir of DOS, but here we have to give passphrase in order to make it secure.

Eg $ emkdir/user/jas/efs (name of the encrypted directory) (user enters passphrase, which does not echo) (same phrase entered again to prevent errors)

In order to use the files in the directory in the normal form, we have to supply key to the EFS. This is achieved by attach command. It takes three parameters.

1. Passphrase

2. Name of directory created

3. New name of directory

$ attach /user/jas/efs aks Key: (same key used in the cmkdir command)

If the key is supplied correctly, the user "sees" /crypt/aks as a normal directory; all standard operations (creating, reading, writing, compiling, executing, cd, mkdir, etc.) works as expected. The actual files are stored under /user/jas/efs, which would not ordinarily be used directly. Access to attached directories is controlled by restricting the virtual directories created under /crypt using the standard UNIX file protection mechanism. Only the user who issued the attach command is permitted to see or use the clear text files. This is based on the uid of the user; an attacker who can obtain access to a client machine and compromise a user account can use any of that user’s currently attached directories. If this is a concern, the attached name can be marked obscure, which prevents it from appearing in a listing of /crypt. When an attach is made obscure, the attacker must guess its current name, which can be randomly chosen by the real user. Of course, attackers who can become the "super user" on the client machine can thwart any protection scheme, including this; such an intruder has access to the entire address space of the kernel and can read (or modify) any data anywhere in the system.

In order to remove the directory as files from /crypt we will use the command detach which removes the entry from the /crypt of EFS. File names are encrypted and encoded in an ASCII representation of their binary encrypted value padded out to the cipher block of size eight bytes.


89

$ detach aks

Some data are not protected. File sizes, access times, and the structure of the directory hierarchy are all kept in the clear text. (Symbolic link pointers are, however, encrypted.) This makes EFS vulnerable to traffic analysis from both real time observation and snapshots of the underlying files; whether this is acceptable must be evaluated for each application.

IV. FILE ENCRYPTION METHODOLOGY

EFS use Advance Encryption Standard (AES) [4] to encrypt the file data. There are various modes of AES; one of it is Electronic Code Book (ECB) mode. [5] In which each eight byte block is encrypted individually. The main shortcoming of ECB is that it will produce same cipher text for the identical plain text block. It will help the cryptanalyst to find structural similarity of the data and help to decrypt the text easily.

Other modes of AES are various chaining ciphers. These modes help in reducing the shortcoming of ECB mode. It helps in overcoming the problem of structural analysis. But the problem with this mode is that, the random access of file becomes difficult due to reason that all the blocks are dependent on the cipher preceding block.

Since 56 bit key is vulnerable to exhaustive search of the key space. To remove this problem we will use multiple AES to provide more security to data. To remove both the shortcomings of random access and structural analysis, we had used both the modes of AES. The 128 bit supplied key is crunched into two halves of 56 bit key. Now the first 56 bit is used to calculate the long initial block with AES OFB mode. Now whenever a file is to be written, it is XOR’ed with the initial block and then encrypted by the second key using AES with ECB mode. When reading, the cipher is reversed in the obvious manner, first decrypt in ECB mode then XOR it with initial block.

This method helps us to overcome both the problems of random access and structural analysis. It is clear that the protection against attack is at least as strong as a single AES pass in ECB mode and may be as strong as two passes with AES stream mode cipher. It is likely that the scheme is weakened, in such situation the attacker might be able to search for the two AES sub keys independently. If there are several known plaintext file encrypted with same key. In the chaining mode, as far as the Initialization Vector (IV) is different, the cipher text of identical block will be different. For this purpose we can attach IV to the beginning of file for maintaining atomicity.

Encryption of path name components uses a similar scheme with the addition that the higher order bit of clear text name are set to a simple checksum computed over the entire name string.

It is important to emphasize that EFS protects data only in the context of the file system. It is not, in itself, a complete, general purpose cryptographic security system. Once bits have been returned to a user program, they are beyond the reach of EFS’s protection. This means that even with EFS, sensitive data might be written to a paging device when a

program is swapped out or revealed in a trace of a program’s address space. Systems where the paging device is on a remote file system are especially vulnerable to this sort of attack. (It is theoretically possible to use EFS as a paging file system, although the current implementation does not readily support this in practice.) .It should also be taken into consideration that EFS does not protect the links between users and the client machines on which EFS runs; users connected via networked terminals remain vulnerable if these links are not otherwise secured.

A. PROPOSED ARCHITECTURE

The EFS prototype has been implemented entirely at user level, communicating with the Unix kernel via the NFS interface. Each client machine runs a special NFS server, efsd (EFS Daemon), on its localhost interface, that interprets EFS file system requests. At boot time, the system invokes efsd and issues an NFS mount of its localhost interface on the EFS directory (/crypt) to start EFS. (To allow the client to also work as a regular NFS server, EFS runs on a different port number from standard NFS.) The NFS protocol is designed for remote file servers, and so assumes that the file system is very loosely coupled to the client (even though, in EFS’s case, they are actually the same machine)[6]. The client kernel communicates with the file system through remote procedure calls (RPCs) that implement various file system related primitives (read, write, etc.). The server is stateless, in that, it is not required to maintain any state data between individual client calls. All communication is initiated by the client, and the server can simply process each RPC as it is received and then wait for the next. Most of the complexity of an NFS implementation is in the generic client side of the interface, and it is therefore often possible to implement new file system services entirely by adding a simple NFS server.

efsd is implemented as an RPC server for an extended version of the NFS protocol. Additional RPCs attach, detach, and otherwise control encrypted directories. Initially, the root of the EFS file system appears as an empty directory. The attach command sends an RPC to efsd with arguments containing the full path name of a directory, the name of the "attach point", and the key. If the key is correct, cfsd computes the cryptographic mask (described in the previous section) and creates an entry in its root directory under the specified attach point name. The attach point entry appears as a directory owned by the user who issued the attach request, with a protection mode to prevent others from seeing its contents.

Encryption of pathname components uses a similar scheme, with the addition that the high order bits of the clear text name (which are normally zero) are set to a simple checksum computed over the entire name string. This frustrates structural analysis of long names that differ only in the last few characters. The same method is used to encrypt symbolic link pointers.

For each encrypted file accessed through an attach point, efsd generates a unique file handle that is used by the client NFS interface to refer to the file. For each attach point, the


90

User Level Application

Any Program

System Call Interface

FS Client

File system (remote)

FS SVR Interface

Storage

System call

File system interface cleartext

UNIX Kernel

EFS daemon maintains a table of handles and their corresponding underlying encrypted names. When a read or write operation occurs, the handle is used as an index into this table to find the underlying file name. efsd uses regular Unix system calls to read and write the file contents, which are encrypted before writing and decrypted after reading, as appropriate. To avoid repeated open and close calls, efsd also maintains a small cache of file descriptors for files on which there have been recent operations. Directory and symbolic link operations, such as readdir, readlink, and lookup are similarly translated into appropriate system calls and encrypted and decrypted as needed. To prevent intruders from issuing RPC calls to EFS directly (and thereby thwarting the protection mechanism), efsd only accepts RPCs that originate from a privileged port on the local machine. Responses to the RPCs are also returned only to the localhost port, and file handles include a cryptographic component selected at attach time to prevent an attacker on a different machine from spoofing one side of a transaction with the server.

It is instructive to compare the flow of data under EFS with that taken under the standard, unencrypted file system interface. Figure 1 shows the architecture of the interfaces between an application program and the ordinary Sun "vnode based" Unix file system [3]. Each arrow between boxes represents data crossing a kernel, hardware, or network boundary; the diagram shows that data written from an application are first copied to the kernel and then to the (local or remote) file system. Figure 2 shows the architecture of the user level EFS prototype. Data are copied several extra times; from the application, to the kernel, to the EFS daemon, back to the kernel, and finally to the underlying file system. Since EFS uses user level system calls to communicate with the underlying file system, each file is cached twice, once by EFS in clear form and once by the underlying system in encrypted form. This effectively reduces the available file buffer cache space by a factor of two.

The architecture described above helps in analyzing the efficiency of the algorithm. To analyze an algorithm is to determine the amount of resources (such as time and storage) necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or complexity of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity).

Figure 1: The Architecture of normal V-node file system in Unix

Figure 2: The Architecture of the user level EFS prototype

Therefore this algorithm has complexity O (n2). Its an important issue that while considering algorithms, one often measures complexity via the number of comparisons that occur, ignoring things such as assignments, etc. It will be suitable to keep track of any factors, in particular those which proceed with the dominating sub term. In the DES, the factor applied to the dominating sub term, namely n2 was 3/2, and by coincidence, this was also the factor that

User Level Application Any Program

Unix Kernel


FS Client(NFS)

EFS Daemon

NFS server interface

System call

File system interface clear text

Encryption/decryption Engine

Unix Kernel


FS Client(NFS)

System call

FS Client(NFS)

File system(remote)

File system interface encrypted

Storage


91

came with the second term, namely n. It is obvious that an algorithm which is linear will perform better than a quadratic one, provided the size of the problem is large enough, but if it is known that the problem has a size of, say, at most 100 then a complexity of (1/10) n2 might be preferable to one of 1000000n.

V. RESULTS & PERFORMANCE EVALUATION

After implementing encrypted file system, it has been tried to find out the change in the space of file i.e. the variation of space of the encrypted file form the original file. For that some file of specific size has been taken and encrypted it using the encrypted file system. The Table-1 shows the variation of size of the original file when encrypted by EFS. The variation in size of encrypted file is approximately 2.5 times the size of the original file. As the encryption algorithm is computationally intensive, the computational time has been computed for the file of same size with EFS. The time taken by EFS to encrypt file of specific size is shown in Table-2. The final result is that both the time and space of the encrypted graph increase with the increase in the size of input file.

For the purpose of comparison standard utility of UNIX has been used i.e. crypt, which is used to encrypt and decrypt the file in the UNIX [7]. The same size of input files have been taken and the size of output file have been analyzed to get the variation in time and space and compared it with the time and space variation of encrypted file system. The variation in size is shown in Table-3 and the variation in the time is shown in the Table-4.

Table-1: Difference in size of original and encrypted file by EFS

Size Of Original file Size Of Encrypted File

909 bytes 2.3 KB

3.6 KB 9.3 KB

9.5 KB 24.5 KB

10.7 KB 27.5 KB

15.6 KB 39.9 KB

Table- 2: Time taken by EFS to encrypt a file

Figure 3: Variation in size of original file

Figure 4: Variation in time according to size Table -3: Difference in size of original and encrypted file by crypt

Size of Original

File

Time Taken By Encrypted

File System

909 Bytes 278 ms

3.6 KB 304 ms

9.5 KB 358 ms

10.7 KB 381 ms

15.6 KB 410 ms

Size Of original file Size Of Encrypted File By crypt

909 Bytes 1.6 KB

3.6 KB 4.5 KB

9.5 KB 14.5 KB

10.7 KB 16.3 KB

15.6 KB 22.5 KB


92

Figure 5: Variation of size of original file by Crypt

Table-4: Time taken by Crypt to encrypt a file

Size Of Original

File

Time Taken By

crypt To Encrypt

909 Bytes ≈ 0 ns

3.6 KB ≈ 0 ns

9.5 KB 10000 ns

10.7 KB 10000 ns

15.6 KB 20000 ns

VI. CONCLUSIONS & FUTURE WORK The proposed model of EFS provides a simple mechanism to protect data written to disks and sent to networked file servers. Although experience with proposed model of EFS is still limited to the research environment, rather

performance on modern workstations appears to be within a range that allows its routine use. Obviously, it has shortcomings of a user-level NFS server based implementation. The client file system interface appears to be the right place to protect file data. If we consider the other alternatives, encrypting at the application layer is inconvenient. Application based encryption leaves windows of vulnerability while files are in the clear or requires the exclusive use of special purpose applications on all encrypted files. At the disk level, the granularity of encryption may not match the user’s security requirements, especially if different files are to be encrypted under different keys. Encrypting the network in distributed file systems, while useful in general against network based attack, does not protect the actual media and therefore still requires trust in the server not to disclose file data. The main focus of EFS is to reduce the barriers that stand in the way of the effective and ubiquitous utilization of file encryption. This is especially relevant as physical media remains exposed to theft and unauthorized access. Whenever sensitive data is being handled, it should be the modus operandi that the data be encrypted at all times when it is not directly being accessed in an authorized manner by the applications. It can be implemented on modern operating systems without having to change the rest of the system. Better performance and stronger security may be achieved by running the file system in the kernel. Proposed model of EFS is more portable than other kernel based file systems because it interacts with a standard vnode interface, as the quick ports to Linux.

REFERENCES

[1] M. Blaze, “A Cryptographic File System for UNIX,” in Proceedings of the ACM Conference on Computer and Communications security, Fairfax, VA, USA, Nov. 1993, pp. 9–16.

[2] Howard, J.H., Kazar, M.L., Menees, S.G., Nichols, D.A., Satyanaryanan, M. & Sidebotham, R.N. "Scale and Performance in Distributed File Systems." ACM Trans. Computing Systems, Vol. 6, No. 1, (February), 1988.

[3] Kleiman, S.R., "Vnodes: An Architecture for Multiple File System Types in Sun UNIX." Proc. USENIX, Summer, 1986.

[4] National Bureau of Standards, "Advance Encryption Standard" FIPS Publication #197, NTIS, 26 Nov 2001.

[5] National Bureau of Standards, "Data Encryption Standard Modes of Operation." FIPS Publication #81, NTIS, Dec. 1980.

[6] Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., & Lyon, B. "Design and Implementation of the Sun


93

Network File System." Proc. USENIX, Summer, 1985.

[7] dm-crypt: a device-mapper crypto target for Linux. [Online] Available:http://www.saout.de/misc/dm-crypt/

[8] Encfs -Virtual Encrypted Filesystem for Linux. [Online]. Available: http://encfs.sourceforge.net/

AUTHORS PROFILE

Dr. Shishir Kumar Dr. Shishir Kumar is currently working as Associate Professor and Head in Dept. of Computer Science & Engineering, Jaypee Institute of Engineering & Technology, Raghogarh Guna, India. He has completed his PhD in the area of Computer Science in 2005.He is having around 12 year teaching experience. He has published several papers in international/national referred journals and conferences. His area of Interest is Network Security & Image Processing.

Mr. U.S. Rawat

U.S. Rawat is currently working as Sr. Lecturer in Department of CSE, Jaypee Institute of Engineering & Technology, Raghogarh Guna, India. He received his B.E. in 1999 from Amravati University, Amravati ,India. He received his M.E. in 2003 from S.G.S.I.T.S, Indore, India. He is working towards his PhD from JUIT Waknagat, India. He is having eight years of teaching experience. His areas of interest are Information Systems & Network Security.

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 3 No.1 July 2009

A new approach to services differentiation between mobile terminals of a wireless LAN

Maher BEN JEMAA

ReDCAD Research Unit, National School of Engineering of Sfax BP 1173-3038 Sfax, Tunisia [email protected]

Maryam KALLEL ZOUARI ReDCAD Research Unit, National

School of Engineering of Sfax BP 1173-3038 Sfax, Tunisia

[email protected]

Bachar ZOUARI ReDCAD Research Unit, National

School of Engineering of Sfax BP 1173-3038 Sfax, Tunisia [email protected]

Abstract— This study aims to identify the advantages and disadvantages of several mechanisms for service differentiation in mobile terminals of a wireless LAN to establish a better and more efficient network. According to the analysis of available approaches for the quality of service of the IEEE 802.11 standard, the objective of this paper is to suggest a new method named DF-DCF "Differentiated Frame DCF”. DF-DCF can be regarded as an implementation of the algorithm EDF (Earliest Deadline First). The system using DF-DCF is considered as a system with time-dependent priorities. The performance of the suggested method in a Network Simulator (NS) environment allowed its validation through a set of testing and simulation scenarios. Simulation results have shown that the DF-DCF method is better suited for mobile nodes in a wireless communication network. Keywords- Service Differentiation, Wireless LAN, mobility, DCF, DF-DCF, NS.

I. INTRODUCTION

Recently, Wireless LANs based on IEEE 802.11 standard offers speeds of up to 11 Mbits/s (about 6.5 Mbps in practice). In addition to the extension of wired LANs, WLANs have generated new markets, including public access networks or public hot spots. From the standpoint of standardization, there are currently two standards for WLANs: High Performance Radio LAN (HiperLan) [1, 2] and IEEE 802.11 [3]. However due to the emergence of WiFi products [4, 5] and WiFi5 [6], the IEEE 802.11 standard has recently a major success that continues to grow. We are interested in this work to study the quality of service in this network family. Indeed, access to multimedia content can be achieved only if these networks offer guarantees in terms of delay, jitter or loss rate. The IEEE 802.11 MAC layer includes a large number of management features such as the frames addressing, frame formatting, error checking, fragmentation and frame reassembly, the management of the terminals mobility (association, reassociation, disassociation) and security

services (authentication, desauthentification, privacy). Aside from these management features, one of the features of the 802.11 MAC layer is that it defines two different methods of access to the medium. The first is the Distributed Coordination Function (DCF), which corresponds to an access method similar to that of Ethernet. The DCF was designed to support the transport of asynchronous data and to allow all users wishing to transmit data to have the same opportunity to access support. In the second, the "Point Coordination Function” (PCF), the various data transmissions between the network terminals are managed by a central point of coordination. This is usually located in the access point. The PCF has been designed to enable the transmission of sensitive data. We found that the different proposals for the quality of service introduction in WLANs have some inefficiency. On the one hand, centralized access methods are complex to implement, and on the other hand, completely distributed methods provide an effective differentiated service in the absence of TCP traffic. To address the limitations, we propose in this paper a new mechanism for service differentiation DF-DCF. This mechanism is based on the idea that for any given real-time flow is associated a period beyond which it becomes unmanageable. Based on this fact, we propose to choose the parameters of differentiated services to be applied to the frame also according to the time held in the queue. If this deadline is too large, the frame will simply be abandoned. Since DIFS differentiation is one that provides the best results in terms of differentiation of services, we propose to improve and integrate the period of queuing in the calculation of an instantaneous value for DIFS per frame. The paper is organized as follows. In section 2, we describe the main extensions to the IEEE 802.11for supporting quality of service. Section 3 is devoted to the description of our proposed approach: DF-DCF to better meet the requirements of different quality of service. A performance analysis and a comparative study with DIFS approach will be presented in Section 4. Finally, a conclusion and perspectives close this paper.

(IJCSIS) International Journal of Computer Science and Information Security, Vol.3 No.1, July 2009

- -

95

II. PROPOSED APPROACH: DF-DCF With the prospect of improvement in the differentiation of services for both the TCP and UDP traffics, we propose a new mechanism for differentiation of services per frame in WLANs named DF-DCF. The method of access to medium in DF-DCF is based on the use of the time for queuing frames in the calculation of instantaneous DIFS values to be applied to these frames of data. The basic idea is to take into account the delay that may have the frames belonging to a flow to calculate their priority [14]. We associate with each frame a time of expiry which is the time when the frame must be transmitted. If the frame is not transmitted before the expiration of its lifetime, it will be eliminated. The service level of the frame will be calculated on the basis of its residual life. This is done through a feature called FSL for 'Frame Service Level". The service differentiation is based primarily on the basis of calculating the level of service by frame: the frame with the smallest deadline is transmitted firstly. In order to implement this level of service in the MAC layer, it is done by calculating a value of DIFS for each frame depending on its level of service FSL. The two following sections describe the two essential components of the proposal: (i) calculate the level of service per frame and (ii) calculate the value of DIFS which corresponds to the class of service to which the frame belongs. A. Level of service per frame Temaxj is the maximum delay allowed for a frame belonging to the class of service j. The level of service per frame, FSLj(t), calculated at time t is then defined by equation (1).

FSLj(t) = (Temaxj +τ -t) /Temaxj

(1) Where τ is the moment of arrival of the frame. Note that the function FSLj(t) thus defined can be regarded as an implementation of the algorithm EDF (Earliest Deadline First). The system using FSLj(t) is described as a system with time-dependent priorities [15]. Fig. 1 illustrates how service levels are calculated for each frame. In this example, we consider the respective arrivals at times t1 and t2 of two frames F11 and F21 belonging to two different classes of service 1 and 2 having the constraints of time equal to the maximum Temax1 and Temax2 < Temax1. The first frame to process in a system with time dependent services has the highest level of service snapshot (the one whose value FSLj (t) is minimal). In the example shown in Fig. 1, the frame F11 will be processed primarily compared to F21 if frame processing time is in the interval

[t1, tth] where tth corresponds to the moment when the FSLj(t) has the same value for both frames (FSL1(tth) = FSL2(tth)) and therefore at the moment when they have the same level of service. It is important to note that during this time interval, frame F11 is chosen despite the fact that it belongs to the lowest priority class. For any time greater than or equal to the moment tth, it will be the frame F21 which will be primarily chosen. Note that in such a system, a frame is removed as soon as its function FSLj(t) reaches zero.

Figure 1. Function level of service per frame - FSLj (t)

B. Calculating the DIFS FSLj(t) function allows to calculate the service level of a frame following the waiting period that has been in the queue. The service classes are differentiated by the lifetime limit for a frame, Temaxj. The question we will answer is: how to use this service to obtain a differentiation of services at the MAC layer? To answer this question, we were inspired by the results obtained in [8, 9]. Specifically, we propose to extend the DIFS differentiation by introducing the service level of each frame, FSLj (t) in calculating the value DIFS, which then becomes instantaneous and per frame. A class of service is defined by the following triplet (Temaxj, DIFSj

min, DIFSj

max) where DIFSjmin and DIFSj

max are the minimum and maximum values DIFS that can take a frame in the same class of service. DIFSj = SIFS + nbSlotDIFSj * SlotTime (2) Where nbSlotDIFSj is the parameter for the differentiation of services. In the standard DCF approach, this parameter is 2. Our proposal is to calculate the DIFS value for a frame belonging to the ith class of service at time t, where the frame is selected for transmission by the MAC layer as follows:


- - 96

DIFSj(t)=SIFS+(nbSlotDIFSjmin+└ (nbSlot DIFSj

max - nbSlotDIFSj

min)*FSLj(t) ┘) *SlotTime (3)

The two parameters nbSlotDIFSjmax and nbSlotDIFSj

min are derived from calculation of DIFSj

max and DIFSjmin such

as: DIFSj

min = SIFS + nbSlotDIFSjmin * SlotTime (4)

DIFSjmax = SIFS + nbSlotDIFSj

max * SlotTime (5) Equation (5) shows that the value DIFSj(t) calculated at time t for a frame of class j is assigned by calculating the level of this frame FSLj(t). Thus, smaller will be the value FSLj(t) smaller will be the value DIFSj(t). Hence, more a frame is waiting for transmission, more deadline approaches, and the level will increase accordingly and thus it will have the chance to access the medium when it is supported by the MAC sub-layer. In DF-DCF, the class which is more priority is the one that having the smallest value Temaxj because it has the hard deadline. In case when the two classes have the same value Temaxj, the class having the highest priority will be the one with the smallest DIFSj

min, then, if still equal, the one with the smallest DIFSj

max.

III. PERFORMANCE EVALUATION OF DF-DCF AND COMPARISON WITH THE EXISTING DIFS

APPROACH

To study the performance of DF-DCF, we have extended the implementation of the IEEE 802.11 standard available on NS [16]. We have included the optional use of access methods DF-DCF and DIFS as an alternative to DCF. In this implementation, the capacity of the radio link is 1Mbps. The simulation results obtained using DF-DCF are compared with those obtained by DIFS only. This comparison is made for several possible scenarios. All these scenarios use the same topology: three stations (STA1, STA2 and STA3) are evenly distributed around an access point (AP). These three stations send three streams to a fixed terminal wired connected to the AP, as illustrated by the Figure 2.

Figure 2. Simulation Topology

The three flows from the three stations will be competing for access to the medium. Each of these three streams will be assigned to a particular class of service. For each scenario, the transmission of three streams begins respectively 50s, 100s and 150s in a simulation whose duration is 250s. To demonstrate the contribution of our approach, its obtained results will be compared to a DIFS differentiation. A class of service will be defined by the triplet (DIFSj

min, DIFSjmax, Temaxj) for DF-DCF and the

parameter DIFSj for DIFS differentiation. TABLE I. PARAMETERS OF THE SERVICE CLASSES FOR DF-DCF AND DIFS DIFFERENTIATION

DF-DCF DIFS

DIFSjmin/

DIFSjmax

Temaxj DIFSj

CBR1 50µs / 130µs 150ms 50µs

CBR2 130µs / 210µs 250ms 130µs

CBR3 210µs / 290µs 350ms 210µs

A. UDP Traffic In the first experiment, the three mobiles sent three UDP flows at constant flow: CBR1, CBR2 and CBR3. In this experiment, each of the three streams will saturate the radio link by transmitting at a rate exceeding the capacity of the channel. To do this, each flow sends packets of 2312 bytes every 0.02 s. The highest priority is given to CBR1, a medium priority to CBR2 and the smallest priority to CBR3. The parameters representing the respective priorities of these three streams for each method of access being simulated are illustrated in TABLE I. (a) Delay using three different flows CBR/UDP: case of DIFS

Point d’accès

STA1

STA2

STA3

fixed Station


- -

97

(b) Delay using three different flows CBR/UDP: case of DF-DCF

Figure 3. Delay introduced by the MAC layer: (a) DIFS, (b) DF-DCF.

Figure 3.a/b shows the delays by the three flows. The curves of Figure 3.a/b clearly show that the differentiation of services obtained by using DF-DCF is better than that obtained using DIFS (Figure 3.a). In addition, the comparison between the two mechanisms for each flow shows that delays obtained with our scheme are lower than those obtained by DIFS. This result was indeed expected because in DF-DCF we try to keep the waiting time in queue of the terminal, less than Temaxj. Our system reduces delays by eliminating frames whose lifetime has expired. This allows more frames to be transmitted more quickly. (a) Jitter using three different flows CBR / UDP: case of DIFS

(b) Jitter using three different flows CBR /UDP: case of DF-DCF

Figure 4. Jitter introduced by the MAC layer: (a) DIFS, (b) DF-DCF This last point can have fewer collisions then fewer retransmissions. Indeed, the mechanism of transmission of IEEE 802.11 based on an exponential binary backoff window may lead to delays and jitter significant and this feature is incompatible with the constraints of real-time applications [18]. We note also that the functioning of DF-DCF allows for the stabilization of the jitter of the flows of highest priority (Figure 4.a/b/). The evaluation that we conducted with UDP flows only, shows a great improvement in the differentiation of services compared to DIFS. This improvement is expressed in terms of reducing delays by flows of highest priority. This is achieved by the loss of frames whose deadline has been reached. Contrary to what one might think, these losses do not impair the overall loss rate of each stream. In fact, our mechanism anticipated only loss of hundreds of frames. This anticipation allows other frames to be quickly served. Indeed, these frames do not keep only less delay in the queues, but also fewer frames will compete for access to the medium. B. TCP Traffic Other interesting results were obtained by replacing CBR/UDP flows with FTP/TCP. In this scenario, each station sends always FTP packets of 1100 bytes. Therefore, they use the TCP transport protocol. The parameters representing the priorities of the three flows for each of the simulated access methods are the same as those used previously for the three UDP flows (see TABLE I). The flow FTP1, the one with the highest priority (with the smallest value Temaxj), obtains a very poor quality of service: the value Temaxj is so small that a large number of frames can be lost due to the expiration of their lifetime. Therefore, this flow goes regularly in the phase of Slow-Start and lost frames must be retransmitted


- - 98

by the TCP transport layer. Furthermore, this situation is accentuated when the second terminal begins to transmit: the priority between the two flows is not always visible. When the third terminal begins its transmission, the situation deteriorates to the point that all frames sent by the station STA1, supposed to be the highest priority, and are lost forcing TCP to abort transmission FTP1 after a few seconds. This is due to the values chosen Temaxj which are too low to allow normal operation of TCP. We doubled this value for each flow (TABLE II), but the result remained exactly the same. In fact, larger values of Temaxj give more chance for frames to be transmitted and therefore penalize one TCP flow. Similarly, if we assign a low value to Temaxj for class of service that you want to choose, this will penalize it more rather than give it priority. In other words, more Temaxj is small, the number of frames lost causing increases in TCP retransmissions and decreases radical flow leading to poor differentiation of services. This does not mean that DF-DCF is inadequate to TCP flow. This result demonstrates that only the concepts of priority cannot be the same for UDP flows and TCP flows.

TABLE II. PARAMETERS OF DF-DCF DF-DCF

DIFSjmin/ DIFSj

max Temaxj

FTP1 50µs / 130µs 300ms

FTP2 130µs / 210µs 500ms

FTP3 210µs / 290µs 700ms

UDP is usually used by real-time with the time constraints, but often not very sensitive to losses. Therefore, the delay may force the MAC layer to eliminate frames leading to the suitable choice of the value Temaxj. Remember that these losses ultimately lead frame to a decrease in the delay experienced by the flow priority. Then, the idea for the UDP flows will be: more Temaxj is smaller, the flow is more prior. DIFSj

min and DIFSjmax values must be chosen

according to the choice of Temaxj. Indeed, if Temaxj is chosen small and DIFSj

min and DIFSjmax are chosen high,

this can lead to excessive losses in the class: the choice of DIFSj

min and DIFSjmax high while Temaxj is chosen small

increases the period of access to medium for each frame and therefore increases the time waiting in the queue, leading to excessive losses. On the other hand, TCP flows are generally less sensitive to time, but rather to loss, therefore we must eliminate the frames of a TCP flow only when it is the lowest priority and we want to penalize it. The penalization of frames of TCP flow is to increase their waiting time. This is done through an appropriate choice of parameters DIFSj

min and DIFSj

max. Regarding the value Temaxj we can use the rule to take the same Temaxj value for all classes of services. This value must be chosen appropriately so that in case of

contention with a flow of more priority, frames lost will be that belonging to the lower priority flows leading to a differentiation of services strictly. Depending on the rules that we established earlier, we have assigned new parameters to different classes of services (see TABLE III). As for UDP flows, the simulation results of DF-DCF are compared to those of DIFS.

TABLE III. PARAMETERS OF DF-DCF AND DIFS

DF-DCF DIFS

DIFSjmin/

DIFSjmax

Temaxj DIFSj

FTP1 50µs / 130µs 375ms 50µs

FTP2 130µs / 210µs 375ms 130µs

CBR3 210µs / 290µs 375ms 210µs

Figure 5 shows a better differentiation of services between the TCP flows. In addition, the priorities between flows are more stringent than those obtained with DIFS. Indeed, we see very well in Figure 5.b. that unlike the DIFS differentiation (Figure 5.a), the RTT is clearly differentiated between the three flows. The explanation for this clear improvement, by DF-DCF, lies in the evolution of contention window for each TCP flow. Indeed, we note that the lowest priority flow remains mostly in the Slow-Start phase. This can restrict the flow compared to other flows. FTP3, with a lower bit rate, presents fewer constraints. This causes an improvement in the RTT for the other two flows. Improvement of RTT leads in turn to improve the flow. As for the DIFS differentiation, differentiation of services between flows is almost invisible (Figure 5.a) because, due to its adaptive behavior, no loss is observed by TCP. Indeed, TCP is designed to fit the available flow. Therefore, through the mechanisms of MAC level retransmissions of any lost frame due to collision is transmitted directly by the MAC layer. This was broadcast before the TCP timer expires; no loss is felt at the level of TCP. With each new arrival of a flow, the rate available for each flow in the radio link reduces, causing a decrease in the slope of the congestion window of TCP. However, this decrease in slope does not prevent the congestion window to continue to grow less rapidly, but it continues to grow since no loss is indicated to it by the absence of a TCP-ACK. The decrease in the slope of the congestion window is due, in turn, to increased RTT which is caused by the arrival of a new flow. Mitigation of service differentiation between TCP flows when DIFS differentiation is used, is due to that this mechanism avoids the losses at the transmitter. In our mechanism,


- -

99

these losses are introduced through the timers Temaxj. This no longer means the same as for real-time. We note finally, as in the case of UDP flows, that no loss of total flow is felt here, because all that is lost by the lower priority flows is won by the flow of highest priority. (a) RTT using three different FTP/TCP flows: case of DIFS

(b) RTT using three different FTP / TCP flows: case of DF-DCF

Figure 5. RTT introduced by the MAC layer: (a) DIFS, (b) DF-DCF.

The evaluation of performance that we have achieved with TCP traffic only, allows us to deduce in the first instance a rule for the choice of parameters for differentiation of services. Indeed, a bad choice of the parameter Temaxj may strongly lead a TCP flow into excessive eliminations of frames. We have therefore concluded that the parameter Temaxj must be chosen high in order not to disadvantage the flows of highest priority. We also take the same value Temaxj for all classes of services. The choice of parameters DIFSj

min and DIFSjmax can then set the priority

of each of them. This rule is therefore different from that measured for UDP flows. Indeed, UDP is usually used by flows in real time constraints. Therefore, the smaller will be the value Temaxj, the highest priority class of service will be. The choice of parameters DIFSj

min and DIFSjmax

will then be taken depending on the choice made for Temaxj. The performance using this rule allows us to

demonstrate the effectiveness of the differentiation of services offered by DF-DCF compared to that obtained by DIFS. This is achieved through the loss of some frames belonging to the lower priority flows and TCP retransmissions. These losses are forcing TCP to reduce the throughput of the corresponding flow (the lower priority). Throughput lost by those flows will be earned by the highest priority flows and the improvement of their RTT. C. Mixing UDP and TCP traffic Having demonstrated the effectiveness of DF-DCF for the differentiation of services between the flows between the UDP and TCP flows, we propose in this section to analyze the behavior of DF-DCF towards a mix of UDP and TCP traffic. Since UDP is generally used for real-time, we therefore give priority to UDP flows compared to TCP flows. Therefore, we will focus on analyzing the QoS assigned to UDP flows when they come into competition with TCP flows. To this end, we have simulated the following scenario: a flow FTP/TCP and two flows CBR/UDP.

TABLE IV. PARAMETERS OF DF-DCF AND DIFS.

DF-DCF DIFS

DIFSjmin/

DIFSjmax

Temaxj DIFSj

CBR3 50µs / 130µs 150ms 50µs

CBR2 130µs / 210µs 250ms 130µs

FTP1 210µs / 290µs 1s 210µs

The three flows generated are denoted FTP1, CBR2 and CBR3. FTP1 is with the lowest priority, the highest priority going to CBR3 and CBR2 with a medium priority. The parameters representing the level of service for each of the access methods used are illustrated in TABLE IV. Note that the value Temax3 corresponding to flow FTP1 is chosen sufficiently large (1s) so that the TCP flow is not too disadvantaged compared to UDP flows. Figure 6.b shows clearly that the delay of the flow FTP1 increases whenever a new CBR flow is initiated leading to a clearer differentiation of services with DIFS. Indeed, we see clearly in Figure 6.a no differentiation is achieved between the flows FTP1 and CBR2 when the flux CBR3 begins its transmission. DF-DCF can also reduce delays in the three flows (Figure 6.b) compared with those obtained by DIFS (Figure 6.a). We also note that the flows CBR have the same quality of service when they are competing


- - 100

with/without a TCP flow. Indeed, the time CBR flows are still guided by the deadline Temaxj. This is achieved with an increase in both the flow FTP1 period and changes of this period, this increase are not critical to the operation flow FTP1. The results obtained with TCP traffic only were confirmed in this section also. Indeed, evaluation of performance we have achieved here is a clear service differentiation between different flows. Unlike a DIFS differentiation, the existence of TCP traffic does not affect the performance obtained by the UDP traffic in DF-DCF. (a) Delay using flow FTP/TCP and two different flows CBR/UDP: case of DIFS

(b) Delay using flow FTP/TCP and two different flows CBR/UDP: case of DF-DCF

Figure 6. Delays introduced by the MAC layer: (a) DIFS, (b) DF-DCF

IV. CONCLUSION AND FURTHER WORK

In this article, we discussed the main approaches proposed for the support of the quality of service in IEEE 802.11 networks. This study has identified a number of findings: on the one hand, complexity of management introduced by the centralized access methods that make them hardly viable in practice and on the other hand, inefficiency in

some requirements for distributed mechanisms of differentiation of services. A new DF-DCF mechanism was suggested for a differentiation of services effective for all flows. It is based on an extension of the DCF access method applied to the IEEE 802.11 network. The DF-DCF method associated to each frame a level of service depending on their time waiting in the queue like the EDF scheduling policy. This mechanism also has the advantage of being distributed by definition, thus avoiding overload due to the signaling induced by centralized methods. The performance of DF-DCF gives very promising results. Indeed, we have validated by simulation that the differentiation of services between flows belonging to different classes of service, is clearly improved compared to a simple differentiation DIFS. Unlike the other mechanisms, differentiation of services offered by DF-DCF is effective regardless of the type of traffic traversing the network: UDP flow, TCP connection or a mix of UDP and TCP flows. This differentiation of services is particularly achieved through a judicious choice of parameters representing the classes of services. On the other hand, since TCP is generally used by flows more tolerant times but very sensitive to losses, our approach tries to reduce the rate of elimination of frames belonging to a TCP flow. One last very important feature of DF-DCF is that of rates achieved. The DF-DCF method seems to have the same performance in terms of saturation throughput, than other methods of distributed access [8, 9, 17]. At the same time, a control algorithm enables the sharing of bandwidth available in each class should be used to maintain the loss rate below a certain threshold. A permissible loss rate helps maintain a satisfactory level of real-time applications quality. Among the further works, the extension of this work by the study of the scalability seems very interesting to identify the influence of number of nodes and mobility of some of the proposed solution. In addition, the performance evaluation of the QoS concept is under study after its effective implementation in a real mobile environment. In addition, we suggest coupling mechanisms for service differentiation at the MAC and network layers to build and secure a scheme of QoS from end to end. A research topic that remains open and is a direct result of our work is to optimize the intercellular handover to minimize its negative effects on the differentiation of services. A good approach to managing the handover will reduce the time additional leading to broken communication and rate of loss. This mechanism of QoS will be more suitable for Voice over IP communication.

REFERENCES

[1] Broadband Radio Access Networks (BRAN), “High Performance Radio LAN (Hiperlan) Type 1: functional


- -

101

Specification”, Project ETSI BRAN of standardisation, EN 300 652, July 1998. [2] Broadband Radio Access Networks (BRAN), “High Performance Radio LAN (Hiperlan) Type 2: System Overview”, Project ETSI BRAN of standardisation, TR 101 683, April 2000. [3] LAN MAN Standards of the IEEE Computer Society, “Wireless LAN medium access control (MAC) and physical layer (PHY) specification”, IEEE Standard 802.11, 1997. [4] Web Site of Wifi Alliance, “www.wi-fi-.org”. [5] S. Kapp, “802.11 : leaving the wire behind”, IEEE Internet computing, Vol. 6, No. 1, pp. 82-85, January/February 2002. [6] S. Kapp, “802.11a. More bandwidth without wires”, IEEE Internet computing, Vol. 6, No. 4, pp. 75-79, July/August 2002. [7] IEEE 802.11 Task Group e, “Draft Supplement to IEEE Std 802.11 – Part 11 : Wireless MAC and physical Layer Specifications : MAC enhancements for quality of service”, 2002. [8] I. Aad and C. Castelluccia, “Differentiation mechanisms for IEEE 802.11”, IEEE Infocom’01, Anchorage, Alaska, April 2001. [9] I. Aad and C. Castelluccia, “Priorities in WLANs”, Computer Networks, Vol. 41, No. 4, pp. 505-526, March 2003. [10] A. Veres, A.T. Campbell, M. Barry and L.H. Sun, “Supporting Service Differentiation in Wireless Packet Networks using Distributed Control”, IEEE Journal of Selected Areas in Communications (JSAC), Vol. 19, No. 10, pp. 2094-2104, October 2001. [11] NH. Vaidya, P. Bahl and S. Gupa, “Distributed fair scheduling in a wireless LAN”, In the sixth Annual International Conference on Mobile Computing and Networking, Boston, USA, August 2000.

[12] A. Lindgren, A. Almquist and O. Shelen, “Quality of Service Schemes for IEEE 802.11 Wireless LANs – An evaluation”, ACM/Kluwer Journal Of Special Topics in Mobile Networking and Applications (MONET), Vol. 8, No. 3, pp. 223-235, June 2003. [13] I. Niang, B. Zouari, H. Afifi and D. Seret, “Amélioration de schémas de QoS dans les réseaux sans fil 802.11”, Colloque Francophone sur l’Ingénierie des protocoles, Montréal, Canada, CFIP’02, May 2002. [14] Y.M. Gh. Doudane, R. Naja, G. Pujolle, and S. Tohme, “P3-DCF : Service Differentiation in IEEE 802.11 WLANs using Per-Packet Priorities”, IEEE semi-annual Vehicular Technology Conference, VTC’03-Fall, Orlando, Florida, October 2003. [15] L. Kleinrock, “Queueing Systems, Volume II: Computer Applications”, John Wiley and sons. Wiley Interscience, New York, 1976. [16] Network Simulator, NS-2.lb9, http ://www.isi.edu/nsnam/ns/ [17] G. Bianchi, “Performances Analysis of the IEEE 802.11 Distributed Coordination Function”, IEEE Journal of Selected Areas in Communications (JSAC), Vol. 18, No. 3, pp. 535-547, March 2000. [18] W.-T. Chen, B.-B. Jian and S.-C. Lo, « An adaptive retransmission scheme with QoS support for the IEEE 802.11 MAC enhancement », IEEE Vehicular Technology Conference Spring 2002, IEEE VTC Spring’02, (Birmingham, Alabama), May 2002.

Internationa Journal of Computer Science and Information Security

IJCSIS Vol. 3 No.1 July 2009

102

Authentication without Identification

Using

Anonymous Credential System

Dr. A. Damodaram,

Prof of CSE Dept & Director,

UGC- ASC, JNTUH, Hyderabad, [email protected]

H.Jayasri,

Asst.Prof, ATRI, Hyderabad,India,

[email protected]

Abstract

Privacy and security are often intertwined. For example, identity theft is rampant because we have become accustomed to authentication by identification. To obtain some service, we provide enough information about our identity for an unscrupulous person to steal it (for example, we give our credit card number to Amazon.com). One of the consequences is that many people avoid e-commerce entirely due to privacy and security concerns. The solution is to perform authentication without identification. In fact, all on-line actions should be as anonymous as possible, for this is the only way to guarantee security for the overall system. A credential system is a system in which users can obtain credentials from organizations and demonstrate possession of these credentials. Such a system is anonymous when transactions carried out by the same user cannot be linked. An anonymous credential system is of significant practical relevance because it is the best means of providing privacy for users.

Keywords: Pseudonyms 1.Introduction As information becomes increasingly accessible, protecting the privacy of individuals becomes a more challenging task. To solve this problem, an application that allows the individual to

control the dissemination of personal information is needed. In this paper we discuss about the best known idea for such a system called the Anonymous Credential System. An anonymous credential(AC) is a vector of attributes certified by a trusted certification authority. Anonymous credentials are used as a way to prevent disclosure of too much information about a user during the authentication process. Access management systems create profiles for each user who has been granted access. Additionally, some systems use digital certificates to further verify the user's identity. Depending on the system, these digital certificate's may contain a lot of information about an individual user's identity. Since the entire digital certificate is used during authentication, if compromised it could lead to a breach of sensitive information about the user, some of which could be used later for stealing the legitimate user's identity or authentication credentials for malicious access. The technology was also called minimal disclosure certificates by Stefan Brands. Here's a scenario to explain how it works. someone goes into a bar and the bartender asks for the person's driver's license to verify if he or she is of legal age to drink. Most likely, the bartender just looks at the person's date of birth and isn't interested in the name, address or other personal information. Once the bartender is satisfied, the person puts their license away and is allowed to stay in the bar. But in open network's -- like the Web and the Internet -- an entire digital certificate may be exposed to the whole world over the wire, where its contents can be sniffed and stolen by hacker's interested in stealing authentication credentials. Minimal disclosure solve



103

that problem by only providing enough information from the user's digital certificate to grant access to a system for a specific requirement. the user's whole identity or credentials aren't served up to the system requesting authentication. An anonymous credential system consists of users and organizations. Organizations know the users only by pseudonyms. Different pseudonyms of the same user cannot be linked. Yet , an organization can issue a credential to a pseudonym, and the corresponding user can prove possession of this credential to another organization(who knows her by a different pseudonym),without revealing anything more than the fact that she owns such a credential. Credentials can be for unlimited use (multiple-show credentials)or for one-time use(one-show credentials). 2.Motivation The internet, by design, lacks provisions for identifying who communicates with whom; it lacks a well-designed identity infrastructure. As a result, enterprises, governments and individuals have over time developed a bricolage of isolated, incompatible, partial solutions to meet their needs in communications and transactions. The overall result of these unguided developments is that enterprises and governments have a problem in identifying their communication partners at the individual level. Given the lack of a proper identity infrastructure, individuals often have to disclose more personal data than strictly required. In addition to name and address contact details such as multiple phone numbers (home, work, mobile) and e-mail addresses are requested. The amount and nature of the data disclosed exceeds that usually required of real world transactions, which can often be conducted anonymously – in many cases the service could be provided without any personal data at all. Over the long run, the inadequacy of the identity infrastructure affects individuals' privacy. The availability of abundant personal data to enterprises and governments has a profound impact on the individual's right to be let alone as well as on society at large.

3. Desirable Properties

3.1 Basic Desirable Properties i) It should be possible for a user to selectively disclose attributes. ii) An AC must be hard to forge. iii) A user's transactions must be unlinkable and iv) An AC must be revokable 3.2 Additional Desirable Properties i) Users should be discouraged from sharing their pseudonyms and credentials with other users. ( PKI assured non- transferability or all-or-nothing non-transferability) ii) It may be desirable to have a mechanism for discovering the identity of a user whose transactions are illegal. iii)It can also be beneficial to allow one-show credentials ie, credentials that should only be usable once and should incorporate an offline double spending test. 4.Requirements A basic credential system has users, organizations, and verifiers as types of players. Users are entities that receive credentials. The set of users in the system may grow over time. Organizations are entities that grant and verify the credentials of the users. Each organization grants a unique (for simplicity of exposition) type of credential. Finally, verifiers are entities that verify credentials of the users. For the purposes of non-transferability, we can add a CA(Certification Authority) to the model who verifies that the users entering the system possess an external public and secret key. This CA will be trusted to do his job properly. To allow revocable anonymity, an anonymity revocation manager can be added. This entity will be trusted not to use his ability to find out a user's identity or pseudonym unless dictated to do so. The user is anonymous until the revocation manager exposes his/her identity. Usually this is followed by entering the user ID into a revocation list. Revocation may be partial or total. In the former a subset of the entries in the vector is revoked, while



104

in the latter the whole vector is revoked.(ie. the user is revoked.) Ideally revocation authority should not be able to revoke capriciously. Finally, a credential may include an attribute, such as an expiration date. 5. Related Work The scenario with multiple users who, while remaining anonymous to the organizations, manage to transfer credentials from one organization to another, was first introduced by Chaum [6]. Subsequently, Chaum and Evertse[5] proposed a solution that is based on the existence of a semi-trusted third party who is involved in all transactions. However, the involvement of a semi-trusted third party is undesirable. The scheme later proposed by Damgard [4] employs general complexity theoretic primitives (one-way functions and zero-knowledge proofs) and is therefore not applicable for practical use. Moreover, it does not protect organizations against colluding users. The scheme proposed by Chen [3] is based on discrete logarithm-based blind signatures. It is efficient but does not address the problem of colluding users. Another drawback of her scheme and the other practical schemes previously proposed is that to use a credential several times, a user needs to obtain several signatures from the issuing organization. Lysyanskaya, Rivest, Sahai, and Wolf [1] propose a general credential system. While their general solution captures many of the desirable properties, it is not usable in practice because their constructions are based on one-way functions and general zero-knowledge proofs. Their practical construction, based on a non-standard discrete-logarithm-based assumption, has the same problem as the one due to Chen [4]: a user needs to obtain several signatures from the issuing organization in order to use unlinkably a credential several times. Brands provides a certificate system in which a user has control over what is known about the attributes of a pseudonym. Although a credential system with one-show credentials can be inferred from his framework, obtaining a credential system with multi-show credentials is not immediate and may in fact be impossible in practice. Another inconvenience of these and the other discrete-logarithm-based schemes mentioned above is that all the users and the certification authorities in these

schemes need to share the same discrete logarithm group. Jan Camenisch & Anna Lysyanskaya [2] propose a practical anonymous credential system that is based on the strong RSA assumption and the Diffie-Hellman assumption. They gave the first practical solution that allows 1) a user to unlinkably demonstrate possession of a credential as many times as necessary without involving the issuing organization 2) to prevent misuse of anonymity they offered optional anonymity revocation for particular transaction. 6. Concluding Remarks It appears that a compromise is required , either in the security requirements or in the amount of trust bestowed on the participant, in order to achieve a practical and efficient anonymous credential system. 7. References [1] A. Lysyanskaya, R. Rivest, A. Sahai, and S. Wolf: Pseudonym systems: In Selected Areas in Cryptography, vol. 1758 of LNCS. Springer Verlag, 1999. ]2] Jan Camenisch & Anna Lysyanskaya: An Efficient system for Non-transferable Anonymous Credentials with Optional Anonymity Revocation. In CRYPTO '97, vol. 1296 of LNCS,pp 410-424. Springer Verlag, 1997 [3]. L. Chen: Access with pseudonyms. In Cryptography: Policy and Algorithms, vol.1029 of LNCS, pp. 232-243. Springer Verlag, 1995. [4]. I. Damgard: Payment systems and credential mechanism with provable security against abuse by individuals. In CRYPTO '88, vol. 403 of LNCS, pp. 328-335. [5]. D. Chaum and J.-H. Evertse: A secure and privacy-protecting protocol for transmitting personal information between organizations. In CRYPTO '86, vol. 263 of LNCS, pp. 118-167. Springer-Verlag, 1987. [6]. D. Chaum: Security without identifi`cation: Transaction systems to make big brother obsolete. Communications of the ACM, 28(10):1030-1044, 1985.



105

Dr A Damodaram obtained his B.Tech. Degree in Computer Science and Engg.in 1989, M.Tech. in CSE in 1995 and Ph.D in Computer Science in 2000 all from Jawaharlal Nehru Technological University, Hyderabad. His areas of interest are Computer Networks, Software Engineering, Data Mining and Image Processing. He presented more than 44 papers in various National and International Conferences and has 7 publications in Journals. He guided 3 Ph.D., 3 MS and more than 100 M.Tech./MCA students. He joined as Faculty of Computer Science and Engineering in 1989 at JNTU, Hyderabad. He worked in the JNTU in various capacities since 1989. Presently he is a professor in Computer Science and Engineering Department. In his 19 years of service Dr. A. Damodaram assumed office as Head of the Department, Vice-Principal and presently is the Director of UGC Academic Staff College of JNT University Hyderabad. He was board of studies chairman for JNTU Computer Science and Engineering Branch (JNTUCEH) for a period of 2 years. He is a life member in various professional bodies. He is a member in various academic councils in various Universities. He is also a UGC Nominated member in various expert/advisory committees of Universities in India. He was a member of NBA (AICTE) sectoral committee and also a member in various committees in State and Central Governments. He is an active participant in various social/welfare activities. He was also acted as Secretary General and Chairman for the AP State Federation of University Teachers Associations, and Vice President for All India Federation of University Teachers Associations. He is the Vice President for the All India Peace and Solidarity Organization from Andhra Pradesh.

H.Jayasri obtained B.E. (CSE) from Bangalore University and M.Tech.(CSE) from JNTU, Hyderabad in 2001 and 2006 respectively. Pursuing Ph.D. from department of CSE JNTU, Hyderabad. She has 7yrs of teaching experience in various colleges of Hyderabad and Bangalore. Areas of research interest are Network Security and Computer Networks.


106

Transmission Performance Analysis of Digital Wire and Wireless

Optical Links in Local and Wide Areas Optical Networks

Abd El–Naser A. Mohamed1, Mohamed M. E. El-Halawany2

Ahmed Nabih Zaki Rashed3* , and Amina E. M. El-Nabawy4 1,2,3,4Electronics and Electrical Communication Engineering Department

Faculty of Electronic Engineering, Menouf 32951, Menoufia University, EGYPT 1E-mail: [email protected], 3*E-mail: [email protected]

Tel.: +2 048-3660-617, Fax: +2 048-3660-617

Abstract—In the present paper, the transmission performance analysis of digital wire and wireless optical links in local and wide areas optical networks have been modeled and parametrically investigated over wide range of the affecting parameters. Moreover, we have analyzed the basic equations of the comparative study of the performance of digital fiber optic links with wire and wireless optical links. The development of optical wireless communication systems is accelerating as a high cost effective to wire fiber optic links. The optical wireless technology is used mostly in wide bandwidth data transmission applications. Finally, we have investigated the maximum transmission distance and data transmission bit rates that can be achieved within digital wire and wireless optical links for local and wide areas optical network applications. Keywords—Wireless fiber optics; Transmission distance; Transmission bit rate; Radio frequency; Bit error rate; Digital optical links; Local area network; Wide area Network.

I. INTRODUCTION AND BACKGROUND

Optical Wireless communication, also known as free-space optical (FSO), has emerged as a commercially viable alternative to RF and millimeter-wave wireless for reliable and rapid deployment of data and voice networks [1]. RF and millimeter-wave technologies allow rapid deployment of wireless networks with data rates from tens of Mb/s (point-to-multipoint) up to several hundred Mb/s (point-to-point). However, spectrum licensing issues and interference at unlicensed bands will limit their market penetration [2]. Though emerging license-free bands appear promising, they still have certain bandwidth and range limitations. Optical wireless can augment RF and millimeter-wave links with very high (>1 Gb/s) bandwidth. In fact, it is widely believed that optical wireless is best suited for multi-Gb/s communication. The biggest advantage of optical wireless communication is that an extremely narrow beam can be used. As a result, space loss could be virtually eliminated (<10 dB). But few vendors take advantage of this and use a wide beam to ensure enough signal is received on the

detector even as the transceivers’ pointing drift. This scheme is acceptable for low data rates, but becomes increasingly challenging at multi-Gb/s rates. Our approach has been to shift the burden from the communication system to a tracking system that keeps the pointing jitter/drift to less than 100 μrad. With such small residual jitter, sub-milliradian transmitted beam widths can be used. In so doing, the data communication part of the system is relatively simple and allows us to scale up to, and even beyond, 10 Gb/s. The main challenge for optical wireless is atmospheric attenuation. Attenuation as high as 300 dB/km in very heavy fog is occasionally observed in some locations around the world [3]. It is impossible to imagine a communication system that would tolerate hundreds of dB attenuation. Thus, either link distance and/or link availability has to be compromised. It is also obvious, that the more link margin could be allotted to the atmospheric attenuation, the better the compromise is. As a result, in the presence of severe atmospheric attenuation, an optical link with narrow beam and tracking has an advantage over a link without tracking [4]. Recent years have seen a wide spread adoption of optical technologies [5] in the core and metropolitan area networks. Wavelength Division Multiplexing (WDM) transmission systems can currently support Tb/s capacities. Next generation Fiber-to-the-Home (FTTH) access networks are expected to rely on Passive Optical Networks (PONs) in order to deliver reliable, multi-megabit rates to the buildings serviced by the network. Time Division Multiplexing PON (TDM/PON) and Wavelength Division Multiplexing PON (WDM/PON) may constitute a reliable alternative to the Active PON, where routing is done using a large Ethernet switch. However, as optical technologies are starting to migrate towards the access networks the cost factor is a vital issue [6] to the economic prospects of the investments. Unless significant progress is achieved in optical component integration in the near future, in terms of the scale of integration and functionality, the cost of the optoelectronic components is not expected to diminish in view of the light specifications placed by TDM/PON and WDM/PON. More importantly, if the existing duct availability is limited, one may expect large investment costs due to the enormous fiber roll out required. Free-Space Optics (FSO) is being considered as an attractive candidate in order to establish ultra high Gb/s wireless connections. FSO systems are classified as indoor (optical LANs) and outdoor systems.


107

FSO is sometimes referred to as optical wireless since it basically consists of transmitting the optical signal directly into the atmosphere without the use of an optical fiber. FSO systems have high bit rates (1Gb/s is already commercially available, while 10 Gb/s systems may soon appear). WDM technology may also provide a further increase in the aggregate transmission capacity exceeding 100 Gb/s. However, as light is no longer guided by the optical fiber, the performance of outdoor FSO systems is mainly limited by environmental factors. It is widely recognized that fog is the worst weather condition for FSO systems causing attenuation that might well exceed 100 dB/Km under heavy fog conditions. Atmospheric scintillation, i.e. the change of light intensity in time is also another limiting factor. The scintillations, caused by random, thermally induced fluctuations of the refractive index along the propagation path, result in bit error rate penalties and FSO systems are

therefore designed to have a power margin. Another important issue is the misalignment caused by building sways [7] due to thermal expansion, wind sway and vibration. Using systems with larger beam divergence however may mitigate some of these effects. Automatic tracking techniques have been developed to deal with this problem. In the present study, we have investigated the transmission performance analysis of digital wire and wireless optical links in local and wide areas optical network over wide range of the affecting parameters. Moreover, we have analyzed parametrically and numerically the maximum transmission distance and transmission bit rates that can be achieved within digital wire and wireless optical links for optical networks.

II. SIMPLIFIED OPTICAL NETWORK ARCHITECTURE WITH WIRE AND WIRELESS

OPTICAL LINKS

Figure 1. Simplified optical network Architecture Model with wire and wireless optical links.

The architecture model of passive optical network with different optical links is shown in Fig. 1. PON consists of many laser diodes as a source of optical signals which converts the electrical signal in the information source to optical signal, multiplexer (Mux) in the OLT, different optical fiber links, demultiplexer (Demux), optical network unit (ONU) in the remote node (RN), optical detector which converts the optical signal to electrical signal for processing to ONU and connects to the supported number of users. In the transmission direction, the information source (electrical signal) is transmitted from the backbone network to the OLT and according to different users and location, optical source [laser diode or light emitting diode] convert it in to optical signal and is transmitted into corresponding wavelength and

multiplexed by Mux. When traffic arrives at RN, wavelengths are demultiplexed by Demux and sent to optical detector [Avalanch photodiode or PIN photodiode] convert the optical signal into electrical signal and then sent to ONUs which is distributed to different number of supported users. Wavelength division multiplexing passive optical networks has been regarded as a promising technology to meet the demands of various customers or most supported subscribers which include many kinds of broadband data such as a high speed internet, high data transmission data rate wireless transmission, and a real time video service. Recently, various techniques of access networks have been presented to increase transmission capacity, transmission distance and reduce the cost per supported user.

.

.

Optical Source

.

.

ONU1

.

.

ONUn

Wire Optical fiber link

Wireless optical link

DM

ux M

ux

Optical Source

Optical detector

Optical detector

Information Source

Information Source

OLT

Supported number of

users

Supported number of

users

RN


108

II. 1. Simplified basic configuration of digital wireless optical communication system

Figure 2. Basic configuration of wireless optical communication system.

Figure 3. Schematic view of the configuration of wireless optical link.

Figures (2, 3) show the basic configuration of wireless optical communication system link. The link consists of an electrical/optical (E/O) conversion device and an optical/electrical conversion (O/E) device. The E/O conversion is accomplished by either the laser diode or the external modulator, while the O/E conversion by the photodiode such as PIN diode and APD. The transmission data rate is dependent on the modulation speed of the E/O devices. The wavelength division multiplexed (WDM) technology can increase the transmission capacity using a number of laser diodes and photodiodes with multiplexers. The wireless optical link is used for point-to-point applications such as the access link between a hub station and a subscriber terminal. The important parameter for the wireless optical communication link is an optical

wavelength. The short wavelength in the range of 0.78-0.8 µm was first introduced to transmit a lower data rate. The long wavelength which is used for the fiber optic systems is in the range of 1.3-1.5 µm [8]. The advantage of the long wavelength is that the optical amplifiers are now available, and because of the amplification of optical carrier the transmission distance between a hub station and a subscriber terminal can be increased. The output power of the link should be designed by taken into account of the eye safety. Optical fiber maintenance is a very important issue to be consider in developing a high quality and reliable passive optical network [9].

II. 2. Simplified basic configuration of digital wire optical cable link

Figure 4. Basic configuration of digital optical link.

LD PD Data Data

lens

lens

Optical transmitter

Optical fiber channel Optical detector

Data in

Data out

Supported number of users

Optical signal

Modulator

Driver Laser TelescopeData in

Channel (Atmosphere)

Telescope

Optical filter Electro-optic

Detector

Amplifier

Data out


109

As shown in Fig. 4, the basic architecture view of the configuration of the digital optical link. Digital communications systems have many advantages over analogue systems brought about by the need to detect only the presence or absence of a pulse rather than measure the absolute pulse shape. Such a decision can be made with reasonable accuracy even if the pulses are distorted and noisy. For single wavelength systems, repeaters allow new clean pulses to be generated if required, preventing the accumulation of distortion and noise along the path. In optical communications systems, the pulse sequence is formed by turning on and off an optical source either directly or using an external modulator. The presence of a light pulse would correspond to a binary 1 and the absence to a binary 0. The two commonly used techniques for representing the digital pulse train are non return to zero (NRZ) and return to zero (RZ). In the case of NRZ, the duration of each pulse is equal to twice the duration of the equivalent RZ pulse. The choice of scheme depends on several factors such as synchronisation, drift etc. An ac coupled photoreceiver will generally not pass a signal with long sequences of ’1’s or ’0’s and so some form of RZ coding scheme would be required [10].

III. BASIC SYSTEM MODEL AND EQUATIONS ANALYSIS

There are several important system issues that need to be considered in the theoretical model equations analysis of such an arrangement of digital wire or wireless optical cable links: i)� Optical signal wavelength: Most installed fibre is

designed for use at 1.3 µm. If long haul links are necessary and multiple wavelength channels are needed then the wavelength must be in the 1.55 µm region. Single channel links can be implemented with a 1.3 µm system using optical repeaters to extend the reach.

ii)� Digital wireless: wireless modulation avoids the requirement for samplers and digitisers at each telescope site allowing these to be situated at the correlator. Digital systems are much less prone to noise and non-linear effects and so can offer better quality signals over larger distances.

iii)� Length of link: This will have an impact on the modulation technique used and the need for mid-span optical amplifiers (or electrical repeaters for single wavelength channels).

iv)� Data rate: If a digital implementation is chosen, the data rate will define the maximum instantaneous bandwidth and hence the sensitivity of the radio astronomy measurement. Commercial equipment will require standard data rates to be used (2.5Gbps, 10Gbps) which may not be compatible with the radio astronomy front and back end.

III. 1. Wireless optical link design In the design of wireless optical link system, it is important to determine the link budget equation. The general link budget equation is given by [11]:

( )

Lreceivertransmitreceived e

LA

PP α

θ−= 2

295.57. (1)

where preceived is the power at receiver (watt), Ptransmit is the transmission power (watt), Areceiver is the receiver effective area (m2), θ is the beam divergence (degrees), L is the length of the optical link (m), and α is the atmosphere absorption (dB/Km). The total loss coefficient is determined by: illationscsnowfograin LLLL intσσσσσ +++= (2)

where σrain is the absorption due to rain (Km-1), σfog is the absorption due to fog (Km-1), σsnow is the absorption due to snow (Km-1), and σscin is the absorption due to scintillation (Km-1). A variety of models exist for the calculation of these absorption coefficients. In the case of fog, the Kruse model according to:

q

fog VKm

−−

⎟⎟⎠

⎞⎜⎜⎝

⎛=

0

1 912.3)(λλ

σ (3)

where V is the visibility at (λ=λ0), Km, λ is the actual wavelength of the beam, µm, λ0 is the reference wavelength in µm for the calculation of V, and the exponent q is the size distribution of the scattering particles and is equal to 1.3 if 6 Km < V < 50 Km, and equal to 0.585 V1/3for low visibility V < 6 Km. Also to calculate the optical losses due to snow, the empiricial formula can be used: b

snow SAKmdB =)/(σ (4) where S is the snow fall rate (in mm/hour), A=5.42x10-5 λ+ 5.9458, and b= 1.38. In the same way, to calculate the optical losses due to rain, the empiricial formula can be used: 32076.1)/( RKmdBrain =σ (5) where R is the rain fall rate measure (in mm/hour). Finally the optical loss due to scitillation is calculated using the following expression [11]:

611267

92 10217.23.4 LCnsc ⎟⎟

⎠

⎞

⎜⎜

⎝

⎛⎟⎠⎞

⎜⎝⎛=λπ

σ (6)

where 2nC is the scintillation strength (in m-2/3). It should be

noted that the case of wireless optical link system, fog induced absorption is the most impairment and can be significantly affect the performance of the system. A link budget for wireless optical link using one lens in the transmitter and one lens in the receiver is calculated. Different kind of losses are calculated that may cause power losses during transmission [11]. The factors that cause the majority of the losses for the system are the atmosphere attenuation and ray losses. Equation (7) shows that the ray losses of the system depend on the radius of the receiver lens and the beam radius at the receiver unit. A Gussian beam intensity distribution is assumed [12]:

( )

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛−==

−Lw

R

total

receivers e

PP

F

22

1log10log10 (7)


110

where L is the link distance, Km, FS is the ray losses, dB, Ptotal is the total beam power at L, watt, R is the lens radius ,mm, w (L) is the beam radius, mm. Geometrical losses occur due to the diverence of the optical beam. These losses can be calculated using the following formula [12]:

,..100

295.572

⎟⎟⎠

⎞⎜⎜⎝

⎛+

=θdD

DAA

T

R

T

R (8)

where AR is the effective area of the receiver lens, AT is the effective area of the transmitter lens, DR is the diameter of the transmitting lens, DT is the diameter of the receiving lens, d is the distance between the wireless optical transmitter and receiver, θ is the divergence of the transmitted laser beam in degrees. Based on curve fitting Matlab Program, the fitting equations between optical signal to noise ratio (OSNR), the operting signal wavelength for transmitter and receiver, and the wireless optical link length are [13]: ,87.505.727.1235.17 32 LLLOSNR −+−= (9)

,75.913.273.1085.3 32 λλλ ++−=OSNR (10) The radio frequency transmission response provide the relative loss or gain in a wireless communication system links with respect to the signal frequency. Any signal attenuation due to the wireless communication links can be expressed as follows [12]:

( ) ,log10 ⎟⎟⎠

⎞⎜⎜⎝

⎛=

incident

rtransmitte

PP

dBonTransmissi (11)

where Ptransmitter is the radio frequency power calculated at the output of the receiver, and Pincident is the radio frequency power calculated at the input to the laser transmitter. Based on curve fitting Matlab Program, the fitting equations between transmission response, operating radio frequency, and amplification range are [12]:

( ) 32 23.442.705.282.10 fffdBonTransmissi −+−= (without amplification), (12)

32 85.156.265.1309.3)( fffdBonTransmissi +−+= (with amplification) (13) The Shannon capacity theorem to calculate the maximum data transmission bit rate or the maximum channel capacity for the wireless optical links is as follows: ( ) sec/,1log. 2 bitsOSNRWBC += (14) III. 2. Digital wire optical cable link design Digital communications systems have many advantages over analogue systems brought about by the need to detect only the presence or absence of a pulse rather than measure the absolute pulse shape. Such a decision can be made with reasonable accuracy even if the pulses are distorted and noisy. For single wavelength systems, repeaters allow new clean pulses to be generated if required, preventing the accumulation of distortion and noise along the path. Chromatic dispersion is caused by a variation in group velocity in a fibre with changes in optical frequency. The set of pulses generated by a laser which by virtue of the laser linewidth and signal modulation contains a spectrum of

wavelengths. As it traverses the fibre, the shorter wavelength components travel faster than the longer wavelength components and as a result, each pulse experiences broadening. By the time the pulses reach the receiver, they may have broadened over several bit periods and be a source of errors (inter symbol interference). The measure of chromatic dispersion is D, in units of psec/nm.km, which is the amount of broadening in picoseconds that would occur in a pulse with a bandwidth of 1nm while propagating through 1km of fibre. The chromatic dispersion factor is given by [13]:

,2 DLBcπλ

γ = (15)

where B is the data rate, L is the fiber path length, and c is the speed of light in a vacuum. As the optical components propagate through the fibre, the inherent birefringence causes one of the components to be delayed with respect to the other. In high bit rate systems, this differential group delay can lead to signal distortions and hence a degradation in the BER of the received signal. The group delay between two polarisation components is called the differential group delay, ∆τ. Its average is the Polarization mode dispersion (PMD) delay in psec and is expressed by the PMD coefficient in ps/km1/2. The PMD does not increase linearly, but with the square root of transmission distance. ,. coeffL ττ Δ=Δ (16)

where L is the transmission distance, and ∆τcoeff is the PMD coefficient. Taking into account the statistical character of PMD variations, if a 1 dB power pentaly due to PMD can be accepted then:

,10maxT

≤Δτ (17)

where T is the bit period. Setting T as 1/B0 we obtain:

,..1001

220 coeffB

LτΔ

≤ (18)

where B0 is the bit rate. The receiver sensitivity is defined as the minimum number of photons per bit necessary to guarantee that the bit error rate (BER) is smaller than 10-9 . This sensitivity corresponds to an optical energy hν 0n and an optical received power as follows: ,00 BnhPreceiver υ= (19) This power is proportional to the total bit rate B0. In a saturated attenuation-limited link, the link budget in dBm units as follows [14, 15]: kmdBLPPPP mcsreceiver /,α−−−= (20) where Ps is the source power, Pm is the modulator power, α is the fiber loss in dB/Km, Pc is the coupling loss, and L is the fiber length. When Preceiver is converted to dB, it is evident that Preceiver increases logarithmically with the data rate B0. Therefore as the bit rate increases the power required to maintain the desired BER also increases. With this in mind, the derived maximum length of the digital optical link [15]:

KmBhn

PPPL mcs ,10

log1013

00 ⎟⎠

⎞⎜⎝

⎛ −−−=−

να

(21)


111

0

20

40

60

80

100

120

140

160

180

200

0.2 0.7 1.2 1.7

0.85 µm 1.3 µm 1.55 µm

Visibility, Km

Figure 5. Variations of the signal attenuation with visibility for

different laser diode wavelengths at the assumed set of parameters.

-20

-18

-16

-14

-12

-10

-8

-6

-4

-2

0

100 150 200 250 300 350 400 450 500

Lens diameter = 50 mm



Beam diameter at receiver, mm

Figure 6. Variations of the ray losses with beam diameter at receiver

for different lens diameter at the assumed set of parameters.

0

5

10

15

20

25

30

0.2 1 1.8 2.6 3.4 4.2 5

1.55 µm1.53 µm1.5 µm

Wireless optical link distance, L, Km

Figure 7. Variations of optical signal to noise ratio with wireless

optical link distance at the assumed set of parameters.

0

1

2

3

4

5

6

7

50 135 220 305 390 475 560 645 730 815 900

wireless link without amplification

wireless link with amplification

Transmitted radio frequency, MHz

Figure 8. Variations of wireless transmission with transmitted radio

frequency at the assumed set of parameters.

S

igna

l atte

nuat

ion,

dB

/ Km

Opt

ical

sign

al to

noi

se ra

tio, O

SNR

, dB

Sig

nal t

rans

mis

sion

, dB

R

ay lo

sses

, dB


112

0

10

20

30

40

50

60

70

80

90

100

50 135 220 305 390 475 560 645 730 815 900

wireless link distance= 0.4 Km

wireless link distance= 0.18 Km

wireless link diatance= 0.02 Km


Figure 9. Variations of transmission data rate with transmitted radio


0

1

2

3

4

5

6

7

8

9

10

50 135 220 305 390 475 560 645 730 815 900

wireless link distance = 5 Km

wireless link distance = 3 Km

wireless link distance = 0.6 Km


Figure 10. Variations of transmission data rate with transmitted radio


0

100

200

300

400

500

600

0 500 1000 1500 2000 2500 3000

1.55 µm

1.3 µm

0.85 µm

Transmission data rate, Mbit/sec

Figure 11. Variations of the transmission distance for digital optical

link with transmission data rate at the assumed set of parameters.

0

4

8

12

16

20

0 100 200 300 400 500

Attenuation limit with LD/APD

Attenuation limit with LED/PIN




Without amplification

With amplification

Tran

smis

sion

dis

tanc

e, K

m

Tra

nsm

issi

on d

ista

nce,

Km

Multi-mode fiber

T

rans

mis

sion

dat

a ra

te, M

bit/s

ec

T

rans

mis

sion

dat

a ra

te, G

bit/s

ec


113

0

20

40

60

80

100

120

140

160

0 750 1500 2250 3000 3750 4500

Attenuation limit with LD/APD

Attenuation limit with LED/PIN




0

40

80

120

160

200

0 750 1500 2250 3000 3750 4500

NRZ dispersion limit

RZ dispersion limit




The total rise time depends on: transmitter rise time (ttx), group velocity dispersion (tGVD), modal dispersion rise time (tmod), and receiver rise time (trx), therefore the total rise time, tsys, for the system is [15]:

2

1

1

2⎥⎦⎤

⎢⎣⎡∑==

n

iisys tt (22)

Total rise time of a digital optical link should not exceed 70% for a non return to zero code (NRZ) bit period, and 35% of a return to zero (RZ) code bit period. Assuming both transmitter and receiver as first order low pass filters, the transmitter and receiver rise times are given by:

sec,350350 nBB

ttrxtx

rxtx === (23)

where Btx and Btx are the transmitter and receiver bandwidths in MHz. The bandwidth BM (L) due to modal dispersion of a digital optical link length L is empirically given by:

,)( 0qM L

BLB = (24)

where B0 is the bandwidth per Km (MHz-Km product) and 0.5 < q < 1 is the modal equilibrium factor. Then the modal dispersion rise time is given by:

sec,44044.0

0mod n

BL

Bt

q

M== (25)

sec, nLDtGVD λσ= (26) where D is the chromatic dispersion parameter (nsec/nm.km), σλ is the half power spectral width of the source (nm), and L is the optical link distance in Km. Therefore the total rise time system is given by [15]:

( )[ ] sec,/440 21

20

2222222 nBLLDttt qrxtxsys +++= λσ (27)

IV. RESULTS AND DISCUSSIONS

IV. 1. Wireless optical link The main objective of the wireless optical link design is to get as much light as possible from one end to the other, in order to light as possible from one end to the other, in order to receive a stronger signal that would result in higher link receive a stronger signal that would result in higher link margin and greater link availability. As shown in Table 1, the proposed wireless optical link parameters to achieve maximum both tranmission link distance and transmission data rate.

TABLE 1. PROPOSED WIRELESS OPTICAL LINK DESIGN PARAMETERS.

Power transmitted (PT) 100 mWatt Operating wavelength range (λ) 0.85 µm to 1. 55 µm Transmitter beam diveregnce (θ) 115 degree Recriver diameter (DR) 0.1-0.5 m Link distance range 0.1 to 10 Km Receiver sensitivity (SR) or power received

2 µWatt

Transmitter and receiver losses (η) 50 %

Single-mode fiber

Tra

nsm

issi

on d

ista

nce,

Km

Single-mode fiber

Tra

nsm

issi

on d

ista

nce,

Km


114

Based on the assumed set of the controlling parameters for wireless optical link design to achieve the best transmission bit rates and transmission distances and the set of the figures from (5-10), the following facts are assured: 1) Fig. 5 has indicated that as the transmission distance

(visibility) increases, the signal attenuation decreases at the same optical signal wavelength. While as the optical signal wavelength increases, signal attenuation decreases at the same transmission distance.

2) As shown in Fig. 6, as the beam diameter at receiver increases, the ray losses also increases at the same lens diameter. While as the lens diameter increases, the ray losses decrease at the same beam diameter at receiver.

3) Fig. 7 has demonstrated that as wireless optical link distance increases, the optical signal to noise ratio (OSNR) decreases at the same optical signal wavelength. Moreover, as the optical signal wavelength increases, the OSNR also increases at the same wireless optical link distance.

4) As shown in Fig. 8, as the transmitted radio frequency increases, the signal transmission also increases for both amplification and non amplification techniques. But with amplification technique offered high signal transmission.

5) Figs. (9, 10) have indicated that as the transmitted radio frequency increases, the transmission data rate also increasesin both cases of amplification and non amplification techniques at the same wireless link distance. While, as the wireless link distance increases, the transmission data rate decreases at the same transmitted radio frequency. Moreover with amplification techniques offered both high transmission link diatance and transmission data rate.

IV. 2. Wire optical cable link

The main goal is how to develop a simple point to point digital wire optical cable link design, taking into account link power budget calculations and link rise time calculations. A link should satisfy both these budgets such as transmission distances, and data rate for a given BER. The data transmission bit rate, and transmission distances are the major factors of our interest for designing digital wire optical cable link. Table 2 shows the proposed wire digital optical cable link parameters to calculate both transmission distances and date rates.

TABLE 2. PROPOSED DIGITAL WIRE OPTICAL CABLE LINK DESIGN PARAMETERS.

Power transmitted (PT) 100 mWatt Power received (Preceiver) 2 µWatt Fiber Loss 3.5 dB/Km Couplers [LED-PIN] 1.5 dB Bandwidth per Km (B0) 900 MHz-Km Modal equilibrium factor (q) 0.7 LED [σλ] 50 nm LD [σλ] 1 nm Couplers [LD-APD] 8 dB Material dispersion (Dmat) 0.07 nsec/nm.Km

Also in the same way, based on the assumed set of the controlling parameters for wire optical cable link design to achieve the best transmission bit rates and transmission distances and the set of the figures from (11-14), the following facts are assured: 6) As shown in Fig. 11, as the transmission data rate

increases, the transmission distance decreases at the same optical signal wavelength. Moreover as the optical signal wavelength increases, the transmission distance also increases at the same transmission data rate.

7) Figs. (12, 13) have demonstrated that as the transmission

data rate increases, the transmission distance decreases at the same attenuation limit for both LD/APD and LED/PIN for both single and multi-mode fibers. While as the attenuation limit with both LD/APD and LED/PIN decreases, the transmission distanceincreases at the same transmission data rate.

8) Fig. 14 has assured that as the transmission data rate

increases, the transmission distance decreases at the same dispersion limit for both return to zero (RZ) and non return to zero (NRZ) codes. Moreover as the dispersion limit for both RZ, and NRZ decrease, the transmission distance increases at the same transmission data rate for single mode fiber link.

V. CONCLUSIONS

In a summary, we have investigated and analyzed the

transmission performance characteristics for both digital wire and wireless optical links in local and wide areas optical networks. We have demonstrated that the larger of the optical signal wavelength, the higher transmission distance for both wireless and wire digital optical links. Moreover, we have demonstrated that with amplification techniques, which added additional costs for wireless system, the wireless optical link offered both high transmission distances and transmission data rate. In the normal case (without amplification), the digital wire optical cable link offered both high transmission distances and data rates over wireless optical link with amplification. Also, we have assured that the use of LD/APD couplers offered maximum transmission distances and data rates over the use of LED/PIN couplers. Therefore it is evident that the digital wire optical cable links offered the best performance in cost, transmission distances, and transmission data rates over wireless optical links.


115

REFERENCES [1] C. Lee, W. V. Sorin, and B. Y. Kim, “Fiber to the Home

Using a PON Infrastructure,” IEEE J. Lightwave Technology, Vol. 24, No. 2, pp. 4568-4583, 2006.

[2] R. E. Wagner, J. R. Igel, R. Whitman and M. D. Vaughn, “Fiber-Based Broadband-Access Deployment in the United States,” IEEE J. Lightwave Technology, Vol. 24, No. 3, pp. 4526-4540, 2006.

[3] D. P. Shea, and J. E. Mitchell, “Architecture to Integrate Multiple Passive Optical Networks (PONs) with Long Reach DWDM Backhaul,” IEEE Journal On Selected Areas in Communications, Vol. 27, No. 2, pp. 126-133, Feb. 2009.

[4] Yong-Yuk Won, Hyuk-Choon Kwon, Moon-Ki Hong, and Sang-Kook Han, “1.25-Gb/s Wire line and Wireless Data Transmission in Wavelength Reusing WDM Passive Optical Networks,” Microwave and Optical Technology Letters, Vol. 51, No. 3, pp. 627-629, March 2009.

[5] M. S. Ab-Rahman, H. Guna, M. H. Harun, and K. Jumari, “Cost-Effective Fabrication of Self-Made 1x12 Polymer Optical Fiber-Based Optical Splitters for Automotive Application,” American J. of Engineering and Applied Sciences, Vol. 2, No. 2, pp. 252-259, 2009.

[6] D. Kedar and S. Arnon, “Urban Optical Wireless Communication Network: The Main Challenges and Possible Solutions,” IEEE Comm. Magazine, Vol. 42, No. 3, pp. 3-8, 2006.

[7] J. Schuster, H. Willebrand, S. Bloom, E. Korevaar “Understanding the Performance of Free Space Optics,” Journal of Optical Networking, Vol. 3, No.2, pp.34-45, 2005.

[8] K. Kiasaleh, “Performance of APD-Based, PPM Free-Space Optical Communication Systems in Atmospheric Turbulence,” IEEE Trans. Commun., Vol. 53, No. 2, pp. 1455-1461, 2005.

[9] J. Mietzer and P. Hoeher, “Boosting the Performance of Wireless Communication ystems: Theory and Practice of Multiple Antenna Technologies,” IEEE Comm. Mag., Vol. 42, No.3, pp. 40-46, 2007.

[10] E. I. Ackerman, G. E. Betts, W. K. Burns, J. C. Campbell, C. H. Cox, N. Duan, J. L. Prince, M. D. Regan, and H. V. Roussell, “Signal-to-Noise Performance of Two Photonic links using Different Noise Reduction Techniques,” in IEEE Int. Microwave Sympos., pp. 51–56, 2008.

[11] T. Kamalakis, A. Tsipouras, and S. Pantazis, “Hybrid Free Space Optical/Millimeter Wave Outdoor Links for Broadband Wireless Access Networks,” the 18 th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), pp. 51-57, 2008.

[12] H. Refai, and M. Atiquzzaman, “Comparative Study of the Performance of Analog Fiber Optic Links Versus Free Space Optical Links,” J. of Optical Engineering, Vol. 45, No. 2, pp. 25-32, Feb. 2006.

[13] S. Lee, et. al., “Pointing and Tracking Subsystem

Design for Optical Communications Link Between the International Space Station and ground,” SPIE Proceedings, Vol. 33, pp. 5-9, 2003.

[14] A. F. Elrefaie et al, “Chromatic Dispersion Limitations in Coherent Lightwave Transmission Systems,” J. Lightwave Technology, Vol. 6, No. 5, pp. 704-709, May 2004.

[15] B. Smith, R. E. Spencer, D. C. Brown, and M. Bentley, “Optical Fibre Communications Between Radio Telescopes in the European VLBI Network,” TMR-LSF RTD Sub Project 4, June 2003.

Abd-Elnaser A. Mohammed

Received Ph.D scientific degree from the

faculty of Electronic Engineering, Menoufia

University in 1994. Now, his job career is

Assoc. Prof. Dr. in Electronics and Electrical

Communication Engineering department.

Currently, his field and research interest in

the all passive optical and communication

Networks, analog-digital communication systems, optical systems,

and advanced optical communication networks.

Ahmed Nabih Zaki Rashed was born in Menouf, Menoufia State, Egypt,

in 1976. Received the B.Sc. and M.Sc.

practical scientific degrees in the Electronics

and Electrical Communication Engineering


Engineering, Menoufia University in 1999

and 2005, respectively. Currently, his field

interest and working toward the Ph.D degree

in Active and Passive Optical Networks (PONs). His theoretical and

practical scientific research mainly focuses on the transmission data

rates and distance of optical access networks.


116

Automatic local Gabor features extraction

for face recognition

Yousra BEN JEMAA National Engineering School of Sfax

Signal and System Unit Tunisia

[email protected]

Sana KHANFIR National Engineering School of Sfax

Tunisia

Abstract—We present in this paper a biometric system of face detection and recognition in color images. The face detection technique is based on skin color information and fuzzy classification. A new algorithm is proposed in order to detect automatically face features (eyes, mouth and nose) and extract their correspondent geometrical points. These fiducial points are described by sets of wavelet components which are used for recognition. To achieve the face recognition, we use neural networks and we study its performances for different inputs. We compare the two types of features used for recognition: geometric distances and Gabor coefficients which can be used either independently or jointly. This comparison shows that Gabor coefficients are more powerful than geometric distances. We show with experimental results how the importance recognition ratio makes our system an effective tool for automatic face detection and recognition.

Keywords-component; face recognition; feature extraction; Gabor wavelets; geometric features

I. INTRODUCTION Face recognition is a very challenging area in computer vision and pattern recognition due to variations in facial expressions, poses, illumination. Face recognition is largely motivated by the need for access control, surveillance and security, telecommunication and digital library...[10]. Face detection is the first stage of an automated face recognition system, since a face has to be located before it is recognized [11]. Consequently, the performance of face detection step certainly affects the performance of the recognition system. The problem of face detection is a very challenging due to the diverse variation of face and the complexity of background in images. Different methods have been proposed, they can be classified into four categories: Knowledge-based methods, Feature-based methods, Template-based methods and Appearance-based methods [12]. We present a face detection technique based on the skin color classification. When skin regions are detected, facial feature extraction attempts to find the most appropriate representation of the face images for recognition. There are mainly two approaches:

General systems and Geometric feature-based systems [10]. In general methods, faces are treated as a whole object. In order to reduce the dimensionality of the face representation, principal component analysis (PCA) or eigenfaces [8] and neural networks are extensively used. Geometrical methods are based on measures extracted between the facial features such as eyes, mouth, nose and chin. Representative works include hidden Markov model (HMM) [19], elastic bunch graph matching algorithm [3] and local feature analysis. Geometrical techniques are usually computationally more expensive than global techniques but are more robust to variations (size, orientation...). We use the two techniques; we first locate the feature points and then apply Gabor filters in each point in order to extract a set of Gabor wavelet coefficients. We use wavelet analysis because it can localize, in space-frequency, characteristics of images and it can represent faces in different spatial resolutions and orientations [7] [17]. For example, Zhang [9] has proposed a system of face recognition based on Gabor wavelet associative memory (a neural network)[6]. Wiskott [3] proposed a face recognition system by elastic bunch graph matching on which he represents the local features of faces (eyes, mouth, etc.) by a set of Gabor wavelets. The proposed system of face recognition can detect the face in an image and localize automatically 10 fiducial points. After that it characterizes the face by the geometric distances between the extracted points or by applying a set of Gabor wavelets (filters) in correspondence to each fiducial point. These two types of features extracted from face can be used either independently or jointly. They are used as input data to neural networks for classification. The recognition performances with different types of features are compared. The paper is organized as follows: the face detection system is described in section 2. Section 3 presents the algorithm of facial features localization and extraction of their characteristic points. The two types of features extracted from face are described in section 4. The face recognition system based on neural networks is presented in section 5. Section 6 provides the experimental results and the comparison between the two types of features. Finally, we conclude in section 7.


117

II. THE FACE DETECTION SYSTEM Face detection is a very important stage to start the step of face recognition. In fact, to identify a person, it is necessary to localize his face in the image. Our system of detection supposes that only one face is presented in the field of the camera. It is described in fig 1.

Figure 1. Steps of face detection

A. Preprocessing and skin color extraction In the first step, an average filter, a low-pass filter, is applied to the image to attenuate the noise. In the second step, the methodology of face detection is assured by the skin color technique since it is invariant to changes in size, orientations and occlusion...Many color spaces have been proposed in the literature for skin detection [13]. We propose for our system to use the YCbCr color space, because the Cr and Cb (chrominance) components are independent of the skin color, the human race and the lighting conditions (it is in the YCbCr color space that the luminance is decoupled from the color information). Furthermore, the Cb and Cr are the chrominance components used in MPEG and JPEG [20]. Fig 2 illustrates how the two chrominance plans have separated the skin color from the background.

Figure 2. Conversion of an RGB image (a) into Cb (b) and Cr (c) space

In order to classify each pixel into skin or non-skin pixel, the most suitable arrangements that we found for all input images in database are [4]: Cb in [77, 127] and Cr in [133, 173]. These arrangements are not sufficient to find a good classification. Fig 3 shows the limitation of this approach for classification. In fact, a big part of the face is considered as background.

Figure 3. Classification of the image into skin region and non-skin region

To overcome the problem listed above, we propose to apply a fuzzy approach for pixel classification. This is considered as a good solution since that, fuzzy set theory can represent and manipulate uncertainly and ambiguity [15]. We use the Takagi-Sugeno fuzzy inference system (FIS). This system is composed of two inputs (the 2 components Cb and Cr) and one output (the decision: skin or non-skin color). Each input has three sub-sets: light, medium and dark. Our algorithm uses the concept of fuzzy logic IF-THEN rules; these rules are applied in each pixel in the image in order to decide whether the pixel represents a skin or non-skin region [21]. The fuzzy logic rules applied for skin detection are the following: 1. IF Cb is Light and Cr is Light THEN the pixel =0 2. IF Cb is Light and Cr is Medium THEN the pixel =0 3. IF Cb is Light and Cr is Dark THEN the pixel =0 4. IF Cb is Medium and Cr is Light THEN the pixel =0 5. IF Cb is Medium and Cr is Medium THEN the pixel =1 6. IF Cb is Medium and Cr is Dark THEN the pixel =1 7. IF Cb is Dark and Cr is Light THEN the pixel =0 8. IF Cb is Dark and Cr is Medium THEN the pixel =1 9. IF Cb is Dark and Cr is Dark THEN the pixel =0 The first step is to determine, for each input, the degree of membership to the appropriate fuzzy sets via membership functions. Once the inputs have been fuzzified, the final decision of the inference system is the average of the output (zi) corresponding to the rule (ri) weighted by the normalized degree pi of the rule (1 or 0). Fig 4 represents the input image and the output one after YCbCr color space conversion and skin region extraction using fuzzy classification.

Figure 4. Classification of the image into skin region and non-skin region using the YCbCr space and fuzzy classification


118

A comparison of figures 3 and 4 proves the importance of fuzzy logic in skin detection. We note that the number of the skin pixels detected by fuzzy logic is more important than the number using classic classification. Skin regions are represented in white and Non skin regions are represented in black. This step allows detecting skin regions in images but it is not sufficient to detect the external edge of the face.

B. Face localization and normalization Canny edge detection [16] is selected in this process to detect the external edge of the face. The method differs from the other edge detection methods in that it includes the weak edges in the output only if they are connected to the strong edges. After that, we extract the extreme points of this edge in the horizontal sense and the highest point of the face. The lowest point of the face is determined while supposing that the height of the face is bigger than the width about 1.3 times. Finally, the rectangle that delimits the face is defined by these four points as shown in fig 5. A standard size for all face examples of the database is necessary to lead a relatively correct recognition operation. In this work, images have a standard size of 50×50.

Original image skin detection and point localization Localized face

Figure 5. Face detection steps The following step consists of extracting face vector characteristic from normalized face.

III. AUTOMATIC FACIAL FEATURE LOCALIZATION AND EXTRACTION OF THE FACE CHARACTERISTIC POINTS

The facial features used in our system of face recognition are: eyes, mouth and nose. The system is described in fig 6.

Figure 6. System of features localization Analysis of the chrominance components indicates that eyes present high values of Cb and small values of Cr [5]. So, we

first extract the chrominance component Cb and then we find its maximal value. We proceed then to a threshold in order to obtain a black and white image. The zone of the eyes is divided in two equal parts, after that we find in every zone the white stains of maximal area. Finally, we apply an operation of dilation by a flat disk-shaped structuring element. We extract four points representing the external points of each maximal area of stains representing each eye. We also determine the center of gravity of these white stains that represents the center of each eye. Finally, we obtain 6 points characterizing the two eyes. The region of the mouth is in the bottom part of the face. Since the component of chrominance Cr depends on the red color [5], we use it to localize the mouth. In order to localize its geometrical points, we apply a Sobel filter to detect its contour then we extract its extreme points [2]. After the detection of the mouth and eyes, the localization of the nose becomes obvious since it is located between them. The geometrical points are also extracted by using a Sobel filter [2]. Fig 7 shows the experimental results when applying our system of features localization.

Figure 7. Facial features localization and extraction of the characteristic points

IV. FACIAL FEATURES

A. Geometric distances The elements of the facial feature vector represent significant distances between all extracted points. As shown in fig 8, these distances are:

• Deye: The mean of the two distances P1P2 and P3P4.

• Dcenter_ eye: The distance between the centers of the two eyes: P5P6.

• Dinterior_ eye: Distance between the two eyes: P2P3.

• Dnose: The width of the nose: P9P10.

• Deye_nose: The height of the nose.

• Dmouth: The width of the mouth: P7P8.

• Dnose_mouth: The distance between the mouth and the nose.

P1 P5 P2 P3 P6 P4

Face features and their characteristic points P1, P2,…,P9.

Normalized image of face

Eyes localization Mouth localization Nose localization

Extraction of characteristic

points of the eyes: P1,..,P6

Extraction of characteristic points of the mouth: P7,P8

Extraction of characteristic

points of the nose: P9, P10

P9 P10P7 P8


119

The facial feature vector is then: V = [Dcenter_eye; Deye; Dinterior_eye; Dnose; Deye_nose; Dmouth; Dnose_ mouth]

Figure 8. Components of facial features vector

This vector represents each person by a unique way, so it will be used for the recognition step.

B. Gabor features Since features extraction for face recognition using Gabor filters is reported to yield good results [10], we use here a Gabor wavelets based feature extraction technique. We use the following family of two-dimensional Gabor kernels [1] [14]:

where (x, y) specify the position of a light impulse in the visual field and µ, ϕ, γ, λ,σ are parameters of the wavelet. We have chosen the same parameters used by Wiskott [3] (TABLE I).

TABLE I. PARAMETERS OF GABOR WAVELETS

A set of Gabor filters is used with 5 spatial frequencies and 8 distinct orientations, this makes 40 different Gabor filters represented in fig 9.

Figure 9. The Gabor filters [3]

When convolving these Gabor filters with a simple face image, we obtain the filter responses. We find out that these representations display desirable locality and orientation performance. We have selected 5 sets of Gabor filters with different orientations (TABLE II).

TABLE II. SETS OF GABOR FILTERS FOR DIFFERENT ORIENTATIONS

When Gabor filters are applied to each pixel of the image, the dimension of the filtered vector can be very large (proportional to the image dimension). So, it will lead to expensive computation and storage cost. To alleviate such problem and make the algorithm robust, Gabor features are obtained only at the ten extracted fiducial points. If we note C the number of Gabor filters, each point will be represented by a vector of C components called ”Jet”. When applying the Gabor filters to all fiducial points, we obtain a jets vector of 10× C real coefficients characterizing the face.

V. FACE RECOGNITION The face recognition was assured by a non linear classifier which is the neural networks. The advantage using neural networks for recognition is the feasibility of training a system in very complex conditions (rotation, lighting). However, the network architecture has to be varied (number of layers and nodes) to get good performances. We tried to use a perceptron multi-layer architecture [18]. In order to describe the proposed architecture, we consider P


120

different persons so P classes: there are as many classes as person to identify. For each person, we have different samples obtained by rotation, by translation and by variation of the lighting. The Neural network is composed by a set of neural networks. There are as many networks as person to identify (P is the number of classes in the database). Each neural network is composed by N input nodes corresponding to the N extracted parameters and one output node which can be active (output=1) or inactive (output=0). The number of extracted parameters N is fixed when choosing the types of face features. In our study, we have used three types of features :

• Geometric distances between fiducial points.

• Gabor coefficients.

• Combined information between Gabor coefficients and geometric distances.

Figure 10. Architecture of the proposed neural network

VI. EXPERIMENTAL RESULTS

A. Face database To test the performances of our recognition system, we have carried out an experiment on the FERET database [22]. The database is split into two subsets at random:

• 60% of the database was used for training and 40% were acquired for testing process,

• 50% of the database was used for training and 50% were acquired for testing process,

• 30% of the database was used for training and 70% were acquired for testing process.

Each of these percentages has been tested on five random combinations of face samples.

B. Performabces of our method :comparison of different features

Experimental results of varying the system parameters are shown in tables III-V. Particularly, we give the average recognition rate of our system using respectively: geometric distances, Gabor coefficients and the fusion of the two features.

TABLE III. AVERAGE RECOGNITION RATE USING GEOMETRIC DISTANCES

TABLE IV. AVERAGE RECOGNITION RATE USING THE GABOR COEFFICIENTS

TABLE V. AVERAGE RECOGNITION RATE USING THE FUSION OF GABOR COEFFICIENTS AND GEOMETRIC DISTANCES

Gabor wavelets can represent features points in special frequency at different orientations. Therefore, the recognition rates, given by the neural network, increase when the number of Gabor wavelets and the number of training set increase. This is shown in fig 11.


121

Figure 11. The recognition rate using (1) first row: the Gabor coefficients (2) second row: the Gabor coefficients and geometric distances.

Figure 12. The recognition rate versus Gabor wavelets number for different facial feature vectors According to fig 12, we deduce that:

• Gabor coefficients are much more powerful than geometric distances.

• Among three types of features, the fusion of Gabor coefficients and geometric distances achieves the highest recognition rate because it takes advantages of the merits of Gabor wavelets and local features.

• More the number of Gabor wavelets increases more the recognition rate increases.

VII. CONCLUSION We propose a new system for human face detection and recognition. A contribution of the method is the use of fuzzy classification in order to detect faces. Another contribution is the use of the Gabor wavelets to represent a face since they can represent images in different frequencies at different orientations. Face is represented with its own Gabor coefficients expressed at the fiducial points (points of eyes, mouth and nose). We have implemented and studied a neural network architecture trained by three characteristic vectors. The first is composed by geometrical distances automatically extracted,

by our system, between the fiducial points, the second is composed by the responses of Gabor wavelets applied in the fiducial points and the third is composed by the combined information between the previous vectors. A comparative study between them shows that the third type of features has achieved higher recognition rate (99.98%) for faces at different degrees of rotation and different lighting due to the fusion of Gabor wavelets and local features. Our future orientation concerns the use of the proposed approach: Gabor wavelets applied in the fiducial points, on 3D faces in order to take into account more features for best recognition.

REFERENCES [1] N. Petkov and P. Kruizinga, Computational models of visual neurons specialised in the detection of periodic and aperiodic oriented visual stimuli: Bar and grating cells, Biological Cybernetics, pp 83-96, 1997. [2] S. Khanfir and Y. Ben Jemaa, Automatic facial features extraction for face recognition by neural networks, International Symposium on Image/Video Communications over fixed and mobile networks (ISIVC), 2006. [3] L. Wiskott, J. M. Fellous, N. Kruger and C. V. D. Malsburg, Face Recognition by Elastic Bunch Graph Matching, Intelligent Biometric Techniques in Fingerprint and Face Recognition, Chapter 11, pp. 355- 396, 1999. [4] T. Sawangsri, V.Patanavijit and S. S. Jitapunkul, Face segmentation using novel skin-color map and morphological technique, Trans on Engineering, Computing and Technology Vol.2, December 2004. [5] R. L. Hsu, M. A. Mottaleb and A. K. Jain, Face detection in color images, IEEE Trans on Pattern Analysis and Machine Intelligence, Vol.24, No.5, pp. 696-706, May 2002. [6] B. L. Zhang, H. Zhang and S. S. Ge, Face recognition by applying Wavelet subband representation and kernel associative memory, IEEE Trans on Neural Networks, Vol.15, No.1, January 2004. [7] C. Lui and H. Wechsler, Independent component Analysis of Gabor features for face recognition, IEEE Trans on Neural Networks, Vol.14, No.4, July 2003. [8] P. Belhumeur, J. Hespanha and D. Kriegman, Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection, IEEE Trans on Pattern Analysis and Machine Intelligence, Vol.19, No.7, pp 711-720, July 1997. [9] H. Zhang, B. Zhang, W. Huang and Q. tian, Gabor wavelet associative memory for face recognition, IEEE Trans on Neural Networks, Vol.16, No.1, January 2005. [10] R.Chellappa, C.L Wilson and S.Sorihey, Human and machine recognition of faces: a survey, proc IEEE, Vol.83, pp 705-740, May 1995. [11] M.H Yang, D. Kriegman and N. Ahuja, Detecting faces in images: A survey, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.24, No1, pp 34-58, 2002. [12] M.H Yang, N. Ahuja and D. Kriegman, Detecting faces in color images, IEEE Conf. on Image Processing, pp 127-139, October 1998. [13] A. Albiol, L. Torres and E. Delp, An unsupervised color image segmentation algorithm for face detection applications, IEEE Conf. on Image Processing , October 2001. [14] http://en.wikipedia.org/wiki/Gabor-filter.


122

[15] X. Q. Li, Z. W. Zhao, H. D. Cheng, C. M. Huang and R. W. Harris, A fuzzy logic approach to image segmentation, Int. Conf. on Pattern Recognition, pp 337-341, October 1994. [16] J.Canny, A computational approach to edge detection, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.8, No6, pp 679-698, 1986. [17] T.S.Lee, Image representation using 2D Gabor wavelets, IEEE trans. on Pattern Analysis and Machine Intelligence, Vol.18, No10, pp 859-970, 1996. [18] H.Rowley, S.Baluja and T.Kanade, Neural Network-based face detection, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.20, No1, pp 23-38, January 1998.

[19] F.Samaria, Face segmentation for identification using Hidden Markov Model, British Machine Vision Conf., 1993. [20] H.Wang and S.F Chang, An highly efficient system for automatic face region detection in MPEG video, IEEE Trans. on Circuit Systems for Video Technology, Vol.7, No4, pp 615-628, August 1997. [21] M. Ben Hmida and Y. Ben Jemaa, Fuzzy classification, image segmentation and shape analysis for Human face detection, IEEE Conf. on Electrinics, Circuits and Systems (ICECS), pp 640-643, December 2006. [22] http://www.itl.nist.gov/iad/humanid/feret/feret_master.html.


123

Intelligent Advisory System for Supporting University Managers in Law

A. E. E. ElAlfi Dept. of Computer Science

Mansoura University Mansoura Egypt, 35516

[email protected]

M. E. ElAlami Dept. of Computer Science

Mansoura University Mansoura Egypt, 35516

[email protected]

Abstract— The rights and duties of both staff members and students are regulated by a large and different numbers of legal regulations and rules. This large number of rules and regulations makes the decision-making process time consuming and error boring. Smart advisory systems could provide rapid and accurate advices to managers and give the arguments for these advices. This paper presents an intelligent advisory system in law to assist the legal educational processes in universities and institutes. The aims of the system are: to provide smart legal advisors in the universities and institutes, to integrate rules and regulations of universities and institutes in the e-government, to ease the burden on the legal advisor and the provision of consulting services to users, to achieve accurate and timely presentation of the legal opinion to a given problem and to assure flexibility for accepting changes in the rules and legal regulations. The system is based on experienced jurists and the rules and regulations of the law organizing Saudi Arabia universities and institutes.

Keywords: decision support systems, advisory systems, rule based systems ,university rules and regulations, e-government.

I. INTRODUCTION Decision making, often viewed as a form of reasoning

towards action, has raised the interest of many scholars including philosophers, economists, psychologists, and computer scientists for a long time. Any decision problem aims to select the "best" or sufficiently "good" action(s) that are feasible among different alternatives, given some available information about the current state of the world and the consequences of potential actions [1]. Advisory systems provide the advices and assist for solving problems that are normally solved by human experts. They can be classified as a type of expert systems [2,3]. Both advisory systems and expert systems are problem-solving packages that mimic a human expert in a special area. These systems are constructed by eliciting knowledge from human experts and coding it into a form that can be used by a computer in the evaluation of alternative solutions to problems within that domain of expertise. Advisory systems do not make decisions but rather help guide the decision maker in the decision-making process, while leaving the final decision-making authority up to the human user [4]. The decision maker works in collaboration with the advisory system to

identify problems that need to be addressed, and to iteratively evaluate the possible solutions to unstructured decisions. For example, a manager of a firm could use an advisory system that helps assess the impact of a management decision on firm value [5] or an oncologist can use an advisory system to help locate brain tumors [6]. In these two examples, the manager and the oncologist are ultimately (and legally) accountable for any decisions/diagnoses made. Traditionally rule-based expert systems operate best in structured decision environments, since solutions to structured problems have a definable right answer, and the users can confirm the correctness of the decision by evaluating the justification provided by explanation facility [7]. Luger [8] has presented some limitations of current expert systems.

Advisory systems are designed to support decision making in more unstructured situations which have no single correct answer. In unstructured situations cooperative advisory systems that provide reasonable answers to a wide range of problems are more valuable and desirable than expert systems that produce correct answers to a very limited number of questions [9].

Advisory systems support decisions that can be classified as either intelligent or unstructured, and are characterized by novelty, complexity, and open-endedness [10]. In addition to these characteristics, contextual uncertainty is ubiquitous in unstructured decisions, which when combined exponentially increases the complexity of the decision-making process. Because of the novel antecedents and lack of definable solution, unstructured decisions require the use of knowledge and cognitive reasoning to evaluate alternative courses of action to find one that has the highest probability of desirable outcome [11]. The more context-specific knowledge acquired by the decision maker in these unstructured decision-making situations, the higher the probability that they will achieve the desirable outcome [4].

The decision-making process that occurs when users utilize advisory systems is similar to that which is used for the judge-advisor model developed in the organizational behavior [12,13]. Under this model, there is a principle decision maker that solicits advice from many sources. However, the decision maker “holds the ultimate authority for the final decision and is made accountable for it” [14]. The judge-advisor model suggests that decision makers are motivated to seek advice from others for decisions that are important, unstructured, and involve uncertainty.


124

Universities made great strides in many areas related to e-government systems, but legal advice to decision makers in universities is still depending largely on the legal advisors .Fortunately, the law rules are considered as fertile ground for building knowledge based systems that can serve as high-level advisory in law [15]. The paper is organized as follows: Section 2 presents the system design and development. Section 3 presents case study. Section 4 is devoted to system flexibility and merits. The paper is terminated by concluding remarks and perspectives summarizing the obtained results and proposing problems for future work.

II. SYSTEM DESIGN AND DEVELOPMENT The intelligent advisory system (IAS) must provide

assistance for the decision making process. Its aim is to capture the expertise in a form that others can use, and to act as an operational guide without limiting the independent exploration of the user.

The three main processes in advisory systems are knowledge acquisition, cognition, and interface. The user interface allows users to access the IAS and includes multiple windows to visualize how the main parameters interrelate with each other. Input data such as certificates, student grade, age etc., are introduced through the user

interface. After input details have been entered, detailed output parameters such as, student accepted or rejected, are displayed. Advice messages are provided to the decision maker during the decision making process. They indicate the next action to be performed every time the IAS program is executed. These messages appear on windows until the decision making process constraints are satisfied.

A. Proposed system architecture and design The iterative support of advisory systems in the decision-

making process is shown in figure1. Knowledge is acquired by knowledge engineers from the experts and the documents of rule and regulations. The cognition is inferred by inference engine. The system has a monitoring agent to identify the need for identifying unstructured decisions that need to be addressed. Decision maker uses the user interface to communicate with the system. There exist an explanation facility to display the arguments of any decision. These are displayed in figure 1 as the flow of information from domain variables to the inference engine. If environmental domain variables exceed expected norms, then the system will notify the user that there is a situation which needs to be addressed and will begin the iterative decision-making process by offering a suggested course of action.

Figure 1. Proposed advisory system architecture

B. Cognition Problem solving varies in its external factors, including

problem type and representation and internal characteristics of the problem solver. Structured and simple problems can be solved with regular rules and principles. They have knowable and comprehensible solutions where the relationship between decision choices and all problem states is known or probabilistic. Unstructured and complex problems possess multiple solutions, solution paths, or no solution at all. Unstructured problem possesses multiple criteria for evaluating solutions, so it is uncertain which

concepts, rules, and principles are necessary for its solution and how they should be organized. It is often necessary for problem solvers to make judgments and express personal opinions or beliefs about the problem; so unstructured problems are uniquely human and interpersonal activities. Therefore, the frame or scenario-based case representation is suitable for well structured problem solving since the rules and principles of problem solving are well-defined. This means that the similar cases retrieved based on certain inputs or states can be applied to new problems. One of the knowledge acquisition frames designed for appointment of the demonstrator in university is shown in figure 2 .


125

Knowledge acquisition frame 1 Appointment Of The Demonstrator Certificates:(Bachelor):yes/No Equivalent yes/No University (Recognized): yes/No Estimation: good or higher Study period: 4 or 5 or 6 or 7 Other conditions: age … The health situation … Marital status Conditions of the Council of the dept. Conditions of the Council of the faculty: …………………………………………………………………… Conditions and exceptions of the Council of the university: ……some…medical…specializations………… Steady Committee for the appointment of repeaters, lecturers, language teachers, researchers assistants recommendation: Yes/no The opinion and recommendation of the University Council: Appoint the person Domain expert Name : Signature ( )

Figure 2. Frame for problem solving in the appointment of demonstrator

Different knowledge acquisition frames are designed to acquire knowledge in the different regulations of the university. The next stage is the knowledge representation.

C. Representation of knowledge One important class of architectural properties revolves

around the representation of knowledge. Semantic networks, encodes both generic and specific knowledge in a declarative format that consists of nodes ( for concepts or entities) and links (for relations between them). Figure 3 shows the semantic network for the acceptance of new student in the university. Frames and schemas offer structured declarative formats to specify concepts in terms of attributes (slots) and their values (fillers).

Figure 3. Semantic network for the acceptance of students in university

Table 1 shows the frames representing the semantic network shown in figure 3.

TABLE I. THE FRAMES OF STUDENTS IN UNIVERSITY

Frame name Slot Slot value Has A Behavior Has A Certificate

(education) Has A Job

Get Personal interview

Student

Get Health status Behavior Decision is Not or OK

Certificate Is Up to date Personal interview Decision is Not or OK

Health status Decision is Not or OK Job Belongs to Affiliation

Affiliation Approve The study in university

The study in university

Decision is Not or OK

OK Give the Legal authority Not Give the Legal authority

Legal authority - - The flowchart shown in figure 4 explains the decisions

applied for the acceptance of new student in university according the rules in the study and testing regulation.

Figure 4. flowchart for accepting student in university


126

The knowledge base is implemented using CLIPS [16] . Sample of the rules included in the knowledge base are given in figure 5.

(defmodule MAIN (export ?ALL)) (defglobal ?*Decision_OK* = 0)

;0=No selection , 1=True selection, 2=False selection

(defglobal ?*Decision_Causes* = "") (defglobal ?*Decision_Law_Text* = "") (defglobal ?*Decision_Law_Link* = "") ; definition of CLASSES (defclass MAIN::Final_Decision (is-a USER) (role concrete) (pattern-match reactive)

(slot Decision_OK create-accessor read-write)(type INTEGER)) ;0=No selection,

1=True, 2=False (slot Decision_Causes (create-accessor read-write) (type STRING)) (slot Decision_Law_Text (create-accessor read-write) (type STRING)) (slot Decision_Law_Link (create-accessor read-write) (type STRING))) ;===== General Rules =========== (defrule MAIN::List_Focus_01 (List 01 ?n) => (switch ?n (case 01 then (focus LIST_01_01)) (case 02 then (focus LIST_01_02)) (case 03 then (focus LIST_01_03)) (case 04 then (focus LIST_01_04)) (case 05 then (focus LIST_01_05)) (case 06 then (focus LIST_01_06)))) ;===================== (defrule MAIN::ConverFacts (SelGUI ?idx ?val ?ena ?stl ?tag) => (assert (Sel ?idx ?val ?ena ?stl ?tag))) (defmodule LIST_01_01 (import MAIN ?ALL) (export ?ALL)) ;===================== (defrule LIST_01_01::00 (declare (salience

100)) (Sel ? ?val ?ena ?stl ?tag) => ;case of student acceptance

(bind ?*Decision_Causes*"accept student") (bind ?*Decision_Causes* (str-cat ?*Decision_Causes* " The differentiation between applicants, who apply to them all the conditions and according to their grades in the secondary school certificate test,personal interview and admission tests if any. ")) (bind ?*Decision_Law_Text* "|rule3| rule 4") (bind ?*Decision_Law_Link* "102-1-3|102-1-4")) ;===================== (defrule LIST_01_01::99(declare (salience -90)) (Sel ? ?val ?ena ?stl ?tag) => (make-instance CaseDecision of Final_Decision

(Decision_OK ?*Decision_OK*) (Decision_Causes ?*Decision_Causes*) (Decision_Law_Text ?*Decision_Law_Text*) (Decision_Law_Link ?*Decision_Law_Link*)))

Figure 5. Samples of rules in the Knowledge base

III. CASE STUDY The higher education and universities council's law and

its executives regulations in Saudi Arabia is a multi-criteria systems. It consists of 8 regulations. Each of them includes more than 7 subsystems. The number of rules in the regulations are listed in table 2.

TABLE II. RULES AND REGULATIONS OF HIGHER EDUCATION AND UNIVERSITIES COUNCIL'S LAW

No Regulation Name Number of rules in regulation

1 Study and testing 53 2 Financial Affairs 52 3 The employment of non Saudis in

the universities 60

4 Scholarships and training for the associates of universities

41

5 affairs of graduate study 68 6 Saudi university employees 106 7 Scientific Research 51 8 Scientific societies 51

Figure 6. The main window of the proposed advisory system

The user interface of the proposed system is shown in figure 6. The decision making process in any subsidiary regulation needs series of queries. The answer to each query has a binary value yes or no. The answer in each case is followed by a decision or another query. All of these answers should be displayed in a main window and sometimes in accompanied dialogue window (exceptions). A part of this system is shown figure 7. The figure shows the decision and


127

what are the rules that yield to the decision. Many other windows are developed for each criteria in the project.

Exceptions

Decision Rules that cause the decision

Accept a new student

Figure 7. The decision , exception and the arguments window

IV. SYSTEM FLEXIBILITY AND MERITS Flexibility has become a key characteristic desired in

both software systems and business processes. Software system flexibility is a two-dimensional construct composed of structural and process flexibility. Structural flexibility is the capability of the design and organization of a software application to be successfully adapted to business changes. Process flexibility is the ability of people to make changes to the technology using management processes that support business changes. The determinants of structural and process flexibility are based on measures of flexibility in the behavioral psychology and software engineering literature [17]. Change acceptance, modularity, and consistency are the measures used for structured flexibility in the proposed system. Change acceptance is the degree to which a system contains built-in capacity for change. Modularity is the degree of formal design separation within a software. Consistency is the degree to which data and components are integrated consistently across a software. The proposed system includes the possibility of amending some of the data that may occur in future, which assures the change acceptance. The system includes three main modules; scholarships and training, employment of non- Saudis and studies and tests, which assures the system modularity. Figure 8 shows both the change acceptance and the system modularity. The system consistency is assured by integrating the entire regulation and system definition of the Higher Education and Universities Council's Law and its Executives Regulations in Saudi Arabia as shown in figure 6. The process flexibility is measured by rate of response, expertise, and coordination of action. The proposed system accepts the changes that can be made in a timely manner that satisfies high rate of response. One of the major advantages of the proposed advisory system is its

ability to up-to-date knowledge which yields to satisfy the expertise.

Figure 8. The seting window for the regualtion

V. CONCLUSION AND FUTURE WORK Intelligent advisory systems support decision maker in different domains specially in law. This paper presents an intelligent advisory system based on the executive regulations and rules that govern universities and institutes. This system provides legal advices to managers in universities and institutes. It does not substitute human advisors in law but it alleviates the burden based upon them. The advices are given automatically with the law causes and arguments. The system includes database which consists of a large number of rules and regulations. Also it is flexible enough to accept new setting without effecting the knowledge base. Our future work will be concentrated on adopting the system to work online with different languages. Also, we will add additional knowledge in other domains to assist university managers.

REFERENCES [1] Leila Amgoud and Henri Prade "Using arguments for

making and explaining decisions" Artificial Intelligence,vol. 173, pp. 413-436, 2009.

[2] Forslund, G., “Toward Cooperative Advice-Giving Systems: A Case Study in Knowledge Based Decision Support,” IEEE Expert, pp. 56�-62,1995.

[3] Vanguard Software Corporation, Decision Script, 2006. Accessed via , www.vanguardsw.com/decisionscript/jgeneral.htm

[4] Aronson, J. and E. Turban, Decision Support Systems and Intelligent Systems. Upper Saddle River, NJ: Prentice-Hall, 2001.

[5] Magni, C.A., S. Malagoli and G. Mastroleo, “An Alternative Approach to Firms’ Evaluation: Expert Systems and Fuzzy Logic,” Int J Int Tech Decis, 5(1), 2006.

[6] Demir, C., S.H. Gultekin and B. Yener, “Learning the Topological Properties of Brain Tumors,” IEEE ACM T Comput, 1(3), 2005.


128

[7] Gefen, D., E. Karahanna and D.W. Straub, “Trust and TAM in Online Shopping: An Integrated Model,” MIS Quart, 27(1), 2003.

[8] Luger G., Artificial Intelligence: Structures and Strategies for Complex Problem Solving. Addison Wesley, 2005.

[9] Gregg, D. and S. Walczak, “Auction Advisor: Online Auction Recommendation and Bidding Decision Support System,” Decis Support Syst, 41(2), 2006.

[10] Mintzberg, H., D. Raisinghani and A. Theoret, “The Structure of ‘Unstructured’ Decision Processes,” Admin Sci Quart, 21(2), 1976.

[11] Chandler, P. R. and M. Pachter, “Research Issues in Autonomous Control of Tactical UAVs,” in Proceedings of the American Control Conference, 1998.

[12] Sniezek J.A. and T. Buckley, “Cueing and Cognitive Conflict in Judge-Advisor Decision Making,” Organ Behav Hum Dec, 62(2), 1995.

[13] Arendt, L.A., R.L. Priem and H.A. Ndofor, “A CEO-Advisor Model of Strategic Decision Making,” J Manage, 31(5), 2005.

[14] Sniezek, J.A., “Judge Advisor Systems Theory and Research and Applications to Collaborative Systems and Technology,” in Proceedings of the 32nd Hawaii International Conference on System Sciences, 1999.

[15] http://www.utsystem.edu/News/mission.htm "University of Texas System.

[16] Joseph C. Giarratano "CLIPS User's Guide" version 6.2 March 31st 2002

[17] Nelson, K. M., and Ghods, M. .Evaluating the Contributions of a Structured Software Development and Maintenance Methodology,. Information Technology and Management, Winter 1998.


129

A Hop-by-Hop Congestion-Aware Routing Protocol for Heterogeneous Mobile Ad-hoc Networks

S.Santhosh baboo PG & Research Department of computer application

DG Vaishnav College, Arumbakkam Chennai, India

B.Narasimhan Department of BCA

K.G.College of Arts& Science, Saravanam Patti Coimbatore-35, India

[email protected]

Abstract—In Heterogeneous mobile ad hoc networks (MANETs) congestion occurs with limited resources. Due to the shared wireless channel and dynamic topology, packet transmissions suffer from interference and fading. In heterogeneous ad hoc networks, throughput via a given route is depending on the minimum data rate of all its links. In a route of links with various data rates, if a high data rate node forwards more traffic to a low data rate node, there is a chance of congestion, which leads to long queuing delays in such routes. Since hop count is used as a routing metric in traditional routing, it do not adapt well to mobile nodes. A congestion-aware routing metric for MANETs should incorporate transmission capability, reliability, and congestion around a link. In this paper, we propose to develop a hop-by-hop congestion aware routing protocol which employs a combined weight value as a routing metric, based on the data rate, queuing delay, link quality and MAC overhead. Among the discovered routes, the route with minimum cost index is selected, which is based on the node weight of all the in-network nodes. Simulation results prove that our proposed routing protocol attains high throughput and packet delivery ratio, by reducing the packet drop and delay.

Keywords-MANETs; Routing Protocol; Overhead; Congestion.

I. INTRODUCTION

A. Heterogeneous Ad-hoc Wireless Networks An ad hoc network is also called as infrastructure less

networks which is a collection of mobile nodes which forms a temporary network without the help of central administration or standard support devices regularly available in conventional networks. Mobile ad hoc wireless networks have the ability to establish networks at anytime, anywhere to possess the assurance of the future. These networks do not depend on irrelevant hardware because it makes them ideal candidate for rescue and emergency operations. The constituent wireless nodes of these network build, operate and maintain these networks. Each node asks the help of its neighboring nodes to forward packets because these nodes usually have only a limited transmission range.

A homogeneous ad hoc network suffers from poor scalability because the network performance is degraded quickly as the number of nodes increases. The nodes are usually heterogeneous in realistic ad hoc networks. For example, in a battlefield network, portable wireless devices are carried by soldiers, and more powerful and reliable

communication devices are carried by vehicles, tanks, aircraft, and satellites and these devices/nodes have different communication characteristics in terms of transmission power, data rate, processing capability, reliability, etc. Hence it would be more realistic to model these network elements as different types of nodes [1]. Such heterogeneous networks nodes are portable to transmit at different power levels and thus cause communication links of varying ranges.

B. Routing in Mobile Ad-hoc Wireless Networks Specially configured routing protocols are engaged in

order to establish routes between nodes which are more than a single hop. The ability to trace routes in spite of a dynamic topology is the unique feature of these protocols. These protocols can be categorized into two main types: Reactive (On-demand) and Proactive (Table-driven). Evaluating the routes continuously within the network is done by proactive protocols, so when a packet needs to be forwarded the route is already known and can be immediately used. . Reactive protocols appeal to a route determination procedure on demand only.

C. Congestion in Mobile Ad-hoc Wireless Networks In mobile ad hoc networks (MANETs) congestion occurs

with limited resources. Due to the shared wireless channel and dynamic topology, packet transmissions suffer from interference and fading, in such networks. The network load is burdened through the transmission errors. There is an increasing demand for support of multimedia communications in MANETs, recently. Large amount of real-time traffic involves high bandwidth and it is liable to congestion. Congestion leads to packet losses and bandwidth degradation and also wastes time and energy on congestion recovery.

II. RELATED WORK Xiaoqin Chen et al. [2] have proposed a congestion-aware

routing metric which was employed data-rate, MAC overhead, and buffer queuing delay, with preference given to less congested high throughput links to improve channel utilization also they have proposed the Congestion Aware Routing protocol for Mobile ad hoc networks (CARM). CARM has applied a link data-rate categorization approach to prevent routes with mismatched link data-rates. CARM was only discussed and simulated in relation IEEE 802.11b networks; however, it was applied to any multi-rate ad hoc network.


130

Ming Yu et al. [3] have proposed a link availability-based QoS-aware (LABQ) routing protocol for mobile ad hoc networks based on mobility prediction and link quality measurement, in addition to energy consumption estimate. They have provided highly reliable and better communication links with energy-efficiency.

Yung Yi and Sanjay Shakkottai [4] have developed a fair hop-by-hop congestion control algorithm with the MAC constraint was being imposed in the form of a channel access time constraint, using an optimization-based framework. In the absence of delay, they have shown that this algorithm was globally stable using a Lyapunov-function-based approach and in the presence of delay, they have shown that the hop-by-hop control algorithm has the property of spatial spreading.

R.Asokan et al. [5] were being extended the scope to QoS routing procedure, to inform the source about QoS available to any destination in the wireless network. However, existing QoS routing solutions were dealt with only one or two of the QoS parameters. It was important that MANETs was provided QoS support routing, such as acceptable delay, jitter and energy in the case of multimedia and real time applications. They have proposed a QoS Dynamic Source Routing (DSR) protocol using Ant Colony Optimization (ACO) called Ant DSR (ADSR).

Lei Chen and Wendi B. Heinzelman [6] have proposed a QoS-aware routing protocol that were incorporated an admission control scheme and a feedback scheme to meet the QoS requirements of real-time applications. The novel part of this QoS-aware routing protocol was the use of the approximate bandwidth estimation to react to network traffic. They have implemented these schemes by using two bandwidth estimation methods to find the residual bandwidth available at each node to support new streams.

Chenxi Zhu and M. Scott Corson [7] have developed a QoS routing protocol for ad hoc networks using TDMA. Their object was to establish bandwidth guaranteed QoS routes in small networks whose topologies were changed at low to medium rate. The protocol was based on AODV, and built QoS routes only as needed. They have started with the problem of calculating the available bandwidth on a given route and develop an efficient algorithm and then they were used the algorithm in conjunction with AODV to perform QoS routing.

Duc A. Tran and Harish Raghavendra [8] have proposed CRP, a congestion-adaptive routing protocol for MANETs. CRP tried to prevent congestion from occurring in the first place, rather than dealing with it reactively. A key in CRP design was the bypass concept. A bypass was a sub path connecting a node and the next non congested node. If a node was aware of a potential congestion ahead, it was found a bypass that was used in case the congestion actually occurred or. Part of the incoming traffic was sent on the bypass, was being made the traffic was being come to the potentially congested node less. The congestion was avoided as a result.

RamaChandran and Shanmugavel [11] have proposed and studied three cross-layer designs among physical, medium access control and routing (network) layers, using Received

Signal Strength (RSS) as cross-layer interaction parameter for energy conservation, unidirectional link rejection and reliable route formation in mobile ad hoc networks.

Jitendra Padhye et al. [12] have considered the problem of estimating pairwise interference among links in a multi-hop wireless testbed. Using experiments done in a 22-node, 802.11- based testbed, they have shown that some of the previously proposed heuristics for predicting pairwise interference were inaccurate. They have then proposed a simple, empirical methodology to estimate pairwise interference using only O (n2) measurements. They have shown that their methodology accurately predicted pairwise interference among links in their testbed in a variety of settings. Their methodology is applicable to any 802.11-based wireless network where nodes use omni-directional antennas.

Xinsheng Xia et al. [14] have introduced a method for cross-layer design in mobile ad hoc networks. They have used fuzzy logic system (FLS) to coordinate physical layer, datalink layer and application layer for cross-layer design. Ground speed, average delay and packets successful transmission ratio were selected as antecedents for the FLS. The output of FLS has provided adjusting factors for the AMC (Adaptive Modulation and Coding), transmission power, retransmission times and rate control decision.

Congestion in mobile ad hoc networks leads to transmission delays and packet loss, and causes wastage of time and energy on recovery. Routing protocols which are adaptive to the congestion status of a mobile ad hoc network can greatly improve the network performance. Xiaoqin Chen et al. [15] have proposed a congestion-aware routing protocol for mobile ad hoc networks which has used a metric incorporating data-rate, MAC overhead, and buffer delay to combat congestion. This metric was used, together with the avoidance of mismatched link data-rate routes, to make mobile ad hoc networks robust and adaptive to congestion.

Ming Yu et al. [16] have proposed a link availability-based QoS-aware (LABQ) routing protocol for mobile ad hoc networks based on mobility prediction and link quality measurement, in addition to energy consumption estimate. There goal was to provide highly reliable and better communication links with energy-efficiency. It has also reduced the average end-to-end delay for data transfer and it has enhanced the lifetime of nodes by making energy-efficient routing decisions.

III. PROPOSED WORK

A. Congestion Control in Mobile Adhoc Network Congestion in wireless networks is slightly different from

that of wired networks. The following are the general cause of congestion:

1. The throughput of all nodes in a particular area gets reduced because many nodes within range of one another attempt to transmit simultaneously, resulting in losses.

2. The queue or buffer used to hold packets to be transmitted, overflows within a particular node. This is also the cause of losses.


131

3. Moreover in a heterogeneous network, different data-rates will almost certainly lead to some routes having different links with quite different data-rates. The packets will build up at the node heading the lower data-rate link which leads to long queuing delays. Link reliability is the additional cause for the congestion. Congestion is increased due to packet retransmission [2], if the link breaks.

Since hop count is used as a routing metric in traditional routing, it do not adapt well to mobile nodes. To counter node mobility, many existing routing schemes use message exchanges, such as hello packets. Unless a link is broken these schemes do not change the routes, rather than taking precautions and make sure the link would not be broken [3].

B. Protocol Overview A congestion-aware routing metric for MANETs should

incorporate transmission capability, reliability, and congestion around a link.

In this paper, we propose to develop a Hop-by-hop congestion aware routing protocol which employs the following routing metrics:

• Data-rate • Buffer queuing delay • Link Quality • MAC Overhead

With preference given to less congested high throughput links to improve channel utilization.

In our proposed routing protocol, after the estimating the above metrics, a combined weight value is calculated for each node. We select any multi path on-demand routing protocol, which discovers multiple disjoint routes from a source to destination, as our basis. Among the discovered routes, the route with minimum cost index is selected, which is based on the node weight of all the in-network nodes for each packet successfully delivered from the source node to the destination node. The node’s cost index is calculated in a backward propagating way. The cost indices of a node’s possible downstream neighbors are obtained by the feedbacks of its downstream neighbors.

C. Link Quality Estimation To be able to see that a node is moving and a route is about

to break, we relay on the fact that communication is based on electronic signals. Because of that it is possible to measure the quality of the signal and based on that guess if the link is about to break. This can be used by the physical layer to indicate to the upper layer when a packet is received from a host, that is sending with a signal lower than a specific value and then indicate that that node is in pre-emptive zone [9],[10]. Thus, using the received signal strength from physical layer, link quality can be predicted and links with low signal strength will be discarded from the route selection.

When a sending node broadcasting RTS packet, it piggybacks its transmissions power tP . On receiving the RTS packet, the intended node measures the signal strength

received which holds the following relationship for free-space propagation model [11].

rttr GGdPP ..)4/.( 2πλ= (1)

Where λ is the wavelength carrier, d is distance between sender and receiver, tG and rG are unity gain of transmitting and receiving omni directional antennas, respectively. The effects of noise and fading are not considered.

So, the link quality rq PL = (2)

D. Estimating MAC Overhead In this network, we consider IEEE 802.11 MAC with the

distributed coordination function (DCF). It has the packet sequence as request-to-send (RTS), clear-to-send (CTS), and data, acknowledge (ACK). The amount of time between the receipt of one packet and the transmission of the next is called a short inter frame space (SIFS). Then the channel occupation due to MAC contention will be

SIFSCTSRTSocc tttC 3++= (3)

Where RTSt and CTSt are the time consumed on RTS and

CTS, respectively and SIFSt is the SIFS period.

Then the MAC overhead OHMAC can be represented as

accoccMAC tCOH += (4)

Where acct is the time taken due to access contention.

The amount of MAC overhead is mainly dependent upon the medium access contention, and the number of packet collisions. That is, MACOH is strongly related to the congestion around a given node.

MACOH can become relatively large if congestion is incurred and not controlled, and it can dramatically decrease the capacity of a congested link.

E. Estimating End to End Delay There is a significant variation between the end-to-end

delay reported by RREQ-RREP measurements and the delay experienced by actual data packets. We address this issue by introducing a DUMMY-RREP phase during route discovery. The source saves the RREP packets it receives in a RREP TABLE and then acquires the RREP for a route from this table to send a stream of DUMMY data packets along the path traversed by this RREP. DUMMY packets efficiently imitate real data packets on a particular path owing to the same size, priority and data rate as real data packets. H is the hop count reported by the RREP. The number of packets comprised in every stream is 2H. The destination computes the average delay avgD of all DUMMY packets received, which is sent


132

through a RREP to the source. The source selects this route and sends data packets only when the average delay reported by this RREP is inside the bound requested by the application. The source performs a linear back-off and sends the DUMMY stream on a different route selected from its RREP TABLE when the delay exceeds the required limit

F .Estimating Data Rate In heterogeneous ad hoc networks, throughput through a

given route is depending on the minimum data rate of all its links. In a route of links with various data rates, if a high data rate node forwards more traffic to a low data rate node, there is a chance of congestion. This leads to long queuing delays in such routes.

Since congestion significantly reduces the effective bandwidth of a link, the effective link data-rate is given by

delaySizerate CDD /= (5)

Where SizeD is the data size and delayC is the channel delay.

IV. CONGESTION AWARE ROUTING PROTOCOL (CARP) CARP is an on-demand routing protocol that aims to create

congestion-free routes by making use of information gathered from the MAC layer. CARP employs a combined weight metric in its standard cost function to account for the congestion level.

For establishing multiple disjoint paths, we adapt the idea from the Adhoc On-Demand Multipath Distance Vector Routing (AOMDV) [13]. The multiple paths are computed during the route discovery.

We now calculate the node weight metric NW which assigns a cost to each link in the network. The node weight NW combines the link quality qL , MAC

overhead MACOH , effective data rate rateD and the average

delay avgD , to select maximum throughput paths, avoiding the most congested links.

For an intermediate node i with established transmission with several of its neighbours, the NW for the link from node i to a particular neighboring node is given by

)*()*( avgMACrateq DOHDLNW = (6)

A. Route Request Consider the scenario

Let us consider the route

DNNNS −−−−− 321 To initiate congestion-aware routing discovery, the source

node S sends a RREQ. When the intermediate node 1N

receives the RREQ packet, it first estimates all the metrics as described in the previous section.

The node 1N then calculates its node weight 1NNW using (6).

211 NRREQ NNW

N ⎯⎯⎯ →⎯

2N then calculates its weight 2NNW in the same way and adds it to the weight of 1N . 2N then forward the RREQ packet with this added weight.

3212 NRREQ NN NWNW

N ⎯⎯⎯⎯⎯ →⎯ +

Finally the RREQ reaches the destination node D with the sum of node weights

DRREQ NNN NWNWNWN ⎯⎯⎯⎯⎯⎯⎯⎯ →⎯ ++ 321

3

B. Route Reply The Destination node D sends the route reply packet

RREP along with the total node weight to the immediate upstream node 3N .

3321 NRREP NNN NWNWNWD ⎯⎯⎯⎯⎯⎯⎯⎯ →⎯ ++

Now 3N calculates its cost C based on the information from RREP as

)()( 213213 NNNNNN NWNWNWNWNWC +−++= (7)

By proceeding in the same way, all the intermediate hosts calculate its cost.

On receiving the RREP from all the routes, the source selects the route with minimum cost value.

V. SIMULATION RESULTS

A. Simulation Model and Parameters We use NS2 to simulate our proposed protocol in our

simulation, the channel capacity of mobile hosts is set to the same value: 2 Mbps. We use the distributed coordination function (DCF) of IEEE 802.11 for wireless LANs as the MAC layer protocol. It has the functionality to notify the network layer about link breakage.

In our simulation, 50 mobile nodes move in a 1500 meter x 300 meter rectangular region for 100 seconds simulation time. We assume each node moves independently with the same average speed. All nodes have the same transmission range of 250 meters. In our simulation, the speed is set as 5m/s. The simulated traffic is Constant Bit Rate (CBR). The pause time of the mobile node is varied as 0,10,20,30 and 40.

Our simulation settings and parameters are summarized in table 1.


133

TABLE I. SIMULATION SETTINGS

No. of Nodes 50 Area Size 1500 X 300 Mac 802.11 Radio Range 250m Simulation Time 100 sec Traffic Source CBR Packet Size 512 Mobility Model Random Way Point Speed 5m/s Pause time 0,10,20,30 and 40

B. Performance Metrics We compare our CARP protocol with the AOMDV [6]

protocol. We evaluate mainly the performance according to the following metrics, by varying the pause time as 0,10,20,30 and 40.

Control overhead: The control overhead is defined as the total number of routing control packets normalized by the total number of received data packets.

Average end-to-end delay: The end-to-end-delay is averaged over all surviving data packets from the sources to the destinations.

Average Packet Delivery Ratio: It is the ratio of the number of packets received successfully and the total number of packets sent

Throughput: It is the number of packets received successfully.

Drop: It is the number of packets dropped.

C. Results A. Based on Pausetime

In our initial experiment, we vary the pausetime as 0,10,20,30 and 40.

Pausetime Vs Throughput

36003700380039004000410042004300

0 10 20 30 40

pausetime(s)

Thro

ughp

ut(p

kts.

)

CARP

AOMDV

Figure 1. Pausetime Vs Throughput

Pauetime Vs Packets Dropped

0200400600800

10001200

0 10 20 30 40

pausetime(s)

Dro

p(pk

ts)

CARP

AOMDV

Figure 2. Pausetime Vs Packets Drop

Pausetime Vs Overhead

4610

4615

4620

4625

4630

4635

4640

0 10 20 30 40

pausetime(s)

over

head

(pkt

s.)

CARPAOM DV

Figure 3. Pausetime Vs Overhead

Pausetime Vs Delivery Ratio

0.780.8

0.820.840.860.880.9

0.92

0 10 20 30 40

pausetime(s)

delra

tio.

CARP

AOMDV

Figure 4. Pausetime Vs Delivery Ratio


134

Pausetime Vs End-to-End Delay

01234567

0 10 20 30 40

pausetime(s)

dela

y(s) CARP

AOMDV

Figure 5. Pausetime Vs Delay

Figure 1 gives the throughput of both the protocols when the pause time is increased. As we can see from the figure, the throughput is more in the case of CARP, than AOMDV.

From Figures 2 and 3, we can ensure that the packets dropped and control overhead is less for CARP when compared to AOMDV.

Figure 4 presents the packet delivery ratio of both the protocols. Since the packet drop is less and the throughput is more, CARP achieves good delivery ratio, compared to AOMDV.

From Figure 5, we can see that the average end-to-end delay of the proposed CARP protocol is less when compared to the AOMDV protocol.

B. Based On Number of Nodes In the second experiment, we vary the number of nodes as

25, 50, 75 and 100.

Nodes VS Overhead

0

0.2

0.4

0.6

0.8

1

1.2

25 50 75 100

Nodes

Ove

rhea

d

CARPAOM DV

Figure 6. Nodes Vs Overhead

Nodes Vs Delivery Ratio

0102030405060708090

100

25 50 75 100

Nodes

Del

iver

y R

atio

CARPAOM DV

Figure 7. Nodes Vs Delivery Ratio

Nodes Vs Delay

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

25 50 75 100

Nodes

Del

ay(s

ec)

CARPAOM DV

Figure 8. Nodes Vs Delay

From Figures 6, we can ensure that the control overhead is less for CARP when compared to AOMDV.

Figure 7 presents the packet delivery ratio of both the protocols. From the figure we can observe that CARP achieves good delivery ratio, compared to AOMDV.

From Figure 8, we can see that the average end-to-end delay of the proposed CARP protocol is less when compared to the AOMDV protocol.

VI. CONCLUSION In heterogeneous mobile ad hoc networks (MANETs)

congestion occurs with limited resources and throughput via a given route is depending on the minimum data rate of all its links. In a route of links with various data rates, if a high data rate node forwards more traffic to a low data rate node, there is a chance of congestion, which leads to long queuing delays in such routes. Traditional routing protocols using hop count as a routing metric, do not adapt well to mobile nodes. So there is a need for a congestion aware routing metric which incorporates transmission capability, reliability, and congestion around a link.

In this paper, we have developed a hop-by-hop congestion aware routing protocol which employs a combined weight value as a routing metric, based on the data rate, queuing


135

delay, link quality and MAC overhead. We have used a multipath on demand routing protocol which discovers multiple disjoint routes from a source to destination, as our basis. Among the discovered routes, the route with minimum cost index is selected, which is based on the node weight of all the in-network nodes from the source node to the destination node. By simulation results, we have proved that our proposed routing protocol attains high throughput and packet delivery ratio, by reducing the packet drop and delay.

REFERENCES [1] Xiaojiang (James) Du, Dapeng Wu, Wei Liu and Yuguang Fang,

"Multiclass Routing and Medium Access Control for Heterogeneous Mobile Ad Hoc Networks", IEEE Transactions on Vehicular Technology, vol. 55, no. 1, January 2006.

[2] Xiaoqin Chen, Haley M. Jones†, A .D .S. Jayalath, "Congestion-Aware Routing Protocol for Mobile Ad Hoc Networks" in proceedings of IEEE conference on Vehicular Technology,pp.21-25, 2007, Doi. 10.1109/VETECF.2007.21.

[3] Ming Yu, Aniket Malvankar, Wei Su and Simon Y. Foo, "A link availability-based QoS-aware routing protocol for mobile ad hoc sensor networks", Computer communications Archive,vol.30,no.18,pp. 3823-3831, 2007.

[4] Yung Yi and Sanjay Shakkottai,"Hop-by-Hop Congestion Control Over a Wireless Multi-Hop Network", IEEE/ACM on Networking, vol.15, no.1, February 2007.

[5] R.Asokan, A.M.Natarajan and C.Venkatesh, "Ant Based Dynamic Source Routing Protocol to Support Multiple Quality of Service (QoS) Metrics in Mobile Ad Hoc Networks”, International Journal of Computer Science and Security, vol.2 no.3, pp.48-56, May/June 2008.

[6] Lei Chen and Wendi B. Heinzelman," QoS-Aware Routing Based on Bandwidth Estimation for Mobile Ad Hoc Networks",IEEE on Selected Areas in Communications, vol.23, no. 3, march 2005.

[7] Chenxi Zhu and M. Scott Corson, "QoS routing for mobile ad hoc networks", in proceedings of Twenty-First IEEE Annual Joint Conference of Computer and Communications Societies, vol.2,pp. 958-967, 2002, Doi.10.1109/INFCOM.2002.1019343.

[8] Duc A. Tran and Harish Raghavendra," Congestion Adaptive Routing in Mobile Ad Hoc Networks", IEEE Transactions on Parallel and Distributed Systems, vol.17, no.11, November 2006.

[9] Mads Østerby Jespersen, Kenneth-Daniel Nielsen and Jacob Frølund, "Optimising performance in AOMDV with pre-emptive routing Mads" Technical Report, May 2003.

[10] Tom Goff, Nael B. Abu-Ghazaleh, Dhananjay S. Phatak and Ridvan Kahvecioglu, "Preemptive Routing in Ad Hoc Networks", Journal of Parallel and Distributed Computing, Vol.63,No.2,pp.123-140,2003.

[11] RamaChandran and Shanmugavel,"Received Signal Strength-based Cross-layer Designs for Mobile Ad Hoc Networks", IETE Technical Review, Vol.25, No.4, pp.192-200, August 2008.

[12] Jitendra Padhye, Sharad Agarwal, Venkata N. Padmanabhan, Lili Qiu, Ananth Rao and Brian Zill, "Estimation of Link Interference in Static Multi-hop Wireless Networks", in proceedings of the 5th ACM SIGCOMM conference on Internet Measurement,pp.28-28,Berkeley, CA, 2005.

[13] Marina and Das, "On-demand multipath distance vector routing in ad hoc networks", Ninth International Conference on Network Protocols", pp.14-23, 2001.

[14] Xinsheng Xia, Qingchun Ren and Qilian Liang , "Cross-layer design for mobile ad hoc networks: energy, throughput and delay-aware approach" in proceedings of the IEEE conference on Wireless Communications and Networking, vol.2, pp. 770-775, 2006.

[15] Xiaoqin Chen, Haley M. Jones and Jayalath, "Congestion-Aware Routing Protocol for Mobile Ad Hoc Networks" IEEE 66th conference on Vehicle Technology, pp.21-25, October 2005.

[16] Ming Yu, Aniket Malvankar, Wei Su and Simon Y. Foo, "A link availability-based QoS-aware routing protocol for mobile ad hoc sensor networks", Computer Communications, Vol.30, pp.3823–3831, 2007.

AUTHORS PROFILE

Lt.Dr.S.Santhosh Baboo, aged forty, has around Seventeen years of postgraduate teaching experience in Computer Science, which includes Six years of administrative experience. He is a member, board of studies, in several autonomous colleges, and designs the

curriculum of undergraduate and postgraduate programmes. He is a consultant for starting new courses, setting up computer labs, and recruiting lecturers for many colleges. Equipped with a Masters degree in Computer Science and a Doctorate in Computer Science, he is a visiting faculty to IT companies. It is customary to see him at several national/international conferences and training programmes, both as a participant and as a resource person. He has been keenly involved in organizing training programmes for students and faculty members. His good rapport with the IT companies has been instrumental in on/off campus interviews, and has helped the post graduate students to get real time projects. He has also guided many such live projects. Lt.Dr. Santhosh Baboo has authored a commendable number of research papers in international/national Conference/journals and also guides research scholars in Computer Science. Currently he is Senior Lecturer in the Postgraduate and Research department of Computer Science at Dwaraka Doss Goverdhan Doss Vaishnav College (accredited at ‘A’ grade by NAAC), one of the premier institutions in Chennai..

B.Narasimhan, done his Under-Graduation in Annamalai University and Post-Graduation and Master of Philosophy Degrees in School of Computer Science and Engineering, Bharathiar University. He is currently pursuing his Ph.D in Computer Science in Dravidian University, Kuppam, Andhra Pradesh. Also, he is working as a Lecturer, Department of BCA, KG College of Arts and Science. He is having more than one year of research experience and 6 months of teaching

experience. His research interest includes Mobile Ad-Hoc Networks and Soft Computing.


136

Abstract—The current SINR mechanism does not provide the base station (BS) with any knowledge on the frequency selectivity of channel from mobile service station(MSS). This knowledge is important since, contrary to the AWGN channel, in a frequency selective channel there is no longer a 1 to 1 relation between amount of increase in power and amount of improvement in “effective SINR” 1. Furthermore, the relation is dependent on MCS level. This lack of knowledge in the BS side results in larger fade margins, which translates directly to reduction in capacity. In this paper we propose a enhanced algorithm on the EESM model with weighted beta (β) that provides the BS with sufficient knowledge on the channel-dependent relationship between power increase, MCS change and improvement in effective SINR. Keywords—SINR, EESM, channel, BS, MSS

INTRODUCTION

A great deal can be learned about an air interface technology by analyzing its performance in a link level setting consisting of one base station and one mobile station. This link level analysis is of fundamental importance for the evaluation of the technologies associated to the given air interface, namely for the study of the variation of the Bit Error Rate (BER) with the Signal to Noise Ratio (SNR) per bit sent along the transmission chain, under the influence of such an aggressive medium for signal transmission as the wireless mobile channel. In real-world, where multiple base stations are deployed in a service area and operating in the presence of a large number of active mobile users, the system performance can only be evaluated through a system-level analysis, where the point-to-point radio link communication scenario is replaced by one in which all radio links among the mobile and base stations must be considered. Typically, network simulations are divided into two parts: link and system level simulations. Although a single simulator approach would be preferred, the complexity of such simulator (including everything from transmitted waveforms to multi-cell network) is far too high with the required simulation resolutions and simulation times. Therefore, separate link and system level simulations are needed. Typically, the link level simulator is able to predict

the receiver Frame Erasure Rate/Bit Error Rate (FER/BER) performance, taking into account channel estimation, interleaving and decoding and is needed to build a model for the system level simulator, which is needed to model a system with a large number of mobile and base stations and the algorithms operating in such a system. In system level simulations, we will focus on making transmission adaptations to optimize system performance and get better understanding of the user performance in various deployment scenarios. For complexity reasons system level evaluations have to rely on simplified Physical (PHY) -layer models that still must be accurate enough to capture the essential behavior. So, the modeling method of link layer is very essential and important. The Block error rate (BLER) performance versus signal to inference and noise ratio (SINR) averaged over all channel realizations of one specific channel model has been widely used as the interface between the PHY- and system-level simulators. But in many cases, the specific channel realization encountered may perform significantly different from the average performance. Consequently, many novel modeling approaches accounting for the instantaneous channel and interference conditions are introduced such as CESM (Capacity based effective SINR Mapping), EESM (Exponential effective SNIR Mapping ) and MIESM (Mutual Information based effective SINR Mapping) have been brought forward. The abstraction method adopted in this paper is based on the EESM algorithm over link layer of 802.16e system and some modifications have been made to improve the performance prediction accuracy. This paper is organized as follow. Section II explain the simulation scenario using in this paper, Section III explains the Link layer Abstraction based on EESM, Section IV presented the Enhanced Algorithms based on EESM and conclusion is presented in section V.

II SIMULATION SCENARIO

Link level simulation chain is shown in Figure 1 which is written in System C and run under Linux platform with gcc complier 1.95 or more. Randomization is a process to systematically or randomly reorder or randomize the transmitted data. It is employed to minimize the possibility of transmission of an un-modulated

Enhanced Algorithm for Link to System level Interface Mapping

Shahid Mumtaz Alitio Gamerio &Raool Sadeghi Institute of Telecommunication Institute of Telecommunication Aveiro,Portugal Aveiro,Portugal [email protected] [email protected]


137

carrier and to ensure adequate numbers of bit transitions to support recovery. Randomization is achieved by XORing the

eesmSNRAWGNSNR

Figure 1:Link Level Simulation Chain data blocks with a pseudo-random binary sequence (PRBS) generated using a certain polynomial [1]. Another purpose of randomization is to encrypt the transmitted data blocks to prevent any unintended receiver from decoding the data. In encoding FEC (Forward Error Correction) process is used to maximize the possibility of detecting and possibly recovering the corrupted received data by adding redundancy to the transmitted data. WiMAX-OFDM standard specifies three methods of FEC: Reed-Solomon concatenated with convolution coding (RS-CC), block turbo coding (BTC), and convolution turbo coding (CTC). WiMAX-OFDMA specifies five methods of channel coding: convolution coding (CC) with tail biting, block turbo coding (BTC), convolution turbo coding (CTC), low density parity check coding (LDPCC), and CC with zero tailing. The most common channel coding method is CTC. The encoded data from the previous step go through a two-step process. The first step ensures that adjacent encoded bits are mapped into non adjacent subcarriers to provide frequency diversity and to improve the performance of the decoder. The second step maps the adjacent bits to the less and more significant bits of the constellation. The modulation of data bits depends on the modulation scheme used. WiMAX takes into consideration channel quality to choose the correct modulation scheme. The modulation scheme is selected per subscriber to achieve the best performance possible. The number of bits per symbol (time) depends on the modulation scheme used, for QPSK (Quadrature Phase Shifting Keying) it is 2, for 16- QAM (Quadrature Amplitude Modulation) it is 4, and for 64-QAM it is 6.Once the signal has been coded, it enters the modulation block. All wireless communication systems use a modulation scheme to map coded bits to a form that can be effectively transmitted over the communication channel.Thus, the bits are mapped to a subcarrier amplitude and phase, which is represented by a complex in-phase and quadrature-phase (IQ) vector. WiMAX specifications for FFT OFDM PHY layer

define three types of subcarriers; data, pilot and null, as shown in Figure 2. Each OFDM symbol is composed of data subcarriers, zero DC subcarrier, pilot subcarriers, and guard carriers according to permutation schemes used. For example table 1 shows the downlink PUSC configuration. Furthermore, preambles consisting of training sequences area appended at the beginning of each burst. These training sequences are used for performing an estimation of the channel coefficients at the receiver. The signal is converted to the time domain by means of the inverse fast Fourier transform (IFFT) algorithm, and finally, a cyclic prefix (CP) with the aim of preventing inter-symbol interference is added.

Figure 2: OFDM Symbol Structure in Time Domain

III LINK LAYER ABSTRACTION In order to simulate WiMAX, we need to simulate 1024 (or more) subcarriers, the effect on noise on each of these subcarriers and their effect on the received FEC blocks. Such a simulation can be very complex and time consuming. This complexity can be avoided by modeling the channel as an additive white Gaussian noise (AWGN) channel with a single effective SINR. Wireless scientists have developed several ways to combine the SINRs of multiple subcarriers in to an effective SINR. One of the commonly used methods is the so called "Exponential Effective SINR Mapping" or EESM.EESM is used to map the instantaneous values of


138

SINRs to the corresponding BLER (Block Error Rate) value. Although EESM was introduced to work with SIR (Signal to Interference Ratio), it works with SNR as well. EESM is a simple mapping method used when all the subcarriers of a specific subscriber are modulated using the same Modulation and Coding Scheme (MCS) level. The basic idea of EESM is to find a compression function that maps the set of SINRs. to a single value that is a good predictor of the

actual BLER [2]. Figure 3 shows the main purpose behind using EESM function. Here, BLER refers to block error rate and PER refers to packet error rate. Note that average SINR is not a good predictor of actual BLER or PER (Packet Error Rate).

Figure 3: SINR Compression EESM is a channel-dependent formula that maps power level as well as MCS level to SINR values in the AWGN (Additive White Gaussian Noise) channel domain. Such function allows its mapping along with AWGN assumptions (such as effect of increase in power, CINR/MCS threshold tables) to predict the effect of MCS and boosting modification. The method has been shown to yield an accurate estimation of the AWGN-equivalent SINR (henceforth referred to as .effective SINR.) for frequency selective channels [3]. In case of multi-carrier transmission as in WiMAX, the set of subcarrier SINRs are mapped with the help of EESM formula

into a scalar instantaneous effective SINR value. An estimate of the BLER value is then obtained, using the effective SINR value, from basic AWGN link-level performance. The mapping of the effective SINR value to the corresponding BLER value will use either a look-up table for the mapping function or use an approximate analytical expression if available. The EESM method estimates the effective SINR using the following formula

∑=

−

−==N

ieff

i

eN

EESM1

1ln),( βγ

ββγγ (1)

Where, γ is a vector [γ1, γ2,., γN ] of the per-subcarrier SINR values, which are typically different in a frequency selective channel. β is the parameter to be determined for each Modulation Coding Scheme (MCS) level and N is number of data subcarrier(720 in case of downlink PUSC) and this value is used to adjust EESM function to compensate the difference between the actual BLER and the predicted BLER. To obtain β value, several realizations of the channel have to be conducted using a given channel model (e.g., Pedestrian (Ped B) and Vehicular (Veh A)). Then BLER for each channel realization is determined using the simulation. Using the AWGN reference curves generated for each MCS level, BLER values of each MCS is mapped to an AWGN equivalent SINR. These AWGN SINRs for n realizations can be represented by an n-element vector SINRAWGN. Using a particular β value and the vector γ of subcarrier SINRs, an effective SINR is computed for each realization. For n realizations, we get a vector of computed effective SINRs denoted by SINREESM. The goal is to find the best possible β value that minimizes the difference between computed and actual effective SINRs:

)(minarg βββ eesmAWGN SINRSNR −= (2)

The four steps to obtain beta value are as follows. First, generate an AWGN curve for a specific MCS level. Second, measure the SINR per tone (subcarrier) values for the same MCS level using the desired channel model (for instance Ped A or Ped B). Many channel realizations are required and SINR per tone values should be converted to one scalar value to represent the channel SINR using EESM formula. The third step is to compare the two values gained from the previous steps (SINREESM and SNRAWGN). The comparisons for many SNR-pair values will yield a mean squared difference for a given beta value The beta value that gives the minimum difference is selected as the optimal value. In the first step the AWGN channel model is used to generate the reference curve. The BLER values that are of interest are those that result in a satisfactory operation. This range includes small BLER values close to zero. Figure 4 shows an example of AWGN reference curve generated by the simulation process for QPSK with coding rate 1/2,2/3,3/4. The second step is to get the SINR per tone values. First, all data subcarriers SNR values are stored for a single realization. Then a set of Gaussian random numbers with length equals to the number of the number of data subcarriers is generated. The

TABLE I DOWNLINK PUSC CONFIGURATION

Parameter Name Value System Channel Bandwidth (MHz) 10 Sampling Frequency (FP in MHz) 11.2

Subcarrier Frequency Spacing (f kHz) 10.94 FFT Size (NFFT) 1024

UP/DL DL Null Subcarriers 184 Pilot Subcarriers 120 Data Subcarriers 720

Data Subcarriers per Subchannel 24 Number of Subchannels (Ns) 30

Useful Symbol Time (Tb=1/f) in s

91.4

Guard Time (Tg=Tb/8) in s

11.4


139

sum of the both sets represents the SINR per tone values. To get one scalar value that represents the channel SINR (or effective SINR), EESM formula is used as shown in equation 1. Since EESM SINR value depends in the chosen beta value. The calibration process using equation 2 formula is used to

Figure 4: AWGN reference cruve for QSPK

minimize the difference between the expected and the simulated value of SNR. Beta values of different format are trained on PB and VA channel respectively through adequate link layer simulation of 802.16e system. The obtained beta values for look up are shown in the following Table 2 and Table 3. Simulations are done using SISO channel, PUSC mode, SCM channel model with the velocity of 3Km/h & 60Km/h,100 independent channel realizations with CTC and Ideal channel estimated is assumed. Beta values trained for PB and VA channel are quite similar in most cases, coinciding with the theory that the beta training should be independent of channel realizations. There are some differences when the higher order modulation is adopted, therefore, two beta tables are presented for different models in order to guarantee higher reliability of abstraction especially for higher order modulation. Figure 5 and Figure 6 shows the beta training for VA and PB channel IV Enhanced Algorithms The current 802.16e SINR reporting mechanism requires the MSS(Mobile station) to report a straightforward CINR(Carrier to Interference plus Noise Ratio) measurement. This mechanism does not provide the BS with any knowledge on the frequency selectivity of the channel and noise (especially prominent with partially loaded cells and with multipath). This knowledge is important since: • Two channel realizations with the same average CINR may cause substantially different frame error rate (FER) depending on the instantaneous channel variation. Without a proper metric to reflect the channel realization, the base station is unable to provide accurate link adaptation. • Contrary to the AWGN channel, in a frequency selective channel there is no longer a 1 to 1 relation between amount of

increase in power and amount of improvement in “effective SINR” . Furthermore, the relation is dependent on the modulation and coding scheme (MCS) level. This lack of knowledge in the BS side results in larger fade margins. Thus the current channel quality report scheme would lead to reduction in system capacity.

-3 -2 -1 0 1 2 310-2

10-1

100abstraction performance of format1 VA

effective snr(DB)

BLE

R

from abstractionfrom awgn

Figure 5: Predicted BLER VS. Simulated BLER

-3 -2 -1 0 1 2 310

-2

10-1

100

abstraction performance of format1 PB

effective snr(DB)

BLE

R

from abstractionfrom awgn

Figure 6: Predicted BLER VS. Simulated BLER

In general, we would like the MSS to report the effective SINR to the BS, and have the BS decide what modulation and coding to use and with what power boosting. This is complicated by the fact that the relationship between increase in power and increase in effective SINR is both channel-dependent and MCS-dependent. In context of EESM, this implies that for each MCS a different β should be utilized, and for each such β , different boosting should be considered. It is well known that the influences of SNR distance on the PER performance varies a lot in different parts of the performance curve. And when SNR value gets higher from a


140

low start point, PER performance is more sensitive to the SNR difference. The square difference between eesmSNR and the effective SNR could be weighted by the relative difference between the current effective SNR value and the SNR value when the BLER begins to drop on the AWGN performance curve, which highlights the influences of the SNR differences in the high SNR region on the PER performance. The cost function needed to be minimized is expressed in equation 3 , where

effSNR and eesmSNR are vectors with the size of simulated channel realization number, and W is the weight vector with the same size and is expressed in equation . startSNR is the SNR point where the BLER begins to drop from 1 ( BLER=0.99 is assumed) on the AWGN performance curve. The minimization algorithm is implemented based on golden section search and parabolic interpolation.

2)()( WSNRSNRF eesmeff Δ−= ββ (3)

2)(

start

starteff

SNRSNRSNR

W−

= (4)

As a result, the BS is required to know the dependence of effective SINR on weighted β and power increase; thus computation of equivalent SNR can no longer remain solely in the MSS’s territory. The increase of effγ due to boosting in weighted β dependent,

where ϕB denotes the weighted boost ratio.

ϕβγ

βϕγ

ββϕγ

B . W) . ( EESM

)1

.

1ln(.) ,.(

Δ≠

∑=

ΔΔ−=Δ

N

i

W

Bi

eN

WBEESM (5)

This implies that EESM is a two-dimensional mapping of weight boost level and an MCS dependent quantity (weighted β ) to effective SINR. However, we can simplify by observing that which shows that given an SINR-per-tone vector it is sufficient for the BS to know the MSS-specific curve relating EESM to weighted β . Both boosting and rate adaptation can be done based on the same curve, thus reducing the mapping problem to one dimension. We plot EESM as function of weighted β , for different cases. The first graph plots EESM for 4 different γ vectors, drawn from 24 independent Rayleigh distributions. Both EESM and weighted β are plotted in dB. It can be seen that the graphs can be approximated locally as linear (in dB=>dB), and have overall a linear shape with saturation at weighted β >15dB. Saturation occurs for practically unachievable weighted β values. This linear shape may be used for compressing the curve for transmission to the BS.

For the purpose of fast MCS adaptation or Hybrid ARQ, the MSS needs to provide instantaneous effective SINR and BS may decide MCS and boosting, according to MSS instantaneous effective SINR. However the number of relevant rates is limited and their weighted β values are close. Furthermore, the boosting range is limited, so we are typically interested in a narrow region of the weighted β axis. Thus a local linear approximation suffices, and the graph may be compressed effectively. This implies one straightforward

)B/,(.B

)1

B/ 1ln(. )B

- .(

)1

.

1ln(.) , .(

ϕϕ

ϕ

ϕ

βγ

β

γ

βϕ

βϕγ

ββγϕ

WEESM

N

i

Wi

eN

WB

N

i

W

Bi

eN

WBEESM

Δ=

∑=

ΔΔ=

∑=

ΔΔ−=Δ

(6)

solution – the MSS can initially (e.g. on handover to a new cell) send a table of EESM SINR thresholds and β values for each MCS, and then at a higher speed transmit a local linear approximation for the EESM(β ) curve. The accuracy of the EESM modeling technique as a predictor for the AWGN equivalent SINR was analyzed extensively for OFDM in [4][5][6]. In addition, we performed a short examination in order to validate the accuracy of EESM for 802.16. The following methodology was used. First, optimal β values were estimated for each MCS level. Then, the accuracy of EESM was evaluated: The following figures 7,8,9 and 10 show, for each MCS (QPSK. 16-QAM), the distribution of the EESM fit error (on the left) and the mean SINR vs. EESM prediction error (on the right) for the channel realizations .

Figure 7: QPSK EESM fit error


141

The proposed mechanism is as follows: a.MSS computes SINR-per-tone vectors for the purpose of EESM. b.MSS computes the curve parameters of EESM(β) in the weighted β The range of interest depends on current MCS level, for example, an MSS that operates in the QPSK area should compute the local slope for the QPSK range of weighted βs rather than the local slope for the QAM-64 range of β s. c. MSS sends the curve parameters to the BS, and updates the BS whenever these parameter change (due to change in channel conditions) – slow update. MSS uses β values from a table of β per MCS (provided by the BS) to compute CINR measurement based on the EESM formula. These measurements are averaged. d. The MSS compensates for implementation losses so that the transmitted CINR values are aligned with normalized threshold levels supplied by the BS. e. A CINR report consists of a single CINR value. The MSS sends the CINR measurement that corresponds to one of the βs; this weighted β is selected using a rule, which ensures that the BS knows its value. The BS now has all needed information (EESM CINR value, β for which it was computed, local-linear approximation of EESM (β)) in order to predict the effect of boosting and change of MCS level with the MSS’s current channel conditions.

V. CONCLUSIONS In system level simulations, we will focus on making

transmission adaptations to optimize system performance and get better understanding of the user performance in various deployment scenarios. For complexity reasons system level evaluations have to rely on simplified Physical (PHY)-layer models that still must be accurate enough to capture the essential behavior. So, the modeling method of link layer is very essential and important. In this paper we presented the enhanced EESM modeling method that can be used for accurate link adaptation and accurate power boosting. The method provides the BS with sufficient knowledge on the channel-dependent relationship between MCS, power increase, effective SINR and describe in detail how to calculate beta.

Reference

[1] Carl Eklund, et al, .WirelessMAN: Inside the IEEE 802.16 Standard for Wireless Metropolitan Area Networks," IEEE Press, 2006.

[2] S. N. Moiseev, et al, .Analysis of the Statistical Properties of the SINR in IEEE OFDMA 802.16 Network..Proceedings ICC 2006. [3] R. Yaniv, D. Stopler, T. Kaitz, K. Blum, .CINR

Measurements using the EESM Method. IEEE 802.16e Contribution #141 rev1sion 1, 2005, http://www.ieee802.org/16/tge/contrib/C80216e- 05_141r1.pdf

[4] “Considerations on the System-Performance Evaluation of HSDPA using OFDM modulation”, Ericsson, 3GPP TSG_RAN WG1 #34, R1-030999, October, 2003. [5] “System-level evaluation of OFDM – further considerations”, Ericsson, 3GPP TSG_RAN WG1 #35, R1-031303, November, 2003. [6] “OFDM EESM simulation Results for System-Level Performance Evaluations, and Text Proposal for Section A. 4.5 of TR 25.892”, Nortel Networks, R1- 04-0089, January, 2004

Shahid Mumtaz received his Masters degree in Electrical engineering from the Blekinge Institute of Technology from Sweden, Karlskrona , in 2005. He is working as Research Engineer at the Instituto de Telecomunicações, Pólo de Aveiro Portugal. His research interests include QoS in 3G/4G Networks, Radio Resource Management for wireless systems. His current reserach activities involve Cross-Layer Based Dynamic Radio Resource Allocation for WANs.

Atílio Gameiro received his Licenciatura (five years course) and his PhD from the University of Aveiro in 1985 and 1993 respectively. He is currently a Professor in the Department of Electronics and Telecommunications of the University of Aveiro, and a researcher at the Instituto de Telecomunicações - Pólo de Aveiro, where he is head of group. His main interests lie in signal processing techniques for digital communications and communication protocols.

Rasool Sadeghi received his MSc. in 2004 in Telecommunication Engineering from Shiraz University, Iran. Then he joined to ITMC (Siemens partner in Iran) and worked on the project of TMN (Telecommunication Management Networks).In 2006-2007, he worked for TCE as a Network and Switching Engineer of GSM.. Since December 2007, he is a PhD student in Institute of Telecommunications at Aveiro university, Portugal. His research interests are network and


142

radio resource management algorithms for wireless systems and cooperative diversity.

Figure 9: QAM16 EESM fit error

Figure 8: QPSK Mean SINR and prediction error per

channel realization

Figure 10 : QAM16 Mean SINR and prediction error per channel realization

Table II Beta values for PB channel (3Km/h) Format 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Beta(dB) 2.46 2.28 2.27 2.18 2.05 2.00 2.03 2.04 1.98 2.56 2.43 2.46 2.41 2.41 2.38 7.45 Format 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Beta(dB) 7.14 7.00 7.34 6.89 8.93 8.87 8.85 11.31 11.11 11.09 13.80 13.69 14.71 14.59 15.32 15.29

Table III Beta values for VA channel (60Km/h) Format 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Beta(dB) 2.54 2.26 2.26 2.12 2.07 2.06 2.02 2.01 2.01 2.50 2.43 2.44 2.39 2.41 2.37 7.48 Format 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Beta(dB) 7.14 6.92 7.53 6.82 8.93 8.87 8.90 11.43 11.16 11.01 13.74 13.70 14.68 14.55 15.17 15.27


143

FPGA-based Controller for a Mobile Robot

Ms. Shilpa Kale Dept. of Electronics & Telecommunication Engg.

Nagpur - 440025, India E-mail : [email protected]

Mr. S. S. Shriramwar

Dept. of Electronics & Telecommunication Engg. Nagpur - 440025, India

E-mail: [email protected]

66137644 Abstract With application in the robotics and automation, more and more it becomes necessary the development of applications based on methodologies that facilitate future modifications, updates and enhancements in the original projected system. This project presents a conception of mobile robots using rapid prototyping, distributing the several control actions in growing levels of complexity and computing proposal oriented to embed systems implementation. This kind of controller can be tested on different platform representing the mobile robots using reprogrammable logic components (FPGA). This mobile robot will detect obstacle and also be able to control the speed. Different modules will be Actuators, Sensors, wireless transmission. All this modules will be interfaced using FPGA controller. I would like to construct a mechanically simple robot model, which can measure the distance from obstacle with the aid of sensor and accordingly should able to control the speed of motor.

I would like to construct a mechanically simple robot model, which can measure the distance from obstacle with the aid of sensor and accordingly should able to control the speed of motor.

Keywords: Field Programmable Gate Array (FPGA), mobile robot, L293D Driver, GP2D12 Distance Measurement Sensor.

I. INTRODUCTION

The emergence of reconfigurable Field Programmable Gate Arrays (FPGA) has given rise to a new platform of complete mobile robot control system. With FPGA devices, it is possible to tailor the design to fit the requirements of applications (for example, exploration and navigation functions for a robot). General-purpose computers can provide acceptable performance when tasks are not too complex. A single processor system cannot guarantee real-time response (particularly in the absence of considerable additional hardware), if the environment is dynamic or semi-dynamic. This paper only focuses on the study of the mobile robot platform, with two driving wheels mounted on the same axis and a free front wheel. An FPGA-based robotic system can be designed to handle tasks in parallel. An FPGA-based robot also improves upon the single general purpose processor/computer based robot in the following areas:

1. Enhanced I/O channels. One can directly map the logical design to the computing elements in FPGA devices. 2. Low power consumption compared to desktops/laptops. 3. Support for the logical design of the non-Von Neumann computational models. 4. Support for easy verification of the correctness of the logical design modules.

Wheeled mobile robots (WMRs) are more energy efficient than legged or treaded robots on hard, smooth surfaces [Bekker60, Bekker691; and will potentially be the first mobile robots to find widespread application in industry, because of the hard, smooth plant floors in existing industrial environments. WMRs require fewer and simpler parts and are thus easier to build than legged or treaded mobile robots. Wheel control is less complex than the actuation of multi-joint legs, and wheels cause minimal surface damage in comparison with treads. The mobile robot consists of many units:

• mechanics (chassis, housing, wheels) • electromechanical parts • sensors

Robots carry out many various tasks. During these tasks the robot moves and orients. While navigating, it uses signals from the environment and the contents of its own memory to make the correct decisions. This form of navigation may be manifold depending on the given task and problem. Often the goal can be sensed, there is no obstacle between the goal and the robot, but there are numerous times when this is not the case, then the marking points must be sensed and the route known. In order for the robot to be able to do this, it must contain two main components: • Drive, motion and • control, direction.

II. WHEELED MOBILE ROBOT A robot capable of locomotion on a surface solely through the actuation of wheel assemblies mounted on the robot and in contact with the surface. A wheel assembly is a device which provides or allows relative motion between its mount and a surface on which it is intended to have a single point of rolling contact.


144

The simplest cases of mobile robots are wheeled robots, as shown in Figure 1. Wheeled robots comprise one or more driven wheels (drawn solid in the figure) and have optional passive or caster wheels (drawn hollow) and possibly steered wheels (drawn inside a circle). Most designs require two motors for driving (and steering) a mobile robot. The design is also steered. It requires two motors, one for driving the wheel on the left-hand side of Figure 1 has a single driven wheel that and one for turning. The advantage of this design is that the driving and turning actions have been completely

Figure 1: Wheeled Robots

separated by using two different motors. The robot design in the middle of Figure 1 is called “differential drive” and is one of the most commonly used mobile robot designs. The combination of two driven wheels allows the robot to be driven straight, in a curve, or to turn on the spot. Finally, on the right-hand side of Figure 1 is the so-called “Ackermann Steering”, which is the standard drive and steering system of a rear-driven passenger car. We have one motor for driving both rear wheels via a differential box and one motor for combined steering of both front wheels.

III. HARDWARE DESCRIPTION OF MOBILE ROBOT A. Architecture of Mobile Robot

Within the proposal of mobile robotics platform, the use of FPGA Controller, with control software especially developed for the necessary applications is considered using structured libraries to design, simulation, and verification with SIMULINK, we convert the model to function prototyping using FPGA hardware.

Figure 2: Block Diagram of mobile robot

The Figure 2 shows the overall flow of designing the system.

Figure 3: Mobile Robot Platform and elements

This system included both hardware and software development. The output of the ADC was connected to the FPGA board and it was used as the input of the source code. The assembly language that used in this system is Verilog HDL. After the simulation and the synthesis process, the program has been implemented on the FPGA board. Figure 3. shows mobile robot platform and its elements. B. Interfacing the FPGA with the L293D for Motor Control The FPGA will process the PWM program and the output will be given to enable pin of L293D which activate the L293 quadruple high current half H-Driver chip (L293 Datasheet) and controls the speed of the motor. Table 1 shows the truth table to get the L293 to perform the different movement operations such as “Forward” etc.

Figure 4: PWM schematic symbol

FUNCTIONS INPUTS (ENA ENB 1A 2A 3A 4A)

Forward 111010 Reverse 110101

Left 010010 Right 101000

Table 1: Logic inputs to activate the L293 chip

C. Interfacing the FPGA with ADC0809 An A/D converter translates an analog signal into a digital value. An 8 channel, 8-bit A/D input is available to read analog voltages between 0 to 5 Volts. Devices such as an analog joystick or potentiometers can be connected to one of ADC channel and converted digital output can be read and is

FPGA Controller

7 Segment Display

Motion Control

Sensor

ADC Converter


145

sent back to the FPGA board to control the speed of DC motor. Figure 5 shows the construction of the ADC. The characteristics of an A/D converter include: • Accuracy expressed in the number of digits it produces per value (for example 10bit A/D converter) • Speed expressed in maximum conversions per second (for example 500 conversions per second) • Measurement range expressed in volts (for example 0.5V)

Figure 5: ADC system shown in the Xilinx-ISE environment 1) Distance Measurement Sensor The analog sensor Sharp GP2D12 simply returns a voltage level in relation to the measured distance. In Figure 6, above, the relationship between digital sensor read-out (raw data) and actual distance information can be seen. From this diagram it is clear that the sensor does not return a value linear or proportional to the actual distance, so some post-processing of the raw sensor value is necessary. The simplest way of solving this problem is to use a lookup table which can be calibrated for each individual sensor.

Figure 6: Analog output voltage Vs Distance to Reflective Objects

IV. RESULTS

A. Synthesis Results

The full VHDL system consists of all the previous blocks connected together, and simulated as a whole before they were downloaded to the FPGA. Figure 8 shows the simulation results obtained from the ADC block. Here START and ALE signals must be high for atleast 100ns to start the conversion process of ADC converter. When conversion is completed, then EOC & OE signals will be pulsed high. The output in digital form is then given to FPGA as an input. Depending on digital input, PWM output will be calculated. Figure 9 shows simulation result of PWM block.

Other simulations results are of seven segment display as shown in figure 10 which display the correct numeral numbers, when adc_in[0:3] input given is in between 0-9 i.e; in binary “0000” to “1001”. It displays the measured distance in cm by GP2D12 sensor. Figure 11 shows Prototype board built with FPGA XC2S50 with simulation result as shown in figure 12.

Figure 7: RTL Schematic of Mobile Robot

Figure 8: Simulation result of ADC VHDL code

Figure 9: Simulation result of PWM VHDL code

Figure 10: Simulation result of 7-segment VHDL code

Clk

Eoc

ale

Start

addr(2:0)


146

Figure 11: Prototype board built with FPGA XC2S50

Figure 12 : Simulation result of Full system VHDL code

V. SUMMARY

This paper describes the design and implementation of a VHDL code for a simple mobile robot. This design may be described as a mapping from the input sensors to the actuators which control the robot motions. It is shown that FPGA can be configured to implement the design successfully.

This paper had been discussed about the methods that used in this project. There are four phases in completing this project. The phase one is design entry which is including both software and hardware development. The hardware developments are the driver circuit for motor and the ADC circuit and the software that being used is ISE Simulator. The simulation process is used to verify the design and the synthesis process is used to produce the block diagram.

VI. CONCLUSIONS & FUTURE WORK Robotic platforms engineering are a necessary in teaching and research institutions for knowledge consolidation in several teaching and research, such as modeling, control, automation, power systems, embedded electronics and software. The use of the mobile robots for this purpose appears to be quite an attractive solution. It allows the integration of several important areas of knowledge and a low cost solution, which has already been adopted with success by other research institutions. The main objective of this work was to propose a generic platform for a robotic mobile system, seeking to obtain a support tool for under-graduation and graduation activities. Another objective was to gather knowledge in the mobile robotic area, aiming at presenting practical solutions for industrial problems, such as maintenance, supervision and transport of materials.

The mobile robot systems have more and more importance these days, so dealing with them in the higher education is necessarily. Autonomous mobile robots can be used to deliver parts in factories, being complementary platforms in a security system and they also can be used in hazardous areas where humans can not stay. The wireless channel may also be added to increase system flexibility. The proposed framework remains simple and user friendly; additionally it provides enough flexibility for the specific application. Our approach can be extended to more demanding applications by adding more modules, or other peripheral interfaces. Currently I am working on the development of VHDL code which integrates all modules which are discussed above.

ACKNOWLEDGEMENT I would like to express my acknowledgement for the support and resources for this project that have been obtained from Prof. S. S. Shriramwar, all from the Priyadarshani College of Engg, Nagpur University. I would like to thank my Guide for his help in setting up the design system and for his expert advice on FPGA.

REFERENCES

1) Chris Lo. “Dynamic Reconfiguration Mechanism For

Robot Control Software”. Department of Electrical and Computer Engineering, University of Auckland, Auckland, New Zealand.

2) J.M. Rosario, R. Pegoraro, H. Ferasoli. “Conception

of Wheeled Mobile Robots with Reconfigurable Control using Integrate Prototyping”. Laboratory of Automation and Robotics, Sao Paulo, Brazil

3) István Matijevics: Microcontrollers, Actuators and

Sensors in Mobile Robots, in Proceedings of 4th


147

Serbian-Hungarian Joint Symposium on Intelligent Systems (SISY 206) September 29-30, 2006, Subotica, Serbia.

4) Volnie A.Pedroni. “Circuit Design with VHDL”

.MIT Press, Cambridge, Massachusetts, London , England.

5) Prabhas Chongstitvatana. “A FPGA-based

Behavioral Control System for a Mobile Robot”. IEEE Asia-Pacific Conference on Circuits and Systems, Chiangmai, Thailand, 1998.

6) Thomas Bräunl. “Embedded Robotics”. Springer-

Verlag Berlin Heidelberg 2003, 2006, Printed in Germany.

7) J. Borestein, H. R. Everett, L. Feng: Where Am I?

Sensors and Methods for Mobile Robot Positioning, University of Michigan, Michigan 1996.

8) K. Sridharan and P. Rajesh Kumar. “Design and

Development of an FPGA based Robot”.

9) www.springerlink.com. Berlin Heidelberg 2008 .

10) Application Note : Motor Control using FPSLIC™/FPGA by ATMEL.

11) MicroCamp : Atmega8 Activity Kit Manual.

Building robot with MicroCamp kit


148

Topological Design of Minimum Cost Survivable Computer Communication

Networks: Bipartite Graph Method

Kamalesh V.N Research Scholar, Department of Computer Science and

Engineering, Sathyabama University,

Chennai, India [email protected]

Abstract- A good computer network is hard to disrupt. It is desired that the computer communication network remains connected even when some of the links or nodes fail. Since the communication links are expensive, one wants to achieve these goals with fewer links. The computer communication network is fault tolerant if it has alternative paths between vertices, the more disjoint paths, the better is the survivability. This paper presents a method for generating k-connected computer communication network with optimal number of links using bipartite graph concept. Keywords ; computer network, link deficit algorithm, wireless network, k-connected networks, survivable network, bipartite graph.

I INTRODUCTION

The topological design of a network assigns the links and link capacities for connecting the network nodes. This is a critical phase of network synthesis, partly because the routing, flow control and other behavioral design algorithms rest largely on the given network topology. The topological design has also several performance and economic implications. The node locations, link connections and link speeds directly determine the transit time through the network. For reliability or security considerations, some networks may be required to provide more than one distinct path for each node pair, thereby resulting in a minimum degree of connectivity between the nodes [10]. The performance of a fault-tolerant system should include two aspects, computational efficiency and network reliability. When a component or link fails, its duties must be taken over by other fault free components or links of the system. The network must continue to work in case of node failure or edge failure. Different notions of fault tolerance exist, the simplest one corresponding to connectivity of the network, that is, the minimum number of nodes which must be deleted in order to destroy all paths between a pair of nodes. The maximum connectivity is desirable since it corresponds to not only the maximum fault tolerance of the network but also the maximum number of internally disjoint

S K Srivatsa Senior Professor, Department of Instrumentation and Electronics

Engineering, St. Joseph College of Engineering,

Chennai, India. [email protected]

paths between any two distinct vertices. However, connectivity number can be at most equal to the degree of the network graph [7, 8]. The goal of the topological design of a computer communication network is to achieve a specified performance at a minimal cost [5]. Unfortunately, the problem is completely intractable [1]. If the network under consideration has n distinct nodes and p distinct possible bandwidth then the size of the space of potential topologies would be (p+1) n (n-1)/2 for the values n=10 and p=3 the size of the search space would be 1.2 x 1027. The fastest available computers cannot optimize a 25 node network, let alone a 100 node network. A reasonable approach is to generate a potential network topology (starting network) and see if it satisfies the connectivity and delay constraints. If not, the starting network topology is subjected to a small modification (“perturbation”) yielding a slightly different network, which is now checked to see if it is better. If a better network is found, it is used as the base for more perturbations. If the network resulting from perturbation is not better, the original network is perturbed in some other way. This process is repeated till the computer budget is used up. [2, 3, 5]. A fundamental problem in network design is to construct a minimum cost network that satisfies some specified connectivity requirements between pair of nodes. One of the most well suited problems in this framework is the survivable network design problem, where we are given a computer communication network with costs on its edges, and a connectivity requirement rij. for each pair i,j of nodes. The goal is to find a minimum cost subset of edges that ensures that there exist rij disjoint paths for each pair i,j of nodes. Where all rij ∈{0,k}, for some integer k, we will refer to these problems as k-connectivity of a computer communication network and k-vertex connectivity respectively [13]. Because of the importance of the problem, few methods for generating k-connected networks are proposed in the literature. In the method due to Steigletiz, et. al.[4], the heuristic begins by numbering the nodes at random. This randomization lets the


149

heuristic to generate many topologies from the same input data. Further this method involves repeated searching of nodes when conflicts occur. This demands more computational effort. In the method [9] the nodes are numbered arbitrarily. The decimal number of each node is converted into a k bit Gray code. Thus each node has a Gray code associated with it. There exist a link between any two nodes whose Gray codes differ only in one place are connected. Thus every node gets connected to k nodes and has a degree of k. However the limitation of this method is again arbitrarily numbering of nodes and this method is applicable only when number of nodes in the network is 2k. In the method [11] the nodes are numbered arbitrarily and assume that the nodes of the network are equispaced and lie on a circle, that is their method is applicable only when the nodes of the network form a regular polygon.

In our earlier method [12] for generating k-connected survivable network topology, the geographical positions of the nodes are given. To start with, the nodes are labeled using some symbols. Given the cost of establishing link between pair of nodes the same is represented in the form of the matrix. The accumulated cost for every node is computed. The accumulated cost is sorted in the increasing order. The index value of the sorted list is assigned as representative number for the nodes. Links are established between nodes. The details can be found in [12]. However in this method redundant links are more in number. In view of overcoming the above said limitations, in this paper we proposed a novel technique, an approach based on bipartite graph which ensures generation of minimum cost k-connected survivable network topologies. The section 2 explains the proposed method. Section 3 give the illustration of proposed method finally the paper concludes in section 4.

II PROPOSED METHOD This section presents the proposed technique for generating k-connected survivable network topologies. The geographical positions of the nodes are given. To start with, the nodes are labeled using some symbols. Given the cost of establishing link between pair of nodes the same is represented in the form of the matrix. The accumulated cost for every node is computed. Sort the accumulated costs in the increasing order. Assign the index value of the sorted list as representative number for the nodes. Partition the node set of the network into two sets V1 and V2 such that | V1 | = k and | V2 | = n-k i.e., V1 = { 1, 2, 3, 4,…k } and V2 = { k+1,k+2,…n }, where n > k. Construct a bipartite graph G ( V1 : V2, E ) with nodes V1 and V2. The network graph so obtained is k-connected network graph.

Algorithm: Generate minimum cost k-connected survivable network topology. Input:

(i ) n- nodes of the network (ii) Cost associated with each pair of nodes (iii) k- connectivity number (k < n) Output: Minimum cost k-connected survivable network topology. Method:

1. Geographical positions of the n distinct nodes are given. 2. Name the nodes using some symbols. 3. Construct a cost matrix using cost associated with each

pair of nodes. 4. Compute the accumulated cost for every node. 5. Sort the result of step 4 using appropriate sorting

technique in increasing order. 6. Assign the index value of the sorted list as the

representative number for the nodes. 7. Partition the node set V into two sets V1 and V2 such that

| V1 | = k and | V2 | = n-k and V1 = { 1, 2, 3, 4,…k } and V2= { k+1,k+2,…n}.

8. Construct a complete bipartite graph with node sets V1 and V2.ie for i=1 to k

for j=k+1 to n establish link between i & j. end of for end of for

The network graph obtain is a k-connected network graph.

Algorithm end This method is link optimal compared to the methods [12]. The number of links required for the generation of the k-connected network with n nodes in this method is k(n-k).Where as in the method [12] ,the number of links required for the generation of the k-connected network with n nodes is (n-1)+(n-2)+…+(n-k). Further (n-1)+(n-2)+…+(n-k) > (n-k)+(n-k)+…+ (n-k) for k >1. ie (n-1)+(n-2)+…+(n-k) > k (n-k). Hence the number of links used in method [ 12 ] is greater than the number of links used in the proposed method. This method is also link optimal compared to the methods [4,11], for k>n/2.In the methods [4,11], the number of links required for the generation of the k-connected network with n nodes is nk/2 ,and k(n-k) > kn/2 for all k > n/2.


150

The comparative analysis for link optimization of various methods are tabulated in table 1 TABLE-I COMPARATIVEI ANALYSIS FOR LINK OPTIMIZATION

III RESULTS ILLUSTRATION

In this section we illustrate the proposed method for 7 distinct nodes. The network connectivity k is assumed to be 3. Let us consider the 7 nodes with some labels as shown in Fig-1.

Figure1:– Name the nodes of the given Network with some symbols

The Table-2 gives the cost matrix constructed out of the cost associated with pair of nodes. The last column of Table-1 gives the accumulated cost for every node.

TABLE-II COST ASSOCIATED WITH EVERY PAIR OF NODES

Table-3 gives the numbers associated for every node based on the sorted accumulated cost.

Fig-2 shows the nodes labeled using the sorted accumulated cost. Whereas, fig-3 shows the minimum cost 1-connected network topology.

A B C D E F G

Accumulate Cost

A 0 4 2 4 3 1 5 19 B 4 0 4 5 2 3 2 20 C 2 4 0 1 4 3 1 15 D 4 5 1 0 2 2 4 18 E 3 2 4 2 0 1 10 22 F 1 3 3 2 1 0 3 13 G 5 2 1 4 10 3 0 25

TABLE-III: NODE NUMBERING

Nodes Accumulated Cost Nodes Numbers

A 19 4

B 20 5

C 15 2

D 18 3

E 22 6

F 13 1

G 25 7

Method Number of links used to generate k-connected network with n nodes

Comparative analysis

The method presented in this paper.

k(n-k)

Kamalesh V.N et al.[12]

(n-1) + (n-2 ) + ... + (n-k)

k(n-k) < (n-1)+(n-2)+... + (n-k), for all n and k. Hence, the method presented in this paper is optimal compared due to Kamalesh V.N et al.

S. Latha. et al. [11]

kn/2 k(n-k) < kn/2, for all k> n/2. Hence, the method presented in this paper is optimal compared to method due to S. Latha et al. [11], for all k > n/2.

K. Steiglitz . et al. [4]

kn/2 k(n-k) < kn/2, for all k> n/2. Hence, the method presented in this paper is optimal compared to method due to K. Steiglitz et al. [4], for all k > n/2.


151

Since k=3 therefore |V1 | = 3 and |V2| = 7-3 =4

Figure3:– The node set is partitioned into two sets V

1 and V

2 Construct the bipartite graph G (V1: V2, E)

2

1

4

3

6

5

7

Figure4:-Complete Bipartite Graph The resultant 3-connected network graph is shown in

Figure5.

Figure5: 3-connected network graph

IV CONCLUSIONS The topological design of a network assigns the links and link capacities for connecting the network nodes. This is a critical phase of network synthesis, partly because the routing, flow control and other behavioral design algorithms rest largely on the given network topology. The goal of the topological design of a computer communication network is to achieve a specified performance at a minimal cost. In this paper we presented a generic method to generate a minimum cost k-connected survivable network topology. The main strength of the proposed method is that it is very simple and straightforward. Also, unlike other existing methods, the proposed method does not make any specific assumption to generate a network. Further in this method the number of links used is optimal compared to [11,12].

V REFERENCES

[1] A.V. Aho, J.E Hopcroft and J. D. Ullman, Design and analysis of Computer Algorithms , Addison – Wesley, Massachusetts , (1974).

[2] M. Gerla H. Frank and J. Eckl, A cut saturation algorithm for topological design of packet switched communication networks, Proc. NTC., (1974) pp. 1074-1085.


152

[3] A. Lavia and E. G. Manning, Pertubation techniques for topological optimization of computer networks, Proc. Fourth Data Communication, Symp. (1975), pp.4.16-4.23.

[4] K. Steiglitz, P. Weiner and D.J. Kleitman, The design of minimum cost survivable network, IEEE Trans. Circuit Theory (1969), pp.455-460.

[5] Andrew S. Tanenbaum, Computer Networks, Prentice Hall, Englewood Cliffs, (1987).

[6] S. K. Srivatsa and P. Seshaiah, On Topological design of a Computer Networks, Computer Networks and ISDN Systems vol 27 (1995), pp.567-569.

[7] Douglas B. West, Introduction to Graph Theory, Pearson Education, inc 2nd Edition, (2001).

[8] Junming Xu, Topological Structure and Analysis of Interconnection Networks, Kluwer Academic Publishers (2002)

[9] S.Latha and S.K.Srivatsa, Topological design of a k-connected network, WSEAS transactions on communications vol 6 [4], (2007) pp.657-662.

[10] Kamalesh V. N and S. K. Srivatsa, Numbering of nodes in computer communication networks: A brief review. National Conference on Advanced Network Technologies and Security Issues, FISAT, Cochin, Kerala, India, 8th to 10th August 2007, pp. 278-282.

[11] S.Latha and S.K.Srivatsa, On some aspects of Design of cheapest survivable networks, International journal of computer nal science and network securities, Vol. 7, No. 11, 2007, pp. 210-211.

[12] Kamalesh V. N and S. K. Srivatsa, On the design of minimum cost survivable network topologies, 15th National conferences on communication, IIT Guwahati, India, 16-18th Jan 2009, pp 394-397.

[13] Tanmoy Chakraborty, Julia Chuzhoy, Sanjeev Khanna, Proceedings of the 40th annual ACM symposium on Theory of computing, Victoria, British Columbia, Canada, 17-20 May 2008, pp 167-176.

VI BIOGRAPHY

Prof. Kamalesh V. N received the Bachelor of Science degree from University of Mysore, India. Subsequently, he received Master of Science in Mathematics degree and Master of Technology in Computer Science & Technology degree from University of Mysore, India. He secured 14th rank

in Bachelor of Science and 4th rank in Master of Science from University of Mysore. Further, he was National Merit Scholar and subject Scholar in Mathematics. He is working as Head, Department of Computer Science & Engineering at JSS Academy of Technical Education, Bangalore, affiliated to

Visvesvaraya Technological University, Belgaum, Karnataka, India. He has taught around fifteen different courses at both undergraduate and post graduate level in mathematics and Computer science and engineering. His current research activities pertain to Computer networks, Design and Analysis of algorithms, Graph theory and Combinatorics, Finite Automata and Formal Languages. His paper entitled “On the assignment of node number in a computer communication network” was awarded certificate of merit at World Congress on Engineering and Computer Science 2008 organized by International Association of Engineers at UC Berkeley, San Francisco, USA. He is currently research candidate pursuing Ph.D degree from Sathyabama University, Chennai, India.

Dr. S.K. Srivatsa received the Bachelor of Electronics and Telecommunication Engineering degree from Jadavpur University, Calcutta, India. Master’s degree in Electrical Communication Engineering and Ph.D from the Indian Institute of Science, Bangalore, India. He was a Professor of Electronics

Engineering in Anna University, Chennai, India. He was a Research Associate at Indian Institute of Science. Presently he is working as senior Professor in department of Instrumentation & Control Engineering at St. Joseph College of Engineering, Chennai, India. He has taught around twenty different courses at undergraduate and post graduate level. His current research activities pertain to computer networks, Design and Analysis of algorithms, coding Theory and Artificial Intelligence & Robotics.


153

TOWARDS AMELIORATING CYBERCRIME AND CYBERSECURITY

Azeez Nureni Ayofe Department of Computer Science,

College of Natural and Applied Sciences,

Fountain University, Osogbo,

Osun State, Nigeria.

E-mail address: [email protected]

Osunade Oluwaseyifunmitan Department of Computer Science,

University of Ibadan, Nigeria.

[email protected]

[email protected]

ABSTRACT-- Cybercrime is becoming ever more serious. Findings from 2002 Computer Crime and Security Survey show an upward trend that demonstrates a need for a timely review of existing approaches to fighting this new phenomenon in the information age. In this paper, we provide an overview of Cybercrime and present an international perspective on fighting Cybercrime. This work seeks to define the concept of cyber-crime, identify reasons for cyber-crime, how it can be eradicated, look at those involved and the reasons for their involvement, we would look at how best to detect a criminal mail and in conclusion, proffer recommendations that would help in checking the increasing rate of cyber-crimes and criminals.

Keywords: Cyber security; information; Internet; technology; people

1.INTRODUCTION Over the past twenty years, unscrupulous computer users have continued to use the computer to commit crimes; this has greatly fascinated people and evoked a mixed feeling of admiration and fear. This phenomenon has seen sophisticated and unprecedented increase recently and has called for quick response in providing laws that would protect the cyber space and its users. The level of sophistication has gone high to the point of using the system to commit murder and other havoc. The first recorded cyber murder committed in the United States seven years ago according to the Indian Express, January, 2002 “has to do with an underworld don in hospital to undergo a minor surgery. His rival went ahead to hire a computer expert who altered his prescriptions through hacking the hospital’s computer system. He was

administered the altered prescription by an innocent nurse, this resulted in the death of the patient”i This work gives a brief overview of cyber-crime, explains why people are involved in cyber-crime, look at those involved and the reasons for their involvement, we would look at how best to detect a criminal mail and in conclusion, proffer recommendations that would help in checking the increasing rate of cyber-crimes and criminals. These guides provide general outlines as well as specific techniques for implementing cyber security. 2. METHODOLOGY This study was carried out purposely to explain clearly the concept of Cybercrime and Cybersecurity and provide adequate and sufficient ways of getting out of these problems in the present days of internet usage and applications. The instruments used were questionnaires, personal interviews, observation, and information on the internet as well as report from both radio and electronic media. The authors conducted personal interviews with twenty two internet users to gather their views on the causes and their experiences with Cybercrime and Cybersecurity. In addition, fifty three questionnaires were distributed to the following categories of internet users: bankers, students, directors and university lecturers with aim of seeking their views and opinions on these issues. Consequently, the information gathered through all the above instruments were analyzed and the approach towards ameliorating these phenomenon were proffered for both the government and corporate bodies for implementation.


154

3. WHAT IS CYBER – CRIME? Cyber-crime by definition is any harmful act committed from or against a computer or network, it differs according to McConnell International, “from most terrestrial crimes in four ways: they are easy to learn how to commit, they require few resources relative to the potential damages caused, they can be committed in a jurisdiction without being physically present in it and fourthly, they are often not clearly illegal.”ii Another definition given by the Director of Computer Crime Research Centre (CCRC) during an interview on the 27th April, 2004, is that “cyber-crime (‘computer crime’) is any illegal behavior directed by means of electronic operations that targets the security of computer systems and the data processed by them.”iii In essence, cyber-crime is crime committed in a virtual space and a virtual space is fashioned in a way that information about persons, objects, facts, events, phenomena or processes are represented in mathematical, symbol or any other way and transferred through local and global networks. From the above, we can deduce that cyber crime has to do with wrecking of havoc on computer data or networks through interception, interference or destruction of such data or systems. It involves committing crime against computer systems or the use of the computer in committing crimes. This is a broad term that describes everything from electronic cracking to denial of service attacks that cause electronic commerce sites to lose money. ". Mr. Pavan Duggal, who is the President of www.cyberlaws.net and consultant, in a report has clearly defined the various categories and types of cybercrimes. Cybercrimes can be basically divided into 3 major categories 1. Cybercrimes against persons. 2. Cybercrimes against property. 3. Cybercrimes against government. 3.1 Cybercrimes against persons: Cybercrimes committed against persons include various crimes like transmission of child-pornography, harassment of any one with the use of a computer such as e-mail. The trafficking, distribution, posting, and dissemination of obscene material including pornography and indecent exposure, constitutes one of the most important Cybercrimes known today. The potential harm of such a crime to humanity can

hardly be amplified. This is one Cybercrime which threatens to undermine the growth of the younger generation as also leave irreparable scars and injury on the younger generation, if not controlled. A minor girl in Ahmedabad was lured to a private place through cyberchat by a man, who, along with his friends, attempted to gangrape her. As some passersby heard her cry, she was rescued. Another example wherein the damage was not done to a person but to the masses is the case of the Melissa virus. The Melissa virus first appeared on the internet in March of 1999. It spread rapidly throughout computer systems in the United States and Europe. It is estimated that the virus caused 80 million dollars in damages to computers worldwide.

In the United States alone, the virus made its way through 1.2 million computers in one-fifth of the country's largest businesses. David Smith pleaded guilty on Dec. 9, 1999 to state and federal charges associated with his creation of the Melissa virus. There are numerous examples of such computer viruses few of them being "Melissa" and "love bug".

Cyber harassment is a distinct Cybercrime. Various kinds of harassment can and do occur in cyberspace, or through the use of cyberspace. Harassment can be sexual, racial, religious, or other. Persons perpetuating such harassment are also guilty of cybercrimes. Cyberharassment as a crime also brings us to another related area of violation of privacy of citizens. Violation of privacy of online citizens is a Cybercrime of a grave nature. No one likes any other person invading the invaluable and extremely touchy area of his or her own privacy which the medium of internet grants to the citizen. 3.2 Cybercrimes against property: The second category of Cyber-crimes is that of Cybercrimes against all forms of property. These crimes include computer vandalism (destruction of others' property), transmission of harmful programmes.

A Mumbai-based upstart engineering company lost a say and much money in the business when the rival company, an industry major, stole the technical database from their computers with the help of a corporate cyberspy.


155

3.3 Cybercrimes against government: The third category of Cyber-crimes relate to Cybercrimes against Government. Cyberterrorism is one distinct kind of crime in this category. The growth of internet has shown that the medium of Cyberspace is being used by individuals and groups to threaten the international governments as also to terrorize the citizens of a country. This crime manifests itself into terrorism when an individual "cracks" into a government or military maintained website.

In a report of expressindia.com, it was said that internet was becoming a boon for the terrorist organizations. According to Mr. A.K. Gupta, Deputy Director (Co-ordination), CBI, terrorist outfits are increasingly using internet to communicate and move funds. "Lashker-e-Toiba is collecting contributions online from its sympathizers all over the world. During the investigation of the Red Fort shootout in Dec. 2000, the accused Ashfaq Ahmed of this terrorist group revealed that the militants are making extensive use of the internet to communicate with the operatives and the sympathizers and also using the medium for intra-bank transfer of funds". Cracking is amongst the gravest Cyber-crimes known till date. It is a dreadful feeling to know that a stranger has broken into your computer systems without your knowledge and consent and has tampered with precious confidential data and information. Coupled with this the actuality is that no computer system in the world is cracking proof. It is unanimously agreed that any and every system in the world can be cracked. The recent denial of service attacks seen over the popular commercial sites like E-bay, Yahoo, Amazon and others are a new category of Cyber-crimes which are slowly emerging as being extremely dangerous.

Cyber crime can be broadly defined as criminal activity in which computer or computer networks are a tool, a target or a medium for the crime. 4. Various types of cyber crimes include:

4.1 Unauthorized access of hosts- more commonly known as hacking. Hacking can take various forms, some of which might not always involve deep technical knowledge.

Social engineering involves “talking” your way into being given access to a computer by an authorized user.

A divide exists between individuals who illegally break into computers with malicious intent, or to sell information garnered from the compromised computer, known as “crackers” or black hats”, and those who do it out of curiosity or to enhance their technical prowess- known as “hackers” or “white hats”.

4.2 Spamming – involves mass amounts of email being sent in order to promote and advertise products and websites.

Email spam is becoming a serious issue amongst businesses, due to the cost overhead it causes not only in regards to bandwidth consumption but also to the amount of time spent downloading/ eliminating spam mail.

Spammers are also devising increasingly advanced techniques to avoid spam filters, such as permutation of the emails contents and use of imagery that cannot be detected by spam filters.

4.3 Computer Fraud/ “Phishing” scams- South Africa has recently been afflicted by an onset of intricate scams that attempt to divulge credit and banking information from online banking subscribers.

These are commonly called “Phishing’ scams, and involve a level of social engineering as they require the perpetrators to pose as a trustworthy representative of an organization, commonly the victims bank.

4.4 Denial of Service Attacks- Not to be

confused with unauthorized computer access and hacking.

Denial of Service or DoS attacks involve large volumes of traffic being sent to a host or network, rendering it inaccessible to normal users due to sheer consumption of resources.

Distributed Denial of Service attacks involve multiple computers being used in an attack, in many


156

cases through the use of “zombie” servers, which are trojanised programs that attackers install on various computers.

Often legitimate computer users have no idea they are involved in denial of service attacks due to the stealthy nature of the zombie software.

4.5 Viruses, Trojans and Worms- These three all fall into a similar category as they are software designed to “infect” computers- or install themselves onto a computer without the users permission, however they each operate very differently.

Many computer users have experienced the frustration of having a malicious virus wreck havoc upon their computers and data, but not all viruses have a malicious payload.

Trojan is a program that allows for the remote access of the computer it’s installed on. Trojans exist for multiple performs and have varying degrees in complexity.

Worms make use of known vulnerabilities in commonly used software, and are designed to traverse through networks- not always with destructive ends, historically however worms have had devastating effects such as the infamous Code Red and Melissa worms.

Intellectual Property Theft- Intellectual property theft in relation to cyber crime deals mainly with the bypassing of measures taken to ensure copyright- usually but not restricted to software.

4.6 Other types of cyber crime could be categorized under the following:

1. Unlawful access to computer information - 8002 crimes.

2. Creation, use and distribution of malware or machine carriers with such programs- 1079.

3. Violation of operation rules of computers, computer system or networks- 11.

4. Copyright and adjacent rights violation- 528.

5. CAUSES OF CYBER – CRIME There are many reasons why cyber-criminals commit cyber-crime, chief among them are these three listed below:

Cyber crimes can be committed for the sake of recognition. This is basically committed by youngsters who want to be noticed and feel among the group of the big and tough guys in the society. They do not mean to hurt anyone in particular; they fall into the category of the Idealists; who just want to be in spotlight.

Another cause of cyber-crime is to make quick money. This group is greed motivated and is career criminals, who tamper with data on the net or system especially, e-commerce, e-banking data information with the sole aim of committing fraud and swindling money off unsuspecting customers.

Thirdly, cyber-crime can be committed to fight a cause one thinks he believes in; to cause threat and most often damages that affect the recipients adversely. This is the most dangerous of all the causes of cyber-crime. Those involve believe that they are fighting a just cause and so do not mind who or what they destroy in their quest to get their goals achieved. These are the cyber-terrorists.

6. HOW TO ERADICATE CYBER – CRIME Research has shown that no law can be put in place to effectively eradicate the scourge of cyber-crime. Attempts have been made locally and internationally, but these laws still have shot-comings. What constitutes a crime in a country may not in another, so this has always made it easy for cyber criminals to go free after being caught. These challenges notwithstanding, governments should in the case of the idealists, fight them through education not law. It has been proven that they help big companies and government see security holes which career criminals or even cyber-terrorist could use to attack them in future. Most often, companies engage them as consultants to help them build solid security for their systems and data. “The Idealists often help the society: through their highly mediatised and individually harmless actions, they help important organizations to discover their high-tech security holes….”iv The enforcement of law on them can only trigger trouble, because they would not stop but would want to defy the law. “ Moreover, if the goal of the cyber-crime legislation is to eradicate cyber-crime, it mint well eradicate instead a whole


157

new culture….”v Investments in education is a much better way to prevent their actions. Another means of eradicating cyber-crime is to harmonize international cooperation and law, this goes for the greed motivated and cyber-terrorists. They can not be fought by education, because they are already established criminals, so they can not behave. The only appropriate way to fight them is by enacting new laws, harmonize international legislations and encourage coordination and cooperation between national law enforcement agencies. 7. WHO ARE INVOLVED Those involved in committing cyber-crimes are in three categories and they are:

7.1 THE IDEALISTS (Teenager). They are usually not highly trained or skilful, but youngsters between the ages of 13 – 26 who seek social recognition. They want to be in the spotlight of the media. Their actions are globally damageable but individually negligible. “Like denying a lot of important e-commerce servers in February, 2000 is said to have caused high damages to these companies.”vi Most often they attack systems with viruses they created; their actual harm to each individual is relatively negligible. By the age of 26 to 26 when they have matured and understood the weight of their actions, they lose interest and stop.

7.2 THE GREED – MOTIVATED (Career

Criminals). This type of cyber-criminals is dangerous because they are usually unscrupulous and are ready to commit any type of crime, as long as it brings money to them. “They started the child pornography often called cyber-pornography which englobes legal and illegal pornography on the internet.” vii They are usually very smart and organized and they know how to escape the law enforcement agencies. These cyber-criminals are committing grievous crimes and damages and their unscrupulousness, particularly in child-pornography and cyber-gambling is a serious threat to the society. Example to show how serious a threat they pose to the society is “the victim of the European bank of Antigua are said to have lost more than $10million” viii “…theft of valuable trade

secrets: the source code of the popular micro-soft windows exploration system by a Russian based hacker could be extremely dangerous… the hackers could use the code to break all firewalls and penetrated remotely every computer equipped with windows were confirmed. Another usage could be the selling of the code to competitors.”ix

7.3 THE CYBER – TERRORISTS. They are

the newest and most dangerous group. Their primary motive is not just money but also a specific cause they defend. They usually engage in sending threat mails, destroying data stored in mainly government information systems just to score their point. The threat of cyber-terrorism can be compared to those of nuclear, bacteriological or chemical weapon threats. This disheartening issue is that they have no state frontiers; can operate from any where in the world, and this makes it difficult for them to get caught. The most wanted cyber-terrorist is Osama Bin Laden who is said to “use stegranography to hide secret messages within pictures, example, a picture of Aishwarya Rai hosted on the website could contain a hidden message to blow up a building.”x A surprising fact is that these hidden messages do not alter the shape, size or look of the original pictures in any way.

8. A CRIMINAL MAIL

Another type of Cybercrime which is being currently researched on but not as popular as those stated above is a criminal mail. A criminal mail is usually sent to networks with the aim of either corrupting the system or committing fraud. The way to detect such mails is by putting security measures in place which would detect criminal patterns in the network. News Story by Paul Roberts, of IDG News Service says that Unisys Suite has a system called the “Unisys Active Risk Monitoring System (ARMS) which helps banks and other organizations spot patterns of seemingly unrelated events that add up to criminal activity.”xi Actimize Technology Ltd based in New York has developed technology that enables organizations to do complex data mining and analysis on stored information and transaction data without needing to copy it to a separate data warehouse. “The actimize software


158

runs on the Microsoft Corp. Windows NT or Windows 2002 platform and can be developed on standard server hardware with either four to eight processors, Katz said.”xii Eric J. Sinrod in his article ‘What’s Up With Government Data Mining’ states that the United States “Federal Government has been using data mining techniques for various purposes, from attempting to improve service to trying to detect terrorists patterns and activities.”xiii The most effective way to detect criminal mails is to provide security gadgets, educate employees on how to use them, and to be at alert for such mails, above all, making sure no security holes is left unattended to. The world over Cybercrime has taken deep root and the use of cyberspace by sophisticated cyber criminals has assumed serious portion today. Criminals and terrorists associated with drug trafficking, terrorist outfits are employing internet for anti social, anti national and criminal activities with impunity. Terrorist groups are deftly using internet for passing on information with regard to executing various terrorist acts having serious negative impact on human life. The cyber-terrorists have even acquired the capability to penetrate computer systems using “logic bombs” (coded devices that can be remotely detonated), electro magnetic pulses and high-emission radio frequency guns, which blow devastating electronic

wind through a computer system. The hackers have gone to the extent of distributing free hacking software—Rootkit, for instance—to enable an intruder to get root access to a network and then control as though they were the system’s administrators. Cyber crime levels are on the rise in Nigeria, examples of large scale cyber crimes over the past few years include:

• The phishing scams that have recently afflicted many of Nigeria’s larger banks and their clients.

• Key logging software that was able to capture banking and other details of online bankers.

Statistical research performed in the UK revealed that cyber crime and software flaws were costing Britain up to £10 billion in losses annually. According to the survey 50% of all businesses (source: The Law and Technology of Software Security: Nigerian Cybercrime Working Group (NCWG) 8th Nigerian Software Exhibition - NISE 2004) were affected by cyber crime, showing a giant increase in cyber crime occurrences in the UK when compared to a 2000 survey which revealed that only 25% of respondents had been cyber crime victims.

Table1 showing Internet Usage and World internet usage and population statistics as at March 30, 2009. World Region Population

(2008 Est.) Population % of world

Internet usage, Latest Data

% population (penetration)

Usage % of world

Usage Growth 2008

Africa 941,249,130 14.2% 44,361,940 4.7% 3.4% 882.7% Asia 3,733,783,474 56.6% 510,478,743 13.7% 38.7% 346.6% Europe 801,821,187 12.1% 348,125,847 43.4% 26.4% 231.2% Middle East 192,755,045 2.9% 33,510,500 17.4% 2.5% 920.2% North America 334,659,631 5.1% 238,015,529 71.1% 18.0% 120.2% Latin America/Caribbean

569,133,474 8.6% 126,203,714 22.2% 9.6% 598.5%

Australia 33,569,718 0.5% 19,175,836 57.1% 1.5% 151.6% World Total

6,606,971,659 100.0% 1,319,872,109 20.0% 100.0% 265.6%

8.1 Technology Viewpoint

• Advances in high-speed telecommunications, computers and other technologies are creating new opportunities for criminals, new classes of

crimes, and new challenges for law enforcement.

8.2 Economy Viewpoint

• Possible increases in consumer debt may affect bankruptcy filings.


159

• Deregulation, economic growth, and globalization are changing the volume and nature of anticompetitive behaviour.

• The interconnected nature of the world’s economy is increasing opportunities for criminal activity.

8.2 Government Viewpoint

• Issues of criminal and civil justice increasingly transcend national boundaries, require the cooperation of foreign governments, and involve treaty obligations, multinational environment and trade agreements and other foreign policy concerns.

8.3 Social-Demographic Viewpoint

• The numbers of adolescents and young adults, now the most crime-prone segment of the population are expected to grow rapidly over the next several years.

8.4 Computer as an instrument facilitating crime Computer is used as an instrument facilitating crime. Most vivid example of computers being used as an instrument of Cybercrime has been the recent attack on parliament where computer and internet was used in a variety of ways to perpetrate the crime. The terrorist and criminals are using internet methods such as e-mail, flash encrypted messages around the globe. Frauds related to electronic banking or electronic commerce are other typical examples. In these crimes, computer programmes are manipulated to facilitate the crimes namely,

a. Fraudulent use of Automated Teller Machine (ATM) cards and accounts;

b. Credit card frauds; c. Frauds involving electronic finds transfers; d. Telecommunication Frauds; and e. Frauds relating to Electronic Commerce

and Electronic Data Interchange. The information technology (IT) infrastructure which is now vital for communication, commerce, and control of our physical infrastructure, is highly vulnerable to terrorist and criminal attacks. The private sector has an important role in securing the Nation’s IT infrastructure by deploying sound security products and adopting good security practices. But the Federal government also has a key role to play by supporting the discovery and development

of cyber security technologies that underpin these products and practices. Improving the Nation’s cyber security posture requires highly trained people to develop, deploy, and incorporate new cyber security products and practices. The number of such highly trained people is too small given the magnitude of the challenge. The situation has been exacerbated by the insufficient and unstable funding levels for long-term, civilian cyber security research, which universities depend upon to attract and retain faculty. 8.5 Software vulnerability Network connectivity provides “door-to_door” transportation for attackers, but vulnerabilities in the software residing in computers substantially compound the cyber security problem. The software development methods that have been the norm fail to provide the high-quality, reliable, and secure software that the IT infrastructure requires. Software development is not yet a science or a rigorous discipline, and the development process by and large is not controlled to minimize the vulnerabilities that attackers exploit. Today, as with cancer, vulnerable software can be invaded and modified to cause damage to previously healthy software, and infected software can replicate itself and be carried across networks to cause damage in other systems.

Like cancer, these damaging processes may be invisible to lay person even though experts recognize that their threat is growing. And as in cancer, both preventive actions and research are critical, the former to minimize damage today and the latter to establish a foundation of knowledge and capabilities that will assist the cyber security professionals of tomorrow reduce the risk and minimize damage for the long term. 8.6 Domestic and international law enforcement. A hostile party using an Internet-connected computers thousands of miles away can attack an Internet- connected computers in the United States as easily as if he or she were next door. It is often difficult to identify the perpetrator of such an attack, and even when a perpetrator is identified, criminal prosecution across national boundaries is problematic. 8.7 Education. We need to educate citizens that if they are going to use the internet, they need to continually maintain and update the security on


160

their system so that they cannot be compromised, for example, to become agents in a DDoS attack or for “spam” distribution. We also need to educate corporations and organizations in the best practice for effective security management. For example, some large organizations now have a policy that all systems in their purview must meet strict security guidelines. Automated updates are sent to all computers and servers on the internal network, and no new system is allowed online until it conforms to the security policy. 8.8 Information security. Information security refers to measures taken to protect or preserve information on a network as well as the network itself. The alarming rise of premeditated attacks with potentially catastrophic effects to interdependent networks and information systems across the globe has demanded that significant attention is paid to critical information infrastructure protection initiatives. For many years governments have been protecting strategically critical infrastructures, however in recent times the information revolution has transformed all areas of life. The way business is transacted, government operates, and national defence is conducted has changed. These activities now rely on an interdependent network of information technology infrastructures and this increases our risk to a wide range of new vulnerabilities and threats to the nation’s critical infrastructures. These new cyber threats are in many ways significantly different from the more traditional risks that governments have been used to addressing. Exploiting security flaws appears now to be far easier, less expensive and more anonymous than ever before. The increasing pervasiveness, connectivity and globalization of information technology coupled with the rapidly changing, dynamic nature of cyber threats and our commitment to the use of ICT for socio-economic development brings about the critical need to protect the critical information infrastructures to provide greater control. This means that governments must adopt an integrated approach to protect these infrastructures from cyber threats. 9. RESULTS AND DISCUSSION The study revealed that three categories of people are involved in committing Cybercrime (The idealists, the Greed-motivated and the cyber-terrorist). It was equally gathered that these

categories of people have contributed in no small measure to cyber terrorism. During the course of interview, it was learnt that four out of twenty two people interviewed were victims of Cybercrime and seven others have their relatives affected in one way or the other. It is equally obvious that that Cybercrime committed against person, property and government have claimed millions of US Dollarv and has affected up to 56% of e-commerce globallyiv. Against this backdrop, the authors offered the recommendations in this paper as panacea for Cybercrime and Cybersecurity with a view of having a reliable and consistent internet usage in the world. 10. CONCLUSION Cybercrime and Cyber security has become a subject of great concern to all governments of the world. Nigeria, representing the single largest concentration of people of Africa decent has an important role to play. This situation has almost reached an alarming point, according to various studies and countries which neglects and /fail to respond timely and wisely, will pay very dearly for it. It has been deduced from this study that reliance on terrestrial laws is still an untested approach despite progress being made in many countries, they still rely on standard terrestrial laws to prosecute cyber crimes and these laws are archaic statutes that have been in existence before the coming of the cyberspace. Also weak penalties limit deterrence: countries with updated criminal statutes still have weak penalties on the criminal statutes; this can not deter criminals from committing crimes that have large-scale economic and social effect on the society. Also a global patchwork of laws creates little certainty; little consensus exist among countries regarding which crimes need to be legislated against. Self protection remains the first line of defense and a model approach is needed by most countries; especially those in the developing world looking for a model to follow. They recognize the importance of outlawing malicious computer-related acts in a timely manner or in order to promote a secure environment for e-commerce. Cyber-crime with its complexities has proven difficult to combat due to its nature. Extending the rule of law into the cyberspace is a critical


161

step towards creating a trustworthy environment for people and businesses. Since the provision of such laws to effectively deter cyber-crime is still a work in progress, it becomes necessary for individuals and corporate bodies to fashion out ways of providing security for their systems and data. To provide this self-protection, organizations should focus on implementing cyber-security plans addressing people, process and technology issues, more resources should be put in to educate employees of organizations on security practices, “develop thorough plans for handling sensitive data, records and transactions and incorporate robust security technology- -such as firewalls, anti-virus software, intrusion detection tools and authentication services--.”xiv 11. RECOMMENDATION By way of recommendations, these kinds of actions (both in form of security, education and legislation) are suggested following the weak nature of global legal protection against cyber crime: A. Legislation approach:

• Laws should apply to cyber-crime—National governments still are the major authority who can regulate criminal behavior in most places in the world. So a conscious effort by government to put laws in place to tackle cyber-crimes would be quite necessary.

• Review and enhance Nigeria cyber law to address the dynamic nature of cyber security threats;

• Ensure that all applicable local legislation is complementary to and in harmony with international laws, treaties and conventions;

• Establish progressive capacity building programmes for national law enforcement agencies;

• There should be a symbiotic relationship between the firms, government and civil society to strengthen legal frameworks for cyber-security. An act has to be crime in each jurisdiction before it can be prosecuted across a border. Nation must define cyber-crimes in similar manner, to enable them pass legislation that would fight cyber-crimes locally and internationally.

B. Security approach • Strengthening the trust framework,

including information security and network security, authentication, privacy and consumer protection, is a prerequisite for the development of the information society and for building confidence among users of ICTs;

• A global culture of Cyber security needs to be actively promoted, developed and implemented in cooperation with all stakeholders and international expert bodies;

• Streamlining and improving the co-ordination on the implementation of information security measures at the national and international level;

• Establishment of a framework for implementation of information assurance in critical sectors of the economy such as public utilities, telecommunications, transport, tourism, financial services, public sector, manufacturing and agriculture and developing a framework for managing information security risks at the national level;

• Establishment of an institutional framework that will be responsible for the monitoring of the information security situation at the national level, dissemination of advisories on latest information security alerts and management of information security risks at the national level including the reporting of information security breaches and incidents;

• Promote secure e-commerce and e-government services;

• Safeguarding the privacy rights of individuals when using electronic communications and

• Develop a national cyber security technology framework that specifies cyber security requirement controls and baseline for individual network user;

• Firms should secure their network information. When organization provides security for their networks, it becomes possible to enforce property rights laws and punishment for whoever interferes with their property.

C. Education/Research


162

• Improving awareness and competence in information security and sharing of best practices at the national level through the development of a culture of Cybersecurity at national level.

• Formalize the coordination and prioritization of cyber security research and development activities; disseminate vulnerability advisories and threat warnings in a timely manner.

• Implement an evaluation/certification programme for cyber security product and systems;

• Develop, foster and maintain a national culture of security standardize and coordinate Cybersecurity awareness and education programmes;

REFERENCES [1] iCyber Crime is here to stay. Indian Express,

January 2002 http://www.asianlaws.org/press/cybercrime.htm

[2] iiCyber-Crime… and Punishment? Archaic

Laws threaten Global Information. December, 2000 www.mcconnellinternational.com/services.cybercrime.htm,

[3] iiiGolubev’s interview. http://www.crime-

research.org/Golubev_interview_052004/ [4] ivProf. Hammond, Allen. The 2001 Council of

European Convention on Cyber-Crime: an Efficient Tool to Fight Crimes in Cyber-Space? June, 2001,

http://www.magnin.org/Publications/2001.06.SCU. LLMDissertation.PrHammond.

COEConvention.Cyber-crime.pdf

[5] The Menace Of Cyber Crime

http://www. The Menace Of Cyber Crime - Author - Anusuya Sadhu.htm

[6] viCybercrime and punishment

http://www.cyberp.org/Publications/2004.06.SCU.LLMDissertation.Pdf.

[7] viiTowards a culture Cybersecurity Research

http://www.cybesecurity.org/Research/2004.06. issertation.Pdf

[8] (Whitney 96) Whitney, D.E., Nevings, J.L. Defazn,

T.L., Gustavson, R.E. 1996. Problems and Issues in Cybersecurity and Cybercrime.

[9] ix(Scacchi & Mi 93) Scacchi, Walt, Mi, Peivei.

1993. Modelling, Integrating and Enacting Complex Organizational Process. 5th International Symposium on Cybercrime and Information Dssemination, CA, December.

[10] xCyber Crime is here to stay. Indian Express

January 2002 http://www.asianlaws.org/press/cybercrime.htm [11] xi Katz, Eli Unisys Suite Aims To Detect Criminal

patterns, June 10, 2003 [12]http://www.computerworld.com/industrytopics/financial/story/0,10801,81979,00.html [13] xiiKatz, Eli Unisys Suite Aims To Detect Criminal

patterns, June 10, 2003 [14]http://www.computerworld.com/industrytopics/financial/story/0,10801,81979,00.html [15] xiii Sinrod, J. Eric. What’s Up With Government

Data Mining? September 6, 2004, [16]http://www.ustoday.com/tech/columnist/ericjsinrod/2004-06-09-sinrod_x.htm [17] xivCyber-Crime… and Punishment? Archaic

Laws threaten Global Information. December, 2000 www.mcconnellinternational.com/services.cybercrime.htm, cyber crime\new\Cybercrime Article.htm

[18] (Schreiber et al. 95) Schreiber, G., Wielinga, B., Jansweijer, W. 1995. IJCAI Workshop on Eradicating Cybercrime in the World, August 19-20th.

[19] Douglas A. Barnes. Deworming the internet. Texas Law Review, 83: 279-329, November 2004.

[20] Aaron J. Burstein. Towards a culture of Cybersecurity research. Harvard Journal of Law and Technology, 22, 2008.

[21] S.E. Coull, M.P. Collins, C.V. Wright, F. Monrose, and M.K. Reiter. On web browsing privacy in anonymised netflows. In Proceedings of the 16th USENIX Security Symposium, pages 339-352, August 2007. [22] Symour E. Goodman and Herbert S. Lin, editors.


163

Towards a Safer and More Secure Cyberspace. National academies Press, 2007.

[23] Joseph P. Liu. The DMCA and the Regulation of Scientific Research. Berkley Technology Law Journal, 18: 501, 2003.

[24] Daniel J. Solove and Chris Jay Hoofnagle. A model regime of privacy protection. University of Illinois Law Review, pages 1083-1167, 2002

[25] Euguene Volokh. Crime-facilitating speech. Stanford Law Review, 57: 1095-1222. March 2005

[26] Mark Allman and Vern bPaxson. Issues and etiquette concerning use of shared measurement data. In proceedings of IMC ’07, pages 135-140. October 2007. [27] Braa, J. & Nermunk C. (2000). Health Information System in Mongolia: a difficult process of change. In C. Avgeuru 7 G. [28] Walsham (Eds.), Information Technology in Context, perspectives from developing countries, UK; Ashgate.


164

Knowledge Elecitation for Factors Affecting Taskforce Productivity– using a Questionnaire

Muhammad Sohail* Institute of Information Technology

Kohat University of Science & Technology (KUST) Kohat, Pakistan

[email protected]

Abdur Rashid Khan Institute of Computing & Information Technology

Gomal University Dera Ismail Khan, Pakistan

[email protected]

Abstract—in this paper we present the process of Knowledge Elicitation through a structured questionnaire technique. This is an effort to depict a problem domain as “Investigation of factors affecting taskforce productivity”. The problem has to be solved using the expert system technology. This problem is the very first step how to acquire knowledge from the domain experts. Knowledge Elicitation is one of the difficult tasks in knowledge base formation which is a key component of expert system. The questionnaire was distributed among 105 different domain experts of Public and Private Organizations (i.e. Education Institutions, Industries and Research etc) in Pakistan. A total of 61 responses were received from these experts. All the experts were well qualified, highly experienced and has been remained the members for selection committees a number of times for different posts. Facts acquired were analyzed from which knowledge was extracted and elicited. A standard shape was given to the questionnaire for further research as a knowledge acquisition tool. This tool may be used as a standard document for selection and promotion of employees.

Keywords- Expert System; Knowledge Acquisition; Knowledge Elicitation; Questionnaire; Taskforce Productivity; domain experts

I. INTRODUCTION In mid of sixties Artificial Intelligent community developed

the Expert Systems technology [16]. Expert systems were consists of tools used for decision making using the reasoning mechanisms of human expert in their area of expertise [1]. The use and development of expert system is very much in large number in different field of life. The knowledge base is the important component of the expert system [9]. The knowledge acquisition (KA) is the process of gathering and transformation of problem solving expertise from human expert through computer program [16]. The process of KA from Expert of the domain some how also called Knowledge Elicitation. The Knowledge Acquisition and Elicitation is the one of the difficult task of the expert system development [1, 11, 12, 16, and 17].

There are a number of KA techniques we found from literature [1, 8, 11, 12, and 16]. But the selection of KA technique depends upon the nature of problem, while accessibility and availability of the domain expert is another

aspect to be considered. In the problem domain questionnaire was used as a knowledge acquisition tool. This technique is the most suitable if knowledge has to be acquired from domain experts when reasonable period can be spared.

To solve the problem of “Finding factors affecting taskforce productivity”, domain experts were selected from Universities, Industries, Semi-Government and Private Organizations in Pakistan.

II. KNOWLEDGE ACQUISITION & ELICITATION Knowledge elicitation is known to be one of the major

bottle-necks in the knowledge base development [11].

Acquiring knowledge from human experts and learning from data is the knowledge elicitation [6]. The knowledge engineer works with the domain expert having the expertise in the specific domain area. The knowledge engineer applies the knowledge elicitation techniques to acquire knowledge. After that the knowledge is to be coded in computer format further to be used as a knowledge base for inferences, decisions and getting new knowledge.

A. Knowledge Elicitation Techniques The knowledge Elicitation techniques are used by the

knowledge engineer to acquire knowledge from human experts to solve problem. There are a number of Knowledge elicitation techniques, for example the interview, structured interview, questionnaire, protocol analysis, concept sorting, simulation and prototyping [1, 6, 7, and 17]. These techniques are adopted according to the nature of the problem.

III. APPROACH The objective of this study is to find out those factors which

truly affect the productivity of the taskforce. This needed a complete knowledge elicitation process. This study was completed through the steps like; Problem Identification, Domain Expert Selection, Adaptation of suitable Knowledge Elicitation Technique, Questionnaire design and distribution, Analysis and finally the Conclusion. In the problem domain experts were available only at remote areas of the country. Due

* This is a part of my MS research work


165

to the basic nature of the problem domain a questionnaire was selected for knowledge acquisition and elicitation.

A. Problem Identification The first step in knowledge elicitation was to identify the

factors affecting taskforce productivity. Various Organizations’ Annual Confidential Reports (ACR) was studied, domain experts were contacted and factors were chose for a complete crude questionnaire.

Literature study reveals that the productivity is directly proportional to the selection process. In order to control the human resources and quality in human resource management, the right talent selection is important. [14, 15]

With an effective personal selection process the organization can improve their productivity problem. It can be happen by selecting right talent for the right job.

Ref [2] stated that the factors like employee’s age, gender, marital status, educational background, and work experience predict the employee work performance and retention. They also emphasizes on the proper orientation and management of the selection process carried out by the company’s human resource department. Motivation theories emphasizes on the employee’s basic needs such as food, drink, sleeping hours, and on broader scenario the factors like security, love, status, self respect, growth and accomplishment etc. Factors like the supervision, relationship with the supervisors, salary, working conditions, company policy and administration, all affecting the productivity of organization and the individual performance[3,4,7].

Thus there is a need of a systematic and strong intelligent knowledge base that must assist us in selecting the right talent for right job and further to be used for decision making for the improvement of individual work performance.

B. The Selection Process Ref [2] has reviewed that factors like person, organization,

society, law, market influence, the nature and analysis of work behavior, and information technology achievements, affecting the selection process and the productivity of the human resources.

The application of expert system or Decision Support System on selection process for right talent chasing is increasing [5].

The organization objective are very simple, the right talent selection for long period of time giving improvements to the current status. Therefore the right talent selection is a goal to apply a valid and effective method to reduce the risk in selecting unsuitable person and increasing the opportunities to find an eligible employee who can enhance the productivity of the organization [2, 3, 7].

C. The Domain Expert Selection The next task was to select the experts of the problem area

for knowledge elicitation and problem solution. The collection of expert knowledge for the knowledge base is the main task of

the expert system development, and the important role is of the human expert in this scenario [9].

The domain problem is related with the areas of Psychology, Human Resource Management, Education, Industries, and almost all professions.

In order to set standard factors affecting taskforce productivity these experts were traced through 140 organizations of Pakistan. We found 112 domain experts along with their relevant information, experience, qualifications, contact addresses and areas of interests. Here one important issue was to find domain experts. They were found both in service and/or recently retired from their services. It is because the domain expert knowledge is the main objective [9]. See Table I for details.

TABLE I. CHARACTERISTICS OF DOMAIN EXPERTS

Domain Experts

Gender ─ Male ─ Female Designation & Nature of Job ─ Doctors ─ Engineers ─ Managers ─ Administrators ─ Academisians ─ Professors ─ Judges Qualification ─ Post Doc ─ PhD ─ M.Phil ─ Master ─ Bachellors (MBBS, Engineers) ─ Others

Subject Specialty ─ Law ─ Engineering ─ Education ─ Human Resource Management ─ Psychology ─ Computer Science ─ Statistics ─ Health ─ Agriculture ─ Others Demographics ─ Federal ─ Sindh ─ Punjab ─ NWFP ─ Balochistan

Therefore the experts of those areas are traced and contacted through various means of communication, like through personnel contacts, ordinary posts and emails.

D. Selection of Knowledge Elicitation Technique To acquire knowledge the knowledge elicitation technique

has to be applied. The literature study revealed that a number of techniques exist for knowledge elicitation, like structured


166

interview, questionnaire, protocol analysis, concept sorting, simulation and prototyping [1, 6, 7, and 17].

E. Questionnaire Design and Distribution The main purpose of the questionnaire was to sort out those

factors/competencies that affect the taskforce productivity. Blank Annual Confidential Report (ACRs), progress evaluation techniques, selection and promotion criteria’s were give due consideration during questionnaire development. Experts’ opinions of various fields of studies, like, Human Resource Management, Psychology, Education and Computer Science were taken into account. At last the questionnaire shaped into a more than 100 competencies / factors with five fuzzy logic variables and scales, defined as: 5= Strongly Agree, 4= Agree, 3= Neutral, 2= Disagree and 1= Strongly Disagree (APPENDIX-A). First questionnaire was checked by the linguistic experts. Some factors were found to be of the same meanings and related grades, therefore changes were made accordingly and resultantly achieved 67 factors. Questionnaire was distributed among 105 experts for evaluation to finalize the importance of the decision making factors. These experts were given rights of the insertion, deletion and editing of factors along with the grading to the fuzzy variables. A time of more than three months were given along with reminder from time to time. We got 61 experts opinions after four months struggle. Study resulted that 57 competencies and factors were of importance. Therefore we evaluated them and set them for further analysis. Original data may be seen in the MS thesis research report.

F. Analysis and findings We got 61 experts opinions in the form of filled

questionnaire with editing and suggestions. Final questionnaire was achieved after analysis as most of the factors were found very important while some of them were relatively of lesser importance. For example the factor, interest in job was found critically sound and of very importance than the reasoning capability factor. See TABLE II (Summary of the factors affecting taskforce productivity as APPENDIX-B). Similarly APPENDIX-C graphically depicts the analysis of experts’ opinions of the first and last five competencies.

IV. CONCLUSION AND RECOMMENDATIONS The resultant questionnaire may be used as a standard

model for promotion, job redesign, job rotation, monitoring & control and selection of employees, leading to smooth operations and sustained growth of governmental and private organizations.

It was concluded that if a questionnaire is properly designed and a systematic process is adopted then knowledge from multiple experts can be extracted easily even living at remote places. This tool may be used as a pedagogical device for students and researchers. This work is not limited to Pakistan but can be extended to all countries in the world.

REFERENCES

[1] Awad, E.M. “Building Experts Systems: Principals, Procedures, and Applications,” New York: West Publishing Company, 1996.

[2] Chien, C-F. & Chen, L-F. ”Data mining to improve personnel selection and enhance human capital: A case study in high-technology industry.” Expert Systems with Applications. vol. 34. pp.280–290, 2008.

[3] F. Herrera, E. López, C. Mendaña and M. A., Rodríguez, “A linguistic decision model for personnel management solved with a linguistic biobjective genetic algorithm,” Fuzzy Sets and Systems. vol. 118. no.1. pp. 47-64, 2001.

[4] Herzberg, F. “The Managerial Choice to be Efficient and to be human,” Homewood, IL: Dow Jones-Irwin, 1977.

[5] Hooper, R. S., Galvin, T . P., Kilmer, R. A., & Liebowitz, J .”Use of an expert system in a personnel selection process.” Expert Systems with Applications. vol. 14. no.4. pp. 425–432, 1998.

[6] Http://intsys.mgt.qub.ac.uk/notes/kelict.html#2.1 [7] J.M. Werner, “Implications of OCB and Contextual Performance for

Human Resource Management,” Human Resource Management Review. vol. 10, no.1. pp. 3-24, 2000.

[8] Khan, A.R. “Expert System for Investment Analysis of Agro-based Industrial Sector,” Bishkek –720001, Kyrgyz Republic, p 248, 2005.

[9] Lightfoot,J.M. ”Expert knowledge acquisition and the unwilling experts: a knowledge engineering perspective,” Expert system. vol. 16. no.3. pp.141-147, 1999.

[10] Maslow, A.H. “Motivation and Personality”, 2nd edn, New York: Harper & Row, 1970.

[11] Medsker, L. and Liebowits, J. “Design and Development of Experts Systems and Neural Networks,” New York: Macmillan Publishing Company, 1987.

[12] Sagheb-Tehrani, M.“The design process of expert systems development: some concerns,” Expert System. vol. 23. no. 2. pp. 116-125, 2006.

[13] Miles, J.C.; Moore, C.J.; Hooper, J.N. ”A structured multi expert knowledge elicitation methodology for the development of practical knowledge based systems,” Knowledge Engineering, IEE Colloquium. pp.6/1 - 6/3, 1990.

[14] M. Nussbaum, M. Singer, R. Rosas, M. Castillo, E. Flies, R. Lara and R. Sommers, “Decision support system for conflict diagnosis in personnel selection,” Information & Management, vol. 36. no. 1. pp. 55-62, 1999.

[15] R. Storey Hooper, T.P. Galvin, R.A. Kilmer and J. Liebowitz, “Use of an expert system in a personnel selection process,” Expert Systems with Applications, vol. 14. no.4. pp. 425-432, 1998.

[16] Wagner, W.P., Najdawi, M.K., Chung, Q.B. “Selection of Knowledge Acquisition techniques based upon the problem domain characteristics of the production and operations management expert systems,” Expert system. vol. 18. no. 2. pp. 76-87, 2001.

[17] Wright, M. P., Gardner, M.T. and Moynihan, M.L. “The impact of HR practices on the performance of business units,” Human Resource Management Journal, vol. 13. no.3. pp. 21-36, 2003.

AUTHOR’S PROFILE

Muhammad Sohail The author is currently pursuing his MS degree in Computer Science from the Institute of Information Technology, Kohat University of Science & Technology (KUST), Kohat, Pakistan. His area of interest includes; Expert System, DSS, Databases and Data Mining.

Abdur Rashid Khan The author is presently working as an Associate Professor at ICIT, Gomal University D.I.Khan, Pakistan. He received his PhD degree from Kyrgyz Republic in 2004. His research interest includes AI, Software Engineering, MIS, DSS and Data Bases.


167

APPENDIX-A A SURVEY TO INVESTIGATE FACTORS AFFECTING TASK FORCE PRODUCTIVITY

The purpose of this study is to explore the personal competency that affects job performance. Your feedback would be the base of our research. Therefore, it would be highly appreciated if you could kindly provide us the accurate / right information well in time to enable us to contribute our research work. Moreover, the information provided by you would be kept confidential and in completely secret.

Name: Designation:

Address:

Qualification: Gender:

Age: Field: Experience:

No

FACTORS Strongly Disagree

(1)

Disagree

(2)

Neutral

(3)

Agree

(4)

Strongly Agree

(5) 1 Power of thinking & logic 2 Analytical skill 3 Power of creative thinking 4 Ability to learn and understand 5 Self confidence 6 Ability to work in groups 7 Ability to work independently 8 Labor commitment 9 Decision-making ability under different situations 10 Academic record 11 Highly qualified 12 Presentation and communication skills 13 Regular and punctuality 14 Cooperation 15 Ability to solve problems 16 Ability to motivate others for work 17 Makes sacrifices for others 18 Patriotism and love for country and humanity 19 Care for the rights of others 20 Leadership qualities 21 Judgments and problem understanding 22 Abide by rules and regulations 23 Power of control distribution among others 24 Behaviors as a whole 25 Family background 26 Job knowledge 27 Awards 28 Grants 29 Resources Utilization 30 Power of honesty 31 Power of sincerity 32 Working environment (hotness, cold, peaceful, safety) 33 Availability of basic needs and incentives 34 Working efficiency and effectiveness 35 Care for standards 36 Professional approach to work 37 Keep his promise 38 Secrete Trust 39 Interest in job 40 Job satisfaction 41 Ability to utilize opportunity of job trainings and courses etc 42 Strong health physique 43 Want to live in a simple way or want luxuries 44 Working ability (hardworking or not?) 45 Religious minded 46 Liberal minded


168

No

FACTORS Strongly Disagree

(1)

Disagree

(2)

Neutral

(3)

Agree

(4)

Strongly Agree

(5) 47 Careful 48 Work experience 49 Knowledge of various technologies 50 Power of tolerance and patience 51 Power of awareness 52 Technical Skills 53 Information sharing capability 54 Management skill 55 Planning skills 56 Stress management 57 Abide by rules and regulations


169

APPENDIX-B TABLE II. RESPONSE SUMMARY FOR FACTORS AFFECTING TASKFORCE PRODUCTIVITY

(Total responses N= 61, 5= Strongly Agree, 1= Strongly Disagree)

Factors Mean Std. Deviation Interest in the job 4.74 0.4435 Power of sincerity 4.66 0.4791 Secrete Trust 4.66 0.4791 Professional approach to work 4.66 0.4791 Care for standards 4.64 0.4842 Keep his promise 4.64 0.4842 Knowledge of various technologies 4.62 0.5217 Power of thinking, logic 4.61 0.4926 Availability of basic needs and incentives 4.61 0.6653 Working ability (hardworking or not?) 4.57 0.4986 Planning skills 4.57 0.5310 Abide by rules and regulations 4.56 0.5331 Job satisfaction 4.56 0.5635 Leadership qualities 4.56 0.5635 Technical Skills 4.56 0.8067 Ability to utilize opportunity of job trainings and courses etc 4.54 0.7433 Information sharing capability 4.51 0.5361 Job knowledge 4.51 0.5951 Stress management 4.51 0.6487 Power of awareness 4.49 0.5951 work experience 4.49 0.6982 Judgment and problem understanding 4.49 0.7664 Makes sacrifices for others 4.49 0.7879 Resources Utilization 4.48 0.6978 Working efficiency and effectiveness 4.48 0.8681 Power of creative thinking 4.46 0.8078 Ability to solve problems 4.44 0.8471 Awards Holder 4.41 0.9377 Working environment (hotness, cold, peaceful, safety etc) 4.39 0.8996 Decision-making ability under different situations 4.38 0.7340 Patriotism and love for country and humanity 4.36 0.8570 Self confidence 4.36 0.8950 Academic record 4.36 0.9135 Cooperation 4.34 0.8344 Family background 4.34 0.8344 Power of tolerance and patience 4.34 0.9108 Power of honesty 4.33 0.8108 Careful 4.33 1.1651 Management skill 4.31 0.8858 Grant producer/Receiver 4.28 1.0022 Highly qualified 4.26 0.9815 Labor commitment 4.26 1.0149 Care for the rights of others 4.26 1.0472 Regular and punctuality 4.25 1.0433 Ability to motivate others for work 4.23 1.0707 Presentation and communication skills 4.21 1.0347 Ability to work independently 4.15 0.8131 Behavior as a whole 4.13 1.1471 Analytical skill 4.08 1.3576 Ability to learn and understand 4.07 0.9810 Religious minded 4.05 1.2169 Ability to work in groups 4.03 1.1250 Liberal minded 3.92 1.1874 Power of control distribution among others 3.82 1.2180 Strong health physique 3.49 1.4677 Want to live in a simple way or want luxuries 3.03 1.6730 Reasoning Capability 3.03 1.6730


170

000 39 61

5 8 5 33 49

02 8 43 48

000 34 66

000 39 61

Power of thinking logic

Behaviour as a whole

Power of honesty

Power of sincerity

Working efficiency andeffectiveness

12

34

5

Expert's Opinion in Percentage

%age (SD)%age (D)%age (N)%age (A)%age (SA)

8 5 10 52 25

2 7 15 38 39

5 7 11 34 43

7 7 11 26 49

15 16 7 30 33

02 8 43 48

26 25 2 15 33

Reasoning capability

Ability to learn andunderstand

Ability to work in groups

Is a religious minded ornot?

Strong health physique

Careful

Want to live in a simpleway or want luxuries

5152

5354

5556

57

%age (SD)%age (D)%age (N)%age (A)%age (SA)

APPENDIX-C Graphic Analysis of Experts’ Opinions


171

A Novel Generic Session Based Bit Level Encryption Technique to Enhance Information Security

Manas Paul,1 Tanmay Bhattacharya,2

1Sr. Lecturer, Department of Computer Application, JIS College of Engineering, Kalyani, West Bengal

e-mail:[email protected] 2Sr. Lecturer, Department of Information Technology, JIS

College of Engineering, Kalyani, West Bengal e-mail: [email protected]

Suvajit Pal,3 Ranit Saha4 3Student, Department of Information Technology, JIS College

of Engineering, Kalyani, West Bengal e-mail:[email protected]

4 Student, Department of Information Technology, JIS College of Engineering, Kalyani, West Bengal

e-mail:[email protected]

Abstract - In this paper a session based symmetric key encryption system has been proposed and is termed as Permutated Cipher Technique (PCT). This technique is more fast, suitable and secure for larger files. In this technique the input file is broken down into blocks of various sizes (of 2^n order) and encrypted by shifting the position of each bit by a certain value for a certain number of times. A key is generated randomly wherein the length of each block is determined. Each block length generates a unique value of “number of bits to be skipped”. This value determines the new position of the bits within the block that are to be shifted. After the shifting and inverting each block is XOR’ed with SHA-512 digest of the key. The resultant blocks from the cipher text. The key is generated according to the binary value of the input file size.

Decryption is done following the same process as the technique is symmetric. Keywords- Permutated Cipher Technique (PCT); Session Based; Number of Bits to Skip (NBSk); Maximum Iterations (MaxIter); Iterations done for encrypting (eIter); Iterations done for decrypting (dIter); Symmetric Key.

I. INTRODUCTION

In the age of science and technology everybody is using internet in almost every discipline of their daily life. Internet has made our communication faster and easier. Hence, maintaining the security of essential information is of utmost importance. Therefore many researchers are working in the field of encrypting the data transacted through internet. Encryption process converts the transacted data into cipher text which is otherwise illegible without the proper decrypting technique. Many algorithms based on various mathematical models are available, but each of them has their own share of merits and demerits. No single algorithm is perfect. As a result, continual researches are being made in the field of cryptography to enhance the security even further.

In this paper a new technique has been proposed, where the

source message is considered as a stream of binary bits whose positions are shifted [1] to create the encrypted message. The key is generated in a unique random manner [2, 3]

(cryptographically strong pseudo-random number generator RFC 1750) and all the variables involved are calculated on the basis of the random generated sequence of the numbers within the key. Section II of the paper discusses about the proposed scheme with block diagrams. Section III discussed about the key generation, the encryption and decryption procedure, section IV shows the results on different files and comparison of the proposed technique with TDES [4] and AES [5], section V & VI deals with conclusions & future scope respectively.

II. THE SCHEME The PCT algorithm consists of three major divisions:

• Key Generation • Encryption Mechanism • Decryption Mechanism

Key Generation:

Encryption Mechanism:

Decryption Mechanism:

Input File

Key (K)

Encrypted File

Input File

Key Generator

Key (K)


172

III. PROPOSED ALGORITHM

Key Generation Algorithm:

1. Total length of the input file is calculated and the corresponding binary representation of the above value is stored in an array.

2. Two index values of the array (mH, mL) are selected randomly.

3. The value of the content of mH cell be decreased by a certain number “x” and “x” is multiplied to 2(mH-mL) and added to the contents of mL cell. The steps are repeated for random number of times.

4. The index values are written into another array, repeating them the number of times of their cell value.

5. The contents of this new array are swapped for random number of times.

Encryption Algorithm:

1. A block size is read from key. If block size is sufficed from available byte input stream then proceed, else required padding be added to achieve block size.

2. Odd bits are flipped. 3. Now Consider bits only in even positions, let i denote the even position containing a bit.

For each even bit: • ith position bit from encrypted block is

extracted • The bit is set to the correct position. The

bit is set using the bitwise OR | operator. • i is increased by number of bits to skip

(NBSk). 4. Finally after the entire block in encrypted, we

XOR with SHA-512 [6] digest of the key. Decryption Algorithm:

1. A block size is read from key. The necessary variables like MaxIter is calculated

Encrypted File

Key (K)

Decrypted File

Example: 8 7 6 5 4 3 2 1

1 1 0 1 1 1 0 1

8 7 6 5 4 3 2 1 1 1 0 1 1 1 0 1

mH = 5; mL = 1 : a [mH] = 1-1 = 0; a [mL] = 1 x 2 ^ (mH-mL) + a [mL] a [mL] = 1 x 2 ^ (5-1) + 1 = 17

8 7 6 5 4 3 2 1 1 1 0 0 1 1 0 0

After repeating random no. of times we have something like: 8 7 6 5 4 3 2 1

3 5 4 17 8 9 11 12 New Array (writing the index values): 8 7 6 5 4 3 2 1

1 1 1 2 2 2 2 2 ….. Swapped New Array gives a random sequence: 8 7 6 5 4 3 2 1

2 4 2 1 1 5 4 2 …..


173

2. Again XOR ed with SHA-512 digest of the key. 3. Bits only in even positions are considered, if i

denote the even position containing a bit. Then for each even bit:

• ith position bit from encrypted block is extracted

• The bit is set to the correct position. The bit is set using the bitwise OR | operator.

• i is increased by number of bits to skip (NBSk).

4. Odd bits are fliped again to get back original bits in odd position

IV. RESULTS AND ANALYSIS

In this section the implementation of different types of files are presented. Files are chosen at random comprising of various file sizes and file types. Here we present analysis of 20 files of 8 different file types, varying file sizes from 330 bytes to 62657918 bytes (59.7MB) on three standard algorithms viz. Triple-DES 168bit, AES 128bit; and the proposed PCT algorithm. Analysis includes comparing encryption and decryption times, Chi-Square values, Avalanche and Strict Avalanche effects and Bit Independence. Implementation of all algorithms and different types of tests has been done using JAVA.

IV-I. ENCRYPTION AND DECRYPTION TIME

Table I & Table II shows the encryption and decryption time against increasing size of files for Triple-DES 168bit, AES 128bit and the proposed PCT techniques. For most of the file size and file types the proposed PCT takes less time to encrypt/decrypt compared to T-DES and nearly same time for decryption compared to AES technique. From table it’s seen that as the file size increases the proposed PCT performs much better than Triple-DES and its performance matches the AES technique. For even larger file, of size greater than 1GB the proposed PCT technique can even outperform AES. Fig. 1and 2 shows the graphical representation of the same in logarithmic scale.

Table I File size v/s encryption time

(for Triple-DES, AES and PCT algorithms)

Encryption Time (in seconds)

Source File Size (in bytes)

File type TDES AES PCT

1 330 dll 0.001 0.001 0.004

2 528 txt 0.001 0.001 0.008

3 96317 txt 0.034 0.004 0.020

4 233071 rar 0.082 0.011 0.062

5 354304 exe 0.123 0.017 0.081

6 536387 zip 0.186 0.023 0.133

7 657408 doc 0.220 0.031 0.234

8 682496 dll 0.248 0.031 0.066

9 860713 pdf 0.289 0.038 0.114

10 988216 exe 0.331 0.042 0.155

11 1395473 txt 0.476 0.059 0.165

12 4472320 doc 1.663 0.192 0.371

13 7820026 avi 2.626 0.334 0.651

14 9227808 zip 3.096 0.397 0.474

15 11580416 dll 4.393 0.544 0.792

16 17486968 exe 5.906 0.743 1.884

17 20951837 rar 7.334 0.937 1.578

18 32683952 pdf 10.971 1.350 2.077

19 44814336 exe 15.091 1.914 2.974

20 62657918 avi 21.133 2.689 5.870

Table II File size v/s decryption time

(for Triple-DES, AES and PCT algorithms)

Decryption Time (in seconds)



1 330 dll 0.001 0.001 0.002

2 528 txt 0.001 0.001 0.006

3 96317 txt 0.035 0.008 0.027

4 233071 rar 0.087 0.017 0.056

5 354304 exe 0.128 0.025 0.069

6 536387 zip 0.202 0.038 0.058

7 657408 doc 0.235 0.045 0.198

8 682496 dll 0.266 0.046 0.128

9 860713 pdf 0.307 0.060 0.088

10 988216 exe 0.356 0.070 0.130

11 1395473 txt 0.530 0.098 0.298

12 4472320 doc 1.663 0.349 0.482

13 7820026 avi 2.832 0.557 0.594

14 9227808 zip 3.377 0.656 0.448

15 11580416 dll 4.652 0.868 0.871

16 17486968 exe 6.289 1.220 1.575

17 20951837 rar 8.052 1.431 1.803

18 32683952 pdf 11.811 2.274 3.312

19 44814336 exe 16.253 3.108 2.948

20 62657918 avi 22.882 4.927 5.300


174

Fig. 1. Encryption Time (sec) vs. File Size (bytes) in logarithmic scale

Fig. 2. Decryption Time (sec) vs. File Size (bytes) in logarithmic scale

IV-II. STUDIES ON AVALANCHE, STRICT AVALANCHE EFFECTS AND BIT INDEPENDENCE

CRITERION.

Avalanche, Strict Avalanche effects and Bt Independence criterion are measured using statistical analysis of data. The bit changes among encrypted bytes for a single bit change in the original message sequence for the entire or a relative large number of bytes. The Standard Deviation from the expected values is calculated. We subtract the ratio of calculated standard deviation with expected value from 1.0 to get the Avalanche and Strict Avalanche achieve effect on a 0.0 - 1.0 scale. The closer the value is to 1.0, we achieve better Avalanche, Strict Avalanche effects and Bit Independence criterion. We take into consideration up to 5 significant digits to the right of the decimal point for more accurate interpretation. For better visual interpretation from the graphs the y-axis scale for Avalanche and Strict Avalanche effects are from 0.9 to 1.0 Table III, Table IV and Table V shows the Avalanche, Strict Avalanche and Bit Independence respectively. Fig. 3, 4 and 5 shows the graphical representation of the same.

Table III Avalanche effect for TDES, AES and PCT algorithms.

Avalanche achieved


File type

TDES AES PCT

1 330 dll 0.99591 0.98904 0.96537

2 528 txt 0.99773 0.99852 0.97761

3 96317 txt 0.99996 0.99997 0.99083

4 233071 rar 0.99994 0.99997 0.99558

5 354304 exe 0.99996 0.99999 0.99201

6 536387 zip 0.99996 0.99994 0.99671

7 657408 doc 0.99996 0.99999 0.99438

8 682496 dll 0.99998 1.00000 0.99285

9 860713 pdf 0.99996 0.99997 0.99485

10 988216 exe 1.00000 0.99998 0.98785

11 1395473 txt 1.00000 1.00000 0.99452

12 4472320 doc 0.99999 0.99997 0.99100

13 7820026 avi 1.00000 0.99999 0.99555

14 9227808 zip 1.00000 1.00000 0.99999

15 11580416 dll 1.00000 0.99999 0.99865

16 17486968 exe 1.00000 0.99999 0.99908

17 20951837 rar 1.00000 1.00000 1.00000

18 32683952 pdf 0.99999 1.00000 0.99996

19 44814336 exe 0.99997 0.99997 0.99986

20 62657918 avi 0.99999 0.99999 0.99994

Table IV Strict Avalanche effect for TDES, AES & PCT algorithms.

Strict Avalanche achieved

Source File Size (bytes)


1 330 dll 0.98645 0.98505 0.89638

2 528 txt 0.99419 0.99311 0.96829

3 96317 txt 0.99992 0.99987 0.97723

4 233071 rar 0.99986 0.99985 0.99236

5 354304 exe 0.99991 0.99981 0.98872

6 536387 zip 0.99988 0.99985 0.99552

7 657408 doc 0.99989 0.99990 0.99121

8 682496 dll 0.99990 0.99985 0.98606

9 860713 pdf 0.99990 0.99993 0.98800

10 988216 exe 0.99995 0.99995 0.97259

11 1395473 txt 0.99990 0.99996 0.99123

12 4472320 doc 0.99998 0.99995 0.98320

13 7820026 avi 0.99996 0.99996 0.99167

14 9227808 zip 0.99997 0.99998 0.99997

15 11580416 dll 0.99992 0.99998 0.99780

16 17486968 exe 0.99996 0.99997 0.99847

17 20951837 rar 0.99998 0.99996 0.99996

18 32683952 pdf 0.99997 0.99998 0.99992

19 44814336 exe 0.99991 0.99990 0.99982

20 62657918 avi 0.99997 0.99998 0.99989


175

Table V

Bit Independence criterion for TDES, AES & PCT algorithms. Bit Independence achieved

Source File Size ( bytes)


1 330 dll 0.49180 0.47804 0.39186

2 528 txt 0.22966 0.23056 0.20963

3 96317 txt 0.41022 0.41167 0.42861

4 233071 rar 0.99899 0.99887 0.98364

5 354304 exe 0.92538 0.92414 0.93465

6 536387 zip 0.99824 0.99753 0.99265

7 657408 doc 0.98111 0.98030 0.97279

8 682496 dll 0.99603 0.99560 0.96586

9 860713 pdf 0.97073 0.96298 0.96726

10 988216 exe 0.91480 0.91255 0.92929

11 1395473 txt 0.25735 0.25464 0.24621

12 4472320 doc 0.98881 0.98787 0.95390

13 7820026 avi 0.98857 0.98595 0.96813

14 9227808 zip 0.99807 0.99817 0.99804

15 11580416 dll 0.86087 0.86303 0.86049

16 17486968 exe 0.83078 0.85209 0.85506

17 20951837 rar 0.99940 0.99937 0.99936

18 32683952 pdf 0.95803 0.95850 0.95785

19 44814336 exe 0.70104 0.70688 0.82622

20 62657918 avi 0.99494 0.99451 0.99664

Fig. 3. Comparison of Avalanche effect between TDES, AES

and PCT

Fig. 4. Comparison of Strict Avalanche effect between TDES, AES and PCT

Fig. 5. Comparison of Bit Independence criterion between TDES, AES and PCT

IV-III. ANALYSIS OF CHARACTER FREQUENCIES

Distribution of character frequencies are analyzed for text file for T-DES, AES and the proposed PCT algorithms. Fig. 9 shows the pictorial representation of distribution of character frequencies for different techniques. Fig. 6(a) shows the distribution of characters in the source file ‘redist.txt‘ (size – 1395473bytes). Fig. 6(b), 6(c) shows the distribution of characters in encrypted files for T-DES and AES respectively. Fig. 6(d) gives the distribution of characters in encrypted file using PCT. All the standard techniques and the proposed PCT technique show a distributed spectrum of characters. From this observation it may be conclude that the proposed technique may obtain good security.


176

Fig. 6(a). Distribution of characters in source file

Fig. 6(b): Distribution of characters in TDES

Fig. 6(c). Distribution of characters in AES

Fig. 6(d). Distribution of characters in PCT

Figure 6 – Frequency Spectrum under different encryption

techniques. IV-IV. TESTS FOR NON-HOMOGENEITY

The well accepted parametric test has been performed to test

the non-homogeneity between source and encrypted files. The large Chi-Square values may confirm the heterogeneity of the source and the encrypted files. The Chi-Square test has been performed using source file and encrypted files for PCT

technique and existing TDES and PCT techniques. Table VI shows the values of Chi-Square for different file sizes,.

From Table VI we may conclude that chi-square values depend on the file types as well as file sizes. Text files generally show a large value for chi square. In all cases the chi-square values of the proposed technique is at par with the values from the standard TDES and AES encryptions. The graphical representation of chi-square values are given in Fig. 7 on a logarithmic scale.

Table VI Chi-Square values

ChiSquare



1 330 dll 922.36 959.92 895.98

2 528 txt 1889.35 1897.77 1940.65

3 96317 txt 23492528.45 23865067.21 20194563.10

4 233071 rar 997.78 915.96 973.19

5 354304 exe 353169.83 228027.38 176908.02

6 536387 zip 3279.56 3510.12 3362.39

7 657408 doc 90750.68 88706.29 88074.14

8 682496 dll 29296.79 28440.42 26668.99

9 860713 pdf 59797.35 60661.50 56190.58

10 988216 exe 240186.48 245090.50 257426.16

11 1395473 txt 5833237390.99 5545862604.40 6778413715.96

12 4472320 doc 102678.48 102581.31 99973.55

13 7820026 avi 1869638.73 1326136.98 808552.25

14 9227808 zip 37593.98 37424.24 36755.77

15 11580416 dll 28811486.61 17081530.73 13773034.65

16 17486968 exe 8689664.61 8463203.56 8003002.32

17 20951837 rar 25615.74 24785.41 26517.84

18 32683952 pdf 13896909.50 13893011.19 15313939.92

19 44814336 exe 97756312.18 81405043.92 500344725.05

20 62657918 avi 3570872.51 3571648.48 4898122.07

Fig.7. Chi-Square Values for TDES, AES and PCT techniques

in logarithmic scale. V. CONCLUSION

The PCT Algorithm has been developed keeping

“randomness” and “unpredictability” in mind. It essentially works on session key concepts, with a Key and a virtual second key, derived from the main key. Standard language

Frequency

Frequency

Frequency

Frequency F

re quency


177

based secure random number generating functions (RFC 1750) (for ex : class SecureRandom()[1] in java library ) are based on a certain specific algorithm and formula available for anyone to understand through the open source community – that guarantees generation of unpredictable random numbers giving an extra edge in security for key generation. Still, the PCT algorithm does not totally depend on the random numbers and incorporates a unique key generation algorithm which is partially independent of standard sequence of random - random numbers.

The proposed technique presented in this paper is simple, easy to implement. The key space increases with increase in file size. The major necessities of a good block cipher viz. Avalanche, Strict Avalanche effects and Bit Independence criterion is satisfied and is comparable with industry standards Triple-DES and AES algorithms. The shifting of bits itself gives a high degree of diffusion, and the final XOR with SHA-512 digest of the key gives desired confusion levels.. Performance improves drastically for larger files and is at par or better than AES encryption for very large files and is significantly better than TDES algorithms.

VI. FUTURE SCOPE A lot of experiments can be done from the basic shifting of

bits, assigning values to MaxIter with corresponding block lengths. For example we may flip even bits and shift odd bits or shift both even and odd bits, one to right and other to left. An increase in the number of iterations, will increase the security and give better results in terms of Avalanche, Strict Avalanche and Bit Independence criteria’s, at the cost of increased encryption time. A correct balance between security and time complexity should be the goal.

REFERENCES

[1] J.K. Mandal, S. Dutta, A Universal Bit-level Encryption Technique, Seventh Vigyan Congress, Jadavpur University, India, 28Feb to Ist March, 2000

[2] J.K. Mandal, P.K. Jha, Encryption Through Cascaded Recursive Key Rotation of a Session Key with Transposition and Addition of Blocks (CRKRTAB), Proceed. National Conference of Recent Trends in Information Systems (ReTIS-06), IEEE Calcutta Chapter & Jadavpur University, 14-15 July, 2006.

[3] J.K. Mandal, P.K. Jha, Encryption through Cascaded Arithmetic Operation on Pair of Bits and Key Rotation (CAOPBKR), National Conference of Recent Trends in Intelligent Computing (RTIC-06), Kalyani Government Engineering College, Kalyani, Nadia, 17-19 Nov. 2006, India, pp212-220.

[4] “Triple Data Encryption Standard” FIPS PUB 46-3 Federal Information Processing Standards Publication, Reaffirmed, 1999 October 25 U.S. DEPARTMENT OF

COMMERCE/National Institute of Standards and Technology. [5] “Advanced Encryption Standard”, Federal Information Processing Standards Publication 197, November 26, 2001 [6] “Secure Hash Algorithm – 512”, Federal Information Processing Standards Publication 180-2, 2001.


Vol. 3, No. 1, 2009

178

An Application of Bayesian classification to Interval Encoded Temporal mining with

prioritized items

C.Balasubramanian Department of Computer Science and Engineering

K.S.Rangasamy College of Technology Namakkal-Dt., Tamilnadu, India [email protected]

Abstract - In real life, media information has time attributes either implicitly or explicitly known as temporal data. This paper investigates the usefulness of applying Bayesian classification to an interval encoded temporal database with prioritized items. The proposed method performs temporal mining by encoding the database with weighted items which prioritizes the items according to their importance from the user’s perspective. Naïve Bayesian classification helps in making the resulting temporal rules more effective. The proposed priority based temporal mining (PBTM) method added with classification aids in solving problems in a well informed and systematic manner. The experimental results are obtained from the complaints database of the telecommunications system, which shows the feasibility of this method of classification based temporal mining. Keywords: Encoded Temporal Database; Weighted items; Temporal Mining; Priority Based Temporal Mining (PBTM); Naïve Bayesian classification

I. INTRODUCTION

The use of maintaining large databases is in the meaningful information extracted from them. Databases are large collections of transactions present in organizations. Data mining deals with retrieving valuable information from enormous amount of data stored in such databases [1]. These transactions may have occurred at varied time points. Mining giving importance to the time at which transactions take place is called temporal mining. Temporal mining is to cluster the data based on time and then determine the association rules [2].

A temporal database consists of transactions of the form:

<Transaction-ID, items, valid time>

A temporal association rule is a binary form (AR, TimeExp), in which the left-hand side element “AR” is an association rule expressed as,

*Corresponding author .Tel. +91 4288 274741; fax: +91 4288 274745, E-mail addresses:[email protected] (C.Balasubramanian),

Dr.K.Duraiswamy

Department of Computer Science and Engineering K.S.Rangasamy College of Technology

Namakkal-Dt., Tamilnadu, India [email protected]

X⇒Y, X ⊂ I, Y ⊂ I, X ∩ Y = ∅

Where “TimeExp” is a time expression [3].

aData mining especially association rule discovery tries to find interesting patterns from databases that represent the meaningful relationships between items. Association rule mining applied to large databases consumes more time because every transaction has to be scanned atleast once for any mining method which is applied. As a solution to this problem, an encoding method is considered, which will reduce the size of the database and hence the processing time required for mining becomes less [4].

In this paper, a method of encoding is proposed for a temporal database and thereafter temporal mining is performed using the Priority Based Temporal Mining method. In this method encoding is performed based on the valid time and on the weight assigned to the items in the particular transaction. This minimizes the amount of data processed while the database is scanned for association rules mining. Association rule mining giving varying importance to the different items of the transactions is called weighted mining [5]. Priority based temporal mining involves mining based on the weights assigned to the items in the transaction according to their importance and based on the time at which the transaction had taken place. This method gives better results in terms of time and computation complexity.

Naïve Bayesian classification assumes class conditional independence to simplify the computations involved [6]. In the proposed method this is used to make the resulting temporal association rules more effective in finding solutions.


Vol. 3, No. 1, 2009

179

The rest of the paper is organized as follows. Section 2 gives a summary of the research works carried out in areas related to temporal mining and classification. Section 3 describes the proposed classification based temporal mining methodology. Section 4 briefs the method of data preparation, which involves defining the valid time interval for encoding and temporal mining. Section 5 presents the explanation of the encoding method, which involves the merging concept. Section 6 explains how frequent itemsets are generated using the method under consideration. Section 7 presents the process of association rule mining from an encoded temporal database involving the expansion concept. Section 8 outlines the application of naïve Bayesian classification to temporal mining. Section 9 gives the performance of the above mentioned method when applied on the complaints database of a telecommunications system. Finally, in Section 10 the conclusion and future work are stated, which includes the best features of the described method and the ways in which it can be further improved.

II. RELATED RESEARCH WORKS

Association rule identification is an integral part of any data mining process. An association is said to be present among the items if the presence of some items also means the presence of some other items [7]. Several mining algorithms have been proposed to find the association rules from the transactions [7][8][9][10]. The large itemsets were identified to find the association rules. First, the itemsets which satisfy the predefined minimum support were identified and these were called the large itemsets. Then, the association rules were identified from these itemsets. The association rules which satisfy the predefined minimum confidence were the association rules produced as the output [7]. Also, in the Graph-Based approach for discovering various types of association rules based on large itemsets, the database was scanned once to construct an association graph and then the graph was traversed to generate all large itemsets [11]. This method avoids making multiple passes over the database.

In addition to the above mentioned method of association rule mining, which overlooks time components that are usually attached to transactions in databases, the concept of temporal mining was proposed giving importance to the time constraints [3]. The concept of valid time was used to find out the time interval during which a transaction is active. Time interval expansion and mergence was performed which gives importance to the time at which a transaction had taken place,

before the application of the graph mining algorithm [11] to identify the temporal association rules. For discovering association rules from temporal databases [12], the enumeration operation of the temporal relational algebra was used to prepare the data. The incremental association rule mining technique was applied to a series of datasets obtained over consecutive time intervals to observe the changes in association rules and their statistics over the time. Temporal issues of association rules was addressed with the corresponding algorithms, language and system [13][14] for discovering temporal association rules.

Further, to mine rules based on the priority assigned to the elements, weighted mining was proposed to reflect the varying importance of different items [5]. Each item was attached a weight, which was a number given by users. Weighted support and weighted confidence was then defined to determine interesting association rules.

In general, a fuzzy approach leads to results which are similar to human reasoning. A fuzzy approach involving the enhancement over AprioriTid algorithm was identified, which had the advantage of reduced computational time [15]. Also, in the mining of fuzzy weighted association rules [16], great importance had been given to the fuzzy mining concept.

Application of supervised and unsupervised learning approaches and a study of different machine learning algorithms for classification help in applying classification to association rule mining. According to the approach in [17], a user account had been split into normal and fraudulent activities using a detailed day-by-day characterization. The area under the curve was used as the statistic that exhibits the classification performance of each case. Accumulated in time, characteristics of a user yield discriminated results. The approach in [18] begins by studying the content of a large collection of emails which have already been classified as spam or legitimate email. The four machine learning methods for anti-spam filtering are discussed. An empirical evaluation for them is based on the benchmark of spam filtering. The approach in [19] predicts the itemset. In classifying a test object, the procedure uses a simple approach, which states that the first rule in the set of ranked rules that matches the test object condition classifies.This paper proposes a classification based encoded weighted temporal mining method. This identifies temporal association rules from an encoded temporal database with weighted items. The temporal database consists of transactions with their corresponding valid time intervals. The proposed method classifies the resulting association rules


Vol. 3, No. 1, 2009

180

in such a way that the methodology for obtaining the solution is made easy.

III. THE PROPOSED CLASSIFICATION BASED

TEMPORAL MINING METHODOLOGY

The proposed methodology for classification based temporal mining is depicted in Fig.1 as follows

Fig.1.The proposed classification based temporal mining methodology

The proposed methodology for classification based temporal mining has the following steps.

1. Data preparation: It describes the way in which the time intervals are set up. The choice of the time

depends upon the user and the application. 2. Encoding of database: This step encodes the

temporal database for the given time in such a way that redundancy is avoided by merging.

3. Frequent itemset generation: This identifies the large itemsets by applying the priority based temporal mining method.

4. Temporal association rule mining: It identifies the association rules in each interval and applies transitive property to find out the relationship between the associations rules resulting from different time intervals. It also expands the time intervals if the time intervals are continuous and hold the same association rules which satisfy the minimum confidence.

5. Naïve Bayesian classification: It identifies the probability for the classes involved which leads to temporal rules for easy solutions.

IV. DATA PREPARATION In this step, depending upon the user’s choice and the application under consideration, the length of the interval during which data is accumulated, and the number of such intervals is decided. Then, the subset of the transactions in the temporal database which conform to each of the time interval is extracted. The application under consideration is the addressing of complaints from a complaints database of the telecommunications department.

V. THE ENCODING OF DATABASE

The most common approach to interval encoding of temporal databases is to use intervals as codes for one dimensional set of time instants[3][12]. The choice of this representation is based on the following empirical observation: Sets of time instants describing the validity of a particular fact in the real world can be often described by an interval or a finite union of intervals. For simplicity, a discrete integer-like structure of time is assumed. However, dense time can also be accommodated by introducing open intervals.

Let Interval-based Domain be TI and let TP = (T,<) be a discrete linearly ordered point-based temporal domain. The set I (T) is defined as

I(T) = {(a, b) : a ≤ b, a ∈ T ∪{- ∞}, b ∈T ∪ { ∞}}

Data Presentation

Encoding of Database

Support Confidence

Frequent itemset

identification

Temporal rules

Rules for easy solution

Naive Bayesian

classification

Temporal association

rule mining

Time Constraints


Vol. 3, No. 1, 2009

181

where < is the order over TP extended with {(- ∞, a),( a,∞ ),

(- ∞, ∞): a ∈T}. The elements of I(T) are denoted by [a,b] which is the usual notation for intervals. The four relations on the elements of I(T) denoted by [a,b] are defined as follows:

([a, b] < -- [a', b']) ⇔ a< a'

([a, b] < +- [a', b']) ⇔ b< a'

([a, b] < -+ [a', b']) ⇔ a< b'

([a, b] < ++ [a', b']) ⇔ b< b'

for [a, b], [a', b'] ∈ I(T).

The structure TI = (I(T), <--, < +- , <-+, <++) is the interval based temporal domain corresponding to TP.

A concrete (timestamp) temporal database is defined analogously to the abstract (timestamp) temporal database. The only difference is that the temporal attributes range over intervals (TI) rather than over the individual time instants (TP).

For the priority based temporal mining method the two levels of encoding are encoding based on the valid time, and encoding based on the weight assigned to the items in the transaction under consideration. Every complaint is assigned a unique number (weight) which denotes the priority of the complaint in terms of its critical nature. For example: A-0.1, B-0.2, C-0.3, D-0.4, E-0.5 and so on. The encoding of the database for this method is as follows in table 1and table 2.

TABLE 1: ENCODED DATABASE AT VALID TIME INTERVAL D1

Tid Complaint weights W Count Weighted Support

=∑Wx(Count+1)

01 0.2 3 0.6

02 0.1 2 0.2

03 0.3 1 0.3

TABLE 2: ENCODED DATABASE AT VALID TIME INTERVAL D2

Tid Complaint weights W Count

Weighted Support

=∑Wx(Count+1)

06 0.1 2 0.2

07 0.5 4 2.0

09 0.4 3 1.2

The count field is zero if the set of complaint codes occurs only once. If the same set of complaints occurs more than once then merging takes place i.e. the count field is incremented by 1 for each repetition and the set of complaints appears only once in the database. Thus redundancy is avoided. The value in the count field +1 gives the support value for the specific set of complaints which is compared with the predefined minimum support value to identify the large itemsets. In this method, the weight of the complaint is given instead of the code according to the priority. The candidate itemsets satisfying bounded support are checked for their weighted support. The weighted support is calculated as the product of the summation of the weights and the corresponding count+1 value for each large itemset (obtained from the table). This weighted support is compared with the weighted minimum support value to identify the large itemsets.

VI. FREQUENT ITEMSET GENERATION

An association rule that strictly satisfies both minimum support threshold and minimum confidence threshold is called as a strong association rule. Similarly, a strong temporal association rule is defined as follows. Let min_s and min_c represent minimum support threshold and minimum confidence threshold respectively. If and only if support ≥ min_s, and confidence ≥ min_c, during [ts, te], rule X⇒Y is a temporal association rule, which could be described as,

X⇒Y (support, confidence, [ts, te])

Itemset means the collection of items. If there are k items, it is called as k-itemset. The itemset that satisfies min_s is called frequent itemset.

Given a database with T transactions belonging to a specified duration [ts, te ], the bounded support (BS) of a large k-itemset X is defined to be the transaction number containing X within the specified valid time, and it must satisfy eqn(1) given by:

T BS(X) = (∑ Counti +T) x wmnspt (1) i =1

∑ Wi

∀i∈X


Vol. 3, No. 1, 2009

182

where Wi is the summation of the weights of all the items in large k-itemset X in a specific duration [ts, te ]. The value of count is taken from the table and its summation for a particular duration of time added with T gives the total number of transactions in the specific time duration. The threshold value for the weighted minimum support is denoted by wmnspt which is defined by the user. The bounded support value according to eqn(1) is calculated for each large k-itemset so that the itemset which are not necessary for further calculations can be avoided. The bounded support calculations are done so that the necessary pruning may be performed and time may be saved. The weighted minimum support is defined by eqn (2) as,

WS(X) = ∑ Wi x (Count +1) (2) ∀i∈X

where ∑ Wi is the sum of the weights of the items ∀i∈X present in itemset X in a given valid time [ts, te ]. Count + 1 gives the number of transactions containing the specific itemset within the above mentioned time interval [ts, te]. The large k-itemsets obtained as a result of pruning will be checked for its weighted support as follows: If the weighted support of an itemset is greater than the threshold value of the weighted minimum support, i.e. if weighted support ≥ wmnspt, then it is considered as a candidate for the next step.

Thus the large itemsets for a specific time interval, for transactions with prioritized items, have been identified and these are the itemsets which will be used for generating the association rules by using the minimum confidence threshold value denoted as min_c.

VII. TEMPORAL ASSOCIATION RULE MINING

Association rules are generated from the large itemsets which satisfy the user defined minimum confidence min_c. The confidence of the association rule X⇒Y is the probability that Y exists given that a transaction contains X and is given by eqn(3) as,

Pr(Y \X) = Pr(X ∪ Y) (3) Pr(X) In large databases, the support of X⇒Y is taken as the fraction of transactions that contain X∪Y. The confidence of X⇒Y is the number of transactions containing both X and Y divided

by the number of transactions containing X.

In the case of priority based temporal mining, the confidence value is represented by eqn(4) as,

Confidence = Weighted Support(X∪Y) (4) Weighted Support(X) The weighted support value is obtained from the table. This confidence value has to be greater than or equal to the predefined minimum confidence threshold for the corresponding large itemset to be included as an association rule in the output.

The property of transitivity and the concept of expansion of time intervals are considered to show the relationship between the different valid time intervals taking into account the resulting association rules from each of the time intervals. This new concept gives importance to the valid time interval as well as to the relationship between the time intervals. The temporal association rules produced are found to be highly related with the time constraints involved.

VIII. NAÏVE BAYESIAN CLASSIFICATION

Classification is one of the most typical operations in supervised learning, but hasn’t deserved much attention in temporal data mining. In fact, a comprehensive search of applications based on classification has returned relatively few instances of actual uses of temporal information. Since traditional classification algorithms are difficult to apply to sequential examples, an interesting improvement consists on applying a pre-processing mechanism to extract relevant features. One approach is idea consists on discovering frequent subsequences, and then using them, as the relevant features to classify sequences with traditional methods, like Winnow.

Classification is relatively straightforward if generative models are employed to model the temporal data. Deterministic and probabilistic models can be applied in a straightforward way to perform classification. The Naïve Bayesian classification is based on probabilities represented by P and assumes class conditional independence. That is, it presumes that the values of the attributes are conditionally independent of one another given the class label of the tuple. The naïve Bayesian classifier predicts that tuple X belongs to the class Ci if and only if

P(Ci \X) > P(Cj \X) for 1≤ j≤ m, j≠ i


Vol. 3, No. 1, 2009

183

The class Ci for which P(Ci\X) is maximized is called the maximum posteriori hypothesis (eqn (5))

P(Ci \X) = P(X\ Ci)P(Ci) (5) P(X) No dependence relationships among the attributes is emphasized by eqn (6) given below.

n P (X\ Ci) = Π P(xk\ Ci) (6) i=1 Using the above mentioned classification, the resulting temporal rules are made effective and it paves the way for easy solving methodologies.

IX. PERFORMANCE EVALUATION

The experimental results for the proposed classification based temporal mining methodology are obtained by considering a temporal database of 25,000 records which are transactions containing the complaints from customers with different valid time intervals. The complaints in each transaction are assigned priority by assigning unique number to each complaint according to the importance of the complaint from the user’s perspective. This large database is encoded by using interval encoding of temporal databases.

Database encoding has been applied to a static database prior to the application of Apriori or anti Apriori. To make it scalable, the same has been applied to a dynamic database, which involves time constraints. Considering the telecommunication temporal database, which addresses complaints, performance of the Apriori family of algorithms and the Anti-Apriori algorithm is as given in Fig 2.

Fig.2.Performance of Apriori family algorithms

Fig3 below shows the usage of memory before and after encoding. It is also found that the encoding method leads to faster generation of temporal association rules.

Fig. 3.Memory usage graph

The classification based temporal mining methodology has better performance in terms of the logic used. The performance of this method increases with increase in the number of tuples within a given time interval. The computational complexity decreases with increase in the number of transactions. This is depicted in fig4 as follows.

Fig.4.Performance of classification based temporal mining

X. CONCLUSION AND FUTURE ENHANCEMENTS

Classification based temporal mining which involves assigning priorities, always leads to more advantages than other concepts which treat all items uniformly since prioritizing reduces the time involved. In a similar manner prioritizing the items and then encoding the temporal database has lead to lesser complexities of time and computation. Introduction of Bayesian classification in temporal mining has lead to more effective temporal rules. A very important fact is that these results are obtained by including the time constraints (i.e.) within the specified valid time interval. All applications


Vol. 3, No. 1, 2009

184

that are time based need to satisfy the real time constraints. Hence applying classification based temporal mining which involves the important concepts of priority, encoding, valid time interval, temporal mining and classification to real time applications which has response time constraints will improve the performance measures in a sharp and distinct manner.

This work may be further extended to improve the performance of systems that involve real time data in the form of audio, video and other multimedia objects which are stored as data items in a database with valid time constraints i.e. a temporal database of transactions containing complex form of data. Also classification tools can be used to categorize the results obtained in order to identify the methodology of solving problems in a systematic way with less complexity.

REFERENCES

[1] H. Mannila, “Methods and problems in data mining”, The International Conference on Database Theory, 1997.

[2] M.H.Dunham, “Data Mining Introductory and Advanced Topics”, Tsinghua University Press, Beijing, 2003.

[3] H.Ning, H.Yuan, S. Chen, “Temporal Association Rules in Mining Method”, Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences, 2006.

[4] T.Wang, P.L.He, “Database Encoding and an Anti-Apriori Algorithm for association Rules Mining”, Proceedings of the Fifth International Conference on Machine learning and Cybernetics, August 2006.

[5] C.H. Cai, W.C. Fu, C.H. Cheng, W.W. Kwong, “Mining association rules with weighted items”, The International Database Engineering and Applications Symposium, 1998, pp. 68–77.

[6] J. Han,M. Kamber, “Data Mining concepts and techniques”, Morgan Kaufmann publishers, San Francisco, 2006.

[7] R.Agrawal, R.Srikant, “Fast algorithm for mining association rules”, The International Conference on Very Large Data Bases, 1994, pp. 487–499.

[8] R. Agrawal, T. Imielinksi, A. Swami, “Mining association rules between sets of items in large database”,1993 ACM, SIGMOD Conference, Washington DC, USA, 1993.

[9] R. Agrawal, T. Imielinksi, A. Swami, “Database mining: a performance perspective”, IEEE Trans. Knowledge Data Eng. Vol 5 (6) (1993) 914–925.

[10] R. Agrawal, R. Srikant, Q. Vu, “Mining association rules with item constraints”, The Third International Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, August 1997.

[11] S.J. Yen, A.L.P. Chen, “Graph Based Approach for Discovering Various Types of Association Rules”, IEEE Transactions on Knowledge and data Engineering, Vol.13, No.5, September/October 2001.

[12] A.U.Tansel, S.P.Imberman, “Discovery of Association Rules in Temporal Databases”, International Conference on Information Technology, 2007.

[13] X. Chen, I. Petrounias, H. Heathfield, “Discovering Temporal Association Rules in Temporal Databases”, Proc. of IADT’98, Berlin, Germany, pp.312-319.

[14] X. Chen, I. Petrounias, “Mining Temporal Features in Association Rules”, Proc. of PKDD’99, Prague, Czech Republic, pp.295-300.

[15] T.P. Hong, C.S. Kuo, S.L. Wang, “A Fuzzy AprioriTid mining algorithm with reduced computational time”, Tenth International Conference on Fuzzy Systems, 2004.

[16] D.L.Olson, Y.Li, “Mining Fuzzy Weighted Association Rules”, Proceedings of the 40th Hawaii International Conference on System Sciences, 2007.

[17] C.S. Hilas, P.A. Mastorocostas, “An application of supervised and unsupervised learning approaches to telecommunications fraud detection”, Journal of Knowledge based systems vol.21 (Elsevier) 2008, pp.721-726.

[18] B.Yu, Z. Xu, “A comparative study for content-based dynamic spam classification using four machine learning algorithms”, Journal of Knowledge based systems vol.21 (Elsevier) 2008, pp.355-362.

[19] F.A. Thabtah, P.I. Cowling, “A greedy classification algorithm based on association rule”, Journal of Applied soft computing vol.7 (Elsevier) 2007, pp. 1102-1111.

(IJCSIS) International Journal of Computer Science and Information Security, Vol. 3, No. 1, July 2009

185

A Proposed Algorithm to improve security & Efficiency of SSL-TLS servers using Batch RSA

decryption R.K.Pateriya, J.L.Rana, S.C. Shrivastava,Jaideep Patel

Department of Computer Science & Engineering Maulana Azad National Institute of Technology

Bhopal, India Emails: [email protected], [email protected], [email protected], [email protected]

Abstract — Today, Internet becomes the essential part of our lives. Over 90% of the e-commerce is developed on the Internet. A security algorithm became very necessary for producer-client transactions assurance and the financial applications safety (credit cards, etc.) The RSA algorithm applicability derives from algorithm properties like: confidentiality, safe authentication, data safety and integrity on the internet. Thus, this kind of networks can have a more easy utilization by practical accessing from short, medium, even long distance and from different public places (Internet Cafe, airports, banks, commercial centers, educational institutes, etc.) the immensity of resources offered by internet. RSA encryption in the client side is relatively cheap, whereas, the corresponding decryption in the server side is expensive because its private exponent is much larger. Thus SSL/TLS servers become swamped to perform public key decryption operations when the simultaneous requests increase quickly .The Batch RSA method is useful for such highly loaded web server .In our proposed algorithm by reducing the response time & client’s tolerable waiting time an improvement in performance of SSL-TLS servers can be done. The proposed algorithm should provide the reasonable response time and optimizes server performance significantly. At Encryption side, to withstand many attacks like brute force attack, subtle attack etc. we also adapted a parameter generation method, which sieve all the parameters strictly, and filter out every insecure parameter.

Keywords- Batch RSA, MiniBatching, Tolerable waiting time, response time.

I. INTRODUCTION SSL/TLS handshake protocol is a typical key encapsulation

approach for secure communication, implementation of RSA algorithm in SSL/TLS is computationally imbalanced for client and server. As a result RSA encryption in the client side is relatively cheap, where as the corresponding decryption in server side is expensive because its private exponent is much larger. In previous batch method some flaws were there which can be improved with proposed algorithm. Batching parameter is optimized when integrating user’s requirements for Internet Quality of Service (QoS). To select the optimal batching parameters, not only the server’s performance but also the client’s tolerable waiting time is considered. Based on the analysis of the mean queue time, batching service time and the

stability of the system, a novel batch optimal scheduling algorithm which is deployed in a batching Web server is proposed. In addition the proposed scheme in this paper models the minibatching into existing algorithm proposed in [4].

The security of RSA algorithm is based on the difficulty of factoring large numbers which is almost impossible for 1024 bit numbers. To be able to generate the RSA parameters, one has to decide on the maximum allowed length of each of these parameters. This will be reflected on the security of the system. Initially the prime factors p and q number should be chosen to generate the modules number n. This n should be of a certain length, which is controlled by the generating number (n-bit). Varying this length will successively change the length of both p and q since they are the factors of n. so we adapted parameter generation method to make the secure transmission more effective and to withstand different attacks like

(1) Brute Force attack (2) Subtle attack etc.

A. Preliminaries Batch RSA Decryption

Fig 1 Batch decryption server

For b different ciphertexts c1, c2… cb encrypted with the different public key (e1, e2………eb). Our goal of the batch decryption is to get the correspondent plaintexts m1,m2,…. ,mb


186

via one decryption operation. It is well known that decryption operation is computational intensive due to the modular multiplication operation. By using a batch RSA decryption, we aim to efficiently improve the decryption efficiency.

System architecture in Fig1 consists of two kinds of processes the batch server process and the web server process. The batch server process schedules and performs batch RSA decryptions. The web server processes perform SSL/TLS handshake with each client and send decryption request to the batch server. The batch server uses a round robin strategy to aggregate the decryption request and complete batch decryption. The decryption results are returned to web server processes which interact with SSL/TSL handshake clients [9].

Batching of client request has two advantages .First it improves the throughput of an RSA. Second batching significantly improves the behavior of a system if the arrival of message is bursty. Batching is effective during peak time in which more messages arrive than system can handle.

II. PROPOSED OPTIMAL BATCH SCHEDULING ALGORITHM In previous algorithm [4] some flaws were there, if no. of

customers in batch are not completely filled then server has to wait for request so that batch of queues is completely filled with client request arrived for decryption, then server do batch decryption, but it increase the client’s tolerable waiting time of client and response time of server.

In our proposed scheme we perform minibatching in batch of queues in algorithm from[4], made for clients request decryption so that requests which are waiting for long time can be decrypted with the web server, so tolerable waiting time & response time can be decreased .so it improves the performance of web server. Value of Tb Taken from [4].

Tb = (3n3 +n2 (42b+k(3b3+3b)-1))bTrsa b(3n3+n2)

Fig 2 Mean response time speedup non-batching Scheme against non- batching

Figure 2 shows [4] the comparison of the mean response

time of the batching schemes with the non-batching scheme. The vertical axis is the mean response time over batch size

divided by the mean response time with non-batching scheme. Performance of server can be improved with the proposed algorithm.

Proposed optimal batch scheduling algorithm

Step1: Find out the solution of b

Input:Tt ,λ;

Output:Tb,optimal batch size

Begin

Compute the max batch size.maxbatchsize= Int(0.4λTt+1)

If (maxbatchsize<=1) then do

Conventional_RSA_decryption(); return;

For (b=2; b<=maxbatchsize;b++)

{Success=false;

Compute Tb;

If (Tb<b/λ) then {Optimalbatch size=b;

Success=true ;}}

If (! success) then return;

If (success) then goto step2 ();

End

Step2:Optimal batch scheduling

Input: optimal batch size,Tt;

Output :optimal batch scheduling.

Begin

Construct b queues (Q (e1), Q (e2)…Q (eb))

For every exponent ei maxtimer=0;

Assign the exponents { e1,e2,e3….eb) to different clients

using round robin strategy to server hello message

While(Q(e1)!=Null or (e2)!=Null..or (eb)! =null)}

If client arrived then (match client. exponent=ei

Enqueue(Q(ei).client ); initialize Client timer}

If (Q(e1)!=Null and Q(e2)!=Null….and Q(eb)!=Null)

Then (do batch decryption () ;

Reset server_waiting time;

Update queues (Q(e1),Q(e2),…..Q(eb));

Else { for ( j=1;j<=b;j++)

{

If ((Q (ej!=Null) and Q(ej).head timer>=maxtimer))


187

{ maxtimer={Q(ej).head.timer);}}

If (server waiting time >=Tt-maxtimer} then

{ do batch_decryption();

Reset server_waiting _time;

Update queues

Else

For j=1 to b

If any (Q (ej)=null) then

If Q(ej).head.timer>=maxtimer then

Make miniBatch of queues Q (ej)

Apply batch_decryption ();

Update Queues;

Repeat until batch size =1;

If batch size=1;

Do conventional_RSA_decryption;

End

III. RSA PARAMETERS The RSA algorithm has some important parameters

affecting its level of security. It consists of some high order mathematical operations performed on some parameters in a certain order. These parameters control the level of security of the encrypted data. It was shown [6] that complexity of decomposing the modules into its factors is a function of the modules length itself. The importance of this length is also reflected on the security of the public key making it more difficult to be detected.

The importance and effect of changing the RSA Parameters are analyzed such that one parameter is changed at a certain time and the others are kept fixed.

It is conjectured that if n is generated by picking at random two big primes and multiplying them, then factoring n is an intractable problem. Also computing d given e and n is as hard as factoring n. This is the assumption of the RSA; clearly if factoring is easy then RSA assumption fails. The RSA algorithm provides excellent protection of voice and data.

Changing the modules length: Changing the maximum length of the generating number n-bit to generate the modules n will affect the other parameters as shown in Table I

TABLE I. THE EFFECT OF CHANGING THE MODULES NUMBER

n-bit p

length q length

n length

e length

d length

C length

500 76 76 151 21 151 151 600 91 91 181 21 181 181 700 106 106 212 21 211 211 800 121 121 242 20 241 241 900 136 136 272 20 272 271 1000 151 151 302 20 301 301 1024 155 155 309 20 308 309 1200 181 181 362 21 361 362 1500 227 227 453 21 452 452

It is clear that increasing the maximum limit on the length of the modules number will increase the length of both p and q factors. The length of the secret key d and the length of the encrypted message c are also increased at the same rate as illustrated in Fig. 1.

Fig 3 Modules length vs. RSA parameters length

Increasing the n-bit length will provide a more secure value for the private key d, since larger d means more security where as the public key e does not have that importance here

IV. KNOWN ATTACKS TO RSA ALGORITHM Methods for attacking RSA algorithm could be classified

into two kinds:

Brute force attack. This kind of attack doesn’t care any special parameters.

(1) Exhaustive attack: This kind of attack traverses all possible values for d, or tries all possible combinations for 1s in d till the attacker finds the correct d.

(2) Factorization attack: This kind of attack factorizes the module of RSA and gets p and q. The widely used methods at


188

present are quadratic sieve, generalized number field sieve, and special number field sieve.

Subtle attack. This kind of attack aims at the mathematic feature of some parameters.

(1) Attack by multiplication of small primes: This kind of attack makes use of that there is no big prime factor in p+1 or p-1 or q+1 or q-1. By calculating the modular power with the multiplication of a chain of small primes as the exponent, one gets p and q subtly.

(2) Square attack: When p and q are too close to each other, this kind of attack works. By computing n , one can easily get the real p and q via repeated tests.

(3) Iteration attack: This kind of attack repeats the modular power calculation in the encryption procedure again and again to recover the plain text.

(4) Low private exponent attack: This kind of attack makes use of continued fraction to get approximate plain text. It includes Wiener attack and Boneh-Durfee attack.

By using parameter generation algorithm from [3] so it can protect our message from different attacks.

Fig 4 Parameter generation algorithm

Here, p and q are strong primes which can withstand the

attack by multiplication of small primes. The difference between p and q is relatively large, which can withstand the square attack. That gcd (p-1,q-1) is relatively small can avoid

iteration attack. Length (d)>=length ( (n))*0.292 can resist

low private exponent attack (Wiener attack requires that length

(d)>= length ( (n))*0.25, while Boneh-Durfee attack requires

that length (d)>=length ( (n))*0.292). It is enough for length

(d) to be bigger than 80 in order to withstand exhaust attack, for 280 rounds exhaust is complex enough. In order to withstand combination attack, the Hamming weight of d should not be too large or too small. If there are too many or too few 1s in d, one could exhaust every possible position for 0 or 1, and then get the correct d. Hence, according to the bit-length of d, we should guarantee that the Hamming weight of d could make the number of combinations more than 280 (more complicated than exhaust attack).

V. CONCLUSION & FUTURE WORK This paper proposed a framework for Batch RSA

decryption technique improvement. We applied concept of minibatching in our algorithm due to that Batch RSA decryption time can be reduced to some extent.

The advantage is that proposed algorithm reduce the mean response time, client’s tolerable waiting time so that efficiency is improved. By adapted parameter generation algorithm we can protect our data from many security attacks on the internet. Our future work is to implement this proposed algorithm.

ACKNOWLEDGMENT The Success of this research work would have been

uncertain without the help and guidance of a dedicated group of people in our institute MANIT Bhopal. We would like to express our true and sincere acknowledgements as the appreciation for their contributions, encouragement and support. The researchers also wish to express gratitude and warmest appreciation to people, who, in any way have contributed and inspired the researchers.

REFERENCES [1] Fiat A. Batch RSA. Journal of Cryptology, 1997,10(2):75−88. [2] Shacham H, Boneh D. Improving SSL handshake performance via

batching. In: Proc. of the RSA 2001. LNCS2020, San Francisco: Spring-Verlag, 2001. 28−43.

[3] Jiezhao Peng#1, Qi Wu#2 Research and Implementation of RSA Algorithm in Java #Jiangxi University of Finance & Economics Nanchang 330013,Jiangxi Province, China.

[4] Qi F, Jia WJ, Bao F, Wu YD, Wang GJ. Parameter optimization-based batching TLS protocol. Journal of Software,2007,18(6) http://www.jos.org.cn/1000-9825/18/1522.htm


189

[5] Peter M. Fischer, Donald Kossmann Batched Processing for Information Filters Swiss Federal Institute of Technology (ETH) Z¨urich, Switzerland {peter.fischer,kossmann}@inf.ethz.ch

[6] Allam Mousa Sensitivity of Changing the RSA Parameters on the Complexity and Performance of the Algorithm Department of Electrical Engineering, An Najah National University, Nablus, Palestine Journal of Applied Sciences 5 (1): 60-63, 2005 .

[7] Md.Ali-Al-Mamun, Mohammad Mota harul Islam,S.M. Mashihure Romman & A.H.salah Uddin Ahmad “Performance Evaluation of several Efficient RSA variants. IJCSNS vol.8 No.7 july 2008

[8] J. Shawe-Taylor Proportion of Primes Generated by Strong Prime methods electronics letters 16th january 1992 vol. 28 no. 2

[9] LI Shi-qun1, WU Yong-dong2, ZHOU Jian-ying2. A Practical SSL Server Performance Improvement Algorithm Based on Batch RSA Decryption

[10] Fang Qi1;2, Weijia Jia1;3, Feng Bao2, and Yongdong Wu2 Batching SSL/TLS Handshake Improved 2005

[11] Dan Boneh ,Hovav Shacham “Fast variants of RSA” , CryptoBytes (RSALaboratoeis) vol.5, NO.1 pp 1-9, 2002.


190

Log Management Support for Recovery in Mobile Computing Environment

J.C. Miraclin Joyce Pamila, Senior Lecturer, CSE & IT Dept,

Government College of Technology, Coimbatore, India.

[email protected]

K. Thanushkodi, Principal,

Akshaya College of Engineering and Technology, Coimbatore, India.

[email protected]

Abstract - Rapid and innovative improvement in wireless communication technologies has led to an increase in the demand for mobile internet transactions. However, internet access from mobile devices is very expensive due to limited bandwidth available on wireless links and high mobility rate of mobile hosts. When a user executes a transaction with a web portal from a mobile device, the disconnection necessitates failure of the transaction or redoing all the steps after reconnection, to get back into consistent application state. Thus considering challenges in wireless mobile networks, a new log management scheme is proposed for recovery of mobile transactions.

In this proposed approach, the model parameters that affect application state recovery are analyzed. The proposed scheme is compared with the existing Lazy and Pessimistic scheme and a trade off analysis between the cost invested to manage log and the return of investment in terms of improved failure recoverability is made. From the analysis, the best checkpoint interval period that yields the best return of investment is identified.

Keywords – log, recovery, mobile environment.

I. INTRODUCTION

Improvement in quality, security and reliability of cellular services, facilitates internet access from the mobile devices. A mobile host (MH) engaged in a client-server application, may easily fail because of limited network resources. Due to its potential applicability, failure recovery of client-server applications needs considerable attention.

Checkpoint and message logging protocols are designed for saving the execution state of a mobile application, so that when a MH recovers from a failure, the mobile application can roll back to the last saved consistent state, and restart execution with recovery guarantees. The existing protocols assume that the MH’s disk storage is not stable and thus checkpoint and log information are stored at the base stations [1], [6].

Two broad categories of mobile checkpoint protocols i.e. coordinated and uncoordinated have been proposed in the literature. Coordinated protocols, are suitable for MHs that run distributed application, and MH must coordinate their local checkpoints to ensure a consistent and recoverable global checkpoint [7]. Uncoordinated protocols are more applicable to mobile applications involving only a single client MH, and the MH can independently checkpoint its local state. Pradhan, Krishna, and Vaidya [3] proposed two uncoordinated checkpoint protocols: No-logging and Logging approaches. The No-logging approach requires the MH to create a new checkpoint every time it has write-event that modifies the state of the application. The Logging approach, creates checkpoints only periodically, and logs all write-events which occur in between two checkpoints. When a MH recovers from a failure, it will retrieve the checkpoint along with log entries saved, to start the recovery process. Performance analysis of Logging versus No-logging was reported in [3].

II. RELATED WORK Global checkpoint based schemes [1] consider distributed applications running on multiple mobile hosts. Hence asynchronous recovery schemes [5], [6] are best suited than the schemes of [12] which require synchronization messages between participating processes. Lazy and Pessimistic schemes are reported in [8]. In a lazy scheme, logs are stored in the Base station (BS) and if mobile host moves to a new BS, a pointer to the old BS gets stored in the new BS. Pointers can be used to recover the log distributed over several BS’s. This scheme has the advantage that it incurs relatively less network overhead during handoff as no log information needs to be transferred. But this scheme has a large recovery time. In the pessimistic scheme, the entire


191

log and checkpoint record, if any, are transferred at each handoff. Hence, the recovery is fast but each handoff requires large volume of data transfer. The work reported in [2], [9] present schemes based on the mobile host’s movement using independent check pointing and pessimistic logging. In the distance-based scheme, the contents that are distributed are unified when the distance covered by mobile host increases above a predefined value. After unifying the log, the distance or handoff counter is reset. These schemes are a trade off between lazy and the pessimistic strategies. The schemes discussed so far do not consider the case where a mobile host recovers in a base station different than the one in which it has crashed. The mobile agent based framework proposed in [4] addresses this problem. This facilitates seamless logging of application activities for recovery from transactions failure.

All the previous works discuss about storing log in Base Station [4]. In the proposed approach Base station Controller is selected for storing the log information rather than Base Station.

III. REFERENCE ARCHITECTURE

Fig. 1 illustrates the reference architecture. It is a client/server architecture system based on GSM [13]. A mobile network contains, a fixed backbone network and wireless network. A host that can move while retaining its network connection is a mobile host. A static network consists of fixed hosts and base stations (BS),

BS that interacts with MH and with wired network acts as a gateway between wired and wireless networks. Two or more base stations are controlled by a Base Station Controller (BSC). Similarly Mobile Switching Center (MSC) will control two or more BSC’s.

BS comprises all radio equipment needed for radio transmission and forms a radio cell. It is connected to MH via radio link and to BSC via a high speed wired link. BS will act as a switch to which a mobile Host can communicate within a limited geographical region referred as a cell. Due to mobility, the MH may cross the boundary between two cells while being active and the process is known as handoff.

BSC manages the Base Stations. It reserves radio frequencies, handles the handover from one BS to another within the BSC region, and also performs paging of the MH. MSC is high performance digital ISDN switch and is equipped with Home Location

Register and Visitor Location Register for storing the location information of the mobile hosts.

Fig 1. Reference Architecture

As wireless bandwidth is constrained in

cellular networks, HTTP is not feasible for MH to access internet.Therefore, WAP-enabled devices (MH) communicate via a WAP gateway [14]. The gateway turns the requests into standard web-based requests according to WAP specifications. The gateway acting as an internet client sends the request to a server providing WAP content.

IV. RECOVERY MECHANISM There are several factors that affect the recovery [1]. A. Failure Rate of the Host The failure of MH due to weak wireless link or less battery power etc., are purely random in nature. If failures are more, the transaction has to roll back every time when MH recovers from failure and thus total execution period of transaction gets increased. Generally MH failure rate is approximated with exponential distribution or Poisson distribution. B. Log Size Transmission of data consumes twice as much power as receiving the same amount of data. So only essential write events are to be logged to reduce size of the log. C. Memory Constraints Storing the log of each MH at the BSC might use up a lot of memory space on the BSC. It is necessary to evaluate average memory requirements based on log size and the recovery schemes used. D. Recovery Time The time required to recover a process upon failure depends on the recovery scheme and method used for logging the write events.


192

E. Log retrieval Cost The cost invested to retrieve the log information upon failure of a transaction depends on the amount of log distribution. If the log is distributed in more places, the retrieval cost and the recovery cost increase. V. PROPOSED LOG MANAGEMENT SCHEME

Here the log information is stored in BSC. The area covered by a single BSC is referred as a REGION. It is assumed that a Tracking agent present in BSC will query the HLR or VLR for the location update of mobile host in which transaction is initiated. By using this agent the problem of recovery of a mobile host in a BS different than the one in which it crashed is addressed. A. Intra BSC Management When MH is moving from one BS to other BS which is connected to same BSC, no log information is transferred as the log is in BSC. Therefore the handoff cost is reduced drastically. B. Inter BSC Management

Every MH carries following information for the purpose of registration. 1. Previous BSC identity (PBSCid) 2. Own identity (MHid)

When a MH registers with a message Connect (MHid, PBSCid) to new BSC which is not the Home Base Station Controller (HBSC), then the new BSC informs HBSC about its reachability, by sending message containing MHid and HBSC and its own identity BSCid.

Now since this message is received by HBSC, it will transfer the entire log present in it to the current BSC. C. Log Transfer from Mobile cache to BSC MH transfers the entire log to the BSC as follows: 1. If MH cache is exhausted, immediately entire

log will be copied to current BSC. 2. When Mobile Host moves away from the current

BSC and system detects handoff, the MH will copy the entire log to BSC and the log would be appended to the previous log file.

VI. MODELLING AND METRICS

In this section mathematical equations for different performance metrics are analyzed [8].

A. Handoff Modeling The interval between two handoffs is referred to as handoff interval. A handoff interval can be represented using a 3 state discrete markov chain [8] as presented below.

Fig. 2 Markov Chain Representation

In Fig. 2 State 0 is the initial state when the

handoff interval begins. During the handoff interval, the host receives messages. Depending upon the state-saving scheme, the host either takes a checkpoint or logs the write events.

A transition from state 0 to state 1 occurs if the handoff interval is completed without failure. If a failure occurs during the handoff interval, a transition is made from state 0 to state 2. After state 2 is entered, a transition occurs to state 1 once the handoff interval is completed.

To simplify the analysis, it is assumed that, at most, one failure occurs during a handoff interval. This assumption does not significantly affect the results when the average handoff interval is small, compared to the mean time to failure. B. Terms and Notations

λ - Log arrival rate µ- Handoff rate r - Ratio of the transfer time in the wired

network to the transfer time in the wireless network.

η - Average log size Tc – checkpoint interval. k - Number of write events per checkpoint Nc – Number of checkpoints in t time units. Nl- number of messages logged in time t. Cc – Average transfer cost of a checkpoint

state over one hop of the wired network. Cl - Average transfer cost of an application

message over one hop of the wired network.


193

Cm – Average transfer cost a control message over one hop of the wired network.

T- Time to load last checkpoint T1 – Time to load last log information α – wireless link cost ρ – wired link cost

Probability for a handoff without failure

P01 = 1- (λ/ (λ +μ)). (1) Probability of failure within

handoff P02 = λ/ (λ +μ) (2)

C. Performance metrics

1) Handoff Cost: In the proposed scheme, the message log will contain the write events that have been processed since the last checkpoint. For each logging operation, there is a cost for the acknowledgement message sent by the BSC to the MH. Thus handoff cost includes the cost of transferring the checkpoint state, message log, and an acknowledgement [8].

Average handoff cost is Ch = (η Cl + CC + Cm). (3) Handoffs being Poisson process η = (k – 1)/2. (4) Thus the total Handoff cost [8] is C01 = rαCc/k + ρrαCl + ρrαCm + ηCl + CC + Cm (5) 2) Recovery Cost: The recovery cost is the cost of transmitting a request message from the MH to the BSC, and the cost of transmitting the checkpoint and log over one hop of the wireless network. For Poisson failure arrivals [8], η = (k-1)/2. Therefore Cr = r (ηCl + Cc +Cm) (6)

3) Total Cost: This is the expected cost incurred during a handoff interval with and without failure. The total cost is determined as follows Ct = P01 C01+ P02Cr (7) D. Failure Recoverability In this section tradeoff involved between the cost invested for maintaining the checkpoint and log versus the resulting recovery probability gained when a failure occurs is analyzed. Failure Recoverability versus Cost Ratio (FRCR) parameter [3] defined as the ratio of the difference in recovery probability to the difference in cost invested by these two strategies. Then, FRCR = (Pprop - PLazy) / (Cprop - Clazy). (8)

When given set of parameters values, Pprop and Plazy are calculated. The cost invested by the two strategies is the cost incurred due to handoff. The proposed system will transfer all log and checkpoint information to the current BSC while

Lazy method just establishes the link to previous base station.

The average number of checkpoints before failure is given by (1/λ)/(Tc). The average number of moves crossing BSC’s or BS’s boundaries between two consecutive checkpoint is given by Tc * μ. The total no of log entry transfer operations required by proposed system between two consecutive check points are given as

nTc

n

*))/(1/ (1/*

1∑=

λ

μλ (9)

The total cost invested by proposed system is C prop = ∑

=

+λ

μλλλ*

1*)/1/()/1(*1****{*/)/1(

tc

nnTrTcTrTc (10)

The cost invested by Lazy scheme[8] is Clazy = (1/λ)/Tc*(Tc*λ*Cp) (11)

VII. PERFORMANCE ANALYSIS The proposed scheme is implemented in Network Simulator 2 (NS2). In this section the proposed method is compared with the lazy and pessimistic methods. A. Comparison of handoff cost The handoff cost for lazy scheme is very low, as no log information is transferred during handoff. So irrespective of the Mobility rate, the handoff cost will remain same for Lazy scheme. The handoff cost for pessimistic is very high when compared with all schemes because for every handoff, the total log and check point information are to be transferred to the current BSC.

Fig. 3 Handoff cost of three strategies

The handoff cost for proposed scheme is low when compared to pessimistic, but high when compared to lazy scheme. But since the log and checkpoint information are not carried along with MH and after every checkpoint the log is purged the handoff cost for the proposed scheme is moderate.


194

Though mobility rate increases the handoff cost increment is very less when compared with the pessimistic scheme. The tradeoff between lazy, pessimistic and the proposed schemes are shown in Fig.3. B. Comparison of Recovery Cost From Fig. 4, since during recovery the logs are to be collected and this causes increase in recovery cost of lazy scheme. The increase in recovery cost will be more if the mobility rate increases. Recovery cost for pessimistic scheme is very low as the entire log information is present at the current base station. Recovery cost for proposed scheme is also very low as the entire log information is present at the current BSC. But the recovery cost will be higher than the pessimistic scheme when recovery is in the same base station where mobile node got failed. If recovery is in other Base station controller then the recovery cost for proposed is lower than the pessimistic scheme.

Fig. 4 Recovery cost of three strategies

C. Comparison of Total Cost Total cost is sum of handoff cost and recovery cost. The total cost comparison shows that total cost incurred for the proposed scheme is comparatively very low.

Fig. 5 Total Cost of three strategies

D. Comparison of Recovery Probability

Fig. 6 shows the effect of log arrival rate on failure recoverability. As observed, the system recovery probability decreases dramatically as the log arrival rate increases for all the schemes.

Fig. 6 Comparison of Recovery probability For the proposed scheme also, the recovery

probability decreases but decrease is less, compared with the other schemes. As the log information is stored at BSC, the probability is very high to recover in the same BSC. Even if it gets recovered in other BSC the entire log is present in the previous BSC. So the recovery probability is better for proposed scheme when compared with the other schemes.

E. Failure Recoverability Vs Cost Analysis If the checkpoint interval is very short, all log entries since the last checkpoint as well as the last checkpoint itself are likely to reside in the current BSC, making the failure recoverability of both strategies virtually the same.

Fig. 7 Comparison of Recovery Probability

As the checkpoint interval increases, the number of log entries accumulated between two consecutive checkpoints becomes more substantial, thus resulting in an increase FRCR.


195

Fig. 8 Behavior of FRCR

When the checkpoint interval is very long,

however, the improvement in failure recoverability cannot catch up with the increase in the cost investment difference, thus resulting in a decline in FRCR.

VIII. CONCLUSION The proposed log management scheme for mobile computing system reduces total cost for recovery from the failure when compared with the existing Lazy and pessimistic schemes. The proposed technique also ensures recovery from different BS other than in which it has failed. The proposed scheme controls the handoff cost, log retrieval cost and failure recovery time. As a result of the analysis, the proposed scheme well suits when the mobility rate of the mobile host is very high.

REFERENCES

[1]. S. Gadiraju, Vijay Kumar, “Recovery in the mobile wireless environments using mobile agents”, IEEE Transactions on Mobile Computing, June 2004, Vol. 3. [2]. S. E. George I Chen Y Jin, ”Movement Based check pointing and logging for recovery in mobile computing systems”, ACM on MobiDE’06, June 2006

[3]. Dhiraj K,P. Krishna ,Nitin H. ”Recoverable Mobile environments: Design and trade off analysis”. 26th IEEE Fault Tolerant Computing Symp 1996, pp 16-25. [4]. Ruchika D, S.Bhandari.”Recovery in Mobile database System”, IEEE Fault Tolerant Computing, 2006 [5]. B. Yao, W.K Fuchs, K, Ssu. “Message Logging in Mobile computing”, 29th IEEE Fault-Tolerant Computing Symp. 1999, pp 294-301. [6]. A. Acharya , B. R. Badrinath, “Checkpoint distributed applications on mobile computers”, in Proc. 3rd Int. Conf. Parallel and Distributed Information Systems, Austin, Texas, 1994, pp. 73–80. [7]. N. Neves and W. K. Fuchs, “Adaptive recovery for mobile environments”, Communication. ACM, vol. 40, no. 1, pp. 69–75, 1997. [8] Ing-Ray Chen, Baoshan Gu, Sapna E. George, and Sheng-Tzong Cheng, “On Failure Recoverability of Client Server Application in Mobile Wireless Environments” IEEE Transactions on Reliability, Vol. 54, March 2005. [9] V.R. Narasayya, “Distributed Transactions in a Mobile Computing System”, Proc. IEEE Workshop MCSA, June 1994. [10] T .Park, N. Woo, H.Y. Yeom, “An Efficient recovery scheme for Mobile Computing environments”, Proc. 8th Int’l conference Parallel and Distributed systems, 2001. [11] T. Park & H.Y Yeom, “An Asynchronous Recovery Scheme Based on Optimistic Message Logging for Mobile computing Systems”, Proc. 20th Int’l Conf. Distributed Computing Systems, April 2000, pp. 436-443. [12] R. Koo & S. Toueg, “Check pointing and Rollback-recovery for Distributed systems”, IEEE Trans. Software Engg., Vol. 13 no.1,pp. 23-31. [13] Jochen H Schiller, Second edition “Mobile Communications”, Pearson Education,2003. [14] R.K Nichols, Panos C Lekkas, “Wireless Security Models, Threats & solutions”, Mc Graw-Hill telecom International Edition, 2002.


196

Complete Security Framework for Wireless Sensor Networks

Kalpana Sharma, M.K. Ghose, Kuldeep Sikkim Manipal Institute of Technology

Majitar-737136, Sikkim, India, [email protected], [email protected]

Abstract— Security concern for a Sensor Networks and level of security desired may differ according to application specific needs where the sensor networks are deployed. Till now, most of the security solutions proposed for sensor networks are layer wise i.e a particular solution is applicable to single layer itself. So, to integrate them all is a new research challenge. In this paper we took up the challenge and have proposed an integrated comprehensive security framework that will provide security services for all services of sensor network. We have added one extra component i.e. Intelligent Security Agent (ISA) to assess level of security and cross layer interactions. This framework has many components like Intrusion Detection System, Trust Framework, Key Management scheme and Link layer communication protocol. We have also tested it on three different application scenarios in Castalia and Omnet++ simulator. Keywords:- Security, sensor networks, key management; application specific security.

I. INTRODUCTION Wireless Sensor Networks are being employed in various

real time fields like Military, disaster management, Industry, Environmental Monitoring and Agriculture Farming etc. Due to diversity of so many real time scenarios, security for WSNs becomes a complex issue. For each implementation, there are different type of attacks possible and demands a different security level. Major challenge for employing an efficient security scheme comes from the resource constrained nature of WSNs like size of sensors, Memory, Processing Power, Battery Power etc. and easy accessibility of wireless channels by good citizens and attackers.

Although research in the sensor network security area is progressing at tremendous pace [15]; still there is lack of an integrated comprehensive framework which can provide security services to each layer and services of sensor networks. Current research in this area majorly focuses on providing layered solutions, which can provide security service for one layer only. Also some solutions address particular kind of attacks only.

In a diverse application field of sensor networks, specifically application designer knows which data needs to be secured with which kind of security service [12].We can take example of two popular WSN applications like Agriculture Farming and Military Surveillance system whereas in case of agriculture farming only data integration (HASH functions) check can do, but military surveillance needs security services like

encryption, authentication and strong resilience to node compromise attacks. By all means, a security setup for an application must always be subject to a thorough security evaluation in order to justify its security promises and to foster the application developer's awareness regarding which aspects are secure and which are at risk, thus avoiding a false sense of security. For a reasonable security evaluation, we have added another logical component in sensor node structure namely ISA (Intelligent Security Agent) which will asses security level needs of a particular sensor network deployment.

In this paper, section 2 describes current approaches in security of sensor networks and their limitations, section 3 formulates the security framework problem and its design goals, section 4 introduces all the component of framework and section 5 describes the simulation results and analysis of ISA. Finally section 6 concludes with future work.

II. RELATED WORKS Extensive research is being carried out to address security

issues in sensor networks like link layer communication protocol, Intrusion Detection Scheme, Secure Routing protocol, Trust Models, freshness transmission, key management schemes etc. In this section a brief overview of the security solutions available in the literature are summarized. One common approach to create secure platforms in WSN is by providing link layer cryptographic primitives or libraries. TinySec [17], Secure Sense, and MiniSec[16] are examples of this approach. Although MiniSec provides energy efficient security compared to other link layer solutions the main drawback of MiniSec is in terms of providing same security level to each application scenarios, thus lacking adaptive security or scenario specific security. A very good technique for low overhead freshness transmission using bloom filter and last bit optimization is given in MiniSec.

TinySec and Secure Sense assume to have a global common secret key among the nodes which is assigned before the deployment of the network and is used to provide security services such as encryption and authentication in link layer. The main drawback with this approach is that it is not resistant against node capture attacks in which an adversary can pollute an entire sensor network by compromising only one single node. In SenSec, there are three types of keys: Global Key, Cluster Key and Sensor key. The global key is generated by the base station, pre-deployed on each sensor node and shared by all nodes. This key is used to broadcast messages in the


197

network. However, this protocol again falls prey to node capture attacks in which dedicated attackers can find this global key and broadcast commands or data to the network. The provision of maximum level of security for all types of communication in each sensor node, as the one which appears in TinySec and MiniSec, is not suitable for use as in a general security platform for WSN since it can lead to unnecessary waste of system resources and noticeably reduces the network lifetime. Although there has been an attempt made in Secure Sense to address this issue; its solution cannot be well integrated with higher level services appropriately. In other solutions, like secure information routing protocols such as SPINS [19] and LEAP [25] or security-aware middleware services such as secure localization or secure time synchronization [8] cryptographic key management plays an important role. Generally there are three major approaches for key management in WSN namely: Deterministic pre assignment, Random pre-distribution and Deterministic post-deployment derivation.

Examples of the first approach are SPINS [19] and LEAP in which unique symmetric keys shared by the nodes with the base station are assigned before the network is deployed. Using this approach, cryptographically strong keys can be generated; however, this involves a significant pre-deployment overhead and is not scalable.

Random-key distribution schemes like those in, PIKE [20] refer to probabilistically establishing pair wise keys between neighboring nodes in the network. However in this approach, a node has to store large number of keys. Bhaskaran Raman et al [26] pointed out that WSN protocols are very deeply dependent on Application scenarios, but most of protocols does not cite or use any specific application in its design. So current security schemes also lacks in providing security to specific scenarios while assessing their security needs. There are some approaches which addresses only routing problem like secure spin, secure sensor network routing and some other geographic techniques. Tae Kyung Kim et. al [27] gives a simple trust model using fuzzy logic that can effectively address the secure routing problem. It calculates the evaluation value for each path and ensures that packet is always forwarded to a high evaluation value path. A scheme for preventing compromised node to become cluster head is proposed by Garth et.al, which is based on trust factor. Some initiatives to provide security framework, which integrates two or more security schemes like secure cluster formation [23], key management [18] and secure routing [22], also combines link layer secure communication protocol with key distribution scheme. Security platform proposed in [22] provides defense against node compromised attacks, but does not give any mechanism to isolate them. It supports holistic security approach to provide security to WSNs. But major disadvantage of holistic security approach is that it tries to implement security layer wise which results in redundant security.

An example to fully understand the concept of redundant security is to consider a black hole attack. The security mechanisms provided in this case is both in network as well as link layer. Without a systematic view such approaches would provide redundant security thus wasting resources and unintentionally launch a SSDoS (Security Service DoS) attack.

Also when data is processed layer by layer, there can be different security provisions layer wise, thus providing redundant security. Although much progress has been made for the past few years, the field remains fragmented, with contributions dispersed over seemingly disjoint yet actually connected areas,

Currently much of work is going on providing layered security for such as the Holistic Security Approach [2]. A holistic approach aims at improving performance, security, longevity with respect to changing environmental condition with some basic principles. For example in a given network security is to be ensured for all the layers of the protocol stack as shown in fig1 and also the cost of security should not be more than assessed security risks. But major disadvantage with holistic security is that it is layered and tries to implement security mechanisms for each layer, which results in wastage of power, memory, processing power and introduce message delay.

III. PROBLEM FORMULATION Generally a security platform that copes with constrained

resources of nodes while being flexible and lightweight eases the application development process and contributes to widespread deployment of sensor networks. In order to provide such a platform we have made a few reasonable assumptions. We also suppose that the base station is safe and adversaries cannot compromise it. Our approach does not place any trust assumption on the communication apart from the obvious fact that there is a nonzero probability of delivering messages to related destinations. We introduce the following design goals for a practical security framework in sensor networks.

Robust, Simple and flexible Designs: - Security design should build trustworthy system out of untrustworthy components and should have ability to detect and function when need arises. Design should have minimum software bugs. Security framework should also work if we add new nodes in the network thus providing scalability.

Component Based Security:-Some kind of security measures must be provided to all the components of a system as well as to network .We should concentrate on securing the whole chain.

Adaptive Security:-WSNs are having numerous combination of sensing, communication and computing technologies and sensors are deployed from very sparse to dense. So depending on traffic characteristics and environment they have to adapt themselves. For ex.:- In a good environment where probability of security attacks is low, we should use low level of security. In other words, we can say that sensor node should adapt them according to outside environment. However we further categorize the notion of adaptive security into following terms.

a) Application based:- As already described in previous section that each application requires different level of security like Military Surveillance, Habitat Monitoring etc.

b) Data Based:- Level of security also depends on the type of data like there should be different level of encryption for


198

routing, sensed data, control packet data and encryption key information.

c) QoS with Security: - One important question is how to trade off between QoS parameters while providing security. Unfortunately, existing security designs can address only a small, fixed threshold number of compromised nodes; the security protection completely breaks down when the threshold is exceeded.

d) Realistic Design:- Current Security design lacks this design requirement because they have an explicit threat model in mind. We have to do real trace analysis for all kind of practical attacks possible for a particular real time scenario.

IV. SOLUTION MODEL We have already seen the flaws that can be occurred in

implementing layered security approach. So in this paper we would rather concentrate on cross layer security framework. In some of the recent works, there are cross layer implementation for power management schemes, path redundancy based security [21], Energy equivalence routing and various key management schemes. In support of cross layer security approach let us concentrate on the following points [3].

Figure 1 Node structure with ISA incorporated

1. If we want that routing should be energy efficient then we have to take care of routing in network layer, minimization of the number of control packets and retransmission in link layer and putting energy transceivers On/Off in physical layer.

2. Key management schemes make sure that all the communicating nodes possess required keys for encrypted communication. At the same time to make sure that packet reaches destination, a secure link with multi path routing is required

There are various other reasons for adopting cross layer security approach and to name some we’ve heterogeneous requirements and services of applications domain, cross layer intrusion detection, detection of selfish nodes and non redundant security. However, Cross layer security introduces a

significant overhead in maintaining interfaces between various protocol layers for exchanging parameters. However this overhead will be much lesser in comparison with strict layered architectures. To further reduce the overhead created by cross layered architectures, we’ve introduced ISA (Intelligent Security Agent) to follow the recommendations given in Section 3 and 4 and to provide energy efficient and non redundant security operation while keeping protocol layer abstraction intact. ISA will be used as a separate component in node architecture (See fig 1), which can exchange parameters with all protocol layers like a Resource Manager.

We know that in a component based security framework, security is to be ensured for all the components and services in a system. So we will also address the following requirements of WSN Security.

• Robust Trust Framework using Cross Layer Approach.

• Trust Based Group Head Election.

• Key Management Architecture.

• Adaptive Secure Communication Protocol.

• Intrusion Detection System.

In addition to the above mentioned points the following assumptions are made for the proposed Trust-Framework.

• We assume TDM (Time Division Multiplexing) scheduling for communication within a group. In TDM, in a particular interval, a node will transmit otherwise it will listen passively in promiscuous mode. So a node can hear neighborhood transmission/reception.

• Each node in a network is identified by a set of Group id (8 Bits) and Node id (8 Bits) i.e. {Groupid, Nodeid}.So node communication is limited to group only.

• Each node has three different types of keys viz. a Node Based Keys which are used to listen to broadcast made by Group Head, Pair Wise Keys which are used to facilitate communication between pair wise nodes and Broadcast Keys which are the Keys used for broadcasting.

A. Trust Framework and Group Head Election In our proposed security scheme a network is divided into

various groups and each group has its group head. Normally, to maximize network life time, a node with highest energy is chosen to be group head. Because a group head has to perform several other operations like data aggregation etc, the rate of consumption of power is very high in case of group head. We apply rotating group head so that when a node falls short of energy, it will transfer its responsibility to some other node in the group by election or some other measures. Here the security concern which arises is as follows; let us take consider the parameter ‘available energy’ as a measure to transfer group head responsibility. A compromised node or an adversary node would always show higher amount of energy.So there is a high

Network Layer

MAC

Radio

Application Layer

Sensor Device Manager Resource Manager

Battery CPU State

Time Memory

Intelligent Security

Agent LIDS/GIDS

Trust

To/From Physical Process

To/From Wireless Channel


199

probability to select an adversary as a group head. Most of the current clustering techniques assume that all the nodes are trustworthy in a network. Hence, we should choose a technique in which probability of selection of compromised nodes as group head is very low. In such a technique, a node will continuously monitor its neighbors and maintains a parameter table. All the parameters values are collected from cross layer interactions. Depending on table parameters, it will compute a trust level of all its neighbors. Table parameters are given as

TABLE I. TRUST PARAMETERS USED IN TRUST FRAMEWORK Sl.No Parameters Node1 Node2 Node m

1. Available Energy (AE) 2. Packet Signal Strength (PSS)

3. Control Packet Received for forward(CRF)

4. Control Packet Received forwarded(CRAF)

5. Routing Cost(RC) 6. No. of Packet Collision (NPC)

7. Data Packet Received for forward (DRF)

8. Data Packet Received forwarded (DRAF)

9. Packet Dropped (PD) 10. No. of Packets Transmitted (NPT) 11. No. of Packets Received (NPR)

• CRFi - Control Packet Received for forward for a particular node i. where i=1, 2… m. same notations apply to CRAF, DRF, DRAF, NPT, and NPR.

• AEi (T1) – Available Energy for a node i at a time T1.Same notations also apply for PSS and RC.

Now the trust values from these parameters are calculated as follows:

A1 = (AEi (T1) - AEi (T2)) / AEi (T1) where T1 < T2.

A2 = (PSSi (T1) - PSSi (T2)) / PSSi (T1) where T1 < T2.

A3 = CRAFi / CRFi.

A4 = DRAFi / DRFi.

A5 = 1 - NPCi / NPTi.

A6 = 1 - PDi/ NPRi.

Here Ti, the Trust Level of Node, is calculated by the node which is maintaining above table.

Ti=w1* A1 + w2* A2 + w3*A3 + w4* A4 + w5* A5 + w6*A6.

Here w1, w2, w3, w4, w5, w6 are constants, whose value is chosen such that Ti < 1.

More on Trust based cluster scheme can be obtained from [23].For our work we considered some extra parameters as well as discarded some of them while calculating trust value to make our scheme more generalized and robust. After computing trust level of each neighbor, a node will use these

values for routing purposes also. Steps for choosing group head are as follows

• Step-1: Whenever a group head finds that it is unable to bear the group head responsibility due to some reasons like low energy etc. then it will broadcast a message for a re-election.

• Step-2: If a node gets election message, then it will find a neighbor with highest trust value and sends this to current group head as a vote.

• Step-3: Now, group head will assign this responsibility to the node having the highest number of votes. For greater integrity, a vice group head can also be chosen but not necessarily. A vice group is needed because sometimes there can be failure of new elected group head before transferring responsibility. Ids of group head and vice group head will directly be broadcasted to all group members using a secret key. All communication described above must be done by using appropriate keys as described in the next section.

B. Key Management Architecture The scheme proposed by Hamed et. al [14] provides a

strong defense to node compromised attacks, while being very simple to implement. But the major drawback of this scheme is that it does not provide any mechanism for changing keys periodically, because it derives all the three types of keys from the key given before deployment. There is also no mechanism to isolate compromised node from a network. So we propose a modified key management protocol which keeps all the advantages provided by Hamed et. al and at the same time try to remove some of the drawbacks of [14]. We assume that K is a key that all sensor nodes initially have. At the time of initial deployment following algorithm is executed:

Algorithm Key Management (K: Master Key)

// Master key is that a node has from deployment time.

Begin

A node i will broadcast its id encrypted by key K to all the neighbors.

Suppose a node j is neighbor of node i, then in its response it will also sent its id encrypted by key K.

Node i will compute all of its keys

NBi= F (i || Group Head ID || K),

PW i,j = F (min(i,j) || max(i,j) || K),

BCi = F (i || K). Here || is concatenation operator.

In the same way node j and all other neighbors will calculate the above three types of keys using master key K.

Similarly all other nodes will calculate required keys, after which they’ll have to delete the master key to be resilient against node capture attacks.

End


200

In key establishment protocol by Hemed et. al, they send PW i,j , BCi encrypted using NBj. But to reduce communication overhead, we choose to compute them at j, because communication is three orders of magnitude more expensive than computation. Now if the same keys are to be used all the times in a network and a node becomes compromised, although the effect will be limited only to neighborhood , it can send group head a false information or launch a DoS (Denial of Service) attack to decrease the energy level of its neighbors. So we should isolate compromised nodes as soon as possible. To isolate them we propose following hierarchical Key Revocation Algorithm.

At the starting of each session, Base Station will distribute a key to the corresponding cluster heads; Cluster head will generate a new key K1= F (Group ID || Base Station Address || K) and subsequently multicasts the key K1.But the key K1 will be sent only to those nodes having high trust value. So in the previous session, if a node becomes compromised then its trust value will decrease automatically and it will not get hold of new key.

C. Adaptive Secure Communication Protocol None of the link layer security protocol proposed till now

provides adaptive security. So we have proposed an adaptive security protocol, which dynamically adjusts itself to a particular security level depending on the network state. Mechanism of providing adaptive security is handled by the proposed component ISA (Intelligent Security Agent), which is used to make cross layer interactions easier.

Packet Format

Figure -2 Comparative Analysis of Packet Structure of Different Link Layer Protocols

We have already described the limitations of current link layer protocols in Section 2. We will describe our protocol based on desired security properties that a protocol should possess.

Tinysec[17] uses CBC (Cipher Block Chaining) mode to provide message authentication (CBC-MAC).It minimizes the

cryptographic primitives, but Tinysec-AE has to perform two CBC mode encryption and CBC mode authentication at a sender side and a CBC-MAC mode authentication and CBC mode decryption at receiver side, which requires two symmetric key operations cycles to compute encryption and MAC. This overhead can be reduced greatly by using authenticated –encryption method such as OCB, CBC-X etc. We use OCB to generate cipher text as well as MAC in only one symmetric key operation (Also used in Minisec).

In Tinysec[17] packets, Source and Destination field is of 2 bytes, so we can say that a network can support 2^16=65536 number of nodes. We have introduced Group field because use of group field is crucial for many applications in sensor networks and also if we use a group field of 8 bits then a network can support 256 different groups, and using source id and destination id of 8 bits, a group can support 256 different nodes.

In our scheme, the number of nodes that a network can support is 256*256=65536 nodes, same as that of Tinysec. Packet overhead due to source and destination id in Tinysec is of 2+2=4 bytes. But in our scheme it is 3 bytes (including group filed).It is because a node can be distinguished by using {Groupid , Sourceid} or {Groupid, Destinationid}.This is consistent with our previous assumption that the communication of a node is limited to its group members only. Maximal payload length in Tinysec packet can be of 29bytes.So payload length cannot be greater than 5 bits and MSB 3bits are unused in each data packet to be sent. We have utilized MSB 3bits for providing adaptive security. First 2 bits will be used for encryption level and the third bit will be used for authentication purpose.

Now, we will discuss how the communication protocol mentioned above preserves the required security properties.

TABLE 2: DIFFERENT LEVEL OF ENCRYPTION

Bits Rep Level Operation 00 Level -0 Simple XOR 01 Level-1 RC5/80/4 10 Level-2 RC5/80/8 11 Level-3 RC5/80/12

Here RC5/80/4 represents RC5 encryption algorithm with

key size 80 bits and encryption rounds 4.

Data Secrecy and Authentication: - We are using RC5 as block cipher for encryption coupled with OCB, so that both encryption and authentication are achieved in only one pass, thus saving in processing time and energy at sensor node. Sensor node uses encryption to provide data secrecy, but sometimes there is very little difference in consecutive readings of a sensor node .We have used a nonce as a counter, which is used in encryption, thus ensuring each time different cipher text is generated. We have used four level of encryption to provide adaptive security. Level of encryption will be provided by ISA.

Replay Protection and Freshness Check: Replay protection is provided by using a monotonically increasing counter value at both ends or using a time stamp in the message. We have


201

provided the counter value in a packet header that is used to defend replay attacks. A monotonically increasing counter of 32 bits is used at the both ends. But only the last 8 bits is sent in the packet for saving transmission and reception energy. The whole operation is given as follows

• Assume that both the receiver and sender are having the same counter value of 32 bits each.

• Operation of RC5-OCB is applied using a counter value of 32bits and MAC is obtained.

• While sending, full MAC (32 bits) is sent but we send only LSB 8 bits of counter value, thus saving transmission of 24 bits.

• Receiver calculates an expected counter (Cs) number depending on the last connection / synchronization.

• After receiving a packet the receiver will concatenate Cs (0-23) and received counter value (8bits). It will apply RC5-OCB operation to get MAC and cipher text and if the MAC is same, the packet is accepted, otherwise it will increment Cs and then try calculating MAC assuming some packet loss.

There should be some bound over increment and check approach given in step5, which can be done by setting a threshold value that depends upon the network packet loss rate. Also it can be minimized by application of bloom filter [16] .Chin et. al [28] provides a LOFT protocol that recommends sending of only 3bits in packet. It will not suit for broadcast communication, so we have taken a counter value of 8bits to make our scheme suitable for unicast as well as broadcast communication. To make this protocol more resilient to replay attacks, base station can broadcast the periodic counter value.

V. SIMULATIONS AND ANALYSIS For implementation of our proposed framework, we have

extended Castalia simulator based on Omnet++ by adding ISA (Intelligent Security Agent).The approach proposed in this paper stresses on group communication and diversity of application scenarios, so we have tested it on three different application scenarios i.e. ‘Military Surveillance System’, ‘Habitat Monitoring’ and ‘Agricultural Farming’. These three different application scenarios correspond to high, medium and low level security respectively. Table 1 provides the different security requirements of these scenarios. A comparison is done between using fixed security level and variable security level as shown in figure 3-5.

Military Surveillance System uses high security level (i.e. highest encryption level) most of the time, whereas in case of Agricultural Farming, only data authentication with low level of encryption is used.

Saving in energy is achieved by variable encryption level and flexibility of authentication and counter sending. ISA, depending on the current percept, will determine an adaptive reaction for level of security that would incorporate many policies, thus recommendations can also be given at deployment level itself or afterwards. Here percept information is collected from various layers using cross layer interactions

and from the resource Manger. Percept Information may include following information

Figure 3

Figure 4

Figure 5

Figure 3, 4, 5. Military Surveillance System (Very Low Saving), Habitat Monitoring (Moderate Saving) and Agricultural Farming (High Saving).

Y Axis -Energy Consumed (Joule) and X-Axis – Node IDs

Types of information considered for simulation are: available memory at that time, available energy, trust level of neighboring nodes, and predefined policies as well as recommendations.

The final conclusion that can be drawn from the result is that the function of ISA is prominent in case where the level of security to be achieved is known in advance like in case of ‘Military application’ the level of security is very high, thus the


202

use of ISA is immaterial in this case as there will be very low energy saving whereas in case of ‘Agricultural Farming’ ISA plays an important role thus the amount of energy saved is very high. In case of “Habitat Monitoring” energy saved is moderate.

VI. CONCLUSION AND FUTURE WORK Improved security is especially important for the success of

the wireless sensor network (WSN), because the data collected are often sensitive and the network is particularly vulnerable. While a number of approaches have been proposed to provide security solutions against various threats to the WSN, most of which are based on the layered design. We have pointed out that these layered approaches are often inadequate and inefficient. For our work to design a security scheme, we Considered one extra component viz. ISA (Intelligent Security Agent), which will interact with all the layers just like a resource manager and provides us with an extensive list of information.

Cross Layered approach is energy efficient and robust as shown by some of current research works. ISA helps in determining an adaptive reaction to security level. In our knowledge it is one of the first security frameworks that will provide security services to each layer and services of sensor networks. Through the simulation results, we have shown that energy efficient security could be achieved if we use variable security level for each application scenarios. We have simulated the above framework to test its feasibility, but the actual output will come from realistic implementation of this approach on sensor motes. So we are developing this framework using TinyOS as a security package. We have implemented a very raw form of ISA. Functions of intelligent security agent can be made more general and enhanced by employing efficient learning algorithm.

REFERENCES [1] Yee Wei Law Paul J.M. Havinga,”How to Secure a Wireless Sensor Network”, ISSNIP 2005, IEEE 2005, pp(89-95). [2] Al-Sakib Khan Pathan et . al. “Security in Wireless Sensor Networks: Issues and Challenges” in Feb. 20-22, 2006, ICACT2006, ISBN 89-5519-129-4 pp(1043-1048). [3] Mingbo Xiao, Xudong Wang, Guangsong Yang,”Cross-Layer Design for the Security of Wireless Sensor Networks”, Proceedings of the 6th World Congress on Intelligent Control and Automation, June 21 - 23, 2006, Dalian, China, pp(104-108). [4] M Healy, T Newe and E Lewis,”Resources Implications for Data Security in Wireless Sensor Network Nodes” in Sensor Comm 2007,pp(170-175). [5] Sanjay Burman,”Cryptography & Security - Future Challenges and Issues”, Invited Talk, in proc. of ADCOM 2007. [6] Dr. Sami S. Al-Wakeel and Eng. Saad A. AL-Swailem,”PRSA: A Path Redundancy Based Security Algorithm for Wireless Sensor Networks”,WCNC 2007 Proceedings,pp(4159-4163). [7] Wei Ding and et.al.,”Energy Equivalence Routing in Wireless Sensor Networks”.,Journal of Microcomputers and applications,Spring 2004. [8] Chris Karlof David Wagner,”Secure Routing in Wireless Sensor Networks: Attacks and Countermeasures“. [9] Agah,Das,Basu,”A game theory based approach for security in wiereless sensor networks”,2004 IEEE,pp(259-263).

[10] Andras Varga,Manual of Omnet++ : A Discrete Event Simulation System. [11] Phillip Rogway et. al, OCB: A Block-Cipher Mode of Operation for Efficient Authenticated Encryption, 2003 ACM 0000-0000/2003/0000-0001 [12] Stefan Ransom, Dennis Pfisterer, and Stefan Fischer, Comprehensible Security Synthesis for Wireless Sensor Networks, In proceedings of MidSens’08, December 1-5, 2008, Leuven, Belgium. [13] Eric Sabbah, Adnan Majeed, Kyoung-Don Kang, Ke Liu, and Nael Abu-Ghazaleh, An Application-Driven Perspective on Wireless Sensor Network Security, In Proceedings of Q2SWinet’06, October 2, 2006, Torremolinos, Malaga, Spain. [14] Hamed Soroush, Mastooreh Salajegheh and Tassos Dimitriou, Providing Transparent Security Services to Sensor Networks. In Proceedings of IEEE International Conference on Communication (ICC’07).24-28 June 2007. [15] Madhukar Anand ,Eric Cronin, Micah Sherr,Matthew A. Blaze, Zachary G. Ives, Insup Lee, Sensor Network Security: More Interesting than you think, Position Paper at University of Pennsylvania,2006. [16] Mark Luk, Ghita Mezzour, Adrian Perrig, and Virgil Gligor."MiniSec: A Secure Sensor Network Communication Architecture." In Proceedings of the Sixth International Conference on Information Processing in Sensor Networks (IPSN 2007), April 2007. [17] Chris Karlof, Naveen Sastry and David Wagner, TinySec: a link layer security architecture for wireless sensor networks, Proceedings of the 2nd international conference on Embedded networked sensor systems, Baltimore, MD, USA. [18] H. Chan, A. Perrig, and D. Song: Random Key Predistribution Schemes for Sensor Networks. In IEEE Security and Privacy Symposium. pp. 197—213 (2003). [19] Perrig, A., Szewczyk, R., Wen, V., Culler, D.,Tygar,: SPINS: Security protocols for sensor networks. In Proceedings of Seventh Annual International Conference on Mobile Computing and Networks. (July 2001). [20] Chan, Haowen, and Adrian Perrig."PIKE: Peer Intermediaries for Key Establishment in Sensor Networks." In Proceedings of IEEE Infocom, Miami, Florida, March 2005. [21] Dr. Sami S. Al-Wakeel and Eng. Saad A. AL-Swailem,”PRSA: A Path Redundancy Based Security Algorithm for Wireless Sensor Networks” ,WCNC 2007 Proceedings, pp(4159-4163). [22] Bryan Parno et. al, Secure Sensor Network Routing: A CleanSlate Approach, CoNEXT 2006 Lisboa, Portugal. [23] Garth V. Crosby, Niki Pissinou, James Gadze, A Framework for Trust-based Cluster Head Election in Wireless Sensor Networks, Proceedings of the Second IEEE Workshop on Dependability and Security in Sensor Networks and Systems (DSSNS’06). [24] Tanveer Zia and Albert Zomaya, A Security Framework for Wireless Sensor Networks, SAS 2006 – IEEE Sensors Applications Symposium, USA. [25] Sanjeev Setia et al , LEAP :Efficient Security Mechanism for large scale Wireless Sensor Networks, In Proceedings of CCS’03(2003). [26] Bhaskaran Raman et. al, Censor Networks: A Critique of “Sensor Networks” from a Systems Perspective, ACM SIGCOMM Computer Communication Review, Volume 38, Number 3, July 2008. [27] Tae Kyung Kim, and Hee Suk Seo, A Trust Model using Fuzzy Logic in Wireless Sensor Network, Proceedings of world academy of science, engineering and technology volume 32,August 2008. [28] Chin-Tser-Huang,LOFT: Low overhead freshness transmission in sensor networks, Proceedings of the 2008 IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing, pp(241-248),2008.

AUTHORS PROFILE

1 Kalpana Sharma: She’s working as a Reader in the Deptt of Computer Science & Engineering in Sikkim Manipal Institute of Technology. She’s done her M.Tech from IIT Kharagpur.

2 M.K. Ghose: Presently serving as the Head of CSE Deptt, Sikkim Manipal Institute of Technology, Sikkim , India. He’s a Ph.D holder and has a number of publications in the field of Remote Sensing and GIS, Bioinformatics etc.

3 Kuldeep: He has completed his B-Tech from Sikkim Manipal Institute of Technology in 2009 June.

(IJCSIS) International Journal of Computer Science and Information Security

Vol. 3, No. 1, 2009

203

DYNAMIC BANDWIDTH MANAGEMENT IN DISTRIBUTED VoD BASED ON THE USER CLASS USING AGENTS

H S Guruprasad Research Scholar, Dr MGR University

Asst Prof & HOD, Dept of ISE BMS College of Engg, Bangalore

[email protected]

Dr. H D Maheshappa Director

East Point Group of Institutions Bangalore

[email protected] Abstract - This paper proposes a dynamic bandwidth management algorithm in which more bandwidth is allocated for higher class users and also higher priority is given to the videos with higher popularity within a class using agent technology. The popularity and weight profile of the videos which is used for efficiently allocating bandwidth is periodically updated by a mobile agent. The proposed approach allocates more bandwidth for higher class users and gives higher priority for higher weight videos [popular videos] so that they can be served with high QoS, reduces the load on the central multimedia server and maximizes the channel utilization between the neighboring proxy servers and the central multimedia server and lower video rejection ratio. The simulation results prove the reduction of load on central multimedia server by load sharing among the neighboring proxy servers, maximum bandwidth utilization, and more bandwidth allocation for higher class users. Keywords: Bandwidth management, user class, mobile agent, Distributed VoD

I. Introduction

Agents are autonomous programs which can understand an environment, take actions depending upon the current status of the environment using its knowledge base and also learn so as to act in the future. Autonomy, reactive, proactive and temporally continuous are mandatory properties of an agent. The other important properties are commutative, mobile, learning and dependable. These properties make an agent different from other programs. The agents can move around in a heterogeneous network to accomplish their assigned tasks. The mobile code should be independent of the platform so that it can execute at any remote host in a heterogeneous network [1, 2, 8, 10].

A video-on-demand system can be designed using any of the 3 major network configurations – centralized, networked and distributed. In a centralized system configuration, all the clients are connected to one central server which stores all the videos. All the client requests are satisfied by this central server. In a network system configuration, many video servers exist within the network. Each video server is connected to a small set of clients and this video server manages a subset of the videos. In a distributed system configuration, there is a central server which stores all the videos and smaller servers are located near the network edges. When a client requests a particular video, the video server responsible for the requests ensures continuous playback for the video [3].

In [5], Tay and Pang have proposed an algorithm called GWQ [Global Waiting Queue] which shares the load in a distributed VoD system and hence reduces the waiting time for the client requests. This load sharing algorithm balances the load between heavily loaded proxy servers and lightly loaded proxy servers in a distributed VoD. They assumed that videos are replicated in all the servers and videos are evenly required, which requires very large storage capacity in the individual servers. In [6], Sonia Gonzalez, Navarro, Zapata proposed a more realistic algorithm for load sharing in a distributed VoD system. Their algorithm maintains small waiting times using less storage capacity servers by allowing partial replication of videos. The percentage of replication is determined by the popularity of the videos. Proxy servers are widely used in multimedia networks to reduce the load on the central server and to serve the client requests faster.

In, [2], we had considered an architecture without PSG and a comparison was made with an architecture without neighbouring proxy servers. In this paper, we propose an efficient bandwidth allocation algorithm and VoD architecture for distributed VoD system which allocates higher bandwidth to the videos which have higher weights. The architecture consists of a Central Multimedia Server [CMS]. A set of local Proxy servers are connected together in the form of a ring to


Vol. 3, No. 1, 2009

204

form a Local Proxy Server Group [PSG]. All the PSG’s are connected to the CMS. All connections are made through fiber optic cables. The rest of the paper is organized as follows: Section 2 presents the proposed architecture, section 3 presents the proposed algorithm, Section 4 presents the simulation model, Section 5 presents the simulation results and discussion, Section 6 finally concludes the paper and further work.

II. Proposed Architecture

In the proposed architecture shown in the Fig 1, a Central Multimedia Server [CMS] is connected to a group of proxy servers. All these proxy servers are connected through fiber optic cables in the form of a ring. Each proxy server is connected to a set of clients (users). The video content that is

currently requested by its clients is stored in each proxy server.

Fig 1

Consider n videos v1, v2……vn. The mean arrival rates for the videos are λ1, λ2……..λn respectively. There are m server channels. The total arrival rate of all the videos is λ= .

The probability of receiving a user request for a video vi is given by Pi=λi/λ for i=1, 2…..n.

There are 3 classes of customer’s c1, c2 and c3 and the profit associated with each class is p1, p2 and p3 respectively. Let k1, k2……kn be the number of requests for the n videos v1, v2……vn. Also, ki=ki1+ki2+ki3, where ki1 is the number of requests of class 1, ki2 is the number of requests of class 2 and ki3 is the number of requests of class 3. Now, the weight associated with each video in class j is wi = kij*p1.

The CMS contains all the N number of videos. These N videos are categorized into most popular, secondary popular and least popular. Initially, few most popular, secondary popular and least popular videos are loaded to the proxy

servers. Also there weights for these videos are appropriately assigned.

A single mobile agent is invoked by the CMS periodically and this mobile agent travels across the proxy servers and updates the video popularity and weight profiles at the proxy servers and the CMS.

When a request for a video arrives at a proxy server PS, one of the following 4 cases happens:

‐ The requested video is already present in the proxy server PS

‐ The requested video is not present in the proxy server and the right neighbor proxy server[RPS], but is present in the left neighbor proxy[LPS] only

‐ The requested video is not present in the proxy server and the left neighbor proxy server[LPS], but is present in the right neighbor proxy[RPS] only

‐ The requested video is present in both LPS and RPS, but not in the proxy server PS

‐ The requested video is not present in the proxy server, left neighbor proxy server[LPS] and right neighbor proxy server[RPS]

If the requested video is present in the proxy server, then the real time transmission of the video starts immediately from the proxy server to the client. If the requested video is not present in the proxy server, then the weight of the video is computed as explained above.

If the requested video is not present in PS and RPS, but is present in LPS, then the bandwidth for the requested video between PS and LPS is allocated as follows:

If maximum bandwidth required for the requested video in that class is available between PS and LPS, then maximum bandwidth is allocated. If not, minimum bandwidth for the video in that class is allocated if available between PS and LPS. If minimum bandwidth required for the video in that class is also not available for the requested video between PS and LPS, then we check if minimum bandwidth could be accumulated by deallocating excess allocated bandwidth than the minimum bandwidth starting from the bottom (i.e. least weighted video). This way excess bandwidth is taken starting from the lower weight videos in that class. If minimum


Vol. 3, No. 1, 2009

205

bandwidth for the video in that class could be accumulated, then this minimum bandwidth is allocated to the requested video.

If bandwidth could not be allocated between PS and LPS, then bandwidth allocation is done between PS and CMS as explained above. If bandwidth could not be allocated between PS and CMS also, then the requested video is rejected.

If the requested video is not present in PS and LPS, but is present in RPS, then the bandwidth for the requested video between PS and RPS is allocated as explained above. If bandwidth could not be allocated between PS and RPS, then bandwidth allocation is done between PS and CMS as explained above. If bandwidth could not be allocated between PS and CMS also, then the requested video is rejected.

If the requested video is not present in PS, but is present in both LPS and RPS, then we check for the free bandwidth available between PS-LPS and PS-RPS. If free bandwidth available between PS and LPS is more than the free bandwidth available between PS and RPS, then bandwidth allocation is done between PS-LPS, otherwise bandwidth allocation is done between PS-RPS. If bandwidth could not be allocated between PS-RPS and PS-LPS, then bandwidth allocation is done between PS and CMS as explained above. If bandwidth could not be allocated between PS and CMS also, then the requested video is rejected.

III. Proposed Algorithm

[Nomenclature PS: proxy server LPS: Left neighbor proxy server RPS: Right neighbor proxy server BW: Bandwidth BWAvail (x, y): Bandwidth available between x and y MaxBWi: Maximum Bandwidth for the video in class i MinBWi: Minimum Bandwidth for the video in class i] When a request for a video arrives at a particular time t by a user of class i, do the following: if the requested video is present in PS start streaming the video from PS else dynamic bandwidth allocation is done according to the algorithm DynamicBand if bandwidth is allocated then the video is downloaded and stored at PS and streamed to the requested client else the request is rejected

Algorithm DynamicBand begin if the requested video is present in LPS only then begin call BA (LPS, PS, i) if bandwidth is not allocated call BA (CMS, PS, i) end if the requested video is present in RPS only then begin call BA (RPS, PS , i) if bandwidth is not allocated call BA (CMS, PS, i) end if the requested video is present in both LPS and RPS then begin if (BWAvail (LPS, PS)>BWAvail (RPS, PS)) then begin call BA (LPS, PS, i) if bandwidth is not allocated call BA (CMS, PS, i) end else begin call BA (RPS, PS, i) if bandwidth is not allocated call BA (CMS, PS, i) end end

if the requested video is not present in LPS and RPS then

call BA (CMS, PS, i) end Algorithm BA(X, PS, i) begin

if MaxBWi required for the video is available between X and PS

then allocate MaxBWi for the video else if MinBWi required for the video is available between X and PS then allocate MinBWi for the video else find out if MinBWi required for the requested video could be accumulated by deallocating excess BW than the MinBW starting from the bottom (i.e. least weighted video in class i) if MinBWi could be accumulated then allocate MinBWi required for the requested video


Vol. 3, No. 1, 2009

206

else BW is not allocated for the requested video end

IV. Simulation Model

The simulation model consists of a single central multimedia server and a few proxy server groups. The PSG consist of a few proxy servers. The parameters considered for simulation are as follows: Parameter values Number of proxy servers 6 Number of videos[NOV] 480 Bandwidth between PS-LPS, PS-RPS and PS-CMS 300MB Max Bandwidth for the videos in class 1 24MB to 29MB Min Bandwidth for the videos in class 1 8MB to 11MB Max Bandwidth for the videos in class 2 18MB to 23MB Min Bandwidth for the videos in class 2 6MB to 8MB Max Bandwidth for the videos in class 3 12MB to 17MB Min Bandwidth for the videos in class 3 4MB to 6MB No. of most popular videos NOV/4 =120 No. of secondary popular videos NOV/4 = 120 No. of least popular videos NOV/2 =240 No. of most popular videos initially loaded to PS 40 No. of secondary popular videos initially loaded to PS 40 No. of least popular videos initially loaded to PS 80

The performance parameters are load sharing among the proxy servers, more bandwidth allocation for the videos having more weights, video rejection ratio and the bandwidth utilization between PS-LPS, PS-RPS, PS-CMS.

V. Results and discussion The results presented are an average of several

simulations conducted on the model. Each simulation is carried out for 10000 seconds.

It is assumed that the video requests for the most popular, secondary popular and least popular videos are 50%, 35% and 15% respectively. Also, it is assumed that 20% of the requests are by class 1 users, 30% of the requests are by class 2 users and 50% of the requests are by class 3 users. The size of the videos are assumed to be quite large.

Fig 2

Fig 3

Fig 4


Vol. 3, No. 1, 2009

207

Fig 5

Fig 6

Fig 7

Fig 8

Fig 9

Fig 10


Vol. 3, No. 1, 2009

208

Fig 11

Fig 12

Fig 13

Fig 14

Fig 2, Fig 3 and Fig 4 shows the average maximum bandwidth, average minimum bandwidth and the average allocated bandwidth for all the videos being downloaded from LPS for class 1, class 2 and class 3 respectively. Initially maximum bandwidth is allocated to the videos downloaded from LPS in all the 3 classes. Later, when the number of videos being downloaded from LPS increases, the excess bandwidth of the lower weight videos in that class being downloaded will be taken back to allocate for new videos. Thus more bandwidth will be assigned to the more weight videos than the lesser weight videos. When the number of videos still increases, then the average bandwidth allocated still decreases.

Fig 5, Fig 6 and Fig 7 shows the average maximum bandwidth, average minimum bandwidth and the average allocated bandwidth for all the videos being downloaded from RPS for class 1, class 2 and class 3 respectively. Initially maximum bandwidth is allocated to the videos downloaded from RPS in all the 3 classes. Later, when the number of videos being downloaded from RPS increases, the excess bandwidth of the lower weight videos in that class being downloaded will be taken back to allocate for new videos. Thus more bandwidth will be assigned to the more weight videos than the lesser weight videos. When the number of videos still increases, then the average bandwidth allocated still decreases.

Fig 8, Fig 9 and Fig 10 shows the average maximum bandwidth, average minimum bandwidth and the average allocated bandwidth for all the videos being downloaded from CMS for class 1, class 2 and class 3 respectively. Initially maximum bandwidth is allocated to the videos downloaded from CMS in all the 3 classes. Later, when the number of videos being downloaded from CMS increases, the excess bandwidth of the lower weight videos in that class being downloaded will be taken back to allocate for new videos.


Vol. 3, No. 1, 2009

209

Thus more bandwidth will be assigned to the more weight videos than the lesser weight videos. When the number of videos still increases, then the average bandwidth allocated still decreases.

Fig 11, Fig 12 and Fig 13 shows the bandwidth utilisation between the PS-LPS, PS-RPS and PS-CMS. The bandwidth utilisation is almost maximum as shown in the figures. Thus, the bandwidth between PS-LPS, PS-RPS and PS-CMS is effiently used.

Fig 14 shows the number of videos requested, videos rejected without PSG and the videos rejected. The number of rejections is quite low which majorly happens when the video is not found in LPS and RPS also there is no free bandwidth between PS and CMS.

VI. Conclusion In this paper, we have concentrated on the dynamic

bandwidth management among the proxy servers by considering the user class and the popularity of the videos using agents. The simulation shows promising results. The algorithm always uses maximum bandwidth between the neighboring proxy servers and the central multimedia server by allocating more bandwidth to the higher class users and also to the popular videos in the class so that they are served with high QoS. Further work is being carried out to investigate dynamic bandwidth management by considering a local proxy server group.

REFERENCES

[1] R Ashok kumar, H S Guru Prasad, H D Maheshappa, Ganesan, “Mobile Agent Based Efficient Channel Allocation For VoD”, IASTED International Conference on Communication Systems & Applications [CSA 2006] 3rd -5th July, 2006, Banff, CANADA [2] H S Guruprasad, H D Maheshappa, ”Dynamic Load Sharing Policy in Distributed VoD using agents”, International Journal of Computer Science and Network Security, Vol 8, No 10, 2008 pp: 270-275. [3] Santosh Kulkarni “Bandwidth Efficient Video on Demand Algorithm (BEVA) “10th International conference on Telecommunications, Vol 2 pp 1335-1342, 2003 [4] Hongliang Yu, Dongdong Zheng, Ben Y. Zhao, Weimin Zheng, “Understanding User Behavior in Large-Scale Video-on-Demand Systems”, Proceedings of the 2006 EuroSys conference, Volume 40 , Issue 4 (October 2006), PP: 333 - 344 [5] Y.C Tay and HweeHwa Pang, “Load Sharing in Distributed Multimedia-On-Demand Systems”, IEEE Transactions on Knowledge and data Engineering, Vol.12, No.3, May/June 2000.

[6] S. Gonzalez, A. Navarro, J. Lopez and E.L. Zapata, "Load Sharing in Distributed VoD (Video on Demand) Systems". Int'l Conf. on Advances in Infrastructure for e-Business, e-Education, e-Science, and e-Medicine on the Internet (SSGRR 2002w), L'Aquila, Italy, January 21-27, 2002. [7] Meng Guo and Mostafa H. Ammar and Ellen W. Zegura, “Selecting among Replicated Batching Video-on-Demand Servers”, Proceedings of the 12th International Workshop on Network and Operating System Support for Digital Audio and Video, pp 155-163, 2002 [8] Mohammed A. M. Ibrahim, “Distributed Network Management with Secured Mobile Agent Support”, International Conference on Hybrid Information Technology (ICHIT'06), 2006 [9] Frederic Thouin, Mark Coates, Dominic Goodwill, "Video-on-Demand Equipment Allocation," Fifth IEEE International Symposium on Network Computing and Applications (NCA'06), pp 103-110, 2006 [10] S S Manvi and P Venkataram, “Mobile Agent based online Bandwidth allocation Scheme in Multimedia Communications”, IEEE GLOBECOM 2001 Conference USA [11] A Dan, D Sitaram and P Shabuddin, “Dynamic batching policies for an on demand video server “, Multimedia Systems, pp 51-58, 1996 [12] H S Guruprasad, H D Maheshappa, ”Dynamic Load Balancing Architecture for Distributed VoD using Agent Technology”, International Journal of Computer Science and Security, Vol(1),Issue(5) Nov 10, 2008 pp: 13-22. AUTHORS PROFILE

H S Guruprasad is an Assistant Professor and Head of the Department of Information Science & Engineering, BMS College of Engineering, Bangalore, India. He is an Engineering Graduate from Mysore University and did his Post Graduation at BITS, Pilani, India in 1995. He has vast experience in teaching and has guided many post graduate students. His research interests include multimedia Communications, distributed systems, computer networks and agent technology.


Vol. 3, No. 1, 2009

210

Dr. H.D. Maheshappa Graduated in Electronics & Communication Engineering from University of Mysore, India in 1983. He has a Post Graduation in Industrial Electronics from University of Mysore, India in 1987. He holds a Doctoral Degree in Engineering from Indian Institute of Science, Bangalore, India, since 2001. He is specialized in Electrical contacts, Micro contacts, Signal integrity interconnects etc. His special interests in research are Bandwidth Utilization in Computer Networks. He has been teaching engineering for UG & P G for last 25 years. He served various engineering colleges as a teacher and at present he working as director, East point group of institutions, Bangalore, India. He has more than 35 research papers in various National and International Journals & Conferences. He is a member of IEEE, ISTE, CSI & ISOI. He is a member of Doctoral Committee of Coventry University UK. He has been a Reviewer of many Text Books for the publishers McGraw-Hill Education (India) Pvt., Ltd, Chaired Technical Sessions, National Conferences and also has served on the advisory and technical national conferences.

(IJCSIS) International Journal of Computer Science and Information Security,Vol. 3, No. 1, 2009

Modeling reaction-diffusion of molecules on surfaceand in volume spaces with the E-Cell System

Satya Nanda Vel Arjunan∗†k and Masaru Tomita∗†‡∗Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, Japan

†Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan‡Department of Environment and Information, Keio University, Fujisawa, Japan

k [email protected]

Abstract—The E-Cell System is an advanced open-sourcesimulation platform to model and analyze biochemical reactionnetworks. The present algorithm modules of the system assumethat the reacting molecules are all homogeneously distributed inthe reaction compartments, which is not the case in some cellularprocesses. The MinCDE system in Escherichia coli, for example,relies on intricately controlled reaction, diffusion and localiza-tion of Min proteins on the membrane and in the cytoplasmcompartments to inhibit cell division at the poles of the rod-shaped cell. To model such processes, we have extended the E-CellSystem to support reaction-diffusion and dynamic localization ofmolecules in volume and surface compartments. We evaluatedour method by modeling the in vivo dynamics of MinD andMinE and comparing their simulated localization patterns to theobservations in experiments and previous computational work.In both cases, our simulation results are in good agreement.

Keywords—lattice, hexagonal close-packed, systems biology,Monte Carlo, simulation, FtsZ, MinC

I. INTRODUCTION

THE E-Cell System is one of the well known and advancedopen-source simulation platforms to model and analyze

both small- and large-scale biochemical reaction networks inliving cells [1]. The driver algorithm of the E-Cell System(version 3) supports concurrent executions of multiple simula-tion algorithms, whose time steps are independently advancedin continuous-time, discrete-time or discrete-event mannerat varying timescales [2]. Multiple sessions of simulations,usually required for estimation of reaction parameters and toobtain the averaged results from stochastic reactions, can besimultaneously executed with its distributed computing utility[3], [4]. Simulation runs can be automated and modified extempore with Python scripting, while new simulation algo-rithms can be developed using C++ and incorporated into thesystem as plug in modules.

Recent advances in molecular biology suggest that modelingreaction networks alone is not sufficient to accurately repro-duce certain important cellular processes such as cell division[5] and gene expression [6]. The dynamic location, crowdingand diffusion of molecules in cellular compartments play

This work is supported by the Monbukagakusho Scholarship from theMinistry of Education, Culture, Sports, Science and Technology of Japan,Core Research for Evolutional Science and Technology (CREST) program bythe Japan Science and Technology Agency (JST) and by the research fundsfrom the Yamagata prefectural government and Tsuruoka City, Japan.

crucial roles in such processes [7]. Molecular crowding cancause some subspace within a compartment to be inaccessibleto reacting molecules because it is occupied by other macro-molecules. This volume exclusion effect can reduce moleculardiffusion and alter reaction kinetics [8].

In the rod-shaped bacterial cell Escherichia coli, the divisionsite is restricted to the midcell by nucleoid occlusion andthe pole-to-pole oscillation of the proteins MinC, MinD andMinE, collectively called the MinCDE system (reviewed in[5]). The periodic oscillation is established because of intri-cately controlled reaction and diffusion of the proteins in thecytoplasm and the inner membrane. A simplified model of thesystem is illustrated in Figure I, which includes only MinDand MinE since MinC is not necessary for the oscillation.Division at the midpoint of the cell is important to ensureequal distribution of cell contents to the two daughter cells.The MinCDE molecules, found in low copies in the cytoplasmand on the inner cell membrane compartments, are not evenlydistributed temporally. As a result, the rate of reactions, whichis determined by the frequency of collision between reactingmolecules, is influenced by diffusion and physical localiza-tion within the compartments. The molecules undergo sur-face (two-dimensional space) and volume (three-dimensionalspace) reaction-diffusion (RD) on the cell membrane and inthe cytoplasm respectively.

Current algorithm modules of the E-Cell System assumethat reactions take place between molecules that are uniformlydistributed within the reaction compartment. It is also notpossible to specify the physical location of each molecule.In this paper, we describe the extension of the E-Cell Systemto model the spatial localization and RD of molecules. Outof the several available methods (see Table I), ours is theonly one that can account for the important implications ofvolume exclusion by molecules and RD in both volume andsurface spaces. To evaluate the new approach, we comparethe simulation results of the MinDE system with the resultsobtained from experiments and previous computational work.

II. METHODOLOGY

In this section, we describe the proposed RD scheme beforeproviding the details of the implementation with the E-CellSystem.

211


Figure 1. The simplified oscillation model of the MinDE sytem in Escherichia coli, adapted from [9]. MinC is not represented because it is not essential forthe oscillation. Arrows depict the five basic reactions in the model. In the first reaction, MinDADP exchanges nucleotide to become MinDATP with the ratek1. In the form of MinDATP, the molecule can bind to the membrane either autonomously with the rate k2 or cooperatively with a another membrane-boundMinDm

ATP at the rate k3. Cytosolic MinE is also recruited to the membrane by MinDmATP with the rate k4 to form MinE.MinDm

ATP. The ATPase functionof MinD is activated by MinE in the MinE.MinDm

ATP complex and consequently, MinDATP is converted to MinDADP that cannot stay bound to themembrane. This is represented by the fifth reaction, in which the MinE.MinDm

ATP complex dissociates from the membrane at the rate k5 and forms thecytosolic monomers MinE and MinDADP.

Table ICOMPARISON OF REACTION-DIFFUSION MODELING METHODS

Name Volume Volume SurfaceExclusion RD RD

MCell [10] - + +MesoRD [11], [12] - + +Smoldyn [13] - + +GMP [14] - + +GFRD [15] + + -CyberCell [16] + + -GridCell [17] + + -E-Cell (this work) + + +

A. The Reaction-Diffusion Scheme

Molecules diffuse freely in space by making random walks[18]. According to the Collins and Kimball theory [19], whentwo reactive molecules come into contact (i.e., collide), theyreact with a certain probability p, which is related to the reac-tion rate constant k. To avoid molecule search when checkingfor collisions, we have discretized the space into hexagonalclose-packed lattice [20]. Each sphere voxel in the lattice has12 adjoining neighbor voxels. To account for volume exclusionand molecular crowding, each voxel can be occupied by asingle molecule. The radius of the voxels is set according tothe size of diffusing molecules. A molecule can walk to arandomly selected neighbor voxel in an interval τd followingthe Einstein-Smoluchowski’s expression for diffusion

τd =

⟨r2

⟩2lD

, (1)

where⟨r2

⟩and D are the mean squared displacement and

the diffusion coefficient of the molecule respectively, andl = 2 for surface diffusion while for volume diffusion,l = 3. Since in the interval τd the molecule walks to aneighbor voxel, the mean squared displacement is given bythe lattice spacing. We have derived the spacing for surfaceand volume diffusion, and the connection between p and k

in [21]. At the destination voxel, the walking molecule maycollide with another molecule that is a reactant pair and reactif an independent random number drawn from a unit uniformdistribution is less than p.

B. E-Cell System Data Structure and Driver Algorithm

The specific details of the E-Cell System data structure,driver and integration algorithms have previously been de-scribed [2]. We briefly provide the data structure and thedriver algorithm here to characterize the implementation ofour algorithm modules. We adopt the notations in [2] andcapitalize the class names.

A reaction network system is represented in E-Cell as aModel, specified by the user in E-Cell Model descriptionlanguage (EML), a subset of XML. Figure 2 shows thedata structure of the Model class, which contains a list ofstate Variables and Steppers. The Stepper class is the mainalgorithm module of the Model and it operates with a set ofProcesses, the current local time τ and the step interval ∆τ ,a step method that advances the Stepper in time either in acontinuous-time, discrete-time or discrete-event fashion, andan interruption method that allows other Steppers to notify theStepper when they modify a read Variable of the Stepper. TheProcess is a lower level algorithm module that directly reads ormodifies the state Variables according to the algorithm usinga transition-function. Here, the instances of the state Variablesare accessed by dereferencing the Variable References.

Central to the E-Cell driver algorithm is a priority queuethat arranges the Steppers according to the scheduled time ofexecution, given as τ + ∆τ . At initialization, the global timet and the local time τ of Steppers are reset. Next, the stepmethod of the Stepper Si with the minimum scheduled timeis called and the global time is updated to t = τi + ∆τi.The step method also sets the local time to τi = τi + ∆τi andcalls the transition function of its Processes to update the stateVariables. The method may also update the next step size ∆τi

212


and call the interruption method of other Steppers whose readVariable has been modified. The Stepper Si is rescheduledin the priority queue according to the new scheduled time.The same procedure is repeated for the Stepper with thenext earliest scheduled time in the priority queue until thesimulation is ended.

Figure 2. The data structure of the E-Cell Model.

C. Implementation of Reaction-Diffusion with E-Cell

We have implemented the proposed reaction-diffusionscheme using the E-Cell System by creating two basic algo-rithm modules – a Diffusion Process and a Reaction Process.The molecules are represented as the state Variables of theModel. For each diffusion coefficient in simulation, a DiffusionProcess object is created to walk the molecules. Likewise,a Reaction Process object is instantiated for every reactioninvolving a diffusing molecule. A discrete-event Stepper ad-vances the Diffusion Process in time and handles the latticestructure and the physical location of molecules. The transitionfunction of the Diffusion Process walks each molecule to arandomly selected adjoining voxel in a diffusion step interval.When a molecule collides with a reactant pair, the transitionfunction of the corresponding Reaction Process is called. If thereactive collision probability is met, the Process removes thecollided molecules and replaces them with one or two productmolecules, as specified by the reaction.

III. APPLICATION RESULTS

We modeled the periodic oscillatory behavior of moleculesin the MinDE system to evaluate our approach because itinvolves both surface and volume RD, and spatio-temporallocalization of molecules. We describe the computationalmodel of the sytem before presenting the results of simulation.

A. The MinDE Model

In Escherichia coli, the FtsZ membrane protein initiates celldivision physically by polymerizing and constricting annularlyat the middle of the long axis of the rod-shaped cell [5], [22].Although the protein can diffuse over the entire membrane,nucleoid occlusion prevents the polymerization from takingplace over the nucleoid mass, leaving only the midcell and thetwo cell poles as viable locations for polymerization [23], [24],[25]. Nonetheless, because of the inhibition by MinC proteinsat the poles [26], [27], the polymerization can only take placeat the midcell. During the cell cycle, MinD with the help ofMinE forms polar zones which oscillate from one pole to theother. Since MinC piggybacks on MinD and the oscillationoccurs on the rod-shaped cell with some dwelling time at thepoles, the time-averaged concentration of MinC at the middleof the cell is kept low, permitting FtsZ to polymerize.

According to the model by Huang et al. [9] (illustrated inFigure I) cytosolic MinD in the ATP-bound form (MinDATP)binds to the membrane either cooperatively with anothermembrane bound MinD (MinDm

ATP) or independently. MinEfrom the cytoplasm inhibits MinDm

ATP by first associating withit and setting off the ATPase function of MinD that hydrolyzesthe bound ATP to ADP. The membrane-bound MinE and theADP-bound form of MinD (MinDADP) then dissociate to thecytoplasm. In the cytoplasm, MinDADP is phosphorylated andtakes the form of MinDATP again. MinC is not explicitlyrepresented in the model because it is usually attached toMinD and experimental data indicate that the oscillation cantake place without it [27], [28]. The series of reactions in themodel are as follows:

MinDADPk1−→ MinDATP (2)

MinDATPk2−→ MinDm

ATP (3)

MinDATP + MinDmATP

k3−→ 2MinDmATP (4)

MinE + MinDmATP

k4−→ MinE.MinDmATP (5)

MinE.MinDmATP

k5−→ MinE + MinDADP (6)

B. Simulation Results and Discussion

To simulate the MinDE oscillation model using the pro-posed approach, we applied the parameters from [29] (listedin Table II). However, we reduced the radius of our voxels tomore closely reflect the size of diffusing molecules. Anothervariation between our method and the MesoRD method in [29]is that our molecules can exhibit excluded volume.

Figure 3 shows the random distribution of cytosolicMinDADP, MinDATP and MinE molecules at initialization.The pole-to-pole oscillation of membrane-bound MinDm

ATP,as shown in Figure 4, is spontaneously triggered after about 1minute of simulated time although all molecules were initiallyrandomly distributed in the cytoplasm. For the 1 minute ofsimulated time, it takes about 14 minutes of simulation onan Intel Core 2 Extreme 3.2 GHz system with 8 GB ofRAM. The oscillation has an average period of 36 secondswhich corresponds to what has been observed experimentally

213


[30], [28]. The period is also in close agreement with thevalue from the previously reported computational model [29].Consistent with the observations by Huang et al. [9] and in[29], the period increased proportionally to the number ofMinD, while reduced proportionally to the amount of MinEin the model. In addition, as observed in the MinD-MinElocalization studies in Escherichia coli [31], the membrane-bound MinE.MinDm

ATP dimers appear to be lagging behindMinDm

ATP molecules when they migrate from one pole to theother. Taken together, our simulations closely reproduce theresults from both experimental and previous computationalstudies.

Table IIPARAMETERS OF THE MINDE MODEL

Variable Value Unitsk1 0.5 s−1

k2 0.0125 µms−1

k3 0.0149 µm3s−1

k4 0.0923 µm3s−1

k5 0.7 s−1

Dcytoplasm 2.5 µm2s−1

Dmembrane 0.01 µm2s−1

Cell volume 3.27 µm3

Cell radius 0.5 µmVoxel radius 8 nmInitial MinDATP molecules 2001Initial MinDADP molecules 2001Initial MinE molecules 1040

Figure 3. Simulated cytosolic molecules of the MinDE system. MinDATP

(purple), MinDADP (white) and MinE (red) are randomly distributed in thecytoplasm of Escherichia coli at initialization.

Here, we describe the steps that trigger the spontaneousoscillation. Initially, all molecules are evenly distributed inthe cytoplasm. Very small number of MinDATP moleculesbegin to independently associate at random locations on themembrane and rigorously recruit other MinDATP moleculescooperatively. As shown in Figure 5, the recruitment gives theappearance of growing patches on the membrane. CytosolicMinE molecules are attracted to these patches because oftheir high affinity to MinDm

ATP and form MinE.MinDmATP

dimers. Soon the patches loosely cover the entire membranebecause the rate of MinD recruitment is faster than the rateof dissociation, even after almost all MinE are found in the

MinE.MinDmATP dimer form on the membrane. At random

locations on the membrane, some MinDmATP patches are free

from MinE inhibition because of the limited cytosolic MinEmolecules. In addition, these patches also become more persis-tent at locations farther from the dissociating patches, wherethey are less inhibited by the newly released MinE moleculesand where MinDATP can escape cooperative recruitment byMinDm

ATP in the dissociating patches. Finally, patches (orpolar zones) form alternately (i.e., oscillate) at the two poles ofthe cell because the poles are sufficiently far from each otherto avoid both the rapid inhibition by the released MinE and thecooperative recruitment by MinDm

ATP. During the oscillationcycle, the MinE.MinDm

ATP dimers appear to be lagging behindMinDm

ATP because the released MinE molecules from theopposite pole find MinDm

ATP at the rim of the polar zone first.From the simulations, we observed several important fea-

tures of the MinDE system that support the periodic os-cillations. When we increased the nucleotide exhange rateof MinDADP to k1 = 1 s−1, the population of MinDATP

increased, resulting in more MinDmATP and longer polar zones.

Since there are more MinDmATP to be activated by MinE, the

oscillation period also increased to about 63 s. Conversely, theduration of an oscillation cycle is reduced to approximately 20s when the rate is decreased to k1 = 0.3 s−1 because of thelimited number of MinDATP copies available.

The reactive collision probability p for MinDATP to as-sociate independently to the lipid molecules on the mem-brane (k2 = 0.0125 µms−1, p = 0.274 × 10−4) is aboutfour orders of magnitude lower than to associate coop-eratively with another membrane-bound MinDm

ATP (k3 =0.0149 µm3s−1, p = 0.16), even though the number oflipid molecules (~72000) is only about an order of mag-nitude more than that of MinDm

ATP (~1400). This ensuresthat MinDATP only binds to the membrane independentlyto nucleate bindings of other MinDATP. When we increasedk2 = 0.05 µms−1, MinDm

ATP were found loosely distributedin the polar zone that extended beyond the midcell becausethe enhanced nucleation rate allows MinDATP to successfulybind almost anywhere on the membrane. Occasionally, therewere no clear definition of the polar zones, with MinDm

ATP

and MinE.MinDmATP covering the entire membrane. The os-

cillation period increased moderately to about 44 s. Reducingk2 = 0.0009 µms−1 shortened the period marginally to 33s and displayed erratic nucleation patterns, with the polarzones frequently appearing and growing near the midcell. Inaddition, because of the decreased nucleation rate, cooperativemembrane recruitments of MinDATP dominated further andrigorous membrane associations occured at the nucleationsites.

When the rate of cooperative recruitment was increasedto k3 = 0.034 µm3s−1, the oscillation was not triggeredbecause MinE cannot rapidly activate the rigorously recruitedMinDm

ATP. Therefore, both MinDmATP and MinE.MinDm

ATP

were uniformly distributed on the membrane. Setting k3 =0.024 µm3s−1 stimulated the oscillation but the polar zoneswere not clearly defined and extended well over the midcellbecause of the larger population of MinDm

ATP. The oscil-lation period also increased to approximately 50 s. Reduc-

214


ing k3 = 0.004 µm3s−1 also prevented the oscillation be-cause MinDATP cannot successfully bind to the membrane—MinE rapidly dissociates them since their cooperative recruit-ment activity has been weakened. However, setting k3 =0.007 µm3s−1 started the oscillation with a reduced period ofabout 30 s, while the polar zones were occasionally nucleatednear the midcell.

Reducing or increasing the MinE membrane recruitmentrate k4 has generally the opposite effect of k3. This is becausethe membrane associated MinE activates MinD ATPase func-tion that dissociates MinD to the cytoplasm. Increasing the rateto k4 = 0.4 µm3s−1 generated the oscillation with a shortenedperiod of about 30 s since more MinE.MinDm

ATP are availableto activate MinD. The polar zones were nucleated both near themidcell and at the poles, resulting in their rapid growth in eachcycle. Conversely, setting k4 = 0.05 µm3s−1 produced polarzones that are not clearly defined with an oscillation periodof approximately 41 s. Both polar zones frequently appearedsimultaneously with one usually covering more than one halfof the cell long axis.

The MinE.MinDmATP dissociation rate k5 has almost the

same properties as k4 but with higher efficacy. When therate was increased to k5 = 1.2 s−1, the polar zones wereshorter and cycled between the poles with a period lastingapproximately 12 s. The higher rate increases the populationof free MinE that can associate and activate other MinDm

ATP,thus reducing the size of the polar zones and increasing theoscillation speed. On the other hand when the rate was reducedto k5 = 0.5 s−1, the period increased to about 54 s, whilethe polar zones were nucleated at the cell poles and extendedbeyond the midcell.

The significantly slower diffusion coefficient of MinDmATP

and MinE.MinDmATP on the membrane prohibits the

molecules from rapidly achieving uniform concentration on themembrane. By increasing the diffusion coefficient twofold toDmembrane = 0.02 µm2s−1, the polar zones extended beyondthe midcell and were not clearly defined. The oscillationperiod was about 60 s. Increasing the coefficient further toDmembrane = 0.05 µm2s−1 did not produce the oscillation.

IV. CONCLUSIONS

We have successfully extended the E-Cell System to modelRD on surface and in volume spaces with dynamic localizationof molecules. A unique feature of our method is that it canaccount for the important implications of volume exclusion bymolecules while performing RD in both volume and surfacespaces [21]. The correctness of our method and implemen-tation is demonstrated by the accurate reproduction of theMinD and MinE oscillation behaviors in Escherichia coli,as observed in both experimental and previous computationalstudies. We modeled the MinDE system because the proteinsdisplay unique properties such as spatio-temporal localizationpatterns on the membrane and inter-compartmental reactions.We have shown the impact of changing the various reactionand diffusion parameters to the dynamic localization patternsof the proteins. Recent experimental studies have shown thatMinD forms helical polymers on the membrane [32]. Our

0 s

4 s

8 s

12 s

16 s

20 s

28 s

32 s

36 s

24 s

Figure 4. MinD and MinE oscillation on the membrane of Escherichia coli.MinDm

ATP monomers and MinE.MinDmATP dimers are shown in cyan and

green respectively. The MinDmATP monomers appear to lead the dimers in

the oscillation cycle which has an average period of 36 seconds.

work can be further extended to model such membrane-bound polymerization dynamics of molecules. The softwareimplementation of the method and the model presented in thispaper are provided upon request. A detailed guide for the RDmodeling using the method is also available [33].

ACKNOWLEDGMENT

We thank Martin Robert, Koichi Takahashi, Takeshi Saku-rada, Mohamed Helmy and Moriyoshi Koizumi for usefuldiscussions.

REFERENCES

[1] M. Tomita, K. Hashimoto, K. Takahashi, T. S. Shimizu, Y. Matsuzaki,F. Miyoshi, K. Saito, S. Tanida, K. Yugi, J. C. Venter, and C. A.Hutchison, “E-CELL: software environment for whole-cell simulation,”Bioinformatics, vol. 15, no. 1, pp. 72–84, 1999.

215


Figure 5. Initial patches of MinD (cyan) and MinE.MinDmATP (green)

forming on the membrane (gray).

[2] K. Takahashi, K. Kaizu, B. Hu, and M. Tomita, “A multi-algorithm,multi-timescale method for cell simulation,” Bioinformatics, vol. 20,no. 4, pp. 538–546, Mar. 2004.

[3] M. Sugimoto, K. Takahashi, T. Kitayama, D. Ito, and M. Tomita,“Distributed cell biology simulations with E-Cell system,” in GridComputing in Life Science, ser. Lecture Notes in Computer Science,2005, vol. 3370/2005, pp. 20–31.

[4] M. Sugimoto, “Distributed cell biology simulations with the E-Cellsystem,” in E-Cell System: Basic Concepts and Applications, ser. In-telligence Unit, S. N. V. Arjunan, P. K. Dhar, and M. Tomita, Eds.Georgetown, Texas: Landes Bioscience and Springer Science+BusinessMedia, Aug. 2009.

[5] J. Lutkenhaus, “Assembly dynamics of the bacterial MinCDE systemand spatial regulation of the Z ring,” Annu. Rev. Biochem, vol. 76, pp.539–562, 2007.

[6] P. B. Talbert and S. Henikoff, “Spreading of silent chromatin: inactionat a distance,” Nat. Rev. Genet, vol. 7, no. 10, pp. 793–803, Oct. 2006.

[7] K. Takahashi, S. N. V. Arjunan, and M. Tomita, “Space in systemsbiology of signaling pathways–towards intracellular molecular crowdingin silico,” FEBS Lett, vol. 579, no. 8, pp. 1783–1788, Mar. 2005.

[8] R. J. Ellis, “Macromolecular crowding: obvious but underappreciated,”Trends Biochem. Sci, vol. 26, no. 10, pp. 597–604, Oct. 2001.

[9] K. C. Huang, Y. Meir, and N. S. Wingreen, “Dynamic structures inEscherichia coli: spontaneous formation of MinE rings and MinD polarzones,” Proc. Natl. Acad. Sci. USA, vol. 100, no. 22, pp. 12 724–12 728,Oct. 2003.

[10] J. R. Stiles and T. M. Bartol, “Monte Carlo methods for simulatingrealistic synaptic microphysiology using MCell,” in ComputationalNeuroscience: Realistic Modeling for Experimentalists, E. D. Schutter,Ed. Boca Raton, FL: CRC Press, Nov. 2001, pp. 87–127.

[11] J. Elf and M. Ehrenberg, “Spontaneous separation of bi-stable biochem-ical systems into spatial domains of opposite phases,” Syst. Biol, vol. 1,no. 2, pp. 230–236, Dec. 2004.

[12] J. Hattne, D. Fange, and J. Elf, “Stochastic reaction-diffusion simulationwith MesoRD,” Bioinformatics, vol. 21, no. 12, pp. 2923–2924, Jun.2005.

[13] S. S. Andrews and D. Bray, “Stochastic simulation of chemical reactionswith spatial resolution and single molecule detail,” Phys. Biol, vol. 1,no. 3-4, pp. 137–151, Dec. 2004.

[14] J. V. Rodríguez, J. A. Kaandorp, M. Dobrzynski, and J. G. Blom,“Spatial stochastic modelling of the phosphoenolpyruvate-dependentphosphotransferase (PTS) pathway in Escherichia coli,” Bioinformatics,vol. 22, no. 15, pp. 1895–1901, Aug. 2006.

[15] J. S. van Zon and P. R. ten Wolde, “Simulating biochemical networksat the particle level and in time and space: Green’s function reactiondynamics,” Phys. Rev. Lett, vol. 94, no. 12, p. 128103, Apr. 2005.

[16] D. Ridgway, G. Broderick, A. Lopez-Campistrous, M. Ru’aini, P. Winter,M. Hamilton, P. Boulanger, A. Kovalenko, and M. J. Ellison, “Coarse-grained molecular simulation of diffusion and reaction kinetics in acrowded virtual cytoplasm,” Biophys. J, vol. 94, no. 10, pp. 3748–3759,May 2008.

[17] L. Boulianne, S. A. Assaad, M. Dumontier, and W. Gross, “GridCell: astochastic particle-based biological system simulator,” BMC Syst. Biol,vol. 2, no. 1, p. 66, Jul. 2008.

[18] H. C. Berg, Random walks in biology. Princeton University Press,1993.

[19] F. C. Collins and G. E. Kimball, “Diffusion-controlled reaction rates,”J. Colloid Sci, vol. 4, no. 4, pp. 425–437, Aug. 1949.

[20] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups,3rd ed. Springer, Dec. 1998.

[21] S. N. V. Arjunan and M. Tomita, “A lattice-based method for volumeand surface reaction-diffusion reproduces the MinE ring by cooperativeactivation of MinD,” Submitted.

[22] J. Löwe and L. A. Amos, “Evolution of cytomotive filaments: thecytoskeleton from prokaryotes to eukaryotes,” Int. J. Biochem. CellBiol, vol. 41, no. 2, pp. 323–329, Feb. 2009.

[23] C. L. Woldringh, E. Mulder, J. A. Valkenburg, F. B. Wientjes,A. Zaritsky, and N. Nanninga, “Role of the nucleoid in thetoporegulation of division,” Res. Microbiol, vol. 141, no. 1, pp. 39–49,1990.

[24] C. L. Woldringh, E. Mulder, P. G. Huls, and N. Vischer, “Toporegulationof bacterial division according to the nucleoid occlusion model,” Res.Microbiol, vol. 142, no. 2-3, pp. 309–320, Apr. 1991.

[25] T. G. Bernhardt and P. A. J. de Boer, “SlmA, a nucleoid-associated,FtsZ binding protein required for blocking septal ring assembly overchromosomes in E. coli,” Mol. Cell, vol. 18, no. 5, pp. 555–564, May2005.

[26] Z. Hu, A. Mukherjee, S. Pichoff, and J. Lutkenhaus, “The MinCcomponent of the division site selection system in Escherichia coliinteracts with FtsZ to prevent polymerization,” Proc. Natl. Acad. Sci.U.S.A, vol. 96, no. 26, pp. 14 819–14 824, Dec. 1999.

[27] Z. Hu and J. Lutkenhaus, “Topological regulation of cell division inEscherichia coli involves rapid pole to pole oscillation of the divisioninhibitor MinC under the control of MinD and MinE,” Mol. Microbiol,vol. 34, no. 1, pp. 82–90, Oct. 1999.

[28] D. M. Raskin and P. A. de Boer, “Rapid pole-to-pole oscillation of aprotein required for directing division to the middle of Escherichia coli,”Proc. Natl. Acad. Sci. USA, vol. 96, no. 9, pp. 4971–4976, Apr. 1999.

[29] D. Fange and J. Elf, “Noise-induced Min phenotypes in E. coli,” PLoSComput. Biol, vol. 2, no. 6, p. e80, Jun. 2006.

[30] C. A. Hale, H. Meinhardt, and P. A. de Boer, “Dynamic localizationcycle of the cell division regulator MinE in Escherichia coli,” EMBO J,vol. 20, no. 7, pp. 1563–1572, Apr. 2001.

[31] Y. Shih, X. Fu, G. F. King, T. Le, and L. Rothfield, “Division siteplacement in E.coli: mutations that prevent formation of the MinE ringlead to loss of the normal midcell arrest of growth of polar MinDmembrane domains,” EMBO J, vol. 21, no. 13, pp. 3347–3357, Jul.2002.

[32] Y. Shih, T. Le, and L. Rothfield, “Division site selection in Escherichiacoli involves dynamic redistribution of Min proteins within coiledstructures that extend between the two cell poles,” Proc. Natl. Acad.Sci. USA, vol. 100, no. 13, pp. 7865–7870, Jun. 2003.

[33] S. N. V. Arjunan, “A guide to modeling reaction-diffusion of moleculeswith the E-Cell System,” in E-Cell System: Basic Concepts and Ap-plications, ser. Intelligence Unit, S. N. V. Arjunan, P. K. Dhar, andM. Tomita, Eds. Georgetown, Texas: Landes Bioscience and SpringerScience+Business Media, Aug. 2009.

Satya Nanda Vel Arjunan is a Ph.D candidate in the Systems BiologyProgram at the Graduate School of Media and Governance in Keio University.He received his B.Eng. (2000) in Electronics Engineering and M.Sc. (2003) inComputer Science from Universiti Teknologi Malaysia. He is the recipient ofthe Texas Instruments Malaysia Scholarship (1997) and the MonbukagakushoScholarship (2003).

Masaru Tomita is a Professor and the Director of the Institute for AdvancedBiosciences, Keio University. He received his B.S (1981) in Mathematicsfrom Keio University, M.S (1983) and Ph.D (1985) in Computer Sciencefrom Carnegie Mellon University, and another Ph.D (1994) in MolecularBiology from Keio University. Among other awards and prizes, Dr. Tomitais a recipient of the Presidential Young Investigators Award from NationalScience Foundation of USA (1988) and IBM Japan Science Prize (2002).

216

Modeling of IEEE 802.11e EDCA: presentation andapplication in an admission control algorithm

Mohamad El Masri 1,2, Guy Juanole 1,2, Slim Abdellatif 1,2

1 CNRS ; LAAS ; 7 avenue du colonel Roche , F-31077 Toulouse, France.2 Université de Toulouse ; UPS, INSA, INP, ISAE ; LAAS ; F-31077 Toulouse, France

{masri,juanole,slim}@laas.fr

Abstract—We propose in this paper a Markov chain modelthat describes the behavior of an EDCA (Enhanced DistributedChannel Access) access category under saturation. Comparedto previous work [1], [2], [3], the model explicitly integratesthe behavior of an access category when submitted to a virtualcollision. We give in this paper two views of the model : a generalone explicitly representing the different states an access categorygoes through ; the second one is an abstract model with only threeuseful states. The model was used in several applications spanningfrom the performance evaluation to its use within an admissioncontrol algorithm. The latter application will be presented in thispaper.

I. INTRODUCTION

The IEEE 802.11e work group [4] introduced QoS (Qualityof Service) mechanisms into the MAC layer (Medium AccessControl) of the legacy IEEE 802.11 standard. This mainlyconsisted in the definition of a new access function: HCF(Hybrid Access Control) which combines two access modes,one of these is EDCA (Enhanced Distributed Channel Access)which is an enhancement of DCF (Distributed CoordinationFunction based on a CSMA/CA scheme). EDCA is the areaof interest of our work. With respect to DCF, EDCA introducesthe traffic differentiation concept, thus defining four access cat-egories (AC), each corresponding to a different queue withinthe station. A CSMA/CA scheme is implemented by eachAC. This scheme is based on the arbitration (characterizedby the AIFS parameter (Arbitration Inter Frame Space)) andon the backoff procedure (characterized by the contentionwindow (CW) and the range [CWmin,CWmax]). AIFS andCW play the same role as DIFS and CW in DCF. The choiceof AIFS and CW allow to prioritize the AC traffic (the smallerthe AIFS and CW, the higher the access probability). Dueto the presence of several queues within a station, EDCAintroduces, in addition to real collisions (physical collisionsin the channel involving queues from different stations), anew kind of collisions, named virtual collisions. These lattertake place when at least two queues from the same stationtry to access the medium at the same time after their backoffperiod. It results in granting the access to the highest priorityqueue and penalizing the others (by widening the CW in thesame manner as a real collision). Some models have alreadybeen proposed in the literature [3], [2], models which assumea saturation regime (thus frequent collisions) and which aremainly inspired by the Bianchi model of DCF [1]. Zhu andChlamtac [3] proposed a two dimensional model of an EDCA

AC which considers neither the virtual collision aspect nor thetime elapsing during a transmission on the medium. The modelof Kong et. al. [2] captures this latter aspect (at a cost of anew dimension in the model - the overall model of EDCAis a three dimensional discrete Markov chain). However itdoes not describe explicitly the virtual collision nor does itrepresent all the mechanisms described in the standard [4]. Inthis paper, we address these issues and propose a new Markovchain model describing precisely (with respect to the 802.11eamendment) the behavior of an EDCA access category undersaturation. Capturing the virtual collision management ofEDCA is fundamental for the modeling and analysis of manyreal-life usage scenarios. Internet traffic is often asymmetricwith much more traffic flowing from the access point to theend stations and little traffic in the reverse direction. In thissituation virtual collisions can not be neglected.An abstract form is derived from the general model usingcommon abstraction rules [5]. The abstract form is statisticallyequivalent to the general model and is reduced to the threeuseful states from the user’s point of view. We derived from theabstract form closed form expressions for the most importantmetrics that describe the performance perceived by the AC,namely Mean access delay, throughput and packet drop prob-ability. We used the model within a hybrid admission controlfor EDCA that we present in this paper: the admission controldecision process uses the metrics derived from the model toestablish a vision of the state of the network and thus be ableto issue admission decisions.This paper is organized as follows: section 2 describes thegeneral form of our Markov chain model of EDCA. Section3 presents the Beizer rules, used in order to reduce the initialmodel along with the abstraction process and the abstrcatform of the model. In section 4, the hybrid admission controlalgorithm for EDCA is given. An analysis of the algorithm isperformed using the Network Simulator (ns2). We concludethis paper by giving perspectives of evolution and of usage ofthe model.

II. THE GENERAL MODELA. ACi behavioral view

We give an abstract view of an ACi’s behavior in saturationregime (i ∈ (0, 1, 2, 3) in descending priority order) in figure 1.The transmission of a packet is implemented through a seriesof access attempts. Each is based, at first, on the sequence

International Journal of Computer Science and Information Security, IJCSIS, Vol. 3, No. 1, 2009

217

Access Attempt # (m+h+1)Retransmission # (m+h)

AIFS + Backoff

Collision # (m+h+1)

Access Attempt # 2Retransmission # 1

AIFS + Backoff

Collision # 2

Collision # 1

Successful Transmission

Access Attempt # 1

AIFS + Backoff

Actual TransmissionAttempt



Packet Drop

Fig. 1. Behavior of an ACi

of two processes (AIFS and backoff), defining the mediumidleness test before the actual transmission attempt, and thenon the actual transmission attempt (i.e. the decision to make atransmission). The result of each actual transmission is eithera successful transmission (following which the sending of anew packet is considered) or a collision (following which thepacket’s retransmission is considered). Note that on the firstattempt we have CW [ACi] = CWmin[ACi]. After a collisionsituation, the new value of the contention window is computedas follows:CWnew[ACi] = min(2∗CW [ACi]+1, CWmax[ACi]) in or-der to further reduce the collision probability. After m attempts(unsuccessful first transmission and m−1 retransmissions), thevalue of CW [ACi] is CWmax[ACi], this value will be usedfor the mth retransmission and h additional retransmissionsbefore reaching the Retransmission Threshold(R_T [ACi] = m + h). If the retransmission threshold isreached and the last retransmission was unsuccessful, thepacket is dropped and the transmission of a new packet isconsidered. This abstract view highlights the basic patternsfor the modeling: AIFS procedure, Backoff procedure, actualtransmission attempt procedure and their results.

B. The basic patterns

1) AIFS procedure: Any transmission attempt starts withthe random choice of the value of the Backoff Counter(B_C[ACi]) within the current contention window range

[0, CW [ACi]] (this value defines the backoff time which willbe used at the end of the AIFS period). The AIFS procedureconsists in the necessity to observe the medium idleness duringthe AIFS period. If, during the AIFS period (We call A itsduration in terms of time slots), the medium becomes busy,we have the AIFS decrementing freeze during the mediumoccupation time (we call N the mean value of this durationin terms of time slots) after which the AIFS countdown isreset. At the end of the last slot of AIFS, if the medium isstill idle, two outputs are possible: if B_C[ACi] = 0, the ACwill directly attempt a transmission; if B_C[ACi] > 0, thevalue of B_C[ACi] is decremented of one, thus initiating thebackoff procedure.

2) Backoff procedure: A backoff procedure will mainlyconsist in decrementing the value of B_C[ACi] while themedium is idle. The value of B_C[ACi] is decrementeduntil it reaches 0, one slot after which a transmission isdirectly attempted if the medium is still idle. If during thebackoff counter decrementing, the medium becomes busy, thedecrementing procedure is stopped and frozen during a timewhich is the sum of the medium occupation time and an AIFSperiod (this time has the value N + A), if during the AIFSperiod, the medium is busy again, the process is repeated.At the end of the last slot of AIFS, if the medium is stillidle, two outputs are possible: if B_C[ACi] = 0, the ACwill directly attempt a transmission; if B_C[ACi] > 0, thevalue of B_C[ACi] is decremented, thus resuming the backoffprocedure.

3) Actual transmission attempt: When an ACi decides toinitiate a transmission attempt, either it is the only one withinthe station to want to transmit, in which case it will directlyaccess the medium, or there is at least another AC withinthe station also wishing to transmit, in which case both ACswill go into a virtual collision . Within the virtual collisionhandler, the AC winner of the virtual collision (thus accessingthe medium) is the higher priority AC. If ACi loses thevirtual collision, then the medium will be accessed by an AC,virtually colliding with ACi and having a higher priority. Anactual transmission attempt is followed by three outcomes:

1) The transmission was successful, in which case ACi

occupied the medium for a duration dTse (dTse isthe smallest integer -in time slots- higher than Ts theduration of a successful transmission) and a new packettransmission is then taken into consideration.

2) ACi suffered a real collision, in which case ACi occu-pied the medium for a collided transmission time dTceand the packet may be retransmitted within the retrythreshold limit.

3) ACi lost a virtual collision, in which case ACi willnot occupy the medium, a higher priority AC withinthe station will transmit (either suffering a collision thusoccupying the medium for dTce or transmitting success-fully thus occupying the medium for dTse). ACi’s packetmay be retransmitted within the retry threshold limit.

Situations 2 and 3 above define globally, what we call, thecollision situation for ACi.


218

C. Basics for the modeling

1) ACi Behavior: We represent it by a discrete-timeMarkov chain where all the states have a duration of onetimeslot (we will thus only represent the transition probabil-ities in figures 2, 3 and 4). A state of the discrete Markovchain must specify both the packet access attempts (we haveto distinguish on one hand the successive attempts and theircorresponding collisions and on the other hand a successfultransmission), the backoff counter (we have to distinguish onone hand the backoff procedure where the backoff counter ismeaningful and on the other hand the situations where thebackoff counter is meaningless) and the remaining time to theend of the different timed actions (AIFS, medium occupancy,collision, successful transmission). Therefore a state of thediscrete Markov chain is represented by a triplet (j, k, d) withj representing the state of the packet attempt, k the backoffcounter and d the remaining time. We consider the followingvalues for each of the components:• j: 0 ≤ j ≤ m + h for the successive attempts (j = 0

for the first attempt and 1 ≤ j ≤ m + h for thefollowing retransmission attempts), each value of j isassociated to all the states of the AIFS period before thebackoff, the stage of the backoff procedure where thevalue of the contention window CW [ACi] is noted Wj ,and the collision situation; the successful transmission isrepresented by j = −1.

• k: 0 ≤ k ≤ Wj for stage j of the backoff procedure; inthe other cases where k is meaningless we take a negativevalue for k (different negative values should be taken, fortriplet uniqueness reasons, depending on the situation aswe explain after the specification of the values of d).

• d: 1 ≤ d ≤ dTse for the duration of a successfultransmission of ACi or after a virtual collision of ACi

(where ACk, winner of the virtual collision, successfullytransmits); 1 ≤ d ≤ dTce for the duration of a collision(of either ACi or ACk winner of the virtual collision);1 ≤ d ≤ A for the AIFS duration; A + 1 ≤ d ≤ N + Afor the medium occupancy duration occurring during anAIFS period or during backoff counter decrementing. Weconsider dTce < dTse.

As for each attempt j the AIFS period before the backoff andthe collision situation (in both situations the backoff counteris meaningless) can have remaining time values which can beidentical, it is necessary, in order to avoid state ambiguity, todistinguish these states by a different negative value of k; wechoose: k = −1 for the collision situation and k = −2 for theAIFS period. The value of k for the successful transmissionperiod is not problematic because of the different value of j,we thus choose k = −1.

2) Transition probabilities: Before defining the differentpattern models forming the whole model, we must define theprobabilities that will be associated to the transitions. At firstwe define the following basic probabilities:• The probability related of the medium becoming busy

(pb) or staying idle (1− pb).

• The probabilities related to the access attempt of ACi,whether competing or not with the other access categorieswithin the station (leading in the first case to a virtualcollision situation): pv

i is the probability for ACi not togo into a virtual collision when attempting to access, p wv

i

is the probability for ACi to go into a virtual collisionand win it and p lv

i is the probability for ACi to go into avirtual collision and lose it. Note that pv

i +p wvi +p lv

i = 1.• The probability for ACi to suffer a real collision during

its actual access to the medium (i.e. either ACi went intoa virtual collision and won it or did not go into a virtualcollision at all): pr

i . We have pri + pr

i = 1.• The probability (after the loss of a virtual collision by

ACi) for the AC winning the virtual collision (let ACk

be it) to suffer a real collision: prk. We have pr

k +prk = 1.

• The probability of the random choice of the BackoffCounter (B_C[ACi]) within the contention window forthe jth retransmission is 1

Wj+1 .

Based on those basic probabilities, we define the probabilitiescharacterizing the collision situation:

• p(2)i is the probability of an unsuccessful transmission

attempt resulting in a dTce slot occupation of the medium,i.e. either ACi suffered a real collision or ACi lost avirtual collision and ACk winner of this virtual collisionsuffers a real collision: p

(2)i = (pv

i + p wvi )pr

i + p lvi pr

k.• p

(3)i is the probability of an unsuccessful transmission

attempt resulting in a dTse slot occupation of the medium,i.e. ACi loses a virtual collision and ACk, winner of thevirtual collision, successfully transmits: p

(3)i = p lv

i prk.

• pi is the probability of a collision of ACi (a real collisionor a lost virtual collision): pi = p

(2)i + p

(3)i .

D. Models of the basic patterns

We at first present the graphs of each model, we then indicatehow to get the global model from these graphs. In each ofthe following models we represent the input and output statesin bold line type and the internal states in normal line type.The states that do not belong to the presented pattern (whicheither lead to an input state of the pattern or are reached froman output state) are represented in dotted line type (note thatthose external states are necessarily output/input states of otherpatterns). All the transitions are labelled with the transitionprobabilities presented in section II-C2.

1) Pattern: AIFS procedure and outputs: The model isgiven in figure 2. The different states of the pattern are selfexplanatory. We added to each of the transitions from theoutput state (j,−2, 1) a Predicate/Transition type label. Thepredicate is the value of the Backoff Counter (B_C[ACi])that has been randomly chosen at the beginning of the AIFSprocedure (see section II-B1). If B_C[ACi] = 0, there willbe a transmission attempt at the end of the last slot of AIFSif the medium is still idle, the transmission attempt will eitherlead to a successful transmission (state (−1,−1, dTse)) or toa collision (state (j,−1, dTse) in case ACi loses a virtualcollision and ACk, winner of the virtual collision, transmits


219

successfully, or state (j,−1, dTce) in case ACi collides orin case it loses a virtual collision and ACk collides). IfB_C[ACi] > 0, the chain transits into one of the states[(j, 0, 0), (j, 1, 0) . . . (j, Wj−1, 0)] representing the beginningof the backoff procedure.

j,-2,1

j,-2,A-1

j,-2,Aj,-2, N+A

1

1-pb

1-pb

pb

pb

pb

j,-2,A+111 1

j,1,0 j,Wj-1,0j,0,0

1)(W

)p(1

j

b

+−

1)(W

)p(1

j

b

+−

1)(W

)p(1

j

b

+−

B_C>0 ⇒B_C decrement

B_C=0 ⇒ Tx attempt

with probability

with probability

with probability

1)(W

)p)(1p(1

j

ib

+−−

1)(W

)pp(1

j

(2)ib

+−

1)(W

)pp(1

j

(3)ib

+−

j,-1, Tc

j,-1, Ts

-1,-1, Ts

Input State(Start of AIFS Period)

Output State (Last slot of AIFS)

Medium occupancy period

1-pb

Fig. 2. AIFS procedure pattern: 0 ≤ j ≤ m + h

2) Pattern: Backoff procedure and outputs: The modelis given in figure 3. The input states of the model are[(j, 0, 0), (j, 1, 0) . . . (j, Wj − 1, 0)]. The transitions betweenthese states represent the decrementing of the backoff counterwhile the medium is idle (probability 1− pb). If the mediumgoes busy (probability pb), the decrementing will be frozenduring the medium occupancy and an AIFS period (repre-sented by the subset of states above each counter decrementingstate). From the output states ((j, 0, 0) or (j, 0, 1)), a trans-mission is attempted if the medium is idle. The transmis-sion attempt will lead into one of the states (−1,−1, dTse),(j,−1, dTse), (j,−1, dTce) (as in section II-D1 - case whereB_C[ACi] = 0).

3) Pattern: Actual transmission attempt: The model isgiven in figure 4. The states (j,−2, 1), (j, 0, 1) and (j, 0, 0) arerespectively the output states in the model "AIFS Procedure"for the first one and "Backoff procedure" for the two others.Those are the states leading to a transmission attempt andresulting in either a successful transmission (right part of thefigure) or a collision (left part of the figure). In case of acollision, two different entry states are possible (both leadingto state (j,−1, 1) meaning two different medium occupancytimes):• states (j,−1, dTse) for a dTse occupancy time in case

ACi lost a virtual collision and ACk, winner of the virtualcollision, successfully transmits;

• (j,−1, dTce) for a dTce occupancy time either in caseACi accesses the medium and collides or in case ACi

-1,-1, 1

-1,-1, Ts

1

-1,-1, Ts-1

0,-2,A

1

11)(W

)p)(1p(1

j

ib

+−−

)p)(1p(1 ib −−

)p)(1p(1 ib −−

j,-2,1

j,0,1

j,0,0

j,-1,1

j,-1, Tc

j,-1, Ts

1

1

1

X,-2,A

X = j+1 if j<m+hX = 0 if j=m+h

(3)ib )pp(1−

(3)ib )pp(1−

(2)ib )pp(1−

(2)ib )pp(1−

1)(W

)pp(1

j

(3)ib

+−

1)(W

)pp(1

j

(2)ib

+−

a) Collisionb) SuccessfulTransmission

OutputState

OutputState

InputState

InputState

InputState

1 1

Fig. 4. Outcomes of an actual transmission attempt: 0 ≤ j ≤ m + h

loses a virtual collision and ACk, winner of the virtualcollision, collides.

Once the process is finished it will lead:• in case of a successful transmission, a new packet is taken

into consideration, we thus go to its first access attempt(state (0,−2, A));

• in case of a collision, if the retry threshold has notbeen reached, the packet will go into a new transmissionattempt (state (j + 1,−2, A)), if the retry threshold hasbeen reached, the packet is dropped and a new packet istaken into consideration (state (0,−2, A)).

E. Global model

The global model is got by connecting the models of thedifferent "Access Attempts" following the guide of figure 1(with j = 0, 1, 2 . . .m, . . . m + h).

III. AN ABSTRACT FORM OF THE MODEL

A. Need for the abstraction

The global model presented in the previous section detailsthe behavior of an IEEE 802.11e EDCA AC. This model maybe used in several different contexts: performance analysis ofthe access technique, embedded behavior representation of theaccess technique (in case of an admission control). The use ofthe model as it is implies a complex calculation (solving a sys-tem of several equations with several unknowns). This makesthe model unusable in an embedded case where calculationsand decisions are needed to be fast and with low computationtime. There is a need for an abstraction of the model froma user’s point of view i.e. focusing on three important states(Start of the first access attempt, successful transmission andpacket drop) and on the transition probabilities and durationsof these states. Such an abstraction can be gotten using theBeizer rules [5] on probabilistic and timed state graphs (alink between two states a and b is labelled with the transitionprobability Pab and the conditional sojourn time Tab). In thefollowing sections we present the rules used in order to achievethe abstraction of the initial model, we then describe theprocess of abstraction and the final result.


220

1-pbj,1,0 j,Wj-1,0

1-pb 1-pb

j,1,N+A

j,1,N+A-1

j,1,A+1

j,1,A

j,1,A-1

j,1,1

j,Wj-1,N+A

j, Wj-1,N+A-1

j, Wj-1,A+1

j, Wj-1,A

j, Wj-1,A-1

j, Wj-1,1

1-pb 1-pb

1-pb

1-pb

1

1-pb

1-pb

1-pb

1

1

1

1

1

1

1

pb

pb

pb

pb

pb

pb

pb

pb

j,0,N+A

j,0,N+A-1

j,0,A+1

j,0,A

j,0,A-1

j,0,1

1-pb

1-pb

1

1

1

1

j,0,0

-1,-1, Ts

j,-1, Ts

j,-1, Tc (2)ib )pp(1−

(2)ib )pp(1−

(3)ib )pp(1−

(3)ib )pp(1−

)p-)(1p(1 ib− )p-)(1p(1 ib−

Input states from state (j,-2,1)Decrementing of B_C

Output states

Medium occupancy

AIFS

pb

pb

pb

Fig. 3. Backoff Procedure pattern: 0 ≤ j ≤ m + h

B. The Beizer Rules

In [5], [6], Beizer detailed several rules used to replacenodes in a probabilistic and timed state graph with links thatare statistically equivalent to them. The rules correspond to thethree situations which can occur: series links, parallel links andloops. The procedure used is iterative: it consists in choosinga node to replace, replace it with the equivalent links (usingthe series link replacement rule), then combining the parallellinks and finally removing loops.

1) The "series" rule: It consists in replacing a linear chainof links by one statistically identical link. In figure 5,

pij = pik × pkj and tij = tik + tkj

i

j

k

i

k

pij,tij

pjk,tjk

pik,tik

i

j

i

j

pij,tijp1,t1 p2,t2 pN,tN

i

j

pij , tij

i

j

Pij,Tij

pii , tii

Fig. 5. Series reduction rule

2) The "parallel" rule: It consists in replacing several linkslinking two nodes by one statistically identical link. In figure6,

pij =N∑

k=1

pk and tij =∑N

k=1 pk × tk∑Nk=1 pk

i

j

k

i

k

pij,tij

pjk,tjk

pik,tik

i

j

i

j


i

j

pij , tij

i

j

Pij,Tij

pii , tii

Fig. 6. Parallel reduction rule

3) The "loops" rule: It consists in integrating a loop linkof a node into the links excident to the looping node. In figure7,

Pij =pij

1− piiand Tij = tij +

tii × pii

1− pii

i

j

k

i

k

pij,tij

pjk,tjk

pik,tik

i

j

i

j


i

j

pij , tij

i

j

Pij,Tij

pii , tii

Fig. 7. Loop reduction rule

4) Abstracting the AIFS pattern: Abstracting the AIFSpattern is done by applying at three different levels the Beizerrules previously presented :

1) The first step lies in a several sub-steps : applying theseries rule between states (j,−2, N +A) and (j,−2, A)


221

leading to a single transition between both states witha duration N and a probability of 1 ; then applying theseries rule between states (j,−2, A) et (j,−2, 1), givinga single transition with a duration (A−1) and a transitionprobability (1 − pb)(A−1) ; followed by alternating theseries and the parallel rule between states (j,−2, A) et(j,−2, N + A) giving the graph represented in figure 8with the transition probability :

P = pb + (1− pb)pb + · · ·+ (1− pb)A−2pb

P = 1− (1− pb)A−1

and the duration :

T =pb + 2(1− pb)pb + · · ·+ (A− 1)(1− pb)A−2pb

P

T =pb

∑A−1l=1 (l(1− pb)(l−1))

P

j,-2,1

j,-2,A

1

A-1, (1-pb)A-1

1,pb

j,-2,N+A

T,P

N,1

Fig. 8. First abstraction step of the AIFS pattern

2) The second step aims at abstracting state (j,−2, N +A), it consists in applying the series rule twice :once between states (j,−2, 1) and (j,−2, A) (through(j,−2, N + A)), the second time revolving around(j,−2, A) (again through (j,−2, N +A)). The resultinggraph is represented in figure 9.

j,-2,1

j,-2,A-1

j,-2,A

1

1-pb

1-pb

N+1,pb

N+1,pb

1-pb

j,-2,1

j,-2,A-2

j,-2,A1

2,(1-pb)2

1-pb

1-pb

N+1,pb

N+2,pb(1-pb)

N+1,pb

j,-2,1

j,-2,A 1

N+1,pb

N+1,pb

N+2,pb(1-pb)

(A-1),(1-pb)A-1

N+A-1,

pb(1-pb)A-2

N+1,pb

N+1,pb

j,-2,1

j,-2,A 1

N+1,pb(A-1),

(1-pb)A-1

T+N,P

Fig. 9. Second step of abstraction

3) The third step aims at reducing the loop around state(j,−2, A) giving the graph in figure 10 with :

TAIFS = (A− 1) + (T + N)P

1− P

.

j,-2,1

j,-2,A 1

N+1,pbTAIFS,1

Fig. 10. Abstract AIFS pattern

5) Abstracting the other patterns: The process we detailedfor the AIFS pattern can also be applied to the other patterns.We will not detail the procedure of abstracting the patterns.We only give, hereafter, the result of the abstraction of eachpattern.

j, 0,0 j,1,0 j,Wj-1,01+pb.X,1 1+pb.X,1 1+pb.X,1

j,-2, 1

j, 0,0

j,-2, 1

j,0,N+A

j, 0,A

N,1

A-1,(1-pb)A-1

j,0,0

T,P

j, 0,1

1,pb

1,pb

j, 0,0

j,0, 1

1W

)p(11,

j

b

+

−

1W

)p(11,

j

b

+

−

1W

)p(11,

j

b

+

−

1W

)Wp(1

,2

1W.X)p(11

j

jb

j

b

+

−

−++

bp ,P-1

PN)(T1)-(A1)(N ++++

bp ,P-1

PN)(T1)-(A1)(N ++++

j, 0,0

j,-2, 1

j,0, 1(2)ib )pp(1−

(3)ib )pp(1−

)p-)(1p(1 ib−

(2)ib )pp(1−

(3)ib )pp(1−

)p-)(1p(1 ib−

bp ,P-1

PN)(T1)-(A1)(N ++++

bp ,P-1

PN)(T1)-(A1)(N ++++1W

)Wp(1

,2

1W.X)p(11

j

jb

j

b

+

−

−++

(2)

i

j

b p1W

)p(1

+

−

(3)

i

j

b p1W

)p(1

+

−)p-(1

1W

)p(1i

j

b

+

−

j,-2, 1

-1,-1, Ts

j,-1, Ts

j,-1, Tc

Tj,Pt

Tj,Pc1

Tj,Pc2

j,-2, 1-1,-1, Ts

j,-1, Ts

j,-1, Tc

Tj,Pt

Tj,Pc1

Tj,Pc2

j,-2, A

j+1,-2, A

Tc,1

Ts,1

N+1,pb 1 ,P-1

PN)(T1)-(A ++

j,-2, 1

-1,-1, Ts

j,-2, A

j+1,-2, A

N+1,pb TAIFS,1

TTj,Pt

Tcj,Pc

Fig. 11. Abstract Backoff pattern

a) The backoff pattern: Figure11 shows the backoffpattern after a first abstraction phase.

b) A transmission attempt: We then combine both ab-stract patterns with the Actual Transmission attempt patternand apply the same rules leading to the graph in figure 12.The different sojourn times and transition probabilities usedin the graph are :

• TAIFS , given earlier, represents the sojourn time of thefirst AIFS period of each transmission attempt, it alsotakes into account the possible medium occupancy timesduring the AIFS procedure.

• TTj represents the mean duration of a Backoff period pre-ceding a successful transmission. Its value is contention


222

j, 0,0 j,1,0 j,Wj-1,01+pb.X,1 1+pb.X,1 1+pb.X,1

j,-2, 1

j, 0,0

j,-2, 1

j,0,N+A

j, 0,A

N,1

A-1,(1-pb)A-1

j,0,0

T,P

j, 0,1

1,pb

1,pb

j, 0,0

j,0, 1

1W

)p(11,

j

b

+

−

1W

)p(11,

j

b

+

−

1W

)p(11,

j

b

+

−

1W

)Wp(1

,2

1W.X)p(11

j

jb

j

b

+

−

−++

bp ,P-1

PN)(T1)-(A1)(N ++++

bp ,P-1

PN)(T1)-(A1)(N ++++

j, 0,0

j,-2, 1

j,0, 1(2)ib )pp(1−

(3)ib )pp(1−

)p-)(1p(1 ib−

(2)ib )pp(1−

(3)ib )pp(1−

)p-)(1p(1 ib−

bp ,P-1

PN)(T1)-(A1)(N ++++

bp ,P-1

PN)(T1)-(A1)(N ++++1W

)Wp(1

,2

1W.X)p(11

j

jb

j

b

+

−

−++

(2)

i

j

b p1W

)p(1

+

−

(3)

i

j

b p1W

)p(1

+

−)p-(1

1W

)p(1i

j

b

+

−

j,-2, 1

-1,-1, Ts

j,-1, Ts

j,-1, Tc

Tj,Pt

Tj,Pc1

Tj,Pc2

j,-2, 1-1,-1, Ts

j,-1, Ts

j,-1, Tc

Tj,Pt

Tj,Pc1

Tj,Pc2

j,-2, A

j+1,-2, A

Tc,1

Ts,1

N+1,pb 1 ,P-1

PN)(T1)-(A ++

j,-2, 1

-1,-1, Ts

j,-2, A

j+1,-2, A

N+1,pb TAIFS,1

TTj,Pt

Tcj,Pc

Fig. 12. Transmission attempt

window size dependant. We have :

TTj =1

Wj + 1

+[[1 +

Wj − 12

(1 + pbX)] + 1]Wj(1− pb)

Wj + 1

+[[1 +

Wj − 12

(1 + pbX)]

+ [(N + 1) + (A− 1) + (T + N)P

1− P]

+ [(N + 1) + (A− 1) + (T + N)P

1− P]

pb

1− pb

+ 1] Wjpb

Wj + 1

with :

X = (N + A) + (T ′ + N)P ′

1− P ′

T ′ =pb

∑Al=1(l(1− pb)(l−1))

P ′P ′ = 1− (1− pb)A

• Tcj represents the mean duration of a Backoff period pre-ceding a collision along with the duration of a collision.Its value is contention window size dependant. We have:

Tcj =(TTj + dTse)Pc1 + (TTj + dTce)Pc2

Pc

with Pc = Pc1 + Pc2

and Pc1 = (1− pb)p(2)i ; Pc2 = (1− pb)p

(3)i

• PTj represents the probability that a transmission attemptis concluded with a successful transmission. We have :

PTj = (1− pb)(1− pi)

• Pcj represents the probability that a transmission attemptis concluded with a successful transmission. We have :

Pcj = (1− pb)pi

-1,-1,Ts

0,-2,A

0,-2,1

N+1,pb TAIFS,1

TC0,PC0

TC1,PC1

TCm,PCm

m+h,-1,0

(Drop)

TT1,PT1

0,1

TTm,PTm

TT0,PT0

TTm+h,PTm+h

TCm+h,PCm+h

Ts,1

TAIFS,1

m+h,-2,A

m+h,-2,1

N+1,pb

m,-2,A

m,-2,1

TAIFS,1N+1,pb

1,-2,A

1,-2,1

TAIFS,1N+1,pb

m+h-1,-2,A

m+h-1,-2,1

TAIFS,1N+1,pb

m+1,-2,A

m+1,-2,1

TAIFS,1N+1,pb

TTm+1,PTm+1

TTm+h-1,PTm+h-1

TCm+1,PCm+1

TCm+h-1,PCm+h-1

Fig. 13. Intermediary model

6) The intermediary model: An intermediate abstract modelis then built from the combination of several transmissionattempt patterns, one for each backoff stage. This model isrepresented in figure 13. This intermediary model representsthe Binary Exponential Backoff behavior. This model is givenfor one main reason : it gives a general pattern that correspondsto the behavioral scheme given earlier in figure 1 which willallow the reader to have a better view of the general form ofthe model and of the place of each abstracted pattern in thegeneral view.

In figure 13, states (j,−2, A) (with 1 ≤ j ≤ m + h)represent the states where an AIFS period is required fol-lowing a collision. Those are followed by a transmissionattempt. State (0,−2, A) represents the beginning of thefirst transmission attempt of a packet. States (j,−2, 1) (with0 ≤ j ≤ m + h) represent the different Backoff steps. Thosestates are usually followed either by a collision (transition tostate (j + 1,−2, A)) or by a successful transmission (state(−1,−1, dTse)). If Backoff state (m + h,−2, 1) is followedby a collision, the packet is dropped. We introduced a virtualstate, (m+h,−1, 0), representing the drop situation. Transitionstates (−1,−1, dTse) and (m + h,−1, 0) to state (0,−2, A)


223

represents the decision of the access function to start thetransmission procedure of a new packet from the queuefollowing respectively a successful transmission or a drop.

-1,-1,Ts

0,-2,A

m+h,-1,0

Ts ,1

TT,PT

TD,PD

0,1

Fig. 14. Abstract model of EDCA

Figure 14 represents the final abstract model which is gottenfrom the intermediary model in figure 13. The behavior ofEDCA is reduced to the three relevant states from the pointof view of a user wishing to know the results of an attempt totransmit a packet: Access Attempt state {0,−2, A} called state1, successful transmission state {−1,−1, dTse} called state 2and the packet drop state {m + h,−1, 0} called state 3.The probabilities PD and PT respectively represent the prob-ability for a packet to be dropped or successfully transmitted.

PD = pm+h+1i

A packet is dropped if it suffers a collision at each of the(m+h+1) transmission attempts.

PT = 1− pm+h+1i

Which obviously is the complement of PD.TD and TT represent respectively the duration between thebeginning of the access attempt of a packet until it is eitherdropped or successfully transmitted.

TD =m+h∑

j=0

(TAIFS +(TAIFS + N + 1)pb

1− pb+ Tcj)

TD is the sum of all the collision durations (m+h+1 collisionin total), to which are added all the AIFS periods and thepossible busy medium periods.

TT =1− pi

1− pm+h+1i

m+hXj=0

�pj

i (TAIFS +(TAIFS + N + 1)pb

1− pb+ TTj)

+

j−1X

l=0

(TAIFS +(TAIFS + N + 1)pb

1− pb+ Tcl)

�

TT integrates for all the values of j (0 ≤ j ≤ m + h)two terms: the duration of a successful transmission at the jth

attempt and the duration of the collisions that preceded.From the transition probability matrix of the graph of figure

14, we get the equilibrium state probabilities of states 1, 2 and3: (Π1 = 1

2 ; Π2 = Π1PT ; Π3 = Π1PD).7) Derived performances: From this graph, we can obtain

the following performances, essential from the user’s point ofview:

a) The throughput:

Throughputi =PT dTse

PT TT + PDTD + PT dTseb) The mean access delay: The mean access delay of a

packet is the mean time between it first comes into consid-eration and its successful transmission which is equivalent toTT .

c) The packet drop probability: Similarly, the packetdrop probability is PD

IV. AN ADMISSION CONTROL ALGORITHM FOR IEEE802.11E

We give an overview of the functioning of a hybrid admis-sion control algorithm we designed for IEEE 802.11e. Thealgorithm is hybrid because it uses both model based metricsand measures done on the medium to establish a vision ofthe state of the medium and thus an admission decision. Ouralgorithm uses the throughput metric derived from the abstractmodel along with analytical formulas derived from the generalmodel. The algorithm is called when a new flow request arrivesat the Access Point’s admission controller. Two conditionsmust apply in order for the algorithm to accept the arrivingflow: first the arriving flow must be able to achieve its requestin terms of throughput, the second condition being that theadmission of the new flow must not degrade the quality ofservice of already admitted flows. The algorithm bases itsadmission decision on two parameters:• estimations made on what each flow’s collision rate and

the medium busy rate would be if the newly arriving flowwas to be admitted (those estimations are made based onactual measurements as explained in next section).

• the maximum achievable throughput of each flow, inthe previously estimated collision and medium businessconditions, calculated using the Markov chain model ofan access category presented earlier.

Algorithm 1 Admission control using the synthetic modelfor each Update_Period do

Update_Busy_ProbabilityUpdate_Collision_Probabilitiesif New_Flows 6= ∅ then

Fi = Get_New_FlowCalculate_Achievable_Throughput(Admitted_Flows ∪ Fi)if Check_Throughput (Admitted_Flows ∪ Fi) then

Admit(Fi)Admitted_Flows = Admitted_Flows ∪ Fi

elseRefuse (Fi)

end ifend if

end for

We define Fi as the flow requesting admission, New_Flowsis the set of all newly arriving flows, Admitted_Flows


224

is the set of active flows. Update_Busy_Probabilityand Update_Collision_Probabilities are the proceduresgiving the admission controller the information it needson both probabilities (by direct measurement for thebusy probability and by piggybacking from the dif-ferent flows for their collision probabilities). The pro-cedure Calculate_Achievable_Throughput(SetofF lows)calculates for each flow (all of the active flows and onenewly arriving flow) their maximum achievable throughputin the estimated network conditions (i.e. for a given flow,its throughput if saturated, given the estimated busy prob-ability and the estimated collision probability). ProcedureCheck_Throughput(SetofF lows) returns true if, for eachof the flows in the set, its achievable throughput is greaterthat its request: Calculated_Achievable_Throughput(F ) >Requested_Bandwidth(F ). The algorithm is detailed in al-gorithm 1.

A. Estimating the probabilities

In the process of decision making, the values of busyprobability pb and each AC’s collision probability are needed.The busy probability can be directly measured by the AccessPoint. The collision probabilities are calculated by the stationsand communicated periodically to the access point by meansof piggybacking or management packets (in fact the stationwill communicate, for each AC, a count of access attemptsand of collisions). Since the measurements are made in theactual context of the medium (i.e. having only the alreadyadmitted flows active and not those requesting admission), theachievable throughput calculation wouldn’t be correct. Thus,an additional process of estimation was added which, basedon the actual measurements made and on the specification ofthe flow requesting to access the medium, will estimate thevalues of collision probability and busy probability would therequesting flow be admitted.

Let Fi be the flow whose admission is being examined, Fi

will be using access category ACi in station s. We also defineτi as the probability for ACi to access the medium on a freeslot. We define Γs, the probability for station s to access themedium. Among the access categories of a station, only onecan access the medium at a specific time slot (the others areeither inactive or in backoff procedure or have lost a virtualcollision); we can therefore write Γs =

∑3i=0 τi.

We define pb as the probability of the medium becoming busy.We neglect the reasons of business of the medium other thanstation access, we therefore write

pb = 1− (1− Γ1)(1− Γ2) . . . (1− ΓM ) = 1−M∏

j=1

(1− Γj)

M being the number of stations sharing the medium.pir is the probability for ACi to suffer a real collision whenaccessing the medium, we can write pir as follows:

pir = τi(1− (1− Γ1) . . . (1− Γs−1)(1− Γs+1) . . . (1− ΓM ))

= τi(1−∏

j 6=s

(1− Γj))

In order to better understand the following, note that allvalues indexed old are measured values (either directly by theaccess point, or measured by the stations and communicated tothe access point). The values indexed new are estimated values(estimation of what would the value be if the requesting flowwas active).In the case of the collision probability, we estimate the effectof introducing Fi on real collisions occurring on the medium.Since we consider the admission of one flow at a time, wesuppose that the access activity of Fi’s station would be theonly one to change. Let pir_new and τi_new be the estimatedreal collision probability of ACi and its estimated accessprobability if Fi was to be accepted. pir_old and τi_old theactual real collision and access probabilities. We have:

pir_new − pir_old = (τi_new − τi_old)(1−∏

j 6=s

(1− Γj_old))

Let ∆τ be the difference introduced by Fi to the accesscategory’s access probability should Fi be accepted. We have:

pir_new = (∆τ )(1−∏

j 6=s

(1− Γj_old)) + pir_old

This estimated ratio will be considered as the estimation ofwhat ACi’s real collision probability would be if Fi wasto be admitted. In the equation above, the access activitiesof the stations are communicated to the HC along with theinformation on the different active flows.In the same fashion as above, we define pb_new as theestimated busy probability if Fi was to be accepted and pb_old

the actual busy probability. Since we consider the admissionof one flow at a time, we suppose that the access activity ofACi would be the only one to change. Hence:

1− pb_new

1− pb_old=

(1− Γ1_old) . . . (1− Γi_new) . . . (1− ΓM_old)(1− Γ1_old) . . . (1− Γi_old) . . . (1− ΓM_old)

=(1− Γi_new)(1− Γi_old)

Following the same reasoning as for the estimation of the realprobability we have:

1− pb_new

1− pb_old=

(1− Γi_new)(1− Γi_old)

=(1− Γi_old −∆τ )

(1− Γi_old)

1− pb_new = (1− pb_old)(1− Γi_old −∆τ )

(1− Γi_old)

pb_new = 1− (1− pb_old)(1− ∆τ

(1− Γi_old)

)

The only unknown in both estimations is ∆τ . ∆τ representsthe additional accesses introduced by the new flow whichcan be additional transmission and possible retransmissionsintroduced by the flow. Considering only one possible collisionper transmitted packet, we use the following to estimate ∆τ :∆τ = (1 + pir)δ, δ being the number of accesses introducedby the flow (i.e. the number of packets to be sent during theupdate period). Both those estimations will be used in thecalculation of the achievable throughput during the admissionmaking process.


225

Scenario Packet Size (Bytes) Interarrival (s) Bandwidth (Mbps)Scenario 1 600 0.002 2.4Scenario 2 800 0.004 1.6Scenario 3 600 0.004 1.2

TABLE ISPECIFYING THE PRESENTED SCENARIOS

B. Enhancing the algorithm

Simulations have been made showing that the estimationswe make of collision probabilities and of busy probability,although going in the correct direction, are not exact. This ismainly due to the fact that we assume in our estimations thatthe new flow will only affect the collision rate of its accessqueue; however, it is clear that all other access queues will beaffected by the new flow. As a consequence, in high mediumoccupancy period, the admission control algorithm takes, insome cases, wrong admission decisions. We thus propose toenhance the decision process by correcting the estimationsmade on the different probabilities with the help of a feedbackcorrection system. We introduce a simple history-less feedbackcorrection of busy probability estimation where we add toeach estimation the error made on the previous one. Let pbe_k

the kth estimated value of pb (using the original estimationprocess), pbm_k the kth measured value of pb and let pb_new_k

be the new corrected estimation of pb. The estimation worksas follows: pb_new_k = pbe_k + (pbm_k−1 − pbe_k−1).

V. ANALYZING THE ALGORITHM

We present in this section analysis we made of the ad-mission control algorithm along with its enhancement. Theanalysis is made by means of simulation using the networksimulator (ns-2) [7] and is done in comparison to the mainhybrid admission control algorithm for EDCA in the litterature[8]. We use the EDCA module contributed by the Telecommu-nication Networks Group of the Technical university of Berlin[9]. The EDCA module was modified in order to integrate theadmission control we propose along with the enhancement. Foreach scenario we present, 10 simulations with different randomnumber generator seeds were executed. The results we presentin this section are sample means. In each simulation, a numberof flows will be periodically activated, seeking thus admissionto access the network through the admission control algorithm(or through the enhanced admission control algorithm). Themetrics used for the analysis are the following:• The total throughput of all the flows in a specific scenario

using the algorithm with or without the enhancement, orusing Pong et al.’s algorithm [8].

• The mean throughput of a flow in a specific scenariousing the algorithm with or without the enhancement, orusing Pong et al.’s algorithm.

• The cumulative distribution function of the delays of alldata packets.

Different execution scenarios were tested, we present in thefollowing the results of several representative scenarios. Inscenarios 1, 2, and 3: the channel is considered error free,

0 50 100 150 2000

0.5

1

1.5

2

2.5

3

3.5

4x 10

6

Time (s)

Thr

ough

put (

bps)

1 Admission control with no enhancements2 Estimation Correction3 Pong et al. algorithm

Fig. 15. Mean throughput of flows, scenario 1

0 50 100 150 2000

1

2

3

4

5

6

7x 10

6

Time (s)

Thr

ough

put (

bps)


Fig. 16. Total throughput of flows, scenario 1

no hidden terminals are present and the stations function at11 Mbps. One station operates as the Access Point and willexecute the admission control algorithm. Within the otherstations CBR flows with the traffic specifications describedin table I will be periodically activated, thus requesting accessto the admission controller. The results of those simulationsare presented in figures 15-23.

A. Analysis

Scenarios 1, 2 and 3 presented here are representative ofthe different behaviors encountered for different simulationscenarios tested. Note that the bad admission decisions ofthe bare admission control algorithm are not generalized.The algorithm works well (as can be seen when comparedto the performance of Pong et al.’algorithm, each time toopessimistic) but does in some cases bad admission decisions,hence the introduction of the proposed modification. It can beclearly seen in the following that the proposed modificationachieves the correction of the problems of the bare algorithm.

d) Scenario1: The results are presented in figures 15,16, 21. Scenario 1 is a case where no bad admission decisions


226

0 50 100 150 2000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

6

Time (s)

Thr

ough

put (

bps)



0 50 100 150 2000

1

2

3

4

5

6

7

x 106

Time (s)

Thr

ough

put (

bps)



are made by the admission control algorithm. Pong et. al’salgorithm refuses too many flows. It can also be seen that theenhancement proposed did not degrade the service offered bythe admission control algorithm. The enhancement admittedthe ideal number of flows: maximizing the utilization of themedium without degrading the service offered to the activeflows (fig. 15-16). As we said earlier, the main aim of theenhancement is to make admission decisions more drasticin order to avoid a bad admission decision. Here, no baddecisions were taken, neither by the bare admission controlalgorithm nor by the enhanced algorithm: the mean throughputper flow respects each flows request and the delays are minimal(fig. 21).

e) Scenario 2: The results are presented in figures 17,18, 22. In scenario 2, the original algorithm will admit onetoo many flows. Pong et al’s is once again pessimistic. Theenhancement will correct our algorithm’s flaw. This will resultin a better mean throughput per flow (fig. 17) (better in the wayit respects the admitted flows requests) and a better distributionof delays (fig. 22) (with the enhancement, about 90 % of the

0 50 100 150 2000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

6

Time (s)

Thr

ough

put (

bps)


1

23


0 50 100 150 2000

1

2

3

4

5

6

7x 10

6

Time (s)

Thr

ough

put (

bps)



packets have delays less than 10 ms whereas it is the casefor only 37 % of the packets in the scenario without theenhancement). The flow that was admitted in excess by thebare algorithm will cause unexpectedly additional collisionswhich will in turn cause the service provided to be degraded.

f) Scenario 3: The results are presented in figures 19, 20,23. The same analysis can be made here, a smaller requestedbandwidth per flow is here studied. The estimation processcorrection will give the admission control a better view of themedium’s state and render the decision process better.

VI. CONCLUSIONS

In this paper we presented a Markov chain model of EDCA[10]. An Abstract model [11] was derived from the threedimension Markov chain model of EDCA . The abstractionwas done based on the Beizer rules of reduction [5]. Theobtained model is simple and easy to use. This abstract modelwas used in order to establish a performance analysis of EDCAin different network conditions and different configurations.The abstract model (and the intermediary abstraction statewe presented) were used [12] to study different modifications


227

0 50 100 150 200 250 300 350 4000

0.2

0.4

0.6

0.8

1

Delay (ms)

CD

F


Fig. 21. CDF of delays, scenario 1

0 50 100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Delay (ms)

CD

F



0 50 100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Delay (ms)

CD

F



of the post collision behavior of EDCA already presentedin [13]. We also present in this paper a hybrid admissioncontrol algorithm [14], [15] for EDCA. The algorithm was

thouroughly studied and adjusted in order to achieve the bestpossible throughput for flows using an EDCA based network.Future evolution of the model is to consider the non saturationregime, this will allow us to use it in other online applications.The admission control algorithm was implemented into anexperimental platform and will be studied using it.

REFERENCES

[1] G. Bianchi, “Performance analysis of ieee 802.11 distributed coordi-nation function,” IEEE Journal on selected areas in communications,vol. 18, no. 3, pp. 535–547, March 2000.

[2] Z. Kong, D. H. K. Tsang, B. Bensaou, and D. Gao, “Performanceanalysis of ieee 802.11e contention-based channel access,” IEEE Journalon selected areas in communications, vol. 22, no. 10, pp. 2095–2106,December 2004.

[3] H. Zhu and I. Chlamtac, “Performance analysis for ieee 802.11e edcfservice differentiation,” IEEE Transactions on wireless Communications,vol. 4, no. 4, pp. 1779–1788, July 2005.

[4] 802.11e, IEEE Standard for Telecommunications and Information Ex-change between Systems – LAN/MAN specific Requirements – Part 11:Wireless LAN MAC and PHY specifications – Amendment 8: MediumAccess Control QoS Enhancements, November 2005.

[5] B. Beizer, The architecture and Engineering of Digital ComputerComplexes. New York: Plenum Press, 1971.

[6] Y. Atamna and G. Juanole, “Methodology for obtaining abstract viewsof state graphs labeled with probabilities and times. an example ofapplication to a communication protocol,” Proceedings of the ThirdInternational Workshop on Modeling, Analysis, and Simulation of Com-puter and Telecommunication Systems. MASCOTS ’95, pp. 299–306,March 195.

[7] ns2, “The network simulator - ns-2,” 2005. [Online]. Available:http://www.isi.edu/nsnam/ns/

[8] D. Pong and T. Moors, “Call admission control for ieee 802.11 con-tention access mechanism,” IEEE Global Telecommunications Confer-ence, GLOBECOM ’03, vol. 1, no. 1-5, pp. 174–178, December 2003.

[9] TKN-TUB, “An ieee 802.11e edca and cfb simulation model forns-2,” 2007. [Online]. Available: http://www.tkn.tu-berlin.de/research/802.11e_ns2/

[10] M. E. Masri, G. Juanole, and S. Abdellatif, “Revisiting the markovchain model of ieee 802.11e edca and introducing the virtual collisionphenomenon,” to appear in the International Conference on WirelessInformation Networks and Systems, WINSYS ’07, July 2007.

[11] ——, “A synthetic model of ieee 802.11e edca,” International Confer-ence on Latest Advances in Networks, ICLAN ’07, Decembrer 2007.

[12] M. El Masri and S. Abdellatif, “Managing the virtual collision in ieee802.11e edca,” 8th IFAC International Conference On Fieldbuses andNetworks in Industrial and Embedded Systems, FET 2009., May 2009.

[13] M. E. Masri, “Ieee 802.11e : the problem of the virtual collisionmanagement wit hin edca,” Proceedings of IEEE Infocom StudentWorkshop ’06, April 2006.

[14] M. E. Masri, G. Juanole, and S. Abdellatif, “Hybrid admission controlalgorithm for ieee 802.11e edca: analysis,” International Conference onNetworks, ICN ’08, April 2008.

[15] ——, “On enhancing a hybrid admission control algorithm for ieee802.11e edca,” IFIP International Conference on Personal WirelessCommunications, IFIP PWC ’08, October 2008.


228

Training Process Reduction Based On

Potential Weights Linear Analysis

To Accelarate Back Propagation Network

Roya Asadi1, Norwati Mustapha

2, Nasir Sulaiman

3

1,2,3Faculty of Computer Science and Information Technology,

University Putra Malaysia, 43400 Serdang, Selangor, Malaysia. [email protected], 2,3{norwati, nasir}@fsktm.upm.edu.my)

Abstract—Learning is the important property of Back

Propagation Network (BPN) and finding the suitable weights and

thresholds during training in order to improve training time as

well as achieve high accuracy. Currently, data pre-processing

such as dimension reduction input values and pre-training are

the contributing factors in developing efficient techniques for

reducing training time with high accuracy and initialization of

the weights is the important issue which is random and creates

paradox, and leads to low accuracy with high training time. One

good data preprocessing technique for accelerating BPN

classification is dimension reduction technique but it has problem

of missing data. In this paper, we study current pre-training

techniques and new preprocessing technique called Potential

Weight Linear Analysis (PWLA) which combines normalization,

dimension reduction input values and pre-training. In PWLA,

the first data preprocessing is performed for generating

normalized input values and then applying them by pre-training

technique in order to obtain the potential weights. After these

phases, dimension of input values matrix will be reduced by using

real potential weights. For experiment results XOR problem and

three datasets, which are SPECT Heart, SPECTF Heart and

Liver disorders (BUPA) will be evaluated. Our results, however,

will show that the new technique of PWLA will change BPN to

new Supervised Multi Layer Feed Forward Neural Network

(SMFFNN) model with high accuracy in one epoch without

training cycle. Also PWLA will be able to have power of non

linear supervised and unsupervised dimension reduction

property for applying by other supervised multi layer feed

forward neural network model in future work.

Keywords-Preprocessing; Dimension reduction; Pre-training;

Supervised Multi-layer Feed Forward Neural Network (SMFFNN);

Training; Epoch

I. INTRODUCTION

Back propagation network [35] uses a nonlinear supervised learning algorithm, which uses data to adjust the network's weights and thresholds for minimizing the error in its predictions on the training set. Training of BPN is considerably slow because biases and weights have to be updated in hidden layers each epoch of learning by weighted functions [26]. In Supervised Multi Layer Neural Network (SMNN) model, suitable data pre-processing techniques are necessary to find input values while pre-training techniques to find desirable

weights that in turn will reduce the training process. Without preprocessing, the classification process will be very slow and it may not even complete [12]. Currently, data preprocessing specially dimension reduction and pre-training are main ideas in developing efficient techniques for fast BPN, high accuracy and reducing the training process [34, 14], but the problems in finding the suitable input values, weights and thresholds without using any random value are still remain [5,

1,

18,

12].

The current pre-training techniques generate suitable weights for reducing the training process but applying random values for initial weights [34,

4] and this will create paradox [39,

7].

This paper is organized as follows: To discuss and survey some related works in preprocessing techniques used in BPN and to compare them with new technique of PWLA. New preprocessing technique of PWLA is combination of normalization, dimension reduction and pre-training in BPN results in worthy input values, desirable process, and higher performance in both speed and accuracy. Finally, the experimental results and conclusion with future works are reported respectively.

II. EXISTING TECHNIQUES FOR PREPROCESSING

Pre-processing in real world environment focuses on data

transformation, data reduction, and pre-training. Data

transformation and normalization are two important aspects of

pre-processing. Transformation data is codification of values

of each row in input values matrix and changing them to one

data. Data transformation often uses algebraic or statistical

formulas. Data normalization data transforms one input value

to another suitable data by distributing and scaling. Data

reduction such as dimension reduction and data compression

are applied minimizing the loss of information content. SMNN

models such as Back-propagation Network (BPN) are able to

identify information of each input based on its weight, hence

increasing the processing speed [34]. Pre-training techniques

reduce the training process through preparation of suitable

weights. This is an active area of research in finding efficient

technique of data pre-processing for fast back-propagation

network with high accuracy [5, 1, 18, 12]. Currently, data pre-

processing and pre-training are the contributing factors in

1Corresponding Author: Roya Asadi, Faculty of Computer Science and Information Technology, University Putra Malaysia, 43400

Serdang, Selangor, Malaysia

(IJCSIS) International Journal of Computer Science and Information Security Vol. 3, No. 1, July 2009

229

developing efficient techniques for fast SMNN processing at

high accuracy and reduced training time [34, 14].

A. Data Preprocessing

Several powerful data pre-processing functions on ordinal basis of improving efficiency will be discussed as latest methods in this study. These are mathematical and statistical functions to scale, filter, and pre-process the data. Changing the input data or initial conditions can immediately affect the classification accuracy in back-propagation network [5] and will be discussed as current methods in this study.

1) MinMax as preprocessing:

Neal et al. explained about predicting the gold market

including an experiment on scaling the input data [28].

MinMax technique will be used in BPN to transform and to

scale the input values between 0 and 1 if the activation

function is used the standard sigmoid and -1 to 1 for

accelerating process [12, 31]. The technique of using Log

(input value) is similar to MinMax for range [0..1). Another

similar method is Sin( Radian(input value) ) between -π to π

where Radian(input value) be between 0 to π [5].

Disadvantage of MinMax technique is lack of one special and

unique class for each data [23].

2) Dimension data reduction:

Dimension data reduction method projects high

dimensional data matrix to lower dimensional sub-matrix for

effective data preprocessing. There are two types of reduction,

which are supervised and unsupervised dimension reduction.

The type of reduction is based on relationship of the

dimension reduction to the dataset itself or to an integrated

known feature. In supervised dimension reduction, suitable

sub-matrix selects based on their scores, prediction accuracy,

selection the number of necessary attributes, and computing

the weights with a supervised classification model.

Unsupervised dimension reduction maps high dimension

matrix to lower dimension and creates new low dimension

matrix considering just the data points. Dimension reduction

techniques are also divided into linear and nonlinear methods

based on consideration of various relations between

parameters. In real world, data is non-linear; hence only

nonlinear techniques are able handle them. Linear techniques

consider linear subset of the high dimensional space, while

nonlinear techniques assume more complex subset of the high

dimensional space [34, 14]. We consider Principal Component

Analysis (PCA) [17] because of its properties which will be

explained in pre-training section and it is known as the best

dimension reduction technique until now [34, 32, 3, 24]. PCA

is a classical multivariate data analysis method that is useful in

linear feature extraction and data compression [34, 32, 3, 24].

If the dimension of the input vectors be large, the components

of the vectors are highly correlated (redundant). In this

situation to reduce the dimension of the input vectors is useful.

The assumption is most information in classification of high

dimensional matrix has large variation. In PCA often are

computed maximizing the variance in process environment for

standardized linear process. The disadvantage of PCA is not to

be able to find non linear relationship within input values;

therefore these data will be lost. Linear Discriminant Analysis

(LDA) is one dimension reduction technique based on solution

of the eigenvalue problem on the product of scatter matrixes

[9]. LDA computes maximizing the ratio of between-class

distance to withinclass distance and sometimes singularity.

LDA has three majors based on this singularity problem:

regularized LDA, PCA+LDA [2] and LDA/GSVD [36, 16].

2003). These techniques use Singular Value Decomposition

(SVD) [19] or Generalized Singular Value Decomposition

(GSVD). QR is one of dimensional reduction techniques for

solving standard eigen problems in general matrix [11]. QR

reduces the matrix to quasi triangular by unitary similarity

transformation. The time complexity of QR is much smaller

than SVD method [33]. Another technique is LDA/QR which

maximizes the separability between different classes by using

QR Decomposition [37]. The main disadvantage of PCA and

other dimension reduction techniques is missing input values.

B. Pre-training

Initialization of weights is the first critical step in training

processing back-propagation networks (BPN). Training of

BPN can be accelerated through a good initialization of

weights during pre-training. To date, random numbers are

used to initialize the weights [34, 4]. The number of epochs in

training process depends on initial weights. In BPN, correct

weights results in successful training. Otherwise, BPN may

not obtain acceptable range of results and may halt during

training. Training of BPN includes activation function in

hidden layers for computing weights. Usually, initializations

the weights in pre-training of BPN are random. Most papers

do not report evaluation of speed and accuracy, only some

comments about initializing of weights, network topology

such as the number of layers and unknown practical nodes, if

any. In turn, processing time depends on initial values of

weights and biases, learning rate, as well as network topology

[39, 7]. In the following sections, latest methods of pre-

training for BPN are discussed.

1) MinMax:

There are several initial weights methods which are current

methods even now such as MinMax [39, 7]. In the methods of

MinMax, initial weights is considered in domain of (-a, +a)

which computed experimentally. SBPN is initialed random

weights in domain [-0.05, 0.05]. There is an idea that

initializing with large weights are important [15]. The input

values are classified in three groups, the weights of the most

important inputs initialized in [0.5, 1], the least important

initialized with [0, 0.5] and the rest initialized with [0, 1]. The

first two groups contain about one quarter of the total number

input values and the other group about one half. Another good

idea was introduced for initializing weight range in domain of

[−0.77; 0.77] with fixed variance of 0.2 and obtained the best

mean performance for multi layer perceptrons with one hidden

layer [20]. The disadvantage of the method of MinMax is


230

usage of initialization with random numbers which create

critical in training.

2) SCAWI:

The method called Statistically Controlled Activation

Weight Initialization (SCAWI) was introduced in [6]. They

used the meaning of paralyzed neuron percentage (PNP) and

concepted on testing how many times a neuron is in a

completed situation with acceptable error. The formula of Wij input

= 1.3 / (1+N input.V2)

1/2 . rij is for initializing weights W

that V is the mean squared value of the inputs and rij is a

random number uniformly distributed in the range [-1, +1].

This method was improved to Wij hidden

= 1.3 / (1+0.3. N

hidden)1/2

. rij for earning better result [8,

10]. The

disadvantage of the method of SCAWI is using random

numbers to put the formula and it is similar to MinMax

method which has critical in training.

3) Multilayer auto-encoder networks as pre-training:

Codification is one of five tasks types in neural network

applications [13]. Multilayer encoders are feed-forward neural

networks for training with odd number of hidden layers [34,

4]. The feed forward neural network trains to minimize the

mean squared error between the input and output by using

sigmoid function. High dimension matrix can reduce to the

low-dimensional by extracting the node values in the middle

hidden layer. In addition Auto-encoder/Auto-Associative

neural networks are neural networks which are trained to

recall their inputs. When the neural network uses linear

activation functions, auto-encoder processes are similar to

PCA [22]. Sigmoid activation allows to the auto-encoder

network to train a nonlinear mapping between the high-

dimensional and low-dimensional data matrix. After the pre-

training phase, the model is called “unfolded” to encode and

decode that initially use the same weights. BPN can use for

global fine-tuning phase through the whole auto-encoder to

fine-tune the weights for optimization. For high number of

multilayer auto-encoders connections BPN approaches are

quite slowly. The auto-encoder network is unrolled and is fine

tuned by a supervised model of BPN in the standard way.

Since the 1980s, BPN has been obvious by deep auto-encoders

and the initial weights were close enough to a good result. An

auto-encoder processes very similar to PCA [22]. The main

disadvantage of this method is due to the high number of

multi-layer auto-encoders connections in BPN training

process, resulting in slow performance.

III. POTENTIAL WEIGHTS LINEAR ANALYSIS (PWLA)

In this study, we will compare current preprocessing

techniques of BPN with new technique of Potential Weights

Linear Analysis (PWLA). PWLA reduces training process of

supervised multi layer neural network models. SMNN models

such as BPN will change to new SMFFNN models by using

real weights [30]. PWLA model recognizes high deviations of

input values matrix from global mean similar to PCA and

using the meaning of vector torque formula for new SMFFNN.

These deviations cause more scores for their values. For data

analyzing, the first PWLA normalizes input values as data

preprocessing and then uses normalized values for pre-

training, at last reduces dimension of normalized input values

by using their potential weights. Figure 1 illustrates the

structure of Potential weight Linear Analyze preprocessing.

Figure 1. Structure of Potential weight Linear Analyze

Each normalized value vector creates one vector torque ratio

to global mean of matrix. They are evaluated together and they

will reach to equilibrium. Figure 2 shows the example of the

action for four vector torques, but all vectors create their own

vector torques:

Figure 2. The action for four vector torques

Ca, Cb, Cc, Cd are the vectors of values while Da, Db, Dc, Dd

are the arms of vector torques of values. These arms are based on their distances from global mean point of matrix. The vector torque of Ca is Ca× Da, the vector torque of Cb is Cb× Db, the vector torque of Cc is Cc× Dc and the vector torque of Cd is Cd× Dd. In this study, the physical and mathematical meaning of vector torque is used, and is used in classification of instances. If two input vectors have much correlation together, they will create noise. When correlation is 1, this means that the two attributes are indeed one attribute and duplication exists in input values matrix. But the global mean moves to new


231

location, hence after equivalence, the global mean take place as one constant at special point of axis of vector torques and the weight will distribute between two vectors. This situation can be between some or all input vectors of matrix. Therefore PWLA can solve noise problem. After illustration of weak and strong weights (arms), PWLA can omit weak weights and just enter strong weights to training process of SMFFNN model. This phase called dimension reduction of normalized values in PWLA. PWLA will be more effective by having dimension reduction technique.

A. Phases of Weights Linear Analysis

The input values can be every numeric type, range and

measurement unit. Table I shows the input values matrix:

TABLE I. INPUT VALUES MATRIX

Attribute 1 Attribute 2 Attribute m

Instance 1 X11 X12 … X1m

Instance 2 X21 X22 … X2m

… … … … …

Instance n Xn1 Xn2 … Xnm

If the dataset is large, there a high chance that the vector

components are highly correlated (redundant). Even though

the input values have correlation, hence the input is noisy;

PWLA method is able to solve this problem. The covariance

function returns the average of deviation products for each

data value pair two attributes. There are three phases in

implementing PWLA:

1) Normalization:

In this phase, normalized values as data pre-processing is

considered. The technique of Min and Max is used. Each value

is computed to find the ratio to average of all columns. Table

II shows a rational distribution of information in each row.

The table is used as normalized input values or input vectors.

TABLE II. NORMALIZING INPUT VALUES PHASE

Attribute 1 Attribute 2 … Attribute m

Instance 1 C11=X11 /

Ave1

C12=X11 / Ave2 … C1m=X1m /

Avem

Instance 2 C21=X21 /

Ave1

C22=X22 / Ave2 … C2m=X2m /

Avem

… … … … …

Instance n Cn1=Xn1 /

Ave1

Cn2=Xn2 / Ave2 … Cnm=Xnm /

Avem

Average Ave1 Ave2 Ave… Avem

2) Pre-training:

In improving pre-training performance, potential weights

are initialized. The first distribution of standard normalized

values is computed. µ is mean of values vectors each row, and

σ is standard deviation of values vectors each row. Znm is a

standard normalized value and is computed based on formula

below.

Znm = ( Xnm – µm ) /σm

Therefore, PWLA does not need to use any random

number for initialization of potential real weights and input of

pre-training phase is normalized input values. Table III shows

horizontal evaluation of PWLA that is about pre-training.

TABLE III. HORIZONTAL EVALUATION OF PWLA

Attribute 1 Attribute 2 … Attribute m µ Σ

Instance 1 Z11 Z12 … Z1m µ1 Σ 1

Instance 2 Z21 Z22 … Z2m µ2 Σ 2

…

… … … … … …

Instance n Zn1 Zn2 … Znm µm Σ m

We explained the arms of value vectors are computed

based on definition of deviation and distribution of standard

normalization. Znm shows the distance of Cnm to mean its rows.

Global mean is the center of vectors torques. The weights are

arms in vectors torques. This definition of weight is based on

statistical and mathematical definition of normalization

distribution and vector torque. Wm is equivalent to (

|Z11|+|Z22|+|Z…|+|Znm| ) / n. |Znm| is absolute of normal value

Znm. Hence, weight selection is not randomization. The

weights may have thresholds but must be managed in hidden

layer of new SMFFNN using the following equation.

Wm = ( |Z11|+|Z22|+|Z…|+|Znm| ) / n

3) Dimension reduction:

In this phase, there are normalized values and potential real

weights. The weights show deviations of input values matrix

from global mean similar to PCA. PWLA with having

potential real weights can recognize high dimensional data

matrix for effective data preprocessing. The suitable sub-

matrix of necessary attributes can be selected based on their

potential weights. PWLA can map high dimension matrix to

lower dimension matrix. The strong weight causes high

variance. If the dimension of the input vectors be large, the

components of the vectors are highly correlated (redundant).

PWLA can solve this problem in two ways. The first, after

equivalence, the global mean take place as one constant at

special point of axis of vector torques and the weights are

distributed between vectors. In other way, it can solve

redundancy by dimension reduction. This phase of PWLA can

be performed in hidden layer during pruning.

B) PWLA Algorithm:

The algorithm of PWLA is shown in Figure 3.

PWLA (D; L, W)

Input: Database D, database of input values;

Output: Matrix L, Normalized database of D; W, potential

weights;

Begin

//Computing vertical evaluation: In this phase, the input

values are translated.

Let row number: n;

Let column number: m;

Let copy of database D in Matrix n×m of L;


232

Forall columns of Matrix L m do

Forall rows of Matrix L n do {

L(n,m)= L(n,m)/Average(column m);

Matrix LTemp = copy of Matrix L; }

//Computing horizontal evaluation: In second phase of

procedure, the weight of each input value is computed.

//Computing standard normalized values in row:

Let µn = Mean of values vectors in row;

Let σn = Standard deviation of values vectors in row;


Forall rows of Matrix L n do

LTemp(n,m) = (LTemp(n,m) – µn ) /σn ;

// Computing arms of values vectors (Weights):


Wm = (Average of Absolute (LTemp( column m)) ;

Apply dimension reduction Matrix L

Return Matrix L, potential weights W

Figure 3. Algorithm of PWLA

The time complexity of PWLA technique depends on the

number of attributes p and the number of instances n. The

technique of PWLA is linear and its time complexity will be

O(pn). PWLA technique output dimension reduction of

normalized input values and potential weights. New SMFFNN

will process based on the algebraic consequence of vectors

torques. The vectors torque Tnm = Cnm × Wm are the basis of the

physical meaning of torque. Each torque T shows a real worth

of each value between whole values in matrix. The algebraic

consequence of vectors torques Šn is equivalent to Tn1 + T n2 +

T +…+ Tnm. In each row, Šn is computed. The output will be

classified based on Ši. Recall that BPN uses sigmoid activation

function to transform actual output between domain [0, 1] and

to compute error by using the derivative of the logistic

function in order to compare actual output with true output.

True output forms the basis of the class labels in a given

training dataset. Here, PWLA computes potential weights and new

SMFFNN computes desired output by using binary step function instead of sigmoid function as activation function. Also there is no need to compute error and the derivative of the logistic function for the purpose of comparison between actual output with true output. The output Šn are sorted and two stacks are created based on true output with class label 0 and class label 1. Thresholds are defined based on creation of Stack0 as Stack of Ij with condition of class label 0, Stack1 as Stack of Ij with condition of class label 1. Binary step function is applied to both stacks serving as threshold and generated desired output 0 or 1. New SMFFNN applies PWLA similar to the simple neural network. The number of layers, nodes, weights, and thresholds in new SMFFNN using PWLA pre-processing is logically clear without presence of any random elements. New SMFFNN will classify input data by using output of PWLA, whereby there is one input layer with several input nodes, one hidden layer, and one output layer with one node. Hidden layer contains of weighted function ∑iWij Ii and Wij are potential weights from pre-training. Hidden layer is necessary for

pruning or considering management opinions for weights optimization. In pruning, the input values with weak weights can be omitted because they have weak effect on desired output. In management strategies, the input values with weights bigger than middle weights perform effectively on desired output, therefore they can be optimized. The output node in output layer contains ∑jWjo Ij and Wjo=1 for computing desired output. Here, BPN exist only in one epoch during training processing without the need to compute bias and error in hidden layer. In evaluating the test set and predicting the class label, weights and thresholds are clear and class label of each instance can be predicted by binary step function.

IV. EXPRIMENTAL RESULTS AND DISCUSSION

All techniques have been implemented in Visual Basic

version 6 and all performed on 1.662 GHz Pentium PC with

1.536 GB of memory. Back-propagation network performed

with configuration of 10 hidden units in one hidden layer, and

1 output unit. We consider initial random weights in range of

[-0.77, 0.77] for standard BPN in experiment results. We

considered F-measure or balanced F-Score to compute average

of accuracies test on 10 folds across [12]. The variables of F-

measure are as follow and based on weighting of recall and

precision. Recall is the probability that a randomly selected

relevant instance is recovered in a search, Precision is the

probability that a randomly selected recovered instance is

relevant.

tp= true positive ; fp= false positive

tn= true negative ; fn= false negative

Recall= tp / tp+fn ; Precision= tp / tp+fp

F= 2. (Precision. recall) / (precision+ recall)

A) XOR problem:

To illustrate the semantic and logic of the proposed

technique and the new SMFFNN model, the problem of

Exclusive-OR (XOR) is considered.

X1 X2 = XOR(X)

There are two logical attributes and four instances.

Attribute characteristic is binary 0, 1. Usually XOR is used by

multi-layers artificial neural networks. The analysis of XOR

problem is illustrated in Table IV, together with its features

and class label.

TABLE IV. THE FEATURES OF XOR

Attribute 1 Attribute 2 Class label of

XOR

Instance 1 0 0 0

Instance 2 0 1 1

Instance 3 1 0 1

Instance 4 1 1 0

Learning of new SMFFNN using PWLA takes one epoch

without computing sigmoid function, training cycle, mean

square error, and updating weights. Potential weights are

obtain through PWLA and new SMFFNN model applies all

them to compute thresholds and binary step function for


233

generating desired output as well as predicting class label for

XOR. The potential weights of attribute1 and attribute2 are the

same (0.5) by using PWLA because the correlation between

the values of attribute1 and attribute 2 is zero (ρ = 0). In this

case, the output of new model shows the error is 0 and the

outputs are the same class labels. Figure 4 illustrates result of

new SMFFNN by using PWLA implementation.

Figure 4. Implementation of XOR problem by new SMFFNN using PWLA

Training results are shown in The problem of XOR is

implemented using PWLA and the new SMFFNN. The

experimental results are compared with results from Standard

BPN and Improved BPN [25, 38]. Table V shows comparison

of speed and accuracy of XOR problem.

TABLE V. CLASSIFICATION OF THE XOR PROBLEM

Classification model Number

of epoch

error

New SMFFNN

with PWLA

1 -

Improved BPN 3167 0.0001

Standard BPN 7678 0.0001

PCA+BPN 200 0.0002

The result of New SMFFNN with PWLA is better than

others.

B) SPECT Heart:

SPECT Heart is selected from UCI Irvine Machine

Learning Database Repository [27] because the

implementations of the neural network models on this dataset

was remarkable since most conventional methods do not

process well on these datasets [21]. The dataset of SPECT Heart contains diagnosing of cardiac Single Proton Emission

Computed Tomography images. There are two classes: normal

and abnormal. The database contains 267 SPECT image sets

of patients features, 22 continuous feature patterns. The

implementation of BPN and new SMFFNN models by using

different preprocessing techniques are compared. The learning

process of new SMFFNN using PWLA is performed in one

epoch and the potential weights are generated by PWLA on

SPECT Heart training dataset which are shown in Table VI.

TABLE VI. POTENTIAL WEIGHTS OF SPECT HEART (TRAINING DATASET)

New SMFFNN considered the potential weights and

obtained thresholds and then created two stacks for 0 and 1

labels using binary step function. The potential weights of

SPECT Heart have suitable distribution and are near together.

Table VII shows stacks of thresholds for SPECT Heart by

using new SMFFNN and PWLA.

TABLE VII. CREATED THRESHOLDS BY NEW SMFFNN USING PWLA ON

SPECT HEART (TRAINING SET)

The learning of BPN is performed in standard situation

(SBPN), and by using PCA with 10 dimensions as

preprocessing technique. Table VIII shows speed and accuracy

of classification methods on SPECT Heart dataset.

TABLE VIII. COMPARISON OF CLASSIFICATION SPEED AND ACCURACY ON

SPECT HEART DATASET

In SPECT Heart, The accuracy of BPN by using PCA is

73.3%, SBPN is 87%. New SMFFNN by using PWLA has

higher accuracy to others which is 92% and by using PWLA

with dimension reduction is 87%. We consider 11 attributes

with high weights for dimension reduction of input values

matrix. Figure 5 shows the chart of comparison accuracy of

classification methods on SPECT Heart dataset.


234

73.30%

87.00%87%92.00%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

New SMFFNN by

using PWLA

New SMFFNN by

using PWLA with

dimension

reduction

SBPN BPN by using

PCA

Accuracy

Figure 5. Comparison of classification accuracy on SPECT Heart dataset

The accuracy of performance of new SMFFNN by using

PWLA is better than other methods because it uses real

potential weights, thresholds and does not work on random

initialization. SBPN method performs on SPECT Heart

training dataset in 25 epochs with 2.92 second CPU times.

BPN by using PCA performs on SPECT Heart training dataset

in 14 epochs with 1.08 second CPU. New SMFFNN by using

PWLA in one epoch during 0.036 second processes on

SPECTF Heart training dataset. New SMFFNN by using

PWLA with dimension reduction in one epoch during 0.019

second processes on SPECTF Heart training dataset which has

higher speed to other methods. Comparison on speed of

training process of the methods on SPECT Heart dataset is

shown in Figure 6 as follow:

14

1 1

25

1.080.036 0.019

2.92

0

5

10

15

20

25

30

New SMFFNN by

using PWLA

New SMFFNN by

using PWLA with

dimension

reduction

SBPN BPN by using

PCA

Epoch CPU time (second)

Figure 6. Comparison of classification speed of BPN and new SMFFNN by

using preprocessing techniques on SPECT Heart dataset

C) SPECTF Heart:

SPECTF Heart is selected from UCI Irvine Machine

Learning Database Repository [27] because the

implementations of the neural network models on these

datasets were remarkable since most conventional methods do

not process well on these datasets [21]. The dataset of

SPECTF Heart contains diagnosing of cardiac Single Proton

Emission Computed Tomography (SPECT) images. There are

two classes: normal and abnormal. The database contains 267

SPECT image sets of patients features, 44 continuous feature

patterns.

The implementation of BPN and new SMFFNN models by

using different preprocessing techniques are compared.

Learning for the new SMFFNN using PWLA is performed

in only one epoch. Generated potential weights by PWLA on

SPECTF HEART training dataset are shown in Table IX.

TABLE IX. POTENTIAL WEIGHTS OF SPECTF HEART (TRAINING

DATASET)

The potential weights don't have suitable distribution;

therefore we can consider dimension reduction of input values

based on weak weights. In here, we consider 14 attributes with

high weights for dimension reduction technique. New

SMFFNN considered these potential weights and obtained

thresholds before it created two stacks for 0 and 1 labels using

binary step function. Table X shows stacks of thresholds.

TABLE X. CREATED THRESHOLDS BY NEW SMFFNN USING PWLA ON

SPECTF HEART (TRAINING SET)

The learning of BPN is performed in standard situation

(SBPN), and by using PCA with 10 dimensions as

preprocessing technique. Table XI shows speed and accuracy

of classification methods on SPECTF Heart dataset.


235

TABLE XI. COMPARISON OF CLASSIFICATION SPEED AND ACCURACY ON

SPECTF HEART DATASET

In SPECTF Heart, the accuracy of BPN by using PCA is

75.1%, SBPN is 79% and new SMFFNN by using PWLA has

higher accuracy to others which is 94% and by using PWLA

with dimension reduction is 85%. We consider 14 attributes

with high weights for dimension reduction of input values

matrix. Figure 7 shows the charts of comparison accuracy of

classification methods on SPECTF Heart dataset.

Accuracy

75.10%

94.00%85%

79.00%

0.00%10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00%

New SMFFNN by

using PWLA

New SMFFNN by

using PWLA with

dimension

reduction

SBPN BPN by using

PCA

Accuracy

Figure 7. Comparison of classification accuracy on SPECTF Heart dataset


PWLA is better than other methods because it uses real

potential weights, thresholds and does not work on random

initialization. SBPN method performs on SPECTF Heart

training dataset in 25 epochs with 4.98 second CPU times.

BPN by using PCA performs on SPECTF Heart training

dataset in 14 epochs with 1.6 second CPU times. New

SMFFNN by using PWLA in one epoch during 0.061 second

processes on SPECTF Heart training dataset. New SMFFNN

by using PWLA with dimension reduction in one epoch during

0.022 second processes on SPECTF Heart training dataset

which has higher speed to other methods. Speed comparison

of models and techniques on SPECTF Heart dataset are

shown in Figure 8.

14

1 1

25

1.60.061 0.022

4.98

0

5

10

15

20

25

30

New SMFFNN by

using PWLA

New SMFFNN by

using PWLA with

dimension

reduction

SBPN BPN by using

PCA

Epoch CPU time (second)


using preprocessing techniques on SPECTF Heart dataset

D) Liver disorders dataset (BUPA):

Liver disorders dataset or BUPA Medical Research Ltd.

database donated by Richard S. Forsyth is selected from UCI

Irvine Machine Learning Database Repository [27]. The first 5

variables are all blood tests which are thought to be sensitive

to liver disorders that might arise from excessive alcohol

consumption. Each line in the BUPA data file constitutes the

record of a single male individual. Selector field used to split

data into two sets. The database contains 345 attributes and 7

instances. Learning for the new SMFFNN using PWLA is

performed in only one epoch and the potential weights

generated by PWLA on BUPA dataset is shown in Table XII.

TABLE XII. POTENTIAL WEIGHTS OF BUPA (TRAINING DATASET)

New SMFFNN considered these potential weights and

obtained thresholds before it created two stacks for 1 and 2

labels using binary step function. Table XIII shows stacks of

thresholds.


236

TABLE XIII. CREATED THRESHOLDS BY NEW SMFFNN USING PWLA ON

BUPA (TRAINING SET)

Table XIV shows average accuracy of classification

methods on BUPA dataset.

TABLE XIV. COMPARISON OF CLASSIFICATION SPEED AND ACCURACY ON

BUPA DATASET

In BUPA, the accuracy of BPN by using SCAWI is

60.90% [7], BPN by using PCA is 63.40% [29], SBPN is

59.40% and BPN by using PWLA has higher accuracy to

others that is 100%. Figure 9 shows the charts of comparison

accuracy of classification methods on BUPA dataset.

60.90%63.40%59.40%

100.00%

0.00%

20.00%

40.00%

60.00%

80.00%

100.00%

120.00%

New SMFFNN by

using PWLA

SBPN BPN by using

PCA

BPN by using

SCAWI

Accuracy

Figure 9. Comparison of classification accuracy on BUPA dataset


PWLA is better than other methods and is 100% because it

uses real potential weights, thresholds and does not work on

random initialization. The BUPA classification result shows

that selected attributes of this dataset are complete and

SMFFNN model by using PWLA has highest accuracy. SBPN method performs on BUPA dataset in 1300 epochs.

BPN by using PCA or SCAWI performs on BUPA dataset in

200 epochs. New SMFFNN by using PWLA in one epoch

processes on BUPA dataset that has higher speed than other

methods.

Speed comparison of models and techniques on BUPA

dataset are shown in Figures 10.

200

1

1300

200

0

200

400

600

800

1000

1200

1400

New SMFFNN by

using PWLA

SBPN BPN by using

PCA

BPN by using

SCAWI

Epoch


using preprocessing techniques on BUPA dataset

V. CONCLUSION

Currently, data pre-processing and pre-training techniques

of BPN focus on reducing the training process and increasing

classification accuracy. The main contribution of this paper is

combination of normalization and dimension reduction as pre-

processing and new pre-training method for reducing training

process with the discovery of SMFFNN model. BPN can

change to new SMFFNN model and get the best result in


237

speed and accuracy by using new preprocessing technique

without gradient of mean square error function and updating

weights in one epoch. Therefore, the proposed technique can

solve the main problem of finding the suitable weights. The

Exclusive-OR (XOR) problem is considered and solved for the

purpose to validate the new model. During experiments, the

new model was implemented and analyzed using Potential

Weights Linear Analysis (PWLA). The combination of

normalization, dimension reduction and new pre-training

techniques shows that PWLA generated suitable input values

and potential weights. This shows that PWLA serves as global

mean and vectors torque formula to solve the problem. Three

kinds of SPECT Heart, SPECTF Heart and Liver Disorders

(BUPA) datasets from UCI Repository of Machine Learning

are chosen to illustrate the strength of PWLA techniques. The

results of BPN by using pre-processing techniques and new

SMFFNN with application of PWLA showed significant

improvement in speed and accuracy. The results show that

robust and flexibility properties of new preprocessing

technique for classification. For future work, we consider

improved PWLA with non linear supervised and unsupervised

dimension reduction property for applying by other supervised

multi layer feed forward neural network models.

REFERENCSES

[1] Andonie, R. and Kovalerchuk, B. 2004. Neural Networks

for Data Mining: Constrains and Open Problems. Computer Science Department Central Washington University, Ellensburg, USA.

[2] Belhumeour, P.N., Hespanha, J.P. and Kriegman, D.J. 1997. Eigenfaces vs, Fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Analysis and Machine Intelligence, 19(7):711–720.

[3] Daffertshofer, A., Lamoth, C.J., Meijer, O.G. and Beek P.J. 2004. PCA in studying coordination and variability: a tutorial. Clin Biomech (Bristol, Avon).19:415-428. Decomposing input patterns to facilitate training. Proc World Congress on Neural Networks, V.III, 503-506.

[4] DeMers, D. and Cottrell, G. 1993. Non-linear dimensionality reduction. In Advances in Neural Information Processing Systems, volume 5, pages 580–587, San Mateo, CA, USA. Morgan Kaufmann.

[5] Demuth, H., Beale, M. and Hagan, M. 2007. Neural Network Toolbox User’s GuideMatlab, The Mathworks , Accelerating the pace of engineering and science.

[6] Drago, G.P. and Ridella, S. 1992. Statistically Controlled Activation Weight initialization (SCAWI). IEEE Transactions. on Neural Networks, vol. 3, no. 4, pp. 627-631.

[7] Fernández-Redondo, M. and Hernández-Espinosa C. 2001. Weight Initialization Methods for Multilayer feed forward. Universidad Jaume I, Campus de Riu Sec, Edificio TI, Departamento de Informática, 12080 Castellón, Spain. ESANN'2001 proceedings - European Symposium on Artificial Neural Networks. Bruges (Belgium), 25-27 April 2001, D-Facto public., ISBN 2-930307-01-3, pp. 119-124.

[8] Fernandez-Redondo, M. and Hernandez-Espinosa, C. 2000. A comparison among weight initialization methods for multilayer feedforward networks, Proceedings of the

IEEE–INNS–ENNS International Joint Conference on Neural Networks, Vol. 4, Como, Italy, 2000, pp. 543–548.

[9] Friedman, J. H. 1989. Regularized discriminant analysis. Journal of the American Statistical Association, 84(405):165–175.

[10] Funahash, K. 1989. On approximate realization of continuous mappings by neural networks, Neural Networks 2. 183–192.

[11] Gentleman, W. M. and Kung, H. T. 1981. Matrix triangularization by systolic arrays. Proceedings of the SPIE Symposium , Real-time signal processing IV, 298(IV).

[12] Han, J. and Kamber, M. 2001. Data Mining: Concepts and Techniques. Simon Fraser University, Academic Press.

[13] Hegland, M. 2003. Data Mining – Challenges, Models, Methods and Algorithms. May 8.

[14] Hinton, G.E. and Salakhutdinov, R.R. 2006. Reducing the Dimensionality of Data with Neural Networks. Materials and Methods. Figs. S1 to S5 Matlab Code. 10.1126/science. 1127647.June.www.sciencemag.org/cgi/content/full/313/5786/504/DC1

[15] Ho-Sub, Y., Chang-Seok, B. and Byung-Woo, M. 1995. Neural networks using modified initial connection strengths by the importance of feature elements. In Proceedings of International Joint Conf. on Systems, Man and Cybernetics, 1: 458-461.

[16] Howland, P., Jeon, M. and Park, H. 2003. Structure preserving dimension reduction for clustered text data based on the generalized singular value decomposition. SIAM Journal on

[17] Jolliffe, L.T. 1986. Principal Component Analysis. Springer-Verlag, New York.

[18] Jolliffe, L.T. 2002. Principal Component Analysis. Springer. 2nd edition.

[19] Kalman, B.L., Kwasny, S.C. and Abella A. 1993.

[20] Keeni, K., Nakayama, K. and Shimodaira, H. 1999. A training scheme for pattern classifcation using multi-layer feed-forward neural networks. In Proceedings of 3

rd

International Conference on Computational Intelligence and Multimedia Applications, New Delhi, India, pp. 307–311.

[21] Kim, J.K. and Zhang, B.T. 2007. Evolving Hyper networks for Pattern Classification. In Proceedings of IEEE Congress on Evolutionary Computation. Page(s):1856 – 1862 Digital Object Identifier 10.1109/CEC.

[22] Lanckriet, G.R.G., Bartlett, P., Cristianini, N., Ghaoui, L. El and Jordan M. I. 2004. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5:27–72.

[23] LeCun, Y., Bottou, L., Orr, G. and Muller, K. 1998. Efficient Back Prop. In Orr, G. B. and. Muller, K.-R., (Eds.), Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.).

[24] Lindsay, S. 2002. A tutorial on Principal Components Analysis. February 26. Unpublished. T.Z.1-4244-1340-0/07/2007 IEEE. [http://www.ics.uci.edu/~mlearn/MLRepository.html]


238

[25] Mariyam, S. 2000. Higher Order Centralized Scale-Invariants for Unconstrained Isolated Handwritten Digits. PhD. Thesis, University Putra Malaysia.

[26] Mark, W.C. and Jude, W.S. 1999. Using Neural Networks for Data Mining. Computer Science Department, Carnegie Mellon University, University of Wisconsin-Madison. Matrix Analysis and Applications, 2003, 25(1):165-179.

[27] Murphy, P.M. 1997. Repository of Machine Learning and Domain Theories. [http://archive.ics.uci.edu/ml/datasets/SPECTF+Heart]; [http://archive.ics.uci.edu/ml/datasets/SPECT+ Heart]; [http://archive.ics.uci.edu/ml/datasets/Liver+Disorders]

[28] Neal, M.J., Goodacre, R. and Kell, D.B. 1994. The analysis of pyrolysis mass spectra using artificial neural networks. Individual input scaling leads to rapid learning. In Proceedings of the World Congress on Neural Networks. International Neural Network Society San Diego.

[29] Perantonis, S.J. and Virvilis, V. (1999). Dimensionality reduction using a novel network based feature extraction method. In International Joint Conference on Neural Networks (IJCNN), volume 2, pages 1195–1198.

[30] Roya, A., Norwati, M., Nasir, S. and Nematollah S. (2009). New Supervised Multi layer Feed Forward Neural Network model to accelerate classification with high accuracy, Accepted by European Journal of Scientific Research (EJSR), Vol. 33 Issue 1.

[31] Russell, I.F. 2007. Neural Networks. Department of Computer Science. University of Hartford West Hartford, CT 06117. Journal of Undergraduate Mathematics and its Applications, Vol 14, No 1.

[32] Shlens, J. 2005. A Tutorial on Principal Component Analysis. Unpublished, Version 2.

[33] Stewart, G.W. 1998. Matrix Algorithms, Basic Decompositions,vol. 1, SIAM, Philadelphia, PA.

[34] Van der Maaten, L.J.P., Postma, E., and van den Herik, H. 2008. Dimensionality Reduction: A Comparative Review. MICC, Maastricht University, P.O. Box 616, 6200 MD Maastricht, The Netherlands. Preprint submitted to Elsevier, 11 January.

[35] Werbos, P.J. 1974. PaulWerbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Committee on Applied Mathematics, Harvard University, Cambridge, MA, November 1974.

[36] Ye, J., Janardan, R., Park, C.H. and Park, H. 2004. An optimization criterion for generalized discriminant analysis on undersampled problems. IEEE Trans. Pattern Analysis and Machine Intelligence, 26(8):982–994.

[37] Ye, J., Li, Q., Xiong H., Park, H., Janardan, R. and Kumar, V. 2004. An Incremental Dimension Reduction Algorithm via QR Decomposition. The Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, pp. 364-373, The Department of Computer Science & Engineering, University of Minnesota, Minneapolis, MN 55455, U.S.A.

[38] Yim, T.K. 2005. An Improvement on Extended Kalman Filter for Neural Network Training. Master Thesis, University Putra Malaysia.

[39] Zhang, X.M., Chen, Y.Q., Ansari, N. and Shi, Y.Q. 2003. Mini-max initialization for function Approximation. Department of Electrical & Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102, USA.

Department of Computer Science and Engineering, Intelligent Information Processing Laboratory, Fudan University, People’s Republic of China. Accepted 31 October 2003.

AUTHORS PROFILE

Dr. Prof. Md. Nasir bin Sulaiman is a lecturer in Computer

Science in Faculty of Computer Science and

Information Technology, UPM and as an

Associate Professor since 2002. He obtained

Ph. D in Neural Network Simulation from

Loughborough University, U.K. in 1994. His

research interests include intelligent

computing, software agents and data mining.

Dr. Norwati Mustapha is a lecturer in Computer Science in

Faculty of Computer Science and

Information Technology, UPM and head of

department of Computer Science since 2005.

She obtained Ph. D in Artificial Intelligence

from UPM, Malaysia in 2005.

Her research interests include intelligent computing and data

mining.

Roya Asadi received the Bachelor degree in Computer

Software engineering from Electronical and

Computer Engineering Faculty, Shahid

Beheshti University and Computer Faculty

of Data Processing Iran Co. (IBM), Tehran,

Iran.

She is a research student of Master of Computer science in

database systems in UPM university of Malaysia. Her

professional working experience includes 12 years of service

as Senior Planning Expert 1. Her interests are in Intelligent

Systems and Neural Network modeling.


239

© IJCSIS PUBLICATION 2009 ISSN 1947 5500

International Journal of Computer Science July 2009

Documents