Enhancement of Web Proxy Caching Using Random Forest Machine Learning Technique Julian Benadit.P 1 , Sagayaraj Francis.F 2 , Nadhiya.M 3 1 Research scholar, Department of Computer Science & Engineering, Pondicherry Engineering College, Puducherry – 605 014, India 2 Professor, Department of Computer Science & Engineering, Pondicherry Engineering College, Puducherry – 605 014, India 3 PG Scholar, Department of Computer Science & Engineering, Dr. S.J.S Paul Memorial College of Engineering and Technology, Puducherry – 605 502, India Abstract The Random Forest Tree is an ensemble learning method for Web data classification. In this study, we attempt to improve the performance of the traditional Web proxy cache replacement policies such as LRU and GDSF by integrating machine learning technique for enhancing the performance of the Web proxy cache. Web proxy caches are used to improve performance of the web. Web proxy cache reduces both network traffic and response time. In the first part of this paper, a supervised learning method as Random Forest Tree classifier (RFT) to learn from proxy log data and predict the classes of objects to be revisited or not. In the second part, a Random Forest Tree classifier (RFT) is incorporated with traditional Web proxy caching policies to form novel caching approaches known as RFT-LRU and RFT-GDSF. These proposed RFT-LRU and RFT-GDSF significantly improve the performances of LRU and GDSF respectively. Keywords: Web caching, Proxy server, Cache replacement, Classification, Random Forest Tree classifier. 1. Introduction For the past few years, many researches are going on in Web proxy caching and integration of supervised techniques in Web cache replacement. This paper also comes under this category. Web proxy caching plays a significant part in improving Web performance by conversing web objects that are likely to be visited again in the proxy server close to the user. This internet proxy caching aid in decreasing user perceived latency, i.e. delay from the time missive of request is issued till response is received, reducing network information measure[4, 15]. Cache space is restricted; the space should be used competently. A cache replacement principle is required to handle the cache content [11,4]. If the cache is full when an object desires to be stored, the replacement strategy will work out which objects to be evicted to permit space for the new object. Table 1: Cache replacement policies The most common internet caching ways (Table 1) aren’t effective enough and flout alternative factors that aren’t often visited. This decreases the effective cache size and affects the performance of the online proxy caching negatively. Therefore, a supervised mechanism is needed to manage internet cache content with efficiency. In the preceding papers exploiting supervised learning methods to cope with the matter [1,6,7,9,10,12,15]. Most of these surveys use an Adaptive Neuro-Fuzzy Inference System (ANFIS) in World Wide Web caching. Though ANFIS training might consume wide amounts of time and need further process overheads. In this paper, we attempted to increase the performance of the web cache replacement strategies by integrating supervised learning method of Random Forest Tree classifier (RFT). In conclusion, we achieved a large-scale Policy Brief description LRU LFU SIZE GDS GDSF The least recently used objects are taken first. The least frequently utilized objects are taken first. Big objects are removed first. It assigns a key value to each object in the cache. The object with the low key value is evicted. It expands GDS algorithm by integrating the frequency component into the key word. IJCSI International Journal of Computer Science Issues, Vol. 11, Issue 3, No 1, May 2014 ISSN (Print): 1694-0814 | ISSN (Online): 1694-0784 www.IJCSI.org 83 Copyright (c) 2014 International Journal of Computer Science Issues. All Rights Reserved.
9
Embed
Enhancement of Web Proxy Caching Using Random Forest ... · Enhancement of Web Proxy Caching Using Random Forest Machine Learning Technique. ... improve predictive power and provides
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Enhancement of Web Proxy Caching Using Random Forest
[17] P. Lorenzetti and L. Rizzo, “Replacement Policies for a Proxy
Cache”, Technical Report, University Pisa, December1996.
[18] E. O’Neil, P. O’Neil and G. Weikum, “The LRU-K Page
Replacement Algorithm for Database Disk Buffering”,
Proceedings of SIGMOD ‘93, Washington, DC, May 1993.
[19] LAN H. Witten, Eibe Frank, Mark A. Hall, Data Mining:
Practical Machine Learning Tools and Techniques, Morgan
Kauffmann, 2011.
[20] WEKAtool:Available at http://www.cs.waikato.ac.nz/ml/weka/.
[21] NLANR, National Lab of Applied Network Research
(NLANR),Sanitized Access Logs: Available at
http://www.ircache.net/2010.
Julian Benadit. P received the B. Tech degree in computer science engineering from Pondicherry University, Puducherry, India and the M.E. Degree in computer science engineering from Anna University, Chennai, India. He is currently pursuing the Ph.D degree in computer science engineering at Pondicherry Engineering College, Pondicherry University, Puducherry, India.
Since 2006, he has been an Assistant Professor with the computer science Engineering Department, in the consortium engineering college affiliated to Pondicherry University.
RFT-GDSF Over
GDSF
RFT-LRU Over LRU
Cache
size
(MB)
HR
BHR
HR
BHR
1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16,384
32,768
20.90
18.19
14.27
12.33
10.94
9.61
9.27
6.66
4.73
2.64
1.93
0.55
0.23
0.13
0.04
0
33.01
97.46
34.69
47.69
95.46
86.66
58.77
115.56
82.80
68.49
52.12
47.75
26.53
8.29
1.43
0.26
26.64
31.87
15.04
26.70
30.77
8.86
61.08
4.77
3.47
5.18
6.14
4.14
1.64
1.31
0.50
0.12
24.17
27.68
32.34
27.95
18.53
15.06
16.38
8.41
4.87
3.44
1.92
0.57
0.56
0.19
0.14
0.09
IJCSI International Journal of Computer Science Issues, Vol. 11, Issue 3, No 1, May 2014 ISSN (Print): 1694-0814 | ISSN (Online): 1694-0784 www.IJCSI.org 90
Copyright (c) 2014 International Journal of Computer Science Issues. All Rights Reserved.
His research interest includes web caching, web prefetching, machine learning, content distribution network. He was the Associate member of Professional society Institution of Electronics and Telecommunication Engineers (IETE), Computer Society of India (CSI) and Institution of Engineers (IE), Indian Society for Technical Education (ISTE).
Sagayaraj Francis.F received the B.Sc, computer science in Madras university and M.Sc, computer science at St.Joseph college Trichy (Autonomous) and M.Tech degree in computer science Engineering from the Pondicherry University, Puducherry, India and he obtained his Ph.D degree in Computer Science
Engineering from Pondicherry University, Puducherry, in2008. Currently, he is working as a professor in the Department of Computer Science and Engineering, Pondicherry Engineering College, Puducherry, India. His research interest includes Database Management System, Knowledge and Intelligent system, Data analysis, Data Modeling.
Nadhiya.M received Master of Computer Application (MCA) degree from Anna University, Tamilnadu, India. She is currently pursuing the M.Tech degree in computer science engineering at Dr. S.J.S Paul Memorial College of Engineering and Technology, Pondicherry University, Puducherry, India. She currently works in the domain Web caching, Machine learning.
IJCSI International Journal of Computer Science Issues, Vol. 11, Issue 3, No 1, May 2014 ISSN (Print): 1694-0814 | ISSN (Online): 1694-0784 www.IJCSI.org 91
Copyright (c) 2014 International Journal of Computer Science Issues. All Rights Reserved.