Proxy Server

INTRODUCTION:-

In computer networks, a proxy server is a server (a computersystem or an application program) which services the requestsof its clients by forwarding requests to other servers. A clientconnects to the proxy server, requesting some service, such as afile, connection, web page, or other resource, available from adifferent server [1]. The proxy server provides the resource byconnecting to the specified server and requesting the service onbehalf of the client. A proxy server may optionally alter theclient's request or the server's response, and sometimes it mayserve the request without contacting the specified server. In thiscase, it would 'cache' the first request to the remote server, so itcould save the information for later, and make everything as fast aspossible.A proxy server that passes all requests and replies unmodified isusually called a gateway or sometimes tunneling proxy. A proxyserver can be placed in the user's local computer or at variouspoints between the user and the destination servers or the Internet[2].A proxy server speaks the client side of a protocol to anotherserver. It is a system or device that operates between computer

application, such as a Web browser and a server. When userswish to read information from the Internet rather than requestingdata directly from the object, they communicate with the proxyserver that fills the request either from its cache or from the objectitself. No direct communication is established between the systemrequesting the data and the Internet. The great thing thatproxy servers provide, when configured correctly, is completesecurity [3].HTTP (Hyper Text Transfer Protocol) is the protocol that webbrowsers and servers use to transfer hypertext pages and images.Here’s how it works: When a client requests a file from an HTTPserver, it simply prints the name of the file in a special format to a

predefined port and reads back the contents of the file. The serveralso responds with a status code number to tell the client whetherthe request can be fulfilled and why.A caching proxy server accelerates service requests by retrievingcontent saved from a previous request made by the same client oreven other clients. Caching proxies keep local copies of frequentlyrequested resources, allowing large organizations to significantlyreduce their upstream bandwidth usage and cost, whilesignificantly increasing performance. Most ISPs and largebusinesses have a caching proxy. These machines are built todeliver superb file system performance (often with RAID andjournaling) and also contain hot-rodded versions of TCP [4].Caching proxies were the first kind of proxy server. The HTTP1.0 and later protocols contain many types of headers fordeclaring static (cacheable) content and verifying content freshnesswith an original server, e.g. ETAG (validation tags), If-Modified-Since (date-based validation), Expiry (timeout-based invalidation),etc. Other protocols such as DNS support expiry only andcontain no support for validation. Some poorly-implementedcaching proxies have had downsides (e.g., an inability to use userauthentication). Some problems are described in RFC 3143(Known HTTP Proxy/Caching Problems) [5].

CASE SELECTION:-

ABOUT PROXY CACHINGProxy caching can provide several benefits to sites that have a large load of webprocessing requests.A forward proxy acts as a gateway for a client’s browser, sending HTTP requests on behalfof the client when requests come from the Internet. When a request is processed, the IPaddress of the proxy is used, rather than the client’s actual address. This hides the IP

address of the network from the outside world.

A reverse proxy server issues requests on behalf of the backend HTTP server, not on

behalf of the client. Because of this, a client configuration is not needed. Clients access theserver as if it were a regular web site. A reverse proxy server acts as a gateway to an HTTPserver, and is the final IP address for requests from the outside.In addition to this protective feature, a proxy cache stores documents close to the user,thus eliminating wait time for retrieval. The client web browser is configured to send allrequests to the proxy server. The proxy server, located close to the client, caches the webcontent, thus providing faster access to common sites and pages. Caching proxy serversreduce traffic from the local site to the Internet, saving valuable bandwidth and reducingconnection costs.A reverse proxy server can also cache requests that it serves from the backend server.When a request is received for a page, the reverse proxy server forwards the request to thebackend server and caches the page in addition to returning the page to the client.Subsequent requests to that page can be served from the cache, as long as the cache hasn’texpired.

illustration The following shows how a reverse proxy server ‘hides’ the identity of themain server from those systems which are making requests.

The firewall is configured to allow a specific server on a specific port to have access to thesecure server. When a client makes a request, the request goes through the proxy server,which passes the request through the firewall to the secure server. A result is passed backthrough the firewall to the proxy.If an error is returned, the proxy server intercepts the message and changes the URLslisted in the headers before sending the message on to the client. In this way, externalclients do not receive redirection to URLs to the secure server.Multiple proxy servers can be used for load-balancing by taking advantage of the cachingfeatures of the proxy server. If you have a web server that has active traffic, proxy serverscan take some of the load from the web server, making network access more efficient.After an initial starting period (in which the proxy servers retrieve documents for the firsttime), the number of requests to the actual web server will drop as the proxy server cacheis used instead.

III. PROPOSED PROXY SERVER APPROACH

The proposed new approach of caching http proxy server basedupon two major factors which reduce the performance of thecurrent proxies. The architectural diagram of basic working proxyserver is shown in figure 1.

Fig. 1 Architectural Diagram of working HTTP ProxyServer.Above diagram shows the basic functionality of current proxyserver. The above proxy does not scan the http requests into thecache, only the current page is taken under the cache. Due tothis lack of design the current proxy is not so efficient tofunction according to need [8]. The proposed architectural diagramof proxy is given in figure 2.

Fig. 2 Architectural Diagram of proposed HTTP Proxy Server.

The working and internal configuration of the new efficient proxyserver is designed with help of given below in figure 3.

Fig. 3 Working and internal configuration of new efficientproxy server.

The above flow mechanism of working version of proposed proxyserver based upon the request and the grant. This means that isrequest by client is being granted then caching the informationwith local system is being started automatically [9]. The internalcaching mechanism design in figure

diagram is remaining

Without the user authentication unauthorised use restriction is notpossible [10]. In this proxy server the new trend of authenticationwhich very fast in nature. The flow mechanism of this fastauthentication design is presented in figure 5.

Fig. 4 Mechanism of fast authentication.

The fast authentication is taken due to search in cache. The aboveall designs are the part of the our proposed proxy server

ABOUT SELECTED CASE:-

Over the past decade, online identity fraud has trans-formed from being a small scale criminal activity of com-puter geeks to a widespread phenomenon costing billions ofdollars in damages each year.Cyber criminals use phishing predominantly as a tech-nique for obtaining identity-related information, such as so-cial security numbers or bank account numbers. In a typicalphishing scenario, a cyber criminal sets up a fake web sitethat looks similar to the login page of a target financial in-stitution and sends out a massive amount of email to trickpeople into logging into the fake web site and entering per-sonal information. The cost incurred by criminals is verylow and within a short period of time they can successfullycomplete an attack cycle and hide their tracks. These factshave fueled the phenomenal growth of phishing attacks[23].A vast majority of existing anti-phishing products followa signature-based approach. Users, ISPs, and security staffof financial companies provide suspected URLs as inputto a centralized blacklist service, which disseminates vet-ted blacklists to end clients (mostly browser extensions) forenforcement. This approach ties the effectiveness of anti-phishing products to the accuracy and timeliness of signa-

ture updates. It is impractical to assume that all activephishing sites will be reported to the centralized service ontime. Vetting of the reported sites adds some amount oftime delay. Therefore, attackers always have a window ofopportunity during which a significant fraction of end clientsoperate without the protection of updated signatures.

Another common aspect of existing products is the in-sertion point of the anti-phishing technology. Most exist-ing technologies are inserted directly into web browsers viaplugin frameworks (such as Browser Helper Objects [26]).Although plugins are good for providing transparent inser-tion, they are susceptible to buffer overflow attacks on endsystems and browsers. Malware frequently gets unwillinglydownloaded by users and executed on their computers. Dueto poor support in current OSes for strong process isola-tion, such malware can corrupt the users’ web browser andsimply disable anti-phishing plugins, install key loggers torecord and exfiltrate information, or install root kits to evadehost-based intrusion detection systems. Another negativeaspect of browser plugins is that custom browser-specificcode is required for different browser platforms, which lim-its support to Microsoft Internet Explorer and Firefox inmost cases. Furthermore, the plugins consume additionalCPU, disk space, and memory resources on the host run-ning the browser, limiting deployment on small embeddeddevices such as internet-ready cell phones.We have developed a different anti-phishing approach inPhishBouncer which uses attribute-based checks to imple-ment both reactive and proactive anti-phishing defenses.These checks do not require signature creation and are strate-gically placed into the client-server communication pathwayvia an HTTPS proxy.We claim that by focusing on common attributes of phish-ing attacks (rather than specific signatures) and user behav-ior over time, the PhishBouncer approach doesn’t requiretimely updates and therefore doesn’t suffer the problemsassociated with them. Since attribute checks operate ona wide variety of phishing attack instances by looking atgeneric features and gathering data autonomously if needed,we are able to significantly reduce the amount of pushedcontent to the end systems and thereby reduce a large part

of the operational cost of anti-phishing services. Further-more, PhishBouncer provides enhanced protection againstpreviously unknown phishing attacks as in most cases somecommon attributes stay invariant.Placement of the anti-phishing checks in an HTTPS proxyprovides stronger isolation guarantees and increased flexi-bility compared to browser plugins. Since the proxy is a

Figure 1: Functional architecture of the Phish-Bouncer proxy.

dedicated process, it can be protected against direct attacksvia technologies that implement process protection domain,such as SELinux[19] or Cisco Security Agent[3]. Even inthe absence of supplemental enforcement, the separation ofthe proxy process from the browser provides a stronger de-fense against memory corruption attacks than browser plu-gins such as toolbars. The process proxy can also be de-ployed on users’ computers, embedded wireless routers, orISP servers without any changes to the code.In summary, PhishBouncer’s superior anti-phishing capa-bilities stem from the following key features:• Implemented in Java, therefore less vulnerable to tra-ditional exploits (e.g., buffer overflow attacks)

• Architectural solution with stronger guarantees thanbrowser plugins (can catch phishs even if the browseris closed or not part of the communication)• Browser independent - supports all web browsers• Operating system independent - supports all operatingsystems that can run Java• Highly customizable deployment options - runs on userhosts, wireless routers, or network server• Open framework and plugin architecture - allows easyaddition of new checks• Attribute-based detection - provides protection againstunknown phishing attacks• Supports reactive and proactive anti-phishing checks• Supports HTTP and HTTPSThe rest of the paper is organized as follows: Section 2 de-scribes the PhishBouncer architecture together with imple-mentation details for HTTPS proxying, anti-phishing checksand the plugin framework. Section 3 summarizes experi-mental results, section 4 describes related work and section5 concludes the paper with a brief description of future work.

PHISHBOUNCER ARCHITECTUREA key aspect of the PhishBouncer approach is its proxy-based architecture. Figure 1 illustrates the design of thisproxy, which consists of 4 main modules that are imple-mented on top of Jetty[18], a popular open-source web serverwritten in Java.The Plugin Framework module provides a means for in-tegration of custom reactive and proactive behavior. Allanti-phishing logic is in fact implemented as a set of plu-gins. We divide plugins into three broad classes based ontheir role in the overall control flow and threading logic.The InterceptHandler calls all Dataplugins on every HTTPrequests and associated response to analyze header and pay-

load. Dataplugins are quite general in nature and not neces-sarily tied to phishing prevention. For example, a proxy withall checks and probes disabled but the HTTP payload andheader dataplugins enabled can be used to record all traffic.We used traffic recorded this way to create a baseline for val-idating anti-phishing solutions. After performing some dataanalysis or extraction (for instance computing image hasheson responses), dataplugins may write their result into an in-memory Database. Checks frequently query this databasefor new information and the database gets persisted to diskin encrypted form at a regular configurable interval.Checks execute sequentially on HTTP requests initiatedby the end system’s web browser and decide whether toa) accept the request without further checks (i.e., URL iswhite listed) b) reject the request without further checks(i.e., URL is blacklisted), or c) set a numeric value 0 < w <100 together with a typed choice (acceptPref, rejectPref)to indicate the confidence and choice selection. While theInterceptHandler exits for cases a) and b), it continues toloop through all available checks in case c) before sendingall check preferences into an aggregation plugin1.In contrast to Checks and Dataplugins which only ex-ecute reactively triggered by web browser requests, Probesallow us to embed proactive behavior into the PhishBouncerproxy (also referred to as Pb proxy in this paper). Probescontain dedicated threads that trigger monitoring functionsat regular configurable intervals. Since phishing relies on re-active human behavior (such as clicking on URLs embeddedin phishing emails), probes, being proactive, are not sus-ceptible to those attacks. By registering sensitive informa-tion about frequently visited sites (registered sites), Phish-Bouncer’s probes can monitor valid changes in web sites overtime with a high degree of fidelity, including changes in IPaddresses and image hashes. The resulting data is then laterused by checks in deciding whether to block user requests or

not.The lower part of Figure 1 displays the three access pathsinto the Pb proxy. The HTTP Proxy listens on a config-urable network port (e.g., 8080) for incoming HTTP re-quests, and dispatches the requests to a main handler (Inter-ceptHandler), which in turn makes strategic use of the plu-gins. This flow is similar in the case of the HTTPS Proxy,except that it listens on a different network port (8443) anduses a custom extension of the InterceptHandler (called Ssl-ProxyHandler) that contains logic to enable HTTPS inter-ception during a HTTPS Connect request. The third accesspath is for management of the proxy through an administra-tion console. Management functions include changing order

Figure 2: The PhishBouncer proxy can be deployedin various configurations.of checks and their respective importance weights as well ascustomization via user-specific data.The Pb proxy supports deployment on a wide variety ofplatforms. Depending on available resources and desiredservice model, the proxy can be hosted at various locationsbetween the end client and the target web site. Figure 2displays three different deployment scenarios.In case A), the proxy is co-hosted with the web browser

on the end user’s computer. Although this negatively im-pacts CPU, memory, and disk resources on the end system,this scenario has the benefit of putting the proxy under di-rect control of the end users. We found that end users feeluncomfortable with disclosing personal sensitive data to ex-ternal parties, but are more amenable to providing this in-formation to local components as long as it doesn’t leavetheir machine.Since many end-users own either a wireless or DSL routerand these devices already ship with web server capabili-ties, we investigated deploying the Pb proxy on a LinksysWRT54G wireless router[7] running OpenWrt[9]. OptionB) in Figure 2 shows that the Pb Proxy (represented by thecross symbol) running on a home router. The benefits of thisdeployment option are increased security through strongerisolation from a potentially virus infected desktop and newmarketing opportunities for wireless router manufacturerswho could include the Pb proxy as a value added offering.On the downside, the very limited CPU and memory re-sources of the wireless router significantly lowers the perfor-mance of the proxy and would necessitate re-implementationin C++ to be successfully offered as a product.Deployment option C) places the phishbouncer proxy ontoa server platform that resides for instance at an ISP locationand is capable of handling hundreds of users interactionsconcurrently. Such a server-side deployment has the clearbenefit of supporting anti-phishing checks for end systemsthat can only host minimal software (such as cell phones),and it would allow ISPs to offer anti-phishing as a valueadded service. The major technical challenge of this de-ployment is to increase PhishBouncer’s scalability to handleincreased request load.As of the writing of this paper, we have mostly used andtested PhishBouncer as deployed on various Linux and Win-dows end systems (case A)), but have studied the impact of

changing over to cases B) and C).The remainder of this section describes in more detail

Figure 3: Use case diagram showing all currentlyimplemented checks, probes, and dataplugins.

PhishBouncer’s unique attribute-based anti-phishing func-tionality, HTTPS Proxying, and extensible plugin frame-work.

HTTPS ProxyingInsertion of the proxy into the non-encrypted HTTP client-server path is straightforward and is condensed down tochanging the client’s HTTP web browser’s proxy settingsduring phishbouncer installation. To prevent an attackerfrom replacing the proxy setting to a proxy of his own, fire-wall rules should be set to only allow outgoing web trafficthrough the Pb proxy.Insertion becomes significantly more difficult for encryptedtraffic. For intercepting encrypted HTTPS requests, thebrowser’s proxy settings are changed accordingly to redirectrequests to PhishBouncer’s HTTPS proxy port. That said,this only allows for encrypted requests to flow into the Pb

proxy, as the underlying SSL protocol is specifically designedto prevent man-in-the-middle use cases, whether the man isbenign or not. So how were we able to intercept HTTPSrequests? The answer lies in the way trust relationships areestablished.

Figure 4: PhishBouncer Proxy Acting as TrustedMan-In-The-MiddleFigure 4 displays the architectural layout 3 for proxyingof encrypted HTTPS traffic.In a regular use case without any HTTPS proxy, SSL re-lies on a PKI infrastructure for connection establishment[25]. Following a general description of the SSL protocol,the client issues a connection request to the server, whichthe server acknowledges with a response containing a certifi-cate signed by a CA. The client then continues to performa set of checks on the server certificate, the main one ofwhich is to verify that the CA’s signature is valid. The SSLtransactions essentially establish a unidirectional trust rela-tionship between the browser and the target web server viaa commonly trusted CA (small black line).With the Pb proxy in the mix, the protocol becomes alittle more complex. The Pb proxy takes on the role of

a server when communicating with the browser, and therole of a browser when communicating with the target webserver. This requires the Pb proxy to dynamically generateX509 certificates for each web site it is proxying4. Since thecertificate generation process via CAs typically requires offline identity verification, the Pb proxy hosts a second CA(called Pb CA in Figure 4). During installation, the webbrowser’s settings are configured to trust signatures from thePb CA 5. As a result, the overall trust relationship betweenbrowser and target web server can now be decomposed intotwo daisy-chained relationships, one between the browser tothe Pb proxy, and a second one between the Pb proxy andthe target web server.Does the Pb proxy introduce additional security vulnera-bilities through breaking the end-to-end encryption betweenbrowser and web server? The answer to this question de-pends on the relative trustworthiness of the Pb proxy com-pared to the browser and target web server and where it isdeployed. Consider the extreme case of a highly secure desk-top which hosts the browser and a highly secure web server.Deploying the Pb proxy on an ISP server which may co-hostother applications and not have the latest security patchesinstalled would significantly lower the overall security of webtransactions flowing through it. On the other hand, in a

Figure 5: Handling of HTTPS Connect requests.scenario where Pb proxy is co-located with the web browseron the same desktop, we’d expect it would be more diffi-cult for attackers to subvert the Java-based stand alone Pbproxy process (which only listens on localhost) compared toa C++ web browser running Javascript. In both cases, datais never sent unencrypted over the network, so the guaran-tees provided by SSL are not affected.The remaining part of this section describes in more de-tail the control and data flows for HTTPS CONNECT anddata requests within the Jetty web server. Figure 5 showsthe call sequence for establishing the instances that imple-ment the SSL man-in-the-middle interception. Upon receiv-ing an HTTPS CONNECT request through its SslProxyPort(1) for a specific URL (url1), the SocketListener creates anassociated HttpConnection (or looks up an previously es-tablished one) in step (2) and forwards the request to theSslProxyHandler (3). It checks whether a tunnel has al-ready been created for the target web site, and if not willstart the process of establishing one by creating a new SSLServer Socket and associated MitmSslListener (4) for url1.The function of the MitmSslListener is to project the SSLidentity of the target web site’s SSL listener back to theclient’s web browser. For this reason, it will generate a newsite certificate via the KeyStoreHelper (5) for url1 signed bythe PhishBouncer CA and bind it to the SSL Server Socketlistening on an ephemeral port (MitmPort). After estab-lishing a local SSL endpoint for url1, the SslProxyHandlercreates a new client socket (TunnelClientPort) and createstunnel (SslProxyPort2MitmTunnel) which simply forwardsall data send into TunnelClientPort to MitmPort (6). Fi-nally, the SslProxyHandler tells the HttpConnection to usethe newly created tunnel for all further communication overthis connection (7).Figure 6 displays the data flow over an existing tunnel.

Incoming encrypted data requests (such as HTTP GET)are dispatched to the HttpConnection (via 1,2) which sendsthem via the MitmTunnel (3,4) to the MitmListener. TheMitmListener decrypts the HTTP request and then forwardsthe request to the InterceptHandler (5,6) which in turn for-wards the request to the various plugins. In summary, theMitmListener’s main job is to perform the decryption andencryption operation, while the responsibility of the otherclasses lies in dispatching the data to the right places in athread-safe manner.

ADVANTAGESThe new proposed version of proxy has the followingadvantage than the previous existing proxy servers.1) Handling HTTP requests: The proxy server handlesmultiple HTTP requests from the clients. Concurrencyissues are also handled in this process.2) Cashing: it is one of the few mechanisms that arepreventing the Internet from overloading. Bycaching frequently accessed sites, Information, anderrors, proxy server significantly reduces totalrequired bandwidth, which gives the appearance of afaster response time and save employees time andconnectivity Expenses.Caching is not merely a copy of everything requestedfrom the Internet. In order for an Internet resource to becached, it must abide by the following Guidelines:Access to the Internet resource must beestablished via FTP or HTTP. Access tothe Intemet resource must be via the getrequest.The URL line cannot contain any “? Keywords" as in

Internet searches..The expires HTTP header field must contain a later datethan the date in the Date header field. (It would beineffective to cache old information)The HTTP result code must be 200 (success), 403(Forbidden request), or 404 (URL not found).3) Cookies: A cookie is a commonly used method for eitherdelivering Information from a custom web page orauthorizing or tracking a connection in a way that isinsecure.

V. SECURITY RELATED FEATURESThe security is very important aspect in shared enviournment ofworking. In this proxy server the security related aspect isconsidered as follows.1) Access Authority: Proxy server allows controlling access ofinbound and outbound connections. Access authority may beused to limit a user’s ability to access certain Internet sites.Outbound connections control a user’s ability to usecertain functions on the Internet. For instance, if a Userwants to run FTP, the proxy server must grant themaccess to the protocol. If no access is granted, the user willnot be able to transfer files with FTP.Inbound connections are limited based on theconfiguration of the proxy server. For instance, if aCompany does not offer any Web-based services or Pages,there is no inbound traffic.

CONCLUSION

The primary experimentation result shows a 65zero-dayattacks using only 4 types of checks is a promising start.This implies that embedding attribute-based anti-phishingchecks into an HTTPS proxy could be a viable defense againstphishing, complementing signature and block list based de-fenses. End users, security practitioners ,and anti-phishingresearchers all can benefit from using Pb proxy. Towardthat end, we intend to make the Pb proxy and the auto-mated testing framework available as open-source software.One direction that we did not explore so far but intendpursue after the open-source release is to transform the Pbproxy into an expert assistant to technicians responsible forvetting reported phish sites. Similarly, the automated crawl-ing and recording framework could also be extended for eval-uating diverse anti-phishing technologies against commonbenchmark tests.Our extensive testing of the prototype proxy shows thatHTTPS interception is feasible in practice. We alreadystarted taking advantage of the proxy’s built in flexibility toextend its use beyond phishing attack prevention and intodeveloping adaptive web based distributed systems like theexample presented in section 2.3. More work remains to bedone in this area.Another direction of future work we would like to pursueinvolves addressing the issues that arise in server-scale de-ployments of the Pb proxy. In this situation, a single proxyinstance needs to support hundreds of parallel sessions, andthe past historical behavior of the users will not be readilyavailable or visible. One interesting question is whether itis possible to boost the aggregate efficiency over all sessionsthrough strategic ordering of checks. How to collect histori-cal behavior patterns of users and build the attribute profileof the sites they visit for the purposes of anti-phishing al-

gorithms without violating the users’ privacy rights will beanother challenge.

REFRENCES