Top Banner
Practical Techniques Practical Techniques for Searches on for Searches on Encrypted Data Encrypted Data Yongdae Kim Yongdae Kim [email protected] [email protected] Written by Song, Wagner, Written by Song, Wagner, Perrig Perrig
24

Efficient Algorithms for Locating Web Proxies Copyright, 1996 © Dale Carnegie & Associates, Inc. Li-Chuan Chen [email protected] The MITRE Corporation Co-author:

Dec 24, 2015

Download

Documents

Coral Stanley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • Slide 1
  • Efficient Algorithms for Locating Web Proxies Copyright, 1996 Dale Carnegie & Associates, Inc. Li-Chuan Chen [email protected] The MITRE Corporation Co-author: Hyeong-Ah Choi George Washington University 2001 CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS01)
  • Slide 2
  • MITRE: Li-Chuan Chen2 Outline Research Motivation. Background. Research Goals. Literature Review, problem formulation, and results. Summary.
  • Slide 3
  • MITRE: Li-Chuan Chen3 Research Motivation With the increased popularity of World- Wide-Web (WWW or Web) there are a number of problems: Servers overloaded Internet backbone congestion Slow Web services access
  • Slide 4
  • MITRE: Li-Chuan Chen4 Background Approaches to Reduce Server Load: Mirror Web Sites: Replicate web server contents throughout network. (User must select server.) Distributed Web Server: Cluster of distributed servers acting as a single server. Web Caching: Stores frequently requested Web documents in proxy servers or users machines.
  • Slide 5
  • MITRE: Li-Chuan Chen5 Research Goals To reduce Web server load and to increase efficiency and reliability of Web system performance by caching frequently accessed documents at strategically Web proxy locations throughout the network. We will consider the design of optimization algorithms for achieving these objectives. Note that most formulations of these problems are NP- hard. We consider special cases and approximation algorithms for Proxy Location.
  • Slide 6
  • MITRE: Li-Chuan Chen6 Proxy Location Problem Popular web sites have to cope with an enormous number of requests. A Web proxy (cache) sits between users and servers. Proxy returns the requested document to the user if it is in the cache, else requests the document from the server and stores it before returning it to the user.
  • Slide 7
  • MITRE: Li-Chuan Chen7 Proxy Location Problem A popular Web site places its documents closer to users by replicating them on Web proxies throughout the network. Goal: Locate k proxy servers throughout network of n nodes to minimize the overall cost for accessing Web documents.
  • Slide 8
  • MITRE: Li-Chuan Chen8 Proxy Location: Literature Replacement Algorithms: When the cache is full, how do you replace existing Web documents with new one? [CK99, Ira97, AY97] Cache Consistency: Deals with problem of keeping Web documents consistent with the original copy [LC98, Din96]. Proxy Placement: Where to place proxies so the Web documents are closer to the user? [KRS00, LGIDS99, LDGS98, HT91]
  • Slide 9
  • MITRE: Li-Chuan Chen9 Proxy Location: Problem Formulation Given a network G=(V,E) with n nodes and integer k. Each node v i is associated with number of document requests w(v i ). Let D(u,u i ) denote the communication cost from u to proxy u i. Objective: Place k proxies U = {u 1, u 2,, u k } and assign each node v to its nearest proxy u i, to minimize the sum w(v)D(v,u i ) over all nodes and over all proxies. Linear topology Ring topology
  • Slide 10
  • MITRE: Li-Chuan Chen10 Proxy Location: History Li, et al. [LDGS98] presented an O(kn 2 ) algorithm for the linear unidirectional case. We improved this to O((log k)n 2 ) and generalized to the bidirectional case with the same running time. Krishman, et al. [KRS00] recently presented an O(kn 3 ) time algorithm for the unidirectional case. Later we discovered an O(kn) time algorithm in the OR literature by Hassin and Tamir [HT91].
  • Slide 11
  • MITRE: Li-Chuan Chen11 Proxy Allocation: Results Uni- and bidirectional Ring Topologies: Compute optimal proxy placement in O(n 2 ) time. (Improves O(kn 4 ) by Krishman, et al. [KRS00].)
  • Slide 12
  • MITRE: Li-Chuan Chen12 Proxy Location: Linear Topology Dynamic Programming Formulation: Break interval [1,j] into subintervals [1,j ] and [j +1,j]. Place one proxy in [j +1,j] and k-1 proxies in [1,j ]. i=1 jj +1 Find j
  • Slide 13
  • MITRE: Li-Chuan Chen13 Proxy Location: Ring Topology Break ring at any point, and reduce to linear case. Solve linear problem in O(kn) time [HT91]. To get the optimal solution, we need to break the ring at an optimal break point. A brute-force approach would result in an O(kn 2 ) time algorithm.
  • Slide 14
  • MITRE: Li-Chuan Chen14 Ring Topology: Our Method Rather than trying all n possible choices for the optimal break point, we show that the optimal break point can be selected from a set of only n/k candidate break points. Interleaving Property: Let x 1,x 2,,x k denote the optimum break point sequence for the ring, and let y 1,y 2,,y k be the optimal linear break points resulting from an arbitrary cut to the ring, then x1x1 x2x2 x3x3 x4x4 y1y1 y2y2 y3y3 y4y4
  • Slide 15
  • MITRE: Li-Chuan Chen15 Ring Topology: Our Method Break the ring at each of these positions, and solve the linear problem for each. By interleaving, one of these will be optimal. Select the one with lowest cost. Using Interleaving, we can find a set of n/k candidate break points as follows. We break the ring at an arbitrary point and compute the optimal linear break points. Choose the interval that has least # of node (at most n/k).
  • Slide 16
  • MITRE: Li-Chuan Chen16 Heuristics and Performance Analysis Many of our existing results are approximations or apply to special cases, because the underlying optimization problems are NP-hard. We implemented heuristics for the proxy location problem for general Internet topologies given a fixed number of servers k.
  • Slide 17
  • MITRE: Li-Chuan Chen17 Internet Topology Input Graph Used the Tiers model, by Calvert, Doar and Zegura of Georgia Tech [CDZ97] for Internet topology generation. Tiers is based on a 3-level hierarchical network (WAN, MAN, LAN). 20 random Internet graphs were generated for each of 63, 119, 267, 575, and 1144 nodes.
  • Slide 18
  • MITRE: Li-Chuan Chen18 Internet Topology Input Graph Example of n = 575 nodes:
  • Slide 19
  • MITRE: Li-Chuan Chen19 Heuristics for Proxy Location Given Number of Servers k: Random: Randomly select k servers and output cost. n-(and n log n)-Random-Pairs: Start with Random. Repeatedly select a node i at random and swap with a random existing server. If swap is profitable, then do it. The process is repeated for n (or n log n) times. (n log n)-Random-Clients: Similar to (n log n)- Random-Pairs, except after randomly selecting node i, we swap with the server giving the best cost. We assume all nodes have equal demand, w(v i ) = 1.
  • Slide 20
  • MITRE: Li-Chuan Chen20 Heuristics for Proxy Location Given Number of Servers k: (continued) Swap-to-Limit: Start with Random. For each existing server j, swap j with each client i. Select the swap that gives the best cost. Repeat until no swap improves the cost.
  • Slide 21
  • MITRE: Li-Chuan Chen21 Simulation Results Brute-Force Search: Computes optimal solution by generating all k-node subsets of {1,2, , n}, and computing the cost for each subset. Requires O(n k ) time, and so is not practical for large values of k and n. Given small values of k = 2, 3, 4, 5, 6 servers, we ran and compared the heuristics with the brute-force algorithm.
  • Slide 22
  • MITRE: Li-Chuan Chen22 Simulation Results: Brute force versus heuristics for k = 3: cost
  • Slide 23
  • MITRE: Li-Chuan Chen23 Simulation Results: Brute force versus heuristics for k = 3: CPU
  • Slide 24
  • MITRE: Li-Chuan Chen24 Simulation Results: For larger values of k = 2, 4, 8, 16, 32 servers, we ran and compared the heuristics for proxy location given a fixed number of servers k. Also collected statistics on the intermediate costs for n = 63, 119, 267, 575, 1144 and k = 2, 4, 8, 16, 32.
  • Slide 25
  • MITRE: Li-Chuan Chen25 Simulation Results: Heuristics given number of servers for k = 32: cost
  • Slide 26
  • MITRE: Li-Chuan Chen26 Simulation Results: Heuristics given number of servers for k = 32: CPU time
  • Slide 27
  • MITRE: Li-Chuan Chen27 Simulation Results: Intermediate cost for n = 1144, k = 32:
  • Slide 28
  • MITRE: Li-Chuan Chen28 Summary We have introduced the problem of improving efficiency of access to Web system services through the use of proxy location. Most problem formulations are NP-hard. We have presented algorithms for the ring topology. We have implemented heuristics for the general case and presented simulations for performance evaluation.
  • Slide 29
  • MITRE: Li-Chuan Chen29 Thank you!
  • Slide 30
  • MITRE: Li-Chuan Chen30 Proxy Location: Ring Submodular:
  • Slide 31
  • MITRE: Li-Chuan Chen31 Proxy Location: Ring Interleaving : X and Y interleave but not X_opt and Y_opt
  • Slide 32
  • MITRE: Li-Chuan Chen32 Heuristics for Proxy Location Given the cost of opening each server: We assume that there is a fixed cost for opening each server. Random: Opens a random server and computes the cost. Repeat as long as cost decreases. Greedy: Similar to random, but repeatedly selects the server that gives the maximum cost reduction (never deletes a server). Repeated until a server cannot be added without increasing the cost.
  • Slide 33
  • MITRE: Li-Chuan Chen33 Heuristics for Proxy Location Run-(n log n): (Charikar and Guha [CG99]). Start with Random. Repeat n log n times: Select node i at random as a new server location. For each existing server i, consider closing i and reassigning its clients to i. If this is profitable do it. If the overall cost is lower, then open i and do all of this, otherwise ignore. Run-to-Limit: Same as Run-(n log n) except the algorithm only terminates when no more improvement can be made.
  • Slide 34
  • MITRE: Li-Chuan Chen34 Simulation Results: Heuristics given the cost of $20K of opening each server:
  • Slide 35
  • MITRE: Li-Chuan Chen35 Simulation Results: Heuristics given the cost of $20K of opening a server: CPU
  • Slide 36
  • MITRE: Li-Chuan Chen36 Fault Tolerance Possible Faults: network failures, server failures, document demand changes, network transfer rate changes. Constraints: After any single failure, a constant number of proxies may be relocated. Goal: Design an algorithm to achieve approximately optimal solution to restore Web services when server fails.
  • Slide 37
  • MITRE: Li-Chuan Chen37 Fault Tolerance Optimal placement for 5 proxies Optimal placement for 4 proxies: very costly. Proxy fails
  • Slide 38
  • MITRE: Li-Chuan Chen38 Fault Tolerance: On-line Approach On-line placement for 5 proxies. Not optimal but good. 1 5 4 3 2 Proxy fails 1 4 3 Move last proxy to replace failed server. Same as 4 proxy on-line placement. (2)
  • Slide 39
  • MITRE: Li-Chuan Chen39 Fault Tolerance: Server Failures On-line Algorithm: Makes decisions to a series of requests without knowledge of the entire input sequence. Approximate Optimality: For any m, our on-line algorithm for 2m proxies has cost is less than the optimal algorithm with m proxies. Strategy: Build an initial set of m proxies using the on-line algorithm.When a server x fails: If x is the last proxy added, then no action is needed. Else let y be the last proxy, move ys documents to node nearest to x, and remove y. We now have m-1 remaining servers, and approximate optimality.
  • Slide 40
  • MITRE: Li-Chuan Chen40 Fault Tolerance: Other Problems Network Failures: How to reroute network traffic to make use of the existing set of proxies? How to determine the best way to place proxies in the updated topology? (Cannot be tolerated in linear topology, only applies to more general topologies.) Link Transfer Rate Changes: How to move proxies when such changes are present? (Can model this as a change in the distance function.) Temporal Variations: Demand rate and network transfer rate varies (e.g., lunch time, events). Determine solutions that are approx. optimal for each possible demand scenario and apply them accordingly.
  • Slide 41
  • MITRE: Li-Chuan Chen41 Major Contributions Efficient algorithms for optimal proxy location on ring topologies. Use of submodularity to produce more efficient DP solutions.
  • Slide 42
  • MITRE: Li-Chuan Chen42 Future Directions Consider ways of strengthening our existing results either by improving efficiency of the algorithms or by eliminating some of the assumptions that are made. Tree topology: Generalize the proxy location results from linear to tree topologies. Non-homogeneous proxies/Documents: We have assumed all proxies hold the same documents. An important generalization would be to determine both the placement of proxies and how documents are assigned to proxies. Fault-Tolerance: How to deal with proxy failures and fluctuations in demand.