Top Banner
Faculty of Engineering and Computer Science Concordia Institute for Information Systems Engineering Cloud Traffic Security Wen Ming Liu INSE 6620 July 23, 2014
58

6620handout5o

Jul 07, 2015

Download

Education

Shahbaz Sidhu

Cloud Computing
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 6620handout5o

Faculty of Engineering and Computer Science

Concordia Institute for Information Systems Engineering

Cloud Traffic Security

Wen Ming Liu

INSE 6620 July 23, 2014

Page 2: 6620handout5o

Agenda

2

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 3: 6620handout5o

Cloud Computing Architecture

3

Page 4: 6620handout5o

Cloud Computing Architecture

4

Page 5: 6620handout5o

Agenda

5

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 6: 6620handout5o

6

Web-Based Applications

untrustedInternet

Client ServerEncryption

“Cryptography solves all security problems!”Really?

Page 7: 6620handout5o

7

Side-Channel Attack on Encrypted Traffic

Internet

Client ServerEncrypted Traffic

User Input Observed Directional Packet Sizes

a: 801→, ←54, ←509, 60→

00: 812→, ←54, ←505, 60→,

813→, ←54, ←507, 60→

b-byte s-byte

� Network packets’ sizes and directions between user and a popular search engine

� By acting as a normal user and eavesdropping traffic with sniffer pro 4.7.5.

� Collected in May 2012

Indicator of the input itself

Fixed pattern: identified input string

Page 8: 6620handout5o

8

Updated Patterns Dec 2013

Internet

Client ServerEncrypted Traffic

User Input Observed Directional Packet Sizes

a: 590→, 67→, ←60, ←60, ←728, 60→

00: 590→, 67→, ←60, ←60, ←698, 60→,

590→, 68→, ←60, ←60, ←717, 60→

b-byte s-byte

� Patterns may change over time, but attacks will still work in similar ways

� Patterns may be different for different Web applications, but they are always there

Indicator of input itself

Fixed pattern: identified input string

Page 9: 6620handout5o

9

To Make Things Worse

� The “Autocomplete” feature allows adversaries to combine the packets corresponding to multiple keystrokes

� Web applications are highly interactive

User Input Observed Directional Packet Sizes

a: 590→, 67→, ←60, ←60, ←728, 60→

00: 590→, 67→, ←60, ←60, ←698, 60→,

590→, 68→, ←60, ←60, ←717, 60→

b-byte s-byte

Page 10: 6620handout5o

10

Longer Inputs, More Unique the Patterns

� S value for each character entered as:

a b c d e f g

509 504 502 516 499 504 502

h i j k l m n

509 492 517 499 501 503 488

o p q r s t

509 525 494 498 488 494

u v w x y z

503 522 516 491 502 501

� First keystroke: � Second keystroke:

First Keystroke

Second Keystroke

a b c d

a 509 487 493 501 497

b 504 516 488 482 481

c 502 501 488 473 477

d 516 543 478 509 499

Unique s value 12 out of 1616 out of 16

The unique patterns leak out users’ private information: the input string

In reality, it may take more than two

keystrokes to uniquely identify an input string.

Page 11: 6620handout5o

Side-Channel Leaks

� To protect the information in critical applications against network sniffing, a common practice is to encrypt their network traffic. However, as discovered in the research, serious information leaks are still a reality.

� Even though the communications generated during these state transitions are protected by HTTPS, their observable attributes can still give away the information about the user’s selection.

� The eavesdropper cannot see the contents, but can observe: number of packets, timing/size of each packet.

11

Slides 11-27 are partially based on: S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel leaks in web applications: A reality today, a challenge tomorrow. In IEEE Symposium on Security and Privacy’10, pages 191–206, 2010.

Page 12: 6620handout5o

Main Findings

� Analysis of the side-channel weakness in web applications.

� Several high-profile and really popular web applications actually disclose surprisingly detailed user’s sensitive information� Personal health data, family income, investment details, search queries

� The root causes of the side-channel information leaks are the fundamental characteristics in today’s web applications.� Stateful communication, low entropy input and significant traffic

distinctions

� In-depth study on the challenges in mitigating the threat.

� Evaluate the effectiveness and the overhead of common mitigation techniques such as packet padding.

� Show that effective solutions to the side-channel problem have to be application-specific, relying on an in-depth understanding of the application being protected.

� This suggests the necessity of a significant improvement of the current practice for developing web applications.

12

Page 13: 6620handout5o

Fundamental Characteristics of Web Applications

The root causes are some fundamental characteristics in today’s web applications :

� Low entropy inputs for better interactions.� Small input space� Autosuggestion, auto-complete

� Stateful communications.� Transitions to next states depend both on the current state and

on its input.� Although information for each transition may be insignificant, their

combination can be really powerful.

� Significant traffic distinctions.� The chance of two different user actions having the same traffic

pattern is really small. � Such distinctions often come from the objected updated by client-

server data exchanges.

13

Page 14: 6620handout5o

Significant traffic distinctions

14

Page 15: 6620handout5o

Basic of Wi-Fi Encryption Schemes:

WEP: susceptible to key-recovery attacksWPA: TKIP (RC4)WPA2: CCMP (128-bit AES block cipher in counter mode)

The ciphertext fully preserves the size of its plaintext!

15

Page 16: 6620handout5o

Scenario: search using encrypted Wi-Fi WPA/WPA2.Example: user types “list” on a WPA2 laptop.

Consequence: Anybody on the street knows our search queries.

Attacker’s effort: linear, not exponential.

821�

910

822�

931

823�

995

824�

1007

16

Page 17: 6620handout5o

OnlineHealthA

(“A” denoting a pseudonym)

� A web application by one of the most reputable companies of online services

� Illness/medication/surgery information is leaked out, as well as the type of doctor being queried.

� Vulnerable designs� Entering health records

� By typing – auto suggestion� By mouse selecting – a tree-structure organization of

elements

� Finding a doctor� Using a dropdown list item as the search input

17

Page 18: 6620handout5o

tabs

Entering health records: no matter keyboard typing or mouse selection, attacker has a 2000× ambiguity reduction power.

Find-A-Doctor: attacker can uniquely identify the specialty.

Attacker’s power

Page 19: 6620handout5o

OnlineTaxA

� It is the online version of one of the most widely used applications for the U.S. tax preparation.

� Design: a tax-preparation wizard� Tailor the conversation based on user’s previous input.

� The forms that you work on tell a lot about your family� Filing status� Number of children� Paid big medical bill� The adjusted gross income (AGI)

19

Page 20: 6620handout5o

Entry page of Deductions & Credits

Summary of Deductions & Credits

Full credit

Not eligible

Partial credit

All transitions have unique traffic patterns.

Consult the IRS instruction: $1000 for each child

Phase-out starting from $110,000. For every $1000 income, lose $50

credit.

$0

$110000 $150000

Not eligibleFull credit Partial credit

(two children scenario)

child credit state machine

20

Page 21: 6620handout5o

Entry page of Deductions & Credits

Summary of Deductions & Credits

Full credit

Not eligible

Partial credit

Even worse, most decision procedures for credits/deductions

have asymmetric paths.Eligible – more questions

Not eligible – no more question

Enter your paid interest

$0

$115000 $145000

Not eligibleFull credit Partial credit

Student-loan-interest credit

21

Page 22: 6620handout5o

Disabled Credit

$24999

Retirement Savings$53000

IRA Contribution

$85000 $105000

College Expense $116000

$115000Student Loan Interest

$145000

First-time Homebuyer credit $150000 $170000

Earned Income Credit$41646

Child credit *$110000

Adoption expense $174730 $214780

$130000 or $150000 or $170000 …

$0

A subset of identifiable AGI thresholds

� We are not tax experts.� OnlineTaxA can find more than 350 credits/deductions.

22

Page 23: 6620handout5o

A major financial institution in the U.S.

Which funds you invest? • No secret.

• Each price history curve is a

GIF image from MarketWatch.

• Everybody in the world can

obtain the images from

MarketWatch.

• Just compare the image sizes!

OnlineInvestA

Your investment allocation• Given only the size of the pie chart,

can we recover it?

• Challenge: hundreds of pie-charts

collide on a same size.

23

Page 24: 6620handout5o

Inference based on the evolution of the pie-chart size in 4-or-5 days

� The financial institution updates the pie chart every day after the market is closed.

� The mutual fund prices are public knowledge.

≅ 800 charts ≅ 80 charts ≅ 8 charts 1 chart

Siz

e o

f day

1

Siz

e o

f day

2;

Prices o

f th

e d

ay

Siz

e o

f day

3;

Prices o

f th

e d

ay

Siz

e o

f day

4;

Prices o

f th

e d

ay

≅8

00

00

ch

art

s

24

Page 25: 6620handout5o

Challenging to Mitigate the Vulnerabilities

25

Page 26: 6620handout5o

� Traffic differences are everywhere. Which ones result in

serious data leaks?� Need to analyze the application semantics, the availability of

domain knowledge, etc.

� Hard.

� Is there a vulnerability-agnostic defense to fix the

vulnerabilities without finding them?� Obviously, padding is a must-do strategy.

� We found that even for the discussed apps, the defense policies

have to be case-by-case.

Why challenging?

26

Page 27: 6620handout5o

� See if problem can be solved without analyzing individual application

� Application-agnostic manner: Padding

� Rounding:

� Random padding:

� Average overhead:

� Given , reduction power being calculated after padding

Universal Mitigation Policies

27

Any Problem?

Page 28: 6620handout5o

Agenda

28

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 29: 6620handout5o

29

Trivial problem?

Page 30: 6620handout5o

30

Don’t Forget the Cost

(Prefix) char s Value Rounding (ΔΔΔΔ)

64 160 256

(c) c 473 512 480 512

(c) d 477 512 480 512

(d) b 478 512 480 512

(d) d 499 512 640 512

(a) c 501 512 640 512

(b) a 516 576 640 768

Padding Overhead (%) 6.5% 14.1% 13.0%

1 S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel leaks in web applications: A reality today, a challenge tomorrow. In IEEE Symposium on Security and Privacy’10, pages 191–206, 2010.

� No guarantee of better privacy at a higher cost

� ∆ ↑⇏ privacy↑� ∆ ↑ ⇏ overhead↑

� To make all inputs indistinguishable by rounding will result in a 21074% overhead for a well-known online tax system 1

Page 31: 6620handout5o

Two Conflicting Goals

31

� To prevent side-channel attacks, we face two seemingly conflicting goals,

� Privacy protection: Reduce the differences in packet sizes

� Cost: Minimize the overhead (communication and processing…)

� The similar goals may allow us to borrow existing expertise on privacy-preserving data publishing(PPDP).

How?

Grouping and Breaking

Slides 31-45 are partially based on: W. M. Liu, L. Wang, K. Ren, P. Cheng, M. Debbabi, S. Zhu, “PPTP: Privacy-Preserving Traffic Padding in Web-Based Applications,” IEEE Trans. on Dependable and Secure Computing (TDSC),

Page 32: 6620handout5o

Solution: Ceiling Padding

32

473 477 478 (c) c

477 477 478 (c) d

478 499 478 (d) b

499 499 516 (d) d

501 516 516 (a) c

516 516 516 (b) a

S Value Padding (Prefix) charOption 1 Option 2

Quasi-ID Function 1 Function 2 Sensitive AttributeGeneralization

PPTP:Padding group

PPDP:anonymized group

� PPTP goals:

� Privacy

� Cost

� PPDP goals:

� Privacy

� Data utility

So we can apply existing techniques in data publication to achieve ceiling padding

However, there are a few difference, and hence challenges...

� Ceiling padding: pad every packet to the maximum size in the group

Page 33: 6620handout5o

33

PPTP Components

Internet

� Interaction:

� action a:� Atomic user input that triggers traffic� A keystroke, a mouse click …

� action-sequence �� :� A sequence of actions with complete input info� Consecutive keystrokes…

� action-set Ai:� Collection of all ith actions in a set of action-seq

� Observation:

� flow-vector v:� A sequence of flows (directional packet sizes)� Triggered an action

� vector-sequence ��:� A sequence of flow-vectors� Triggered by an equal-length action-sequence

� vector-set Vi:� Collection of all ith vectors in a set of vector-seq

� Vector-Action Set VAi:

� Pairs of ith actions and corresponding ith flow-vectors

User Input Observed Directional Packet Sizes

a: 801→, ←54, ←509, 60→

00: 812→, ←54, ←505, 60→,

813→, ←54, ←507, 60→

Page 34: 6620handout5o

34

Privacy and Cost

� k-indistinguishability: Given a vector-action set VA

� Padding group:any S⊆VA satisfying all the pairs in S have identical flow-vectors and no S’ ⊃S can

satisfy this property

� We say VA satisfies k-indistinguishability (k is an integer) if the cardinality of every padding group is no less than k

� Goal of privacy protection:

� Upon observing any flow-vector in the traffic, the eavesdropper cannot determine which action in the table (vector-action set) has triggered this flow-vector.

� l-diversity:

� Address the cases that:No all inputs should be treated equally in padding (for example, some statistical

information regarding the likelihood of different inputs may be publicly known).

Page 35: 6620handout5o

35

Privacy and Cost

� Vector-distance:

� Given two equal-length flow-vectors v1 and v2, vector-distance is the total number of bytes different in the flows: ���� ��, �� =∑ (|��� − ���|)"#�$� .

� Padding cost:

� Given a vector-set V, the padding cost is the sum of the vector-distances between each flow-vector in V and its countpart after padding.

� Processing cost:

� Given a vector-set V, the processing cost is the number of flows in V which corresponding packets should be padded.

Page 36: 6620handout5o

Agenda

36

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 37: 6620handout5o

Challenge 1

37

473 477 478 (c) c

477 477 478 (c) d

478 499 478 (d) b

499 499 516 (d) d

501 516 516 (a) c

516 516 516 (b) a

S Value Padding (Prefix) charOption 1 Option 2

Differences and challenges:

� Data utility measures & padding cost

� Traffic padding: cost of option 1 is worse than that of option 2

� Data publication: utility of function 1 is better than that of option 2

Page 38: 6620handout5o

38

Challenge 2

� Recall that adversaries may combine multiple keystokes

� Example:

� One obvious, but invalid solution:Pad every keystroke (separately)

� Another obvious, but invalid solution:Pad on the whole string!

First Keystroke

Second Keystroke

a b c d

a 487 493 501 497

b 516 488 482 481

c 501 488 473 477

d 543 478 509 499

First Keystroke

Second Keystroke

a b c d

a 487 493 501 497

b 516 488 482 481

c 501 488 473 477

d 543 478 509 499

First Keystroke

Second Keystroke

a b c d

a … … 501 …

b … 488 … …

c 501 488 … …

d … … … …

First Keystroke

Second Keystroke

a b c d

a 509 … … 501 …

b 504 … 488 … …

c 502 501 488 … …

d 516 … … … …

First Keystroke

Second Keystroke

a b c d

a 516 … … 501 …

b 504 … 488 … …

c 504 501 488 … …

d 516 … … … …

Strings 1st keystroke 2nd keystroke

ac 509 501

ca 502 501

ad 509 497

dd 516 499

… … …

Page 39: 6620handout5o

39

PPTP - Overview of Algorithms

� Intention:� To demonstrate the existence of abundant possibilities in approaching PPTP issue, and not to design an exhaustive list of solutions.

� Design three algorithms for partitioning inputs into padding groups.� Main difference: the algorithms handle in increasingly complicated cases .� Computational complexity:

� svsdSimple algorithm: Ο &'()&� svmdGreedy algorithm: Ο(&* ⨯ &2) (worse case), Ο(&* ⨯ &'()&) (average case)� mvmdGreedy algorithm:Ο(&* ⨯ &2) (worse case), Ο(&* ⨯ &'()&) (average case)

Page 40: 6620handout5o

40

Experiment Settings

� Collect data from two real-world web applications:

� A popular search engine(users’ search keyword needs to be protected)

Collect flow-vectors for query suggestion widget for all possible combinations of four lettersby crafting requests to simulate the normal AJAX connection request.

� Authoritative drug information system from national institute(user’s possible health information needs to be protected)

Collect vector-action set for all the drug information by mouse-selecting following theapplication’s three-level tree-hierarchical navigation.

Page 41: 6620handout5o

41

Overhead - Padding Cost

� The padding cost against k:

� To compare to rounding, Δ=512(engineB)andΔ=5120(drugB)which achieves only 5-indistinguishility.� Our algorithms have less padding cost in both cases.� Observe that our algorithms are superior specially when the number of flow-vectors is larger.

Page 42: 6620handout5o

42

Overhead – Execution Time

� Generate n-size flow data by synthesizing n/|VA| copies of engineB and drugB.

� The computation time of mvmdGreedy increases slowly with n.� Practically efficient (1.2s for 2.7m flow-vectors),� Require slightly more overhead than rounding when it is applied to a single Δ value.

� The computational time of mvmdGreedy against privacy property k� A tighter upper bound: Ο(&* ⨯ & ⨯ 21 ⨯ λ) (worse case), Ο(&* ⨯ & ⨯ log(21 ⨯ λ)) (average case)� The computation time increases slowly with k for engineB, and decreases slowly for drugB

.

Page 43: 6620handout5o

43

Overhead – Processing Cost

� An application can choose to incorporate the padding at different stage ofprocessing a request, however, we must minimize the number of packets to bepadded.

� Pad the flow-vectors on the fly,� Modify the original data beforehand.

� The processing cost against k:� Rounding must pad each flow-vector regardless of the k’s and the applications, while ouralgorithms have much less cost for engineB and slightly less for drugB.

Page 44: 6620handout5o

Extension

44

� Adapt l-diversity to address cases that:

� No all inputs should be treated equally in padding.

� Model:� Catch the information about the inequality.

� Algorithms:

� Need additional constraints on partition.

Page 45: 6620handout5o

45

Experiments

� Collect data from two real-world web applications:� Another popular search engine

(users’ search keyword needs to be protected)

� Authoritative patent information system from national institute (company’s patent interest needs to be protected)

Page 46: 6620handout5o

46

Challenge 3: Ceiling Padding Defeated

Condition s Value Rounding (ΔΔΔΔ) CeilingPadding112 144 176

Cancer 360 448 432 528 360

Cervicitis 290 336 432 352 360

Cold 290 336 432 352 290

Cough 290 336 432 352 290

Padding Overhead (%) 18.4% 40.5% 28.8% 5.7%

2-indistinguishability

� Observation: a patient received a 360-byte packet after login

� Cancer? Cervicitis? ⇒50%,50%� Extra knowledge: this patient is a male

� Cancer? Cervicitis? ⇒100%,0%

� Facts (Ceiling Padding):

Slides 46-58 are partially based on: W. M. Liu, L. Wang, K. Ren, M. Debbabi,, “Background Knowledge-Resistant Traffic Padding for Preserving User Privacy in Web-Based Applications,” Proc. The 5th IEEE International Conference and on Cloud Computing Technology and Science (IEEE CloudCom 2013),

Page 47: 6620handout5o

47

Solution: Add Randomness

Condition s Value

Cancer 360

Cervicitis 290

Cold 290

Cough 290

Cancerous Person

36

0

36

0

36

0

� Random Ceiling Padding� Instead of deterministically forming padding groups, the server will randomly (at uniform, in this example) selects one out of the other three conditions (together with the real condition) to form a padding group for ceiling padding.

Always receive a 360-byte packet

Condition s Value

Cancer 360

Cervicitis 290

Cold 290

Cough 290

Cervicitis Patient

36

0

29

0

66.7%: 290-byte packet33.3%: 360-byte packet

29

0

Page 48: 6620handout5o

48

Better Privacy Protection

Diseases s Value

Cancer 360

Cervicitis 290

Cold 290

Cough 290

� Can tolerate adversaries’ extra knowledge

� Suppose an adversary knows a patient is male and he saw s = 360

PatientHas

ServerSelects

Cancer Cervicitis

Cancer Cold

Cancer Cough

Cold Cancer

Cough Cancer

The adversary now can only

be 60%, instead of 100%, sure

that patient has Cancer.

� Cost is not necessarily worse

� In this example, these two methods actually lead to exactly the same expected padding and processing costs

Page 49: 6620handout5o

Analysis

49

� Scenario:

� Algorithm: randomness drawn from uniform distribution

� Data: action-sequence and flow-vector are of length one

� Analysis of privacy preservation:

� Lemma 6.1

� Analysis of costs:

� Lemma 6.2, Lemma 6.3

Page 50: 6620handout5o

Model, Scheme and Experiment

50

� Model:

� component, privacy, method, cost

� Scheme:

� Main idea:

� Server randomly selects members to form the group. � Different choices of random distribution lead to different algorithms.

� Two instantiations of scheme:

� Bounded uniform distribution; Normal distribution� Computation complexity: O(k)

� Experiment:

� Data from real-world web applications

� Low overheads and high uncertainty

Page 51: 6620handout5o

51

Privacy Properties

� k-Indistinguishability:

� For any flow-vector, at least k different actions can trigger it.

� Given vector-action set VA, padding algorithm M, range Range(M,VA)

∀ � ∈ 9�&): ;, <= , �: Pr(;@� � = � > 0 ∧ � ∈ = ≥ 1

Model the privacy requirement of a traffic padding from two perspectives

� Uncertainty:

� Apply the concept of entropy in information theory to quantify an adversary’s uncertainty about the real action performed by a user.

� Given vector-action sequence <=, padding algorithm M

D �, <=,; = −E Pr(;@� � = � log� Pr(;@� � = �F∈G

H <=,; = E D(�, <=,;) × Pr(; = = �)"∈JFKLM(N,OG)

Φ <=,; = Q H(<=,;)OG∈OG

Page 52: 6620handout5o

52

Privacy Properties (cont.)

� δ-uncertain k-Indistinguishability:

An algorithm M give δ-uncertain k-Indistinguishability for a

vector-action sequence <= if

� M w.r.t. any <= ∈ <= satisfies k-indistinguishability, and

� The uncertainty of M w.r.t. <= is not less than δ.

Page 53: 6620handout5o

53

Padding Method

� Ceiling padding [15][16]:

� Inspired by PPDP: grouping and breaking

� Dominant-vector of a padding group:

� Size of each group is not less than k, andevery flow-vector in a group is padded to dominant-vector of that group.

Achieves k-indistinguishability, but

not sufficient if the adversary possess prior knowledge.

� Random ceiling padding method:

� A mechanism M: when responding to an action a (per each user request),

� It randomly selects k-1 other actions, and

� Pads the flow-vector of action a to be dominant-vector of transient group(those k actions).

� Randomness:

� Randomly selects members of transient group from certain candidates based on certain distributions.

� To reduce the cost, change the probability of an action being selected as a member of transient group.

Page 54: 6620handout5o

54

Cost Metrics

� Expected processing cost:

� How many flow-vectors need to be padded

� RS(� <=,; = ∑ ∑ TU N(F)V"(W,X)∈YZYZ∈YZ∑ F," : F," ∈OGYZ∈YZ

� Expected padding cost:

� The proportion of packet size increases compared to original flow-vectors

� Given vector-action sequence <=, padding algorithm M,

(�, �): *S(� �, <=,; = E Pr ; � = �[ × �[ − �"[∈JFKLM(N,OG) �

<=:*S(� <=,; = E (*S(�(�, <=,;))

F," ∈OG<=: *S(� <=,; = E (*S(�(<=,;))

OG∈OG

Page 55: 6620handout5o

55

Overview of Scheme

� Main idea:

� To response a user input, server randomly selects members to form the group. � Different choices of random distribution lead to different algorithms.

� Goal:

� The privacy properties need to be ensured.� The costs of achieving such privacy protection should be minimized.

� Two stage scheme:

� Stage 1: derive randomness parameters, one-time, optimization problem;� Stage 2: form transient group, real-time.

Page 56: 6620handout5o

56

Scheme (cont.)

� Computational complexity: O(k)

� Stage 1: pre-calculated only once� Stage 2: select k-1 random actions without duplicate, O(k).

� Discussion on privacy:

� The adversary cannot collect vector-action set even acting as normal user,

<= = 100, 1 = 20, \&�](R^ ⟹ `R�&��:&`)R(\*�(�&�S`�(&) = 9919 ≈ 2cc

� Approximate the distribution is hard: all users share one random process.

� Discussion on costs:

� Deterministically incomparable with those of ceiling padding.

Page 57: 6620handout5o

57

Instantiations of Scheme

� Bounded uniform distribution:

� ct: cardinality of candidate actions

� cl: number of larger candidates

� Scheme can be realized in many different ways.

� Choose group members from different subsets of candidates and based ondifferent distributions, in order to reduce costs.

� Normal distribution:

� µ: mean

� σ: standard deviation

0 n

iaction

cl

ct

0 n

iaction

largest smallest largest smallest

probability

[max(0,min(i-cl, |VA|-ct)), min(max(0,min(i-cl, |VA|-ct))+ct, |VA|)]

(return)

Page 58: 6620handout5o

58

Uncertainty and Costs v.s. k

� The padding and processing costs of all algorithms increase with k, while TUNI and NORM have less than those of SVMD.

� Our algorithms have much larger uncertainty for Drug and slightly larger for Engine.