Top Banner
An Analysis of P3P Deployment Hyun Jin Kim Sensitive Information in a Wired World November 11, 2003
33
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Kim's Presentation

An Analysis of P3P Deployment

Hyun Jin KimSensitive Information

in a Wired World

November 11, 2003

Page 2: Kim's Presentation

Introduction Privacy Policies

US self-regulatory approach to online privacy protection

Description of a company’s data practices What information they collect from individuals

and what they do with it

Page 3: Kim's Presentation

P3P Specifications Developed by World Wide Web

Consortium (W3C) over 5 years of work

Became an official W3C “Recommendation” just over a year ago on April 16, 2002

Page 4: Kim's Presentation

P3P Specifications

Page 5: Kim's Presentation

P3P Evaluation System Design Automated process to measure P3P adoption

and gather data from P3P-enabled web sites By Lorrie Faith Cranor, Simon Byers, and David

Kormann (AT&T Labs-Research) Five major components

URL Collection Mechanism P3P Policy Retriever Scripted Interface to the W3C P3P Validator P3P Policy Evaluator Generic Data Analysis Tools

Page 6: Kim's Presentation

URL Collector To identify sets of sites of interest

Existing lists of URLs Newly constructed lists that focus on

particular web sites Web spidering technique

Gather information from web directories and other sources

Page 7: Kim's Presentation

P3P Policy Retriever Pearl Script to retrieve P3P information

All policies, policy reference files, compact header policies

Page 8: Kim's Presentation

P3P Validator W3C P3P Validator

Fetches P3P policy reference files, policy files and compact policies

Checks them for compliance with the P3P 1.0 Specification

Stops validation upon encountering an error Scripted interface to the W3C P3P

Validator Retrieve P3P policies from sites with errors

in their policy reference files

Page 9: Kim's Presentation

P3P Policy Evaluator Compares a web site’s policy with a

user’s privacy preferences

Finds a mismatch between the P3P policy and the privacy preferences

Page 10: Kim's Presentation

Data Analysis Outputs of policy evaluations gathered

in a rectangular matrix Row – policy from a web site Column – APPEL rule set file

Run a Pearl script over the matrix Produce various tabulations

i.e., number of sites that returned mismatch between privacy preferences and P3P policies

Page 11: Kim's Presentation

Web Site Selection Focus on the sites frequently visited by users

PFF Most Popular 85 of the 100 busiest sites determined by the October 2001

Nielsen/NetRatings ranking of sites with the most unique visitors per month Excludes adult sites, children’s sites, business-to-business sites, and sites not

in the .com top level domain PFF Random

Random sample of 302 of the 7821 domains with at least 39,000 unique monthly visitors in October 2001 by Nielsen/NetRatings

PFF Refined Random 209 domains from the PFF Random list that were in the top 5,625 domains in

October 2001 by Nielsen/NetRatings Excludes adult sites, children’s sites, business-to-business sites, and non-dot-

coms Netscore Top 500

500 domains with the most unique visitors during July 2002 by comScore Media Matrix netScore Standard Traffic Measurement report

Key Measures Top 500 domains with the most unique visitors during July 2002 by comScore

Media Matrix Key Measures report Includes “third-party” sites

Page 12: Kim's Presentation

Web Site Selection (Cont.)

Alexia Top 500 domains by Alexia Traffic Ranking on Feb.4, 2003 Includes non-US domains and adult sites

Froogle 1,017 sites obtained by crawling the www.froogle.com web sites in

April 2003 Sites offer products for sale

Yahooligans 900 sites obtained by crawling www.yahooligans.com in April 2003 Sites for children ages 7-12

Firstgov 344 government sites indexed at www.firstgov.gov in April 2003 Includes US federal and state government sites and sites for some

quasi-government organizations News

2,429 sites by news.google.com in April 2003 Includes a variety of news-reporting organizations from the US and

other countries

Page 13: Kim's Presentation

P3P Adoption on May 2003

Page 14: Kim's Presentation

P3P Adoption (Cont.) P3P adoption increasing over time

Highest for the most popular web sites Key Measures site lists higher than Netscore

Presence of “third-party” sites To avoid having their cookies blocked by IE6

Alexa top 500 list lowest International nature Large number of adults sites

One third of the P3P-enabled sites had errors flagged by W3C P3P Validator 7% had errors that prevented their evaluation by

Privacy Bird evaluation engine Omit required components of a P3P policy Improperly referencing data elements

Page 15: Kim's Presentation

Privacy Bird Evaluation Definition of not sharing data

Sites share data only with agents that use it only to complete the transaction for which it was provided or with delivery companies

Data sharing occurs only under an opt-in policy

3 standard settings Low

Trigger a red bird – policy does not match the preferences Collects health/medical info Share it with other companies Use it for analysis, marketing or to make decisions what content

or ads the user sees Engage in marketing but do not provide a way to opt-out

Page 16: Kim's Presentation

Privacy Bird Evaluation (Cont.)

Medium Same as low Sites sharing PII (physical contact info, online contact info,

government-issued identifier), financial info, or purchase info with other companies

Sites collecting PII but provide no access provisions

High Same as medium Sites sharing any personal info (including non-identified info) with

other companies Use it to determine the user’s habits, interests, or other

characteristics Sites contacting users for marketing Sites using financial or purchase info for analysis, marketing, or to

make decisions that may affect what content or ads the user sees

Page 17: Kim's Presentation

Privacy Bird Evaluation (Cont.)

Page 18: Kim's Presentation

Privacy Bird Evaluation (Cont.) Red bird on 24% of the evaluated sites

No opt-out of marketing and/or telemarketing ability offered

Most popular sites receive both green bird on low setting and red bird on high setting Green bird - Greater awareness of the importance of the

“choice” principle Red bird - Most offer rich ecommerce environments that

rely heavily on targeted marketing and profiling visitors

Red birds on Froogle and Yahooligans most likely Collect health and medical info

Page 19: Kim's Presentation

Types of Data Collected

Page 20: Kim's Presentation

Types of Data Collected (Cont.) Most collected data

Computer info and click stream info HTTP protocol used for retrieving content from website

Demographic data Less by Froogle and gov’t web sites

Online contact info, physical contact info, interactive data, unique ids Mostly by news web sites

Preference info, purchase info, and state management info (cookies) Fewer collected financial info (excludes purchase process)

Least collected data Content (email msgs, bulletin board postings, etc.) Government-issued identifiers Health information Political information Location information (ie. GPS positioning data) Information not falling into any other pre-defined categories

No government websites collect government-issued identifiers

Page 21: Kim's Presentation

Data Usage

Page 22: Kim's Presentation

Data Usage (Cont.) Almost all websites used data for

Completion and support of the activity for which data was provided Web site and system administration Research and development

Majority of sites used data for Email and postal mail marketing One-time tailoring of the site content Two-forms of pseudonymous profiling

Fewer sites used data for Telemarketing Profiling in which individuals are identified by name or other PII

Very few sites used data for Historical preservation (Not by government sites) Other purposes that do not fall into these categories

News web sites use data for almost every purpose.

Page 23: Kim's Presentation

Data Recipients and Sharing

Page 24: Kim's Presentation

Data Recipients and Sharing (Cont.) Half the websites share PII with parties

other than agents who use data for the purpose for which it was provided Most likely by

News web sites Froogle list sites with delivery company

Least likely by Government web sites

Page 25: Kim's Presentation

Choice Options

Page 26: Kim's Presentation

Choice Options (Cont.) Top sites most likely to engage in marketing

than less popular sites

Top sites most likely to offer choices (opt-in/out)

Internal choices (telemarketing and other marketing) offered more opt-out than opt-in

Third-party choices offered more opt-in than opt-out

Page 27: Kim's Presentation

Access Provisions

Page 28: Kim's Presentation

Access Provisions (Cont.) 92% of sites collecting identified data

provides some access provisions Most provides access to both contact info

and other data Smaller number provides access to only

contact info or to all identified data Very few provides no access None provides access only to non-contact

info

Page 29: Kim's Presentation

Dispute Resolution Options and Remedies

Page 30: Kim's Presentation

Dispute Resolution Options and Remedies Individuals can contact customer service to

resolve their disputes on most sites

About one-third offered resolution via independent organization (ie. Privacy seal provider) by most popular sites

Very few indicated resolution of dispute under an applicable law

Almost none indicated resolution in court

Page 31: Kim's Presentation

Data Retention Policies

Page 32: Kim's Presentation

Data Retention Policies (Cont.) Majority did not have a data retention

policy for all of the data they collected

Government web sites more likely to have a policy of not retaining info or to have a retention policy based on a legal requirement

Page 33: Kim's Presentation

Conclusion P3P adoption is increasing over time, especially for the

most popular web sites Yahooligans (sites for children) most likely to offer opt-in

policies Large number of websites with technical errors in their

P3P policies Debates continue about the need for further privacy

legislation and the effectiveness of industry self-regulation in the privacy area. Essential to have good statistics and privacy policies

US government web sites began posting P3P policies to comply with the privacy requirements of section 208 of the E-Government Act of 2002 Continue web sweeps of gov’t web sites to monitor

compliance with these requirements