Lecture 19 Page 1 CS 136, Spring 2009 Web Security and Privacy CS 136 Computer Security Peter Reiher June 4, 2009.

Lecture 19Page 1CS 136, Spring 2009

Web Security and PrivacyCS 136

Computer Security Peter ReiherJune 4, 2009


Outline

• Web security• Privacy issues in modern computer systems


Web Security

• Lots of Internet traffic is related to the web

• Much of it is financial in nature• Also lots of private information flow

around web applications• An obvious target for attackers


The Web Security Problem• Many users interact with many servers• Most parties have little other relationship• Increasingly complex things are moved via the web• No central authority• Many developers with little security experience• Many critical elements originally designed with no

thought to security• Sort of a microcosm of the overall security problem


Aspects of the Web Problem


Who Are We Protecting?

The server

From the

client

The client

From the serverThe clients

From each other


What Are We Protecting?• The client’s private data• The server’s private data• The integrity (maybe secrecy) of their

transactions• The client and server’s machines• Possibly server availability

– For particular clients?


Some Real Threats• Buffer overflows and other compromises

–Client attacks server• SQL injection

–Client attacks server• Malicious downloaded code

–Server attacks client


More Threats• Cross-site scripting

–Clients attack each other• Threats based on non-transactional nature

of communication–Client attacks server

• Denial of service attacks–Threats on server availability (usually)


Compromise Threats

• Much the same as for any other network application

• Web server might have buffer overflow– Or other remotely usable flaw

• Not different in character from any other application’s problem


What Makes It Worse

• Web servers are complex• They often also run supporting code

– Which is often user-visible• Large, complex code base is likely to

contain such flaws• Nature of application demands allowing

remote use


Solution Approaches

• Patching• Use good code base• Minimize code that the server executes• Maybe restrict server access

–When that makes sense• Lots of testing and evaluation


SQL Injection Attacks

• Many web servers have backing databases–Much of their information stored in

database• Web pages are built (in part) based on

queries to database–Possibly using some client input . . .


SQL Injection Mechanics

• Server plans to build a SQL query• Needs some data from client to build it

– E.g., client’s user name• Server asks client for data• Client, instead, provides a SQL fragment• Server inserts it into planned query

– Leading to a “somewhat different” query


An Example

“select * from mysql.userwhere username = ‘ “ . $uid . “ ‘ andpassword=password(‘ “. $pwd “ ‘);”

• Intent is that user fills in his ID and password

• What if he fills in something else?‘or 1=1; -- ‘


What Happens Then?• $uid has the string substituted, yielding“select * from mysql.user

where username = ‘ ‘ or 1=1; -- ‘ ‘ andpassword=password(‘ “. $pwd “ ‘);”

• This evaluates to true– Since 1 does indeed equal 1– And -- comments out rest of line

• If script uses truth of statement to determine valid login, attacker has logged in


Basis of SQL Injection Problem• Unvalidated input• Server expected plain data• Got back SQL commands• Didn’t recognize the difference and went ahead• Resulting in arbitrary SQL query being sent to

its database– With its privileges


Solution Approaches

• Carefully examine all input–To filter out injected SQL

• Use database access controls–Of limited value

• Randomization of SQL keywords–Making injected SQL meaningless


Malicious Downloaded Code• Modern web relies heavily on downloaded

code– Full language and scripting language

• Instructions downloaded from server to client– Run by client on his machine– Using his privileges

• Without defense, script could do anything


Types of Downloaded Code• Java

– Full programming language• Scripting languages

– Java Script– VB Script– ECMAScript– XSLT


Solution Approaches• Disable scripts

– Not very popular• Use secure scripting languages

– Also not popular– Particularly with code writers

• Isolation mechanisms– VM or application-based

• Vista mandatory access control


Cross-Site Scripting• XSS• Many sites allow users to upload information

– Blogs, photo sharing, Facebook, etc.– Which gets permanently stored– And displayed

• Attack based on uploading a script• Other users inadvertently download it

– And run it . . .


The Effect of XSS

• Arbitrary malicious script executes on user’s machine

• In context of his web browser–At best, runs with privileges of the site

storing the script–Often likely to run at full user

privileges


Why Is XSS Common?

• Use of scripting languages widespread–For legitimate purposes

• Most users leave them enabled in browser• Only a question of getting user to run

your script–Often only requires fetching URL


Typical Effects of XSS Attack

• Most commonly used to steal personal information–That is available to legit web site–User IDs, passwords, credit card

numbers, etc.• Such information often stored in cookies

at client side


Solution Approaches

• Don’t allow uploading of scripts–Usually by carefully analyzing

uploaded data• Provide some form of protection in

browser


Exploiting Statelessness

• HTTP is designed to be stateless• But many useful web interactions are stateful• Various tricks used to achieve statefulness

– Usually requiring programmers to provide the state

– Often trying to minimize work for the server


A Simple Example

• Web sites are set up as graphs of links• You start at some predefined point

– A top level page, e.g.• And you traverse links to get to other pages• But HTTP doesn’t “keep track” of where

you’ve been– Each request is simply the name of a link


Why Is That a Problem?

• What if there are unlinked pages on the server?

• Should a user be able to reach those merely by naming them?

• Is that what the site designers intended?


A Concrete Example

• The ApplyYourself system• Used by colleges to handle student

applications• For example, by Harvard Business

School in 2005• Once all admissions decisions made,

results available to students


What Went Wrong?

• Pages representing results were created as decisions were made

• Stored on the web server– But not linked to anything, since results not

yet released• Some appliers figured out how to craft URLs to

access their pages– Finding out early if they were admitted


The Core Problem

• No protocol memory of what came before• So no protocol way to determine that

response matches request• Could be built into the application that

handles requests• But frequently isn’t

– Or is wrong


Solution Approaches

• Get better programmers– Or better programming tools

• Back end system that maintains and compares state

• Front end program that observes requests and responses– Producing state as a result


Conclusion

• Web security problems not inherently different than general software security

• But generality, power, ubiquity of the web make them especially important

• Like many other security problems, constrained by legacy issues


Privacy

• Data privacy issues• Network privacy issues• Some privacy solutions


What Is Privacy?

• The ability to keep certain information secret

• Usually one’s own information• But also information that is “in your

custody”• Includes ongoing information about

what you’re doing


Privacy and Computers• Much sensitive information currently kept

on computers– Which are increasingly networked

• Often stored in large databases– Huge repositories of privacy time bombs

• We don’t know where our information is


Privacy and Our Network Operations

• Lots of stuff goes on over the Internet– Banking and other commerce– Health care– Romance and sex– Family issues– Personal identity information

• We used to regard this stuff as private– Is it private any more?


Threat to Computer Privacy• Cleartext transmission of data• Poor security allows remote users to access our

data• Sites we visit can save information on us

– Multiple sites can combine information• Governmental snooping• Location privacy• Insider threats in various places


Some Specific Privacy Problems• Poorly secured databases that are remotely accessible

– Or are stored on hackable computers• Data mining by companies we interact with• Eavesdropping on network communications by

governments• Insiders improperly accessing information• Cell phone/mobile computer-based location tracking


Data Privacy Issues

• My data is stored somewhere– Can I control who can use it/see it?

• Can I even know who’s got it?• How do I protect a set of private data?

– While still allowing some use?• Will data mining divulge data “through the

back door”?


Personal Data• Who owns data about you?• What if it’s really personal data?

– Social security number, DoB, your DNA record?

• What if it’s data someone gathered about you?– Your Google history or shopping records– Does it matter how they got it?


Protecting Data Sets

• If my company has (legitimately) a bunch of personal data,

• What can I/should I do to protect it?– Given that I probably also need to use

it?• If I fail, how do I know that?

– And what remedies do I have?


Options for Protecting Data• Careful system design• Limited access to the database

– Networked or otherwise• Full logging and careful auditing• Using only encrypted data

– Must it be decrypted?– If so, how to protect the data and the keys?


Data Mining and Privacy

• Data mining allows users to extract models from databases–Based on aggregated information

• Often data mining allowed when direct extraction isn’t

• Unless handled carefully, attackers can use mining to deduce record values


Insider Threats and Privacy

• Often insiders need access to private data–Under some circumstances

• But they might abuse that access• How can we determine when they

misbehave?• What can we do?


Network Privacy

• Mostly issues of preserving privacy of data flowing through network

• Start with encryption–With good encryption, data values

not readable• So what’s the problem?


Traffic Analysis Problems

• Sometimes desirable to hide that you’re talking to someone else

• That can be deduced even if the data itself cannot

• How can you hide that?– In the Internet of today?


Location Privacy• Mobile devices often communicate while

on the move• Often providing information about their

location–Perhaps detailed information–Maybe just hints

• This can be used to track our movements


Implications of Location Privacy Problems

• Anyone with access to location data can know where we go

• Allowing government surveillance• Or a private detective following your

moves• Or a maniac stalker figuring out where to

ambush you . . .


Some Privacy Solutions

• The Scott McNealy solution– “Get over it.”

• Anonymizers• Onion routing• Privacy-preserving data mining• Preserving location privacy• Handling insider threats via optimistic security


Anonymizers• Network sites that accept requests of

various kinds from outsiders• Then submit those requests

–Under their own or fake identity• Responses returned to the original

requestor• A NAT box is a poor man’s anonymizer


The Problem With Anonymizers

• The entity running knows who’s who• Either can use that information himself• Or can be fooled/compelled/hacked to

divulge it to others• Generally not a reliable source of real

anonymity


Onion Routing• Meant to handle issue of people knowing

who you’re talking to• Basic idea is to conceal sources and

destinations• By sending lots of crypo-protected

packets between lots of places• Each packet goes through multiple hops


A Little More Detail

• A group of nodes agree to be onion routers

• Users obtain crypto keys for those nodes• Plan is that many users send many

packets through the onion routers–Concealing who’s really talking


Sending an Onion-Routed Packet

• Encrypt the packet using the destination’s key

• Wrap that with another packet to another router–Encrypted with that router’s key

• Iterate a bunch of times


In Diagram Form

Source Destination

Onion routers


What’s Really in the Packet


Delivering the Message


What’s Been Achieved?• Nobody improper read the message• Nobody knows who sent the message

–Except the receiver• Nobody knows who received the message

–Except the sender• Assuming you got it all right


Issues for Onion Routing

• Proper use of keys• Traffic analysis• Overheads

–Multiple hops–Multiple encryptions


Privacy-Preserving Data Mining

• Allow users access to aggregate statistics

• But don’t allow them to deduce individual statistics

• How to stop that?


Approaches to Privacy for Data Mining

• Perturbation– Add noise to sensitive value

• Blocking– Don’t let aggregate query see sensitive

value• Sampling

– Randomly sample only part of data


Preserving Location Privacy

• Can we prevent people from knowing where we are?

• Given that we carry mobile communications devices

• And that we might want location-specific services ourselves


Location-Tracking Services

• Services that get reports on our mobile device’s position– Probably sent from that device

• Often useful– But sometimes we don’t want them

turned on• So, turn them off then


But . . .

• What if we turn it off just before entering a “sensitive area”?

• And turn it back on right after we leave?• Might someone deduce that we spent the

time in that area?• Very probably


Handling Location Inferencing

• Need to obscure that a user probably entered a particular area

• Can reduce update rate– Reducing certainty of travel

• Or bundle together areas– Increasing uncertainty of which was

entered


Conclusion

• Privacy is a difficult problem in computer systems

• Good tools are lacking– Or are expensive/cumbersome

• Hard to get cooperation of others• Probably an area where legal assistance is

required

Lecture 19 Page 1 CS 136, Spring 2009 Web Security and Privacy CS 136 Computer Security Peter Reiher June 4, 2009.

Documents