Data leakage detection

Post on 08-Jun-2015

1282 Views

Category:

Engineering

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

somthing about data leakage detection

Transcript

DATA LEAKAGE DETECTION

DEPARTMENT OF COMPUTER SCIENCE

BY Anshika Singh

Guided By: Mr V K Shukla

INTRODUCTION

Data leakage is defined as the accidental or unintentional distribution of private or sensitive data to an unauthorized entity .

Data leakage poses a serious issue for companies as the number of incidents and the cost to those experiencing them continue to increase.

Data leakage is enhanced by the fact that transmitted data including emails, instant messaging, website forms, and file transfers among others, are largely unregulated and unmonitored on their way to their destinations.

The main scope of this module is provide complete information about the

data/content that is accessed by the users within the website.

Forms Authentication technique is used to provide security to the website in

order to prevent the leakage of the data.

Continuous observation is made automatically and the information is send to the

administrator so that he can identify whenever the data is leaked.

Above all the important aspect providing proof against the Guilty Objects. The

following techniques are used.

Fake Object Generation.

Watermarking. 

DATA LEAKAGE DETECTION

Sept. 2011 Science ApplicationsInternational Corp

Backup tapes stolen from a car containing 5,117,799 patients’ names, phone numbers ,Social Security numbers, and medicalinformation.

July 2008 Google Data were stolen, not from Google offices, but from the headquarters of an HR outsourcing company ,Colt Express.

July 2009 American Express DBA stole a laptop containing thousands of American Express card numbers. The DBA reported it stolen

Aug. 2007 Nuclear Laboratoryin Los Alamos

An employee of the U.S. nuclear laboratory in Los Alamos transmitted confidential information by email.

Data leakage incidents

EXISTING SYSTEMDATA LEAKAGE DETECTION IS HANDLED BY WATERMARKING

Watermark is a unique code is embedded in each distributed copy. If that copy is later discovered in the hands of an unauthorized party, the leaker can be identified.

Watermarks can be very useful in some cases, but again, involve some modification of the original data. Furthermore, watermarks can sometimes be destroyed if the data recipient is malicious.

Hence this technique proves to be inefficient.

Example :- A company may have partnerships with other companies that require sharing customer data. Another enterprise may outsource its data processing, so data must be given to various other companies

PROPOSED SYSTEM

ADDITION OF FAKE OBJECTS

The distributor may be able to add fake objects to the

distributed data in order to improve his effectiveness in detecting guilty agents.

Fake objects are objects generated by the distributor that are not in the original set.

The objects are designed which appear realistic, and are distributed among the agents along with

the original objects.

Different fake objects may be added to the data sets of different agents in order to increase the

chances of detecting agents that leak data.

SYSTEM DIAGRAM

Distributor:The distributor is the main owner of the data.

Agents:These are supposedly trusted third parties who can make

requests for data to the distributor. Guilty Agent:

The agent who leaks the sensitive data of the distributorto unauthorised party.

Target:The unauthorised party who receives the distributor’s

sensitive data leaked by the guilty agent

The distributor can send data to these agents by insertingdifferent fake objects into the data sets of different agents.Now, suppose the distributor discovers his sensitive dataat an unauthorised party.

Database Maintenance:The sensitive data which is to be handed over to the agents is stored in the database.

Agent Maintenance:The registration detail about the agents as well as the data which is given to them by the distributor is maintained.

Addition of Fake Objects:The distributor is able to add fake objects in order to improve the effectiveness in detecting the guilty agent.

Data Allocation:In this module, the original records fetched according to the agent’s request are combined with the fake records generated by the administrator.

Calculation Of Probability:In this module, the request of every agent is evaluated and probability of each agent being guilty is calculated.

MODULE DESCRIPTION

ARCHITECTURAL VIEW OF THE SYSTEM

Hardware Required:  System : Pentium IV 2.4 GHz Hard Disk : 40 GB Floppy Drive : 1.44 MB Monitor : 15 VGA colour Mouse : Logitech. Keyboard : 110 keys enhanced. RAM : 256 MB

Software Required: 

O/S :Windows XP.Language :Asp.Net , c#.Data Base :Sql Server 2005IDE : Visual Studio 2008

 

Data Loss Prevention (DLP)

DLP: Security measures to protect confidential and private data

in-use in-motion at-rest

From both intentional and accidental loss of data

• Data-In-Motion Email Network access Wireless

• Data-at-Rest Portable/Removable media (USB) Authorized abuse

• Data-In-Use IM File share Web uploads

• Compliance Regulations Customer credit card information Medical Information Financial Information

DLP SOLUTIONS –FOUR FOCUS AREAS

To protect against confidential data theft and loss, a multi-layered security foundation is needed

Control/limit access to the data –firewalls, remote access controls, network access controls, physical security controlsSecure information from threats –protect perimeter and endpoints from malware, botnets, viruses, DoS, etc. with security technology Control use of sensitive data once access is granted –policy-based content

inspection, acceptable use, encryption

Cisco’s Solution for Data Loss Prevention

Build a secure foundation with a Self-Defending Network Integrate DLP controls into security devices to protect data and increase visibility while decreasing the complexity and total cost of ownership of DLP deployments

DATA LOSS PREVENTION

An Integrated Approach to Data Loss Prevention through Security

In the real scenario there is no need to hand over the sensitive data to the agents who will unknowingly or maliciously leak it.

However, in many cases, we must indeed work with agents that may not be 100 percent trusted, and we may not be certain if a leaked object came from an agent or from some other source.

In spite of these difficulties, it is possible to assess the likelihood that an agent is responsible for a leak, based on the overlap of his data with the leaked data .

The algorithms we have presented implement a variety of data distribution strategies that can improve the distributor’s chances of identifying a leaker.

CONCLUSION

THANK YOU ALL

top related