YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: privacy aware collaborative spam detection

ALPACAS:A Large-scale Privacy- Aware Collaborative Anti-spam System

Guided byM.Karthiga B.E

Presented byDevasenapathi..ARadhesh.M

Page 2: privacy aware collaborative spam detection

ABSTRACT

The first is a feature-preserving message transformation technique that is highly resilient against the latest kinds of spam attacks. The second is a privacy-preserving protocol that provides enhanced privacy guarantees to the participating entities.

Page 3: privacy aware collaborative spam detection

INTRODUCTION To protect email privacy, digest approach has been

proposed in the collaborative anti-spam systems to both provide encryption for the email messages and obtain useful information from spam email.

The digest calculation has to be a one-way function such that it should be computationally hard to generate the corresponding email message

Page 4: privacy aware collaborative spam detection

System Requirements

Front End : Java.

Back End : My-SQL

IDE : Eclipse or Net beans 6.9.

Page 5: privacy aware collaborative spam detection

The DCC system attempts to address the privacy issue by using hash functions.

Here, the participating servers do not share the actual emails they have received and classified.

They share the emails’ digests, which are computed through hashing functions such as MD5 over the email body.

Existing System

Page 6: privacy aware collaborative spam detection

Drawbacks in DCC

1. Hashing schemes like MD5 generate completely different hash values even if the message is altered by a single byte.

2. The DCC scheme does not completely address the privacy issue.

Page 7: privacy aware collaborative spam detection

In designing the ALPACAS framework, this paper makes two unique contributions:

1) Feature-preserving transformation

2) Protection via privacy-preserving protocol

Page 8: privacy aware collaborative spam detection

THE ALPACAS ANTI-SPAM FRAMEWORK ALPACAS framework addresses to design the

challenges of the collaborative anti-spam system.

1. To protect email privacy, it is obvious that the messages have to be encrypted.

2. to minimize the information revealed during the collaboration process.

Page 9: privacy aware collaborative spam detection

The ALPACAS framework essentially consists of a set of collaborative anti-spam agents.

An email agent can either be an entity that participates in the ALPACAS framework on behalf of an individual end-user, or it may represent an email server having multiple end-users.

Each email agent of the ALPACAS framework maintains a spam knowledge base and a ham knowledge base , containing information about the known spam and ham emails.

Page 10: privacy aware collaborative spam detection
Page 11: privacy aware collaborative spam detection
Page 12: privacy aware collaborative spam detection

Feature-Preserving Fingerprint The fingerprint of an email is a set of digests that

characterize the message content.

The set of digests is referred to as the transformed feature set (TFSet) of the email.

The individual digests are called the feature elements.

The transformed feature set of a message Ma is represented as TFSet(Ma).

Page 13: privacy aware collaborative spam detection

Shingle-based Message Transformation

Feature preserving fingerprint technique is based upon the concept of Shingles

Shingles are essentially a set of numbers that act as a fingerprint of a document.

Shingles have the unique property that if two documents vary by a small amount their shingle sets also differ by a small amount.

Page 14: privacy aware collaborative spam detection
Page 15: privacy aware collaborative spam detection
Page 16: privacy aware collaborative spam detection

The similarity between two messages Ma and Mb can be calculated as

Page 17: privacy aware collaborative spam detection

Term-level Privacy Preservation

The possibility of inferring a word or a group of words is to shuffle the tokens of the original email and compute TFset on the shuffled email.

To shuffle the email content in an acceptable manner, our feature-preserving fingerprint scheme adopts a controlled shuffling strategy wherein the tokens are shuffled in a predetermined format.

The position of a token after shuffling is always within a fixed range of its original position.

Page 18: privacy aware collaborative spam detection

Privacy-preserving Collaboration Protocol

Page 19: privacy aware collaborative spam detection

If the score is greater than a configurable

threshold λ, Ma is classified as spam. Otherwise it is classified as ham.

Page 20: privacy aware collaborative spam detection

Robustness Against Attacks

The robustness of the ALPACAS approach against two common kinds of camouflage attacks.

1.one is good-word attack

2.character replacement attack.

Page 21: privacy aware collaborative spam detection

Literature Review

Understanding the Network Level Behavior of Spammers

• spam is being sent from a few regions.• IP address space, and that spammers appear to be

using transient• Few pieces of email over very short periods• Finally, a small, yet non-negligible, amount of spam

is received from IP addresses that correspond to short-lived BGP

• routes, typically for hijacked prefixes.

Page 22: privacy aware collaborative spam detection

Reference 2

SMTP Path Analysis

This paper presents a new

learning algorithm for learning the reputation

of email domains and IP addresses based on

analyzing the paths used to transmit known

spam and known good mail.

Page 23: privacy aware collaborative spam detection

SMTP Path Analysis

This algorithm achieves many of the benefits

offered by domain-authentication systems,

black-list services, and white-list services

provide without any infrastructure costs or

rollout requirements.

Page 24: privacy aware collaborative spam detection

Reference 3

On Attacking Statistical Spam Filters

Spammershavetriedmanythingsfromusing HTMLlayout tricks, letter substitution, to adding random data. While at times their attacks are clever, they have yet to work strongly against the statistical nature that drives many altering systems.

Page 25: privacy aware collaborative spam detection

Reference 3

Here, examine the general attack methods spammers use, along with challenges faced by developers and spammers. It also demonstrate an attack that, while easy to implement, attempts to more strongly work against the statistical nature behind alters.

Page 26: privacy aware collaborative spam detection

Conclusion

We plan to establish this idea in voice ip spam detection for privacy and securing purpose.


Related Documents