A system to filter unwanted messages from OSN user walls

A System to Filter Unwanted Messages from

OSN User Walls

Presented By:

Gajanand Sharma

M. E. Scholar

UVCE Bangalore

Guided By:

Ms. Vandana Jha

Ph. D. Scholar

UVCE Bangalore

Introduction

Related Work

Model

Algorithm

Implementation

Performance

Conclusion

Bibliography

The underlying issue in today’s Online Social Networks is to give users theability to control the messages posted on their own timeline.

Online Social Networks provide a little support to this necessity.

The proposed system allows users to have a direct control on their timelineposts.

This is achieved by using a flexible rule based system allowing users tocustomize the filtering criteria.

Online Social Networks are one of the most popular medium for communication,

sharing and broadcasting the human life information.

Due to huge and dynamic character of data, web content mining strategies are

assumed to automatically discover the useful information from the data.

In OSNs this strategy is used to filter and remove unwanted posts on the user walls.

It can be implemented using ad - hoc classification strategies because wall messages

contain short text for which traditional classification methods do not work.

So the aim of proposed system is to evaluate an automated system able to filter

unwanted messages form user walls.

Machine Learning text categorization techniques are used to automatically assign with

each short text message a set of categories based on its content.

By using this technique, short messages are categorized into neutral and non-neutral.

Then Non-neutral messages are further classified into different categories.

By using Filtering Rules, users can state what contents should not be displayed on

their walls.

Filtering Rules exploit user profiles, user relationships as well as the output of the

Machine Learning categorization process to state the filtering criteria to be enforced.

The system also provides the support for user-defined Black-Lists. i.e., lists of users

that are temporarily prevented to post any kind of messages on a user wall.

N.J. Belkin and W.B. Croft, introduced Information filtering system, in “Information

Filtering and Information Retrieval: Two Sides of the Same Coin?”

P.J. Denning, introduced content based filtering system in paper entitled “Electronic Junk,”

P.W. Foltz and S.T. Dumais, also discussed information filtering system in the paper

“Personalized Information Delivery: An Analysis of Information Filtering Methods,”

M. Vanetti, E. Binaghi, B. Carminati, M. Carullo, and E. Ferrari, given the concept of content

based filtering in the paper “Content-Based Filtering in On-Line Social Networks,”

The architecture in support of OSN services is a three-tier structure.

1. Social Network Manager (SNM)

-Aims to provide the basic OSN functionalities…

2. Social Network Applications (SNAs)

-Provides the support for external Social Network Applications…

3. Graphical User Interfaces (GUIs)

-GUI to set up and manage FRs/ BLs by users…

OSN

Information

Filtering

Policy-based

Personalization

Short Text

Classification

Information filtering can be used for a different, more sensitive, purpose. This is

due to the fact that in OSNs there is the possibility of posting or commenting other

posts on particular public/private areas, called in general walls.

Information filtering can therefore be used to give users the ability to automatically

control the messages written on their own walls, by filtering out unwanted

messages.

Information filtering

It is something like first identifying Neutral sentences, then classifying Non-neutral

sentences…

First level task is somehow hard task i.e. labeling massage sentences Neutral or Non-

Neutral…

In second level non-neutral sentences are further classified into different classes…

The second level soft classifier produces a gradual membership for each non-neutral

sentence…

Short Text Classifier

A classification method has been proposed to categorize short text messages in

order to avoid overwhelming users of microblogging services by raw data.

Filtering policy language allows the setting of FRs according to a variety of

criteria, that do not consider only the results of the classification process but also

the relationships of the wall owner with other OSN users as well as information on

the user profile.

Policy based Personalization

Vector Space Model

underlying model for text representation

This is the underlying model for text representation according to which a text

document dj is represented as a vector of binary or real weights.

T is the set of terms that occur at least once in at least one document of the

collection Tr.

wkj є [0,1] represents how much term tk contributes to the semantics of

document dj.

RBFN Model

RFBNs have a single hidden layer of processing units with local, restricted

activation domain: A Gaussian function is commonly used.

RBFN main advantages are that classification function is nonlinear, the model

may produce confidence values and it may be robust to outliers.

Drawbacks are the potential sensitivity to input parameters, and potential

overtraining sensitivity.

A creator specification creatorSpec implicitly denotes a set of OSN users. It can

have following forms-

A set of attribute constraints of the form an OP av

A set of relationship constraints of the form (m, rt, minDepth, maxTrust)

A filtering rule FR is a tuple (author, creatorSpec, contentSpec, action)

Creator specification

Filtering Rule

A BL rule is a tuple (author, creatorSpec, creatorBehavior, T)

author is the OSN user who specifies the rule, i.e., the wall owner;

creatorSpec is a creator specification, specified according to Definition 1;

creatorBehavior consists of two components RFBlocked and minBanned.

RFBlocked = (RF, mode, window)

minBanned = (min, mode, window)

T denotes the time period the users identified by creatorSpec and

creatorBehavior have to be banned from author wall.

Black Lists

The short message goes in user’s filtering wall and checked using the filtering rules

defined by the user.

According to the user defined filtering rules, it is labeled as the class in it resides.

Then the gradual value of message is compared with the system defined threshold

value.

If message crosses the threshold value then it goes to block list. Otherwise it is posted

to user’s wall.

Two different types of measures will be used to evaluate the effectiveness of first-level

and second-level classifications.

In the first level, the short text classification procedure is evaluated on the basis of the

contingency table approach.

At second level, measures Precision (P) that permits to evaluate the number of false

positives, Recall (R), that permits to evaluate the number of false negatives, and the

overall metric F-measure (F β) defined as the harmonic mean between the above two

indexes.

Evaluation Metrics

The blacklist guarantees 100% filtering of messages

coming from suspicious sources.

The process of detecting and filtering spam is transparent,

regulated by standards and fairly reliable.

Flexibility, and the possibility to fine-tune the settings.

Rarely make mistakes in distinguishing spam from

legitimate messages.

Overall Performance

DicomFW is the GUI of this study work. It is a prototype Facebook application.

The main focus is on implementation of Filtering Rules throughout the

implementation.

This application permits to-

1. View the list of users’ Filtering Walls;

2. View messages and post a new one on a Filtering Walls;

3. Define Filtering Rules using the OSA tool.

DicomFW

In this whole study work, a system to filter undesired messages from Online Social

Network walls is presented.

The system exploits a Machine Learning soft classifier to enforce customizable

content-dependent Filtering Rules.

The flexibility of the system in terms of filtering options is enhanced through the

management of Black Lists.

The aim behind this work is to investigate a tool able to automatically recommend

trust values for those contacts user does not personally known.

[1] M. Vanetti, E. Binaghi, B. Carminati, M. Carullo, and E. Ferrari, “Content-Based Filtering in On-

Line Social Networks,” Proc. ECML/PKDD Workshop Privacy and Security Issues in Data Mining and

Machine Learning (PSDML ’10), 2010.

[2] Y. Zhang and J. Callan, “Maximum Likelihood Estimation for Filtering Thresholds,” Proc. 24th Ann.

Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 294-302, 2001.

[3] M. Carullo, E. Binaghi, and I. Gallo, “An Online Document Clustering Technique for Short Web

Contents,” Pattern Recognition Letters, vol. 30, pp. 870-876, July 2009.

[4] M. Carullo, E. Binaghi, I. Gallo, and N. Lamberti, “Clustering of Short Commercial Documents for

the Web,” Proc. 19th Int’l Conf. Pattern Recognition (ICPR ’08), 2008.

[5] C.D. Manning, P. Raghavan, and H. Schu ¨tze, Introduction to Information Retrieval. Cambridge

Univ. Press, 2008.

[6] J. Moody and C. Darken, “Fast Learning in Networks of LocallyTuned Processing Units,” Neural

Computation, vol. 1, no. 2, pp. 281-294, 1989.

[7] M.J.D. Powell, “Radial Basis Functions for Multivariable Interpolation: A Review,” Algorithms for

Approximation, pp. 143-167, Clarendon Press, 1987.

[8] J. Park and I.W. Sandberg, “Approximation and Radial-BasisFunction Networks,” Neural

Computation, vol. 5, pp. 305-316, 1993.

[9] C. Cleverdon, “Optimizing Convenient Online Access to Bibliographic Databases,” Information

Services and Use, vol. 4, no. 1, pp. 37-47, 1984.

[10] J.A. Golbeck, “Computing and Applying Trust in Web-Based Social Networks,” PhD dissertation,

Graduate School of the Univ. of Maryland, College Park, 2005.

A system to filter unwanted messages from OSN user walls

Education

information filtering

filtering rules

filtering criteria

paper contentbased filtering

short messages

introductionthe system

user profiles

user relationships