Top Banner
QA-Zen ANONYMOUS QUESTION AND ANSWER PLATFORM A Report submitted to M S RAMAIAH INSTITUTE OF TECHNOLOGY Bengaluru For fulfilling the requirements of Bachelor of Engineering in Information Science and Engineering By KRUTHIKA VISHWANATH(1MS11IS044) POOJA.J (1MS11IS075) PRIYANKA.U.PANDIT.T (1MS11IS081) under the guidance of Dr. VIJAYA KUMAR B P Head of Department, Information Science and Engineering Department of Information Science and Engineering M S Ramaiah Institute of Technology Bengaluru - 560054 MAY 2015
65

B.E. Information Science Project report of

Aug 09, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: B.E. Information Science Project report of

QA-Zen

ANONYMOUS QUESTION AND ANSWER PLATFORM

A Report submitted to

M S RAMAIAH INSTITUTE OF TECHNOLOGY

Bengaluru

For fulfilling the requirements of

Bachelor of Engineering in Information Science and Engineering

By

KRUTHIKA VISHWANATH(1MS11IS044)

POOJA.J (1MS11IS075)

PRIYANKA.U.PANDIT.T (1MS11IS081)

under the guidance of

Dr. VIJAYA KUMAR B P

Head of Department,Information Science and Engineering

Department of Information Science and EngineeringM S Ramaiah Institute of Technology

Bengaluru - 560054

MAY 2015

Page 2: B.E. Information Science Project report of

Department of Information Science and Engineering

M S Ramaiah Institute of Technology

Bengaluru - 560054

CERTIFICATE

This is to certify that KRUTHIKA VISHWANATH(1MS11IS044), POOJA.J(1MS11IS075), PRIYANKA.U.PANDIT.T(1MS11IS081) who were working fortheir B.E project under my guidance, have completed the work as per mysatisfaction with the topic QA-Zen, ANONYMOUS QUESTION ANDANSWER PLATFORM. To the best of my understanding the work to besubmitted in dissertation does not contain any work, which has been previouslycarried out by others and submitted by the candidates for themselves for theaward of any degree anywhere.

(Guide) (Head of the Department)Dr. Vijaya Kumar B P Dr. Vijaya Kumar B PProfessor & Head, Dept. of ISE Professor & Head, Dept. of ISE

(Examiner 1) (Examiner 2)Name

Signature

Page 3: B.E. Information Science Project report of

Department of Information Science and Engineering

M S Ramaiah Institute of Technology

Bengaluru - 560054

DECLARATION

We hereby declare that the entire work embodied in this B.E. Projectreport has been carried out by us at M S Ramaiah Institute ofTechnology under the supervision of Dr. Vijaya Kumar B P,Head,Dept. of ISE. This Project report has not been submitted in partor full for the award of any diploma or degree of this or any otherUniversity.

KRUTHIKA VISHWANATH(1MS11IS044)

POOJA.J (1MS11IS075)

PRIYANKA.U.PANDIT.T (1MS11IS081)

Page 4: B.E. Information Science Project report of

Acknowledgement

This project is result of many individuals hard work anddedication.So we would like to extend our sincere gratitude to all

of them.We wish to express our sincere appreciation to theDepartment of Information Science and Engineering for their

extended long-term support and giving us the opportunity. Wewould like to extend our hearty gratitude to our Principal Dr.S Y

Kulkarni , our internal guide Dr.Vijaya Kumar B P and ourexternal guide Mr.Vijay Nadadur,Founder & CEO of Tationem, for

their vast reserve of patience,knowledge and constant feedback.We are also grateful to our project coordinators, Mr.Krishna Raj.

P.M, Mrs.Deepthi K and Mr.Jagadeesh Sai. D for providingnecessary guidance concerning project documentation.

Nevertheless, we express our gratitude toward our families for theirkind co-operation and encouragement which help us in completion

of this project.

4

Page 5: B.E. Information Science Project report of

Abstract

With today’s fast growing technology and many social network-ing sites helping many of the citizens to connect to other people,grow their business or even finding a life partner, social network hasindeed become a very important part of our lives today.But many so-cial networking websites like Facebook, LinkedIn etc have divergentaims. Facebook aims at getting people online and LinkedIn to con-nect professionally. This rapid growth seems to betray the aim withwhich Internet was created, i.e. Create a fountain of knowledge”.Instead we have contributed in creating a drain of information.

Today’s users need a unique place to get themselves engaged inmore intellectual place, as it will definitely help them in a longerrun. More importantly students, who are the future are the oneswho look for a lot of interesting opportunities and can get any workon board due to their interest and will to work.

As per the survey,with the above challenges in the existing sys-tems, we have built an application which aims at helping out thosestudents, in letting out their intellectual nature to other fellow stu-dents, faculty, companies and people as a whole who prefer, uniqueintellectual platform, for knowledge sharing, getting interesting an-swers and posting intriguing questions, to enhance their intellectualapproach to certain issues or work.

We have used certain cutting edge technology like elements of DataMining, Machine Learning and Distributed Databases in implement-ing this project and in getting this application up on web. Once thisapplication gets deployed successfully on web, the students wouldbe able to have an anonymous identity on this site, by which theusers will be able to share their knowledge with the people. The aimto connect people of similar intellectual inclination, share ideas andconnect with like-minded individuals who share the same interests.By doing so, one can keep track of certain issues or topics and getenlightened.

5

Page 6: B.E. Information Science Project report of

Contents

1 Introduction 101.1 General Introduction . . . . . . . . . . . . . . . . . . 101.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . 101.3 Problem Statement . . . . . . . . . . . . . . . . . . . 111.4 Objectives . . . . . . . . . . . . . . . . . . . . . . . . 111.5 Proposed Model . . . . . . . . . . . . . . . . . . . . . 121.6 Current And Future Scope . . . . . . . . . . . . . . . 12

2 Literature Survey 142.1 Question and Answering . . . . . . . . . . . . . . . . 142.2 Brief History On Q&A Website . . . . . . . . . . . . 152.3 Question Answering Methods . . . . . . . . . . . . . 152.4 Challenges . . . . . . . . . . . . . . . . . . . . . . . . 162.5 Progress . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Software Requirements Specifications 203.1 General Description . . . . . . . . . . . . . . . . . . . 20

3.1.1 Project Perspective . . . . . . . . . . . . . . . 203.1.2 Product Overview . . . . . . . . . . . . . . . . 213.1.3 End User Expectations . . . . . . . . . . . . . 233.1.4 General Constraints of the software . . . . . . 24

3.2 Specific Requirements . . . . . . . . . . . . . . . . . . 253.2.1 Functional Requirements . . . . . . . . . . . . 253.2.2 Software Requirements . . . . . . . . . . . . . 26

3.3 Interface Requirements . . . . . . . . . . . . . . . . . 263.3.1 User Interface . . . . . . . . . . . . . . . . . . 263.3.2 Software System Attributes . . . . . . . . . . 27

4 Software Design Documents 294.1 System Architecture: Overview . . . . . . . . . . . . 294.2 System Model . . . . . . . . . . . . . . . . . . . . . . 304.3 Subsystem Model and Design . . . . . . . . . . . . . 31

5 System Implementation 33

6 Software Test Document 436.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 43

6.1.1 System Overview . . . . . . . . . . . . . . . . 43

6

Page 7: B.E. Information Science Project report of

6.1.2 Test Approach . . . . . . . . . . . . . . . . . 436.2 Test Plan . . . . . . . . . . . . . . . . . . . . . . . . 43

6.2.1 Features to be Tested . . . . . . . . . . . . . . 436.2.2 Features not to be Tested . . . . . . . . . . . 436.2.3 Testing Tools and Environment . . . . . . . . 44

6.3 Test Case . . . . . . . . . . . . . . . . . . . . . . . . 446.3.1 Test Case 1 . . . . . . . . . . . . . . . . . . . 446.3.2 Test Case 2 . . . . . . . . . . . . . . . . . . . 446.3.3 Test Case 3 . . . . . . . . . . . . . . . . . . . 456.3.4 Test Case 4 . . . . . . . . . . . . . . . . . . . 456.3.5 Test Case 5 . . . . . . . . . . . . . . . . . . . 456.3.6 Test Case 6 . . . . . . . . . . . . . . . . . . . 466.3.7 Test Case 7 . . . . . . . . . . . . . . . . . . . 466.3.8 Test Case 8 . . . . . . . . . . . . . . . . . . . 476.3.9 Test Case 9 . . . . . . . . . . . . . . . . . . . 476.3.10 Test Case 10 . . . . . . . . . . . . . . . . . . . 476.3.11 Test Case 11 . . . . . . . . . . . . . . . . . . . 486.3.12 Test Case 12 . . . . . . . . . . . . . . . . . . . 486.3.13 Test Case 13 . . . . . . . . . . . . . . . . . . . 486.3.14 Test Case 14 . . . . . . . . . . . . . . . . . . . 496.3.15 Test Case 15 . . . . . . . . . . . . . . . . . . . 496.3.16 Test Case 16 . . . . . . . . . . . . . . . . . . . 496.3.17 Test Case 17 . . . . . . . . . . . . . . . . . . . 506.3.18 Test Case 18 . . . . . . . . . . . . . . . . . . . 506.3.19 Test Case 19 . . . . . . . . . . . . . . . . . . . 516.3.20 Test Case 20 . . . . . . . . . . . . . . . . . . . 516.3.21 Test Case 21 . . . . . . . . . . . . . . . . . . . 526.3.22 Test Case 22 . . . . . . . . . . . . . . . . . . . 526.3.23 Test Case 23 . . . . . . . . . . . . . . . . . . . 526.3.24 Test Case 24 . . . . . . . . . . . . . . . . . . . 536.3.25 Test Case 25 . . . . . . . . . . . . . . . . . . . 536.3.26 Test Case 26 . . . . . . . . . . . . . . . . . . . 536.3.27 Test Case 27 . . . . . . . . . . . . . . . . . . . 546.3.28 Test Case 28 . . . . . . . . . . . . . . . . . . . 546.3.29 Test Case 29 . . . . . . . . . . . . . . . . . . . 546.3.30 Test Case 30 . . . . . . . . . . . . . . . . . . . 556.3.31 Test Case 31 . . . . . . . . . . . . . . . . . . . 556.3.32 Test Case 32 . . . . . . . . . . . . . . . . . . . 566.3.33 Test Case 33 . . . . . . . . . . . . . . . . . . . 56

7

Page 8: B.E. Information Science Project report of

6.3.34 Test Case 34 . . . . . . . . . . . . . . . . . . . 566.3.35 Test Case 35 . . . . . . . . . . . . . . . . . . . 576.3.36 Test Case 36 . . . . . . . . . . . . . . . . . . . 576.3.37 Test Case 37 . . . . . . . . . . . . . . . . . . . 58

7 Conclusion And Future Work 59

A Q&A Platform 61

B Django 61

C NLTK 61

D Numpy 61

E PyEnchant 62

F tf–idf 62

G Profanity 62

H SRS 62

I MVC 62

J Python 63

8

Page 9: B.E. Information Science Project report of

List of Figures

1 Design Of The System . . . . . . . . . . . . . . . . . 292 Django Python WebFramework . . . . . . . . . . . . 303 Implementation Of The System . . . . . . . . . . . . 344 Implementation Of Profanity Filtering . . . . . . . . 365 Implementation Of Read At Leisure . . . . . . . . . . 376 Implementation Of The Scoring System . . . . . . . . 377 Implementation Of BookMark . . . . . . . . . . . . . 388 Implementation Of Editing Question And Answers . 399 Implementation Of Suggesting Editions . . . . . . . . 4010 Implementation Of Follow Questions . . . . . . . . . 4011 Implementation Of Text Summarizer . . . . . . . . . 4112 Implementation Of Burning Topics . . . . . . . . . . 4113 Implementation Of Spell Checker . . . . . . . . . . . 42

9

Page 10: B.E. Information Science Project report of

1 Introduction

This chapter consists namely introduction. Firstly it explains thegeneral introduction followed by motivation behind this project fol-lowed by the problem statement, existing issues and the proposedmodel concluding with current and future scope of the project.

1.1 General Introduction

With today’s fast growing technology and many of the social net-working sites up on web helping many of the citizens to connect topeople, helping the people grow their business or even finding a lifepartner, social network has indeed become a very important part ofour lives today.But with many social networking websites like Facebook, LinkedInthere are divergent aims. Facebook aims at getting people onlineand LinkedIn to connect professionally. This rapid growth seemsto betray the aim with which Internet was created, i.e. “Create afountain of knowledge”. Instead we have contributed in creating adrain of information.Users need a one stop place to get themselvesengaged in more intellectual work, as it will definitely help them ina longer run, as intellectua are the ones who looks for a lot of in-teresting opportunities and can get any work on board due to theirinterest and will to work.

1.2 Motivation

The survey involved nearly 300 social networking users from diversebackground. Based on the survey and aiding analysis of popularsocial networking sites such as Facebook, Quora where Facebookwhose aims to connect people has resulted in a merely time-spendingplace without a reason. Quora which is the question answer site hasresulted in sharing very useful information among all types of peoplebut has failed to make it more intellectual. Also these platformsseem to focus way too much on a person’s identity.

10

Page 11: B.E. Information Science Project report of

A student centric web platform would enable not just knowledgesharing, but inherently empower people in the age group of 13+.This platform helps students to improve their intellectual approachby providing each of them with anonymous identity. This also avoidsbiased assessment because of any individual’s identity

1.3 Problem Statement

“Man is least himself when he talks in his own person. Give him amask, and he will tell you the truth”The team at Tationem, surveyed regarding the problems they facedin social networking sites. Based on the survey we have analyzed,these sites holds ocean of information where people have failed tofind information of their interest. Because of biased assessment frompeople based on identity and opinions the intellectual ability is notrecognized much. There is inability to filter out offensive contentsand thus making it difficult to find the content of interest to theuser.

1.4 Objectives

• Introducing anonymity

• Uphold user’s intellectuality

• Clustering and Classification based on interests, temporal andspatial.

• Trending of topics dynamically

• Better Profiling

• Make a better recommendation system

• Automatic subject identification

• Auto tagging

• Corpus based neural network method

• Unraveling the pattern by analyzing huge data sets

11

Page 12: B.E. Information Science Project report of

1.5 Proposed Model

We have come up with an application that aims at helping out thosestudents, in letting out their intellectual nature to other fellow stu-dents, faculty, companies and people as a whole who prefer,one stopplace, for knowledge sharing, getting interesting answers and post-ing intriguing questions, to enhance their intellectual approach tocertain issues or work.

1.6 Current And Future Scope

Once this application gets deployed successfully on web , the stu-dents would be able to have an anonymous identity on this site, bywhich the users will be able to share their knowledge with the peo-ple.The aim to connect people of similar intellectual inclination, shareideas and connect with like-minded individuals who share the sameinterests. By doing so , one can keep track of certain issues or top-ics and get enlightened with them without discriminating your past,present and future.As the site basically aims at student community, it could be an addon to their professional identity .This could be achieved by linkingthis site to their social networking site.

12

Page 13: B.E. Information Science Project report of

The organization of the report:

Chapter 2, Literature Survey, explains about the existing Ques-tion and Answer platforms and their architecture, issues and otherrelated works.

Chapter 3, Software Requirements Specification explains the pur-pose and environment for our software.It explains what the softwarewill do and how it will be expected to perform.It basically explainsrequirements of the software.

Chapter 4, Software Design Documents explains architecturaldeign of the system and the strategies involved. It explains thesub system architecture, libraries required and general constraintsinvolved.

Chapter 5, Implementation explains overall functionality of thesystem and also explains each modules implementation.

Chapter 6, Software Test Document explains test plan, testingapproach followed by features to be tested and not to be tested, thetesting tool and environment used concluding with the test cases.

Chapter 7, Conclusion and Future Works concludes that theiris a need for anonymous platform which brings like minded peopletogether to share their knowledge.It also explains future scope forthis project.

13

Page 14: B.E. Information Science Project report of

2 Literature Survey

This chapter explains about the survey carried on the existing Ques-tion and Answer platforms and their architecture and the issues.Italso explains about the works related to the existing Question andAnswer platforms.It also explains the progress we have made in com-bating the issues present in the existing platforms.

2.1 Question and Answering

In today’s world, humans are almost wholly dependent on Internetfor information. Nearly all of the roughly 50 petabytes of dataavailable on the Internet, were first captured and created by humanbeing by typing, pressing a record button, taking a digital picture,or scanning a bar code. The problem is, people have limited time,attention and accuracyall of which means they are not very good atcapturing data about things in the real world. Users are generatingdata and are not able to find relevant data in limited time. Yettoday’s information technology is so dependent on data originatedby people that our computers know more about ideas than things.If we had computers that knew everything there was to know aboutthings using data they gathered without any help from us ,we wouldbe able to track and count everything, and greatly reduce waste, lossand cost. We would know when things needed replacing, repairingor recalling, and whether they were fresh or past their best.

Question Answering (Q&A) is a computer science discipline withinthe fields of information retrieval and natural language processing(NLP), which is concerned with building systems that automati-cally answer questions posed by humans in a natural language[1].Q&A has received tremendous interest recently with the emergenceof many commercial products, and research papers in variouscommunities[2]. A Q&A implementation, usually a computer pro-gram, may construct its answers by querying a structured databaseof knowledge or information.Q&A systems can pull answers from anunstructured collection of natural language documents.The goal is to address the questions[3] and explore their implica-tions for youth identities.Systems come and go, how youth engagethrough social network sites today provides long-lasting insights intoidentity formation, status negotiation, and peer-to-peer sociality.[4]

14

Page 15: B.E. Information Science Project report of

2.2 Brief History On Q&A Website

Question-and-answer websites are the ones where questions are cre-ated, answered, edited and organized by its community of users. Itlacks the accurate filtering of contents and as a result we would belost in ocean of information. Any user would post troll questionand answers and there was no way to prevent them from doing so.This seems to betray the aim with which Internet was created, i.e.“Create a fountain of knowledge”. Instead we have contributed increating a drain of information. These sites which aims to connectpeople and increase their intellectual ability has resulted in a merelytime-spending place without a reason.Performance issues and erroranalysis in an open-domain question answering system must alsobe considered.[5] Current Q&A systems typically include a questionclassifier module that determines the type of question and the typeof answer. After the question is analysed[6], the system typicallyuses several modules that apply increasingly complex NLP[7] tech-niques on a gradually reduced amount of text. Thus, a documentretrieval module uses search engines to identify the documents orparagraphs in the document set that are likely to contain the an-swer. Subsequently a filter pre selects small text fragments thatcontain strings of the same type as the expected answer. For exam-ple, if the question is ”Who invented Penicillin” the filter returnstext that contain names of people. Finally, an answer extractionmodule looks for further clues in the text to determine if the answercandidate can indeed answer the question.

2.3 Question Answering Methods

Q&A is very dependent on a good search corpus - for without doc-uments containing the answer, there is little any Q&A system cando. It thus makes sense that larger collection sizes generally betterthe performance, unless the question domain is orthogonal to thecollection. it continues to connect visitors and experts to the rightquestions as it grows[8].The notion of data redundancy in massivecollections, such as the web, means that nuggets of information arelikely to be phrased in many different ways in differing contexts anddocuments,leading to two benefits:

15

Page 16: B.E. Information Science Project report of

• By having the right information appear in many forms, the bur-den on the Q&A system to perform complex NLP techniques[9]to understand the text is lessened.

• Correct answers can be filtered from false positives by relyingon the correct answer to appear more times in the documentsthan instances of incorrect ones.

SOCIAL NETWORK ANALYSIS Social network analysis (SNA)provides a rich and systematic means of assessing informal networksby mapping and analyzing relationships among people, teams, de-partments or even entire organizations.[10]

2.4 Challenges

The following are some of the issues faced in Q&A platforms:

I.Question classesDifferent types of questions require the use of different strategiesto find the answer. Question classes are arranged hierarchically intaxonomies.

II.Question processingThe same information request can be expressed in various ways,some interrogative and some assertive.A semantic model of questionunderstanding and processing would recognize equivalent questions,regardless of how they are presented. This model would enable thetranslation of a complex question into a series of simpler questions,would identify ambiguities and treat them in context or by interac-tive clarification.

III.Context and Q&AQuestions are usually asked within a context and answers are pro-vided within that specific context. The context can be used to clarifya question, resolve ambiguities or keep track of an investigation per-formed through a series of questions.

16

Page 17: B.E. Information Science Project report of

IV.Data sources for Q&ABefore a question can be answered, it must be known what knowl-edge sources are available and relevant. If the answer to a question isnot present in the data sources, no matter how well the question pro-cessing, information retrieval and answer extraction is performed, acorrect result will not be obtained.

V.Answer formulationThe result of a Q&A system should be presented in a way as nat-ural as possible. In some cases, simple extraction is sufficient. Forexample, when the question classification indicates that the answertype is a name (of a person, organization, shop or disease, etc.),a quantity (monetary value, length, size, distance, etc.) or a date,the extraction of a single datum is sufficient. For other cases, thepresentation of the answer may require the use of fusion techniquesthat combine the partial answers from multiple documents.

VI.Real time question answeringThere is need for developing QA systems that are capable of ex-tracting answers from large data sets in several seconds, regardlessof the complexity of the question, the size and multitude of the datasources or the ambiguity of the question.

VII.Interactive Q&AIt is often the case that the information need is not well capturedby a Q&A system, as the question processing part may fail to clas-sify properly the question or the information needed for extractingand generating the answer is not easily retrieved. In such cases, thequestioner might want not only to reformulate the question, but tohave a dialogue with the system. In addition, system may also usepreviously answered questions(For example, the system might askfor a clarification of what sense a word is being used, or what typeof information is being asked for.)

VIII.Advanced reasoning for Q&AMore sophisticated questioners expect answers that are outside thescope of written texts or structured databases.It is necessary to in-tegrate reasoning components operating on a variety of knowledgebases as well as knowledge specific to a variety of domains.

17

Page 18: B.E. Information Science Project report of

IX.Information clustering for Q&AInformation clustering for question answering systems[15] is a newtrend that originated to increase the accuracy of question answer-ing systems through search space reduction. In recent years thiswas widely researched through development of question answeringsystems which support information clustering in their basic flow ofprocess.

X.User profiling for Q&AThe user profile captures data about the questioner, comprising con-text data, domain of interest, reasoning schemes frequently usedby the questioner, common ground established within different dia-logues between the system and the user, and so forth. The profilemay be represented as a predefined template, where each templateslot represents a different profile feature. Profile templates may benested one within another.

2.5 Progress

Q&A systems have been extended in recent years to encompass ad-ditional domains of knowledge.For example, systems have been de-veloped to automatically answer temporal and geospatial questions,questions of definition and terminology, biographical questions, mul-tilingual questions, andquestions about the content of audio, images, and video. CurrentQ&A research topics include:

• interactivity—clarification of questions or answers

• answer reuse or caching

• knowledge representation and reasoning

• social media analysis with Q&A systems

• sentiment analysis

• utilizing thematic roles

18

Page 19: B.E. Information Science Project report of

Studies have shown that allowing people to answer questionnairescompletely anonymously yields more reports of socially inappropri-ate attitudes, beliefs, and behaviors, and researchers have often as-sumed that this is evidence of increased honesty.[11] The above chal-lenges can be solved by introducing anonymity among users[12] andusing strong data mining algorithms along with machine learningand pattern matching algorithms.

19

Page 20: B.E. Information Science Project report of

3 Software Requirements Specifications

In this chapter we intend to explain the purpose and environmentfor our software.It explains what the software will do and how it willbe expected to perform.

We have explained how the application will interact with sys-tem hardware, other programs and human users in a wide variety ofreal-world situations. Parameters such as operating speed, responsetime, availability,portability, maintainability, footprint, security andspeed of recovery from adverse events are evaluated. It also estab-lishes the basis for an agreement between customers and contractorsor suppliers (in market-driven projects, these roles may be playedby the marketing and development divisions) on what the softwareproduct is to do as well as what it is not expected to do.

3.1 General Description

Based on the survey done previously, the problems and the need forchanges in any QA has been noted down.The basic aim for whichthe QA platforms started that is to uphold intellectual integrityhas been deviated instead identity based biased assessment is beenmade[18].To solve this existing model problems ,we have come upwith the idea of masking the identity of the users and hence keep-ing the platform complete anonymous.This avoids biased assessmentand also upholds intellect of the individual.

3.1.1 Project Perspective

The Question and Answer platform which is built is similar to otherexisting Q&A platforms ,with some of the additional functionalities.From the results of the survey and analysing the result has helpedus to arrive to the basic requirement of the platform to keep theplatform complete anonymous by assigning each of the user by 9digit random id generated. The following of “me too” trend hasto avoided in question and answer platforms ,which can remove theidentity based biasment to one level. For Example: In a socialnetworking site like Facebook, if a person has got 5 thousand likesthen there is a tendency for the other person who reads it(or maynot read) , to go and like it. There is a “me too” trend and hencethere is loss of new and creative ideas. To avoid this we have come

20

Page 21: B.E. Information Science Project report of

up with basic requirement , to hide the rate to the person initiallyunless and until the person rates for question or answers . By thiswe are trying to avoid the biased assessment or biased rating of thequestion or answer based on the previous rates.This makes personto read the content and arrive at a decision independently withoutany influence.

As the platform is made complete anonymous there is alwaysfear of irrelevant and abusive contents in the system. To avoid thisand prevail quality content in the system ,users have been alloweddemocratically rate for the quality.Based on the overall rating ofthe question and answer ,it is been displayed correspondingly.Thequestion or answer with high rates that is of high quality contentis placed on top of the newsfeed and the rest unuseful contents areplaced at the bottom of the newsfeed and gradually one day theymay eliminate from the system.

3.1.2 Product Overview

The overall product has 13 distinguishable features from other QAplatforms.The features and description are as follows:

• AnonymityThe platform helps students to improve their intellectual ap-proach by providing each of them with anonymous identity.This avoids biased assessment because of any individual’s iden-tity.Anonymity has a lot of significance.It does not give impor-tance to recognition, power, caste,creed or gender that havecaused difficulties in some societies.This has it’s own cons also.It is not reliable and can be abused, leading to the ruin ofinnocent people.

• Spell CheckerThis feature is important because it increases readability forthe users and also access to good quality content.We believe inquality content which can’t be achieved if the system is filledwith dirty drain of information.

21

Page 22: B.E. Information Science Project report of

• ProfanityAnonymity is not reliable and can be abused, leading to theruin of innocent people.This feature helps in filtering out offen-sive contents and makes the platform user friendly.

• Similar Questions DetectionThis feature helps in reducing duplicate contents on the plat-form. When the question, which user wants an answer for isalready available on the platform then we can suggest him thosequestions than increasing the repeated content.

• RAL-Read At LeisureUser can download the question and answer and read them attheir leisure.Internet may not be available all the time.So youcan download the contents and read it offline when you havetime.

• BookMarkSave the question Or answer link , so as to have quick accessin future.We can have customized label for the questions andanswers we want to store.This acts a single place to keep ll yourfavorite questions and answers.

• Edit Question/AnswerOnce user finds any mistake in contents he/she has posted,They can always edit and correct the content.

• Suggestion For EditsOnce user finds grammar mistake or spelling mistake in other’sposted contents then he/she can suggest edit for the owner ofthe question or answer, but it is user’s choice to accept or rejectthe suggestion.Our primary focus is quality content.

• Follow QuestionOnce user finds any question very interesting then he/she canalways follow that particular question so as to get notified re-garding any action related to it.It is easier to keep track thisway than to search in news feed.

22

Page 23: B.E. Information Science Project report of

• Text SummarizationThis feature saves user’s time by summarizing the content ofthe long answers into a paragraph or two retaining the impor-tant key points.In this present world time is very important, noone has time to read answer of 10 pages, so here we are withtext summarizer

• Auto taggingThis feature indirectly helps in clustering of contents and alsoto know about user interests. POS tagging is used on the ques-tion to extract the tags and thus questions are classified.

• Scoring SystemTo make it more interactive and attract the students,the aspectof gamification is introduced. For example, a user is rewardedcredits for providing a quality answer and question.

• Burning TopicsIt Shows the current most popular and talked-about topics onthe platform.

3.1.3 End User Expectations

The End user ultimately wants fair assessment on the quality con-tent which he/ she has contributed to the system.One needs qualityand useful content which brings the common set of users togetherhelping them to achieve some common purpose.The cluster of usersare formed dynamically and virtually on the basis of their intellect,interest and consistency.Users want to connect to similar mindedpeople and do some needful and get some needful from the platform.It will be a platform which encourages the users to post qualitycontent and ensure that if the content is good then he/she will berewarded and recognised for the same.This keeps users encouragedand post good quality content.This increases intellect of any personand basic aim of the platform fulfilles.

23

Page 24: B.E. Information Science Project report of

The primary motivations for asking questions in Q&A are to ful-fill cognitive needs. For those motivated to fulfill cognitive needs,users tend to expect that they will acquire relevant information,learn through acquiring information, and receive others’ opinionsand advice. It is not surprising that the motivation of acquiring in-formation plays an important role choosing to use Q&A servicessince these services constitute a knowledge exchange communitythat facilitates a user–driven environment for information seekingand sharing . The study also found that another significant moti-vation for asking a question in Q&A is to have fun while asking aquestion that seeks information (Mean=3.56).One respondent explained this motivation by stating:

“Sometimes I get great answers and other times the replies aresilly or really stupid. It’s an adventure when you get an answer

and that’s part of the fun.”

Combining this observation with the significant, moderate levels ofcorrelation found between different affective levels of motivation andwhat could arguably be considered subjective elements of users’ ex-pectations, it would be interesting for future research to analyze theinterplay between users’ emotional and affective conditions and sub-sequent expectations and motivations exercised within the uniquecontext of Q&A environments.

3.1.4 General Constraints of the software

The general constraint of any software is security.Especially in asystem which is complete anonymous ,the privacy and security is aconcern.Anonymous Internet users can easily infringe on an intellectual prop-erty owner’s trademark by simply inserting a trademark into webcontent to divert Internet traffic. Because users can anonymouslyown domain names, trademark infringement can occur by an uniden-tified user as well, thereby necessitating discovery of the websiteowner’s identity through other means.

24

Page 25: B.E. Information Science Project report of

3.2 Specific Requirements

3.2.1 Functional Requirements

As a question and answer platform ,which is social networking sitethe functionalities will be much similar to other social sites includ-ing the novelty ideas of the platform.

FR1: The assessment of the question and answer has to be purelyintellectual based instead of any biased assessment such as based oncaste, creed, religion, gender, social status etc.

FR2: The hiding of rate of the content that is question or an-swer until and unless it has been rated by the person .This avoidsany biasment towards rating a content based on previous rates given.

FR3: The burning topics based on time and importance is dis-played and is dynamic.

FR4: Read at Leisure - Downloading the content and read it atthe leisure. The need of allowing downloading of the question oranswer in a file and saving at the local system.

FR5: We need text summarization for reducing large contentsinto small fast readable ones.

FR6: Since the user is anonymous there is chance of using pro-fanity and abusive words.So we have profanity filtering mechanism.

FR7: Since our motto is quality content, we need to make theplatform free from grammar mistake and spelling mistakes.So weneed spell checker.

FR8: To make the search results more accurate, we need to as-sociate the tags to questons, so that when we type the tags, we getaccurate results.So method of auto tagging is used.

FR9: There should be provision for users to edit their questionsand answers if the quality is bad.

25

Page 26: B.E. Information Science Project report of

FR10: There should be provision of customized tagging for theuser’s favorite questions and answers.

3.2.2 Software Requirements

The platform which provides a platform for user to post the ques-tion , answer and other related activities is required.

The requirements required for the software are:i) Software should be able to provide users a platform for users topost the content and access the content.ii) Software should be scalable enough to hold many users and theiractivities.iii) Software should respect the privacy and security concerns of theusers.iv) As the software is complete anonymous, the filtering mechanismof abusive content and irrelevant contents have to be strong.v) Software has to be usable, durable and available at all time. vi)Software has to be flexible, reliable and user interface has to be easyto all kind of users.

3.3 Interface Requirements

The platform is social site, which brings together similar mindsetpeople together by learning their interests, intellect and consistency.The platform has to be quite interactive to make users learn eachothers interests and share knowledge among themselves.The interface has to be easy to users to interact with the systemhence making the platform easier to create virtual and dynamicgroups based on similar interests.

3.3.1 User Interface

i) Simple and easy :The user interface has to be simple and easy.ii) Clarity :The interface avoids ambiguity by making everythingclear through language, flow, hierarchy and metaphors for visualelements.

26

Page 27: B.E. Information Science Project report of

iii) Concision :It’s easy to make the interface clear by over-clarifyingand labeling everything.iv) Familiarity :Even if someone uses an interface for the first time,certain elements can still be familiar. Real-life metaphors can beused to communicate meaning.v) Responsiveness: A good interface should not feel sluggish. Thismeans that the interface should provide good feedback to the userabout what’s happening and whether the user’s input is being suc-cessfully processed.vi) Consistency: Keeping your interface consistent across your ap-plication is important because it allows users to recognize usagepatterns.vii) Aesthetics: While you don’t need to make an interface attrac-tive for it to do its job, making something look good will make thetime your users spend using your application more enjoyable; andhappier users can only be a good thing.viii) Efficiency: Time is money, and a great interface should makethe user more productive through shortcuts and good design.ix) Forgiveness : A good interface should not punish users for theirmistakes but should instead provide the means to remedy them.

3.3.2 Software System Attributes

The system needs to be efficient and follow certain criterias.

• Reliability The platform should be trustworthy. It should beconcerned about privacy of the users and not share the detailsprovided during registration elsewhere.

• Availability The platform should be active and should not godown. The downtime should be as less as possible and make itavailable all time.

• Security As the platform is anonymous, there is more chancesof cyber bullying and posting of irrelevant and abusive con-tents.Hence the requirement of platform to be highly secureand safeguard privacy concerns of the users.

• Maintainability With the increasing contents and users,there isalways challenge to maintain the platform consistently all time.

27

Page 28: B.E. Information Science Project report of

• Portability There is no hardware and OS constraint on the plat-form.

• Performance In order to assess the performance of a system thefollowing must be clearly specified: Response Time : Responsetime must be less. Workload : Software should be able to bearthe work load ,For example the number of users or number ofactivities and events generated. Scalability : Software shouldbe scalable and handle the increase in systems workload. Plat-form : A platform is defined as the underlying hardware andsoftware (operating system and software utilities) which willhouse the system.Software should be platform independent asmuch as possible.

28

Page 29: B.E. Information Science Project report of

4 Software Design Documents

This chapter explains architectural deign of the system and thestrategies involved. It explains the sub system architecture, librariesrequired and general constraints involved.

4.1 System Architecture: Overview

Figure 1: Design Of The System

The architectural diagram of the platform is given above.To ac-cess content of the platform we need the user to create account byproviding all his/her details to the platform. Once the user registershimself,at the backend he is assigned unique id and identified by itand thus anonymity is achieved.Once done, he will login with email id and password and enter thenewsfeed landing page. News feed page is filled with recent questionsand answer posted.As the person grows on the system, it learns hisinterest and posts questions and answers of his interest along withrandomization. He can post his questions which passes through pro-fanity,spell checker and then gets stored in database. My Activtieskeep track of all his/her contribution to the system.Notification in-forms user about any activity done on user’s questons,answers. myscore keeps track of user’s score and badges of honours.The overallsystem is built on django python web framework.

29

Page 30: B.E. Information Science Project report of

4.2 System Model

The entire application is built on django platform. Django is pythonweb framework used for designing complex web application. Djangois a high-level Python Web framework that encourages rapid devel-opment and clean, pragmatic design. Built by experienced devel-opers, it takes care of much of the hassle of Web development, soyou can focus on writing your app without needing to reinvent thewheel. It’s free and open source. It has all features needed for awebsite building, we just need to import it and start using it in ourapplication.The default database system used here is sqlite.Beforewe start with the implementation, let us first discuss about djangoweb framework components. The whole of django is built on MVCarchitecture - Model-Views-Controller. Request is send to systemvia html codes from the user.The controller passes it on to the Viewswhich has all functions to run and Models which has all definitionsof database fields.These both will will do processing using the dataand the function code and gives the output to the controller. Con-troller passes it on to the user.This is basic functioning of Djangoframework.

Figure 2: Django Python WebFramework

1) ControllerA controller is the heart of the system, it steers everything. For aweb framework, this means handling requests and responses, set-ting up database connections and loading add-ons. For this, Djangoreads a settings file so that it knows what to load and set up. AndDjango reads a URL config file that tells it what to do with theincoming requests from browsers.

30

Page 31: B.E. Information Science Project report of

2)ModelThe model layer in Django means the database plus the Python codethat directly uses it. It models reality. You capture whatever yourwebsite needs in database tables. Django helps you write Pythonclasses (called models) that tie 1:1 to the database tables.3)ViewThe view layer is the user interface. Django splits this up in the ac-tual HTML pages and the Python code (called views) that rendersthem. And it also has an automatic web admin interface for editingthe models.

4.3 Subsystem Model and Design

Django framework supports so many libraries and packages and theones used by us are:

• NLTK

• NUMPY

• PYENCHANT

1)NLTKNatural Language Tool Kit is a leading platform for building Pythonprograms to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such asWordNet, along with a suite of text processing libraries for clas-sification, tokenization, stemming, tagging, parsing, and semanticreasoning[17], and an active discussion forum.Thanks to a hands-on guide introducing programming fundamentalsalongside topics in computational linguistics, NLTK is suitable forlinguists, engineers, students, educators, researchers, and industryusers alike. NLTK is available for Windows, Mac OS X, and Linux.Best of all, NLTK is a free, open source, community-driven project.Natural Language Processing with Python provides a practical in-troduction to programming for language processing. Written by thecreators of NLTK, it guides the reader through the fundamentals ofwriting Python programs, working with corpora, categorizing text,analyzing linguistic structure, and more.[13]

31

Page 32: B.E. Information Science Project report of

2)NUMPYNumPy[14] is the fundamental package for scientific computing withPython. It contains among other things:a) powerful N-dimensional array objectb) sophisticated (broadcasting) functionsc)tools for integrating C/C++ and Fortran coded) useful linear algebra, Fourier transform, and random number ca-pabilitiesBesides its obvious scientific uses, NumPy can also be used as an ef-ficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedilyintegrate with a wide variety of databases.Numpy is licensed under the BSD license, enabling reuse with fewrestrictions.

3)PYENCHANTPyEnchant[? ] is a spellchecking library for Python, based on theexcellent Enchant library.Most of the solutions listed above are tiedto a single spellchecking platform, such as aspell or MySpell. By con-trast, Enchant supports multiple spellchecking platforms. A gooddiscussion of why this is an advantage can be found on theEnchantwebsite under the heading “Enchant and Multiple Backends”.Different backends can be used for different languages, dependingon which does a better job Integration with the user’s “native”spellchecker, whatever that may be This flexibility is transparentto the application programmer As explained above, PyEnchant isavailable under the GNU LGPL. This may mean it can be used insome projects where other libraries (such as GPL-licensed libraries)cannot.The Enchant API is also generally simpler than that provided byother spellchecking solutions. This can be an advantage or disad-vantage depending on the needs of your program

32

Page 33: B.E. Information Science Project report of

5 System Implementation

This chapter explains overall functionality of the system and alsoexplains each modules implementation.The step by step implemen-tation of each feature is explained.

The entire application is built on django platform. Django ispython web framework used for designing complex web application.Django is a high-level Python Web framework that encourages rapiddevelopment and clean, pragmatic design. Built by experienced de-velopers, it takes care of much of the hassle of Web development,so you can focus on writing your app without needing to reinventthe wheel. It’s free and open source. It has all features needed for awebsite building, we just need to import it and start using it in ourapplication.The default database system used here is sqlite.Beforewe start with the implementation, let us first discuss about Djangoweb framework components. The whole of Django is built on MVCarchitecture - Model-Views-Controller. Request is send to systemvia html codes from the user.The controller passes it on to the Viewswhich has all functions to run and Models which has all definitionsof database fields.These both will will do processing using the dataand the function code and gives the output to the controller. Con-troller passes it on to the user.This is basic functioning of Djangoframework.

• Controller A controller is the heart of the system, it steers ev-erything. For a web framework, this means handling requestsand responses, setting up database connections and loadingadd-ons. For this, Django reads a settings file so that it knowswhat to load and set up. And Django reads a URL configfile that tells it what to do with the incoming requests frombrowsers.

• Model The model layer in Django means the database plus thePython code that directly uses it. It models reality. You cap-ture whatever your website needs in database tables. Djangohelps you write Python classes (called models) that tie 1:1 tothe database tables.

• View The view layer is the user interface. Django splits this upin the actual HTML pages and the Python code (called views)

33

Page 34: B.E. Information Science Project report of

that renders them. And it also has an automatic web admininterface for editing the models.

Figure 3: Implementation Of The System

Coming to the implementation, we will first discuss about thebackend implementation and then the front end. All our defini-tions of database fields are defined in models.py. For example,the below code has the definition of database field of registra-tion page:

c l a s s r e g i s ( models . Model ) :username = models . CharField ( max length =150)dob = models . CharField ( max length =100)emai l = models . Emai lFie ld ( )password1 = models . CharField ( max length =10)password2 = models . CharField ( max length =10)mobile = models . CharField ( max length =10)educat ion = models . CharField ( max length =10)i n s t i t u t e = models . CharField ( max length =10)uid = models . CharField ( max length =10, primary key=True )timestamp = models . DateTimeField ( auto now add=True )#qid o f que s t i on s f o l l owed by usert r a ck ing que s = L i s t F i e l d ( )book markq = L i s t F i e l d ( )book marka = L i s t F i e l d ( )de f u n i c o d e ( s e l f ) :r e turn u’%s %s %s %s %s %s ’ %( s e l f . username , s e l f . email ,s e l f . password1 , s e l f . password2 ,s e l f . random , s e l f . f o l l o w )

34

Page 35: B.E. Information Science Project report of

The fields store name,date of birth,email id, passwords,mobilenumber,education qualification,institute studied,unique user id,question/answerid followed by the user, question id bookmarked by the user, answerid bookmarked by the user.

Coming to the other component of backend, views: it explains allthe operations that has to be done on the data that the user sendsand what has to be done on the data.

Finally the user interface is built using html and all the codesare stored in templates and connections of views and templates isstored in urls.py which gives which path opens upto which functionsin views.Once all is set up we need to open up the terminal and run thefollowing codes

python manage.py syncdbpython manage.py runserver

This both commands synchronize the database and open up theapplication on browser. Lets discuss about the each feature imple-mentation with their flowcharts:

• Anonymity

• Spell Checker

• Profanity

• Similar Questions Detection

• Read at Leisure

• Bookmark

• Edit Question/Answers

• Suggestion For Edits

• Follow Question

• Text Summarization

• Auto tagging

• Scoring System

• Burning Topics

35

Page 36: B.E. Information Science Project report of

• AnonymityThe platform helps students to improve their intellectual ap-proach by providing each of them with anonymous identity.This avoids biased assessment because of any individual’s iden-tity.Anonymity has a lot of significance.It does not give impor-tance to recognition, power, caste,creed or gender that havecaused difficulties in some societies.This has it’s own cons also.It is not reliable and can be abused, leading to the ruin of in-nocent people.

• ProfanityAnonymity is not reliable and can be abused, leading to the ruinof innocent people.This feature helps in filtering out offensivecontents and makes the platform user friendly.

Figure 4: Implementation Of Profanity Filtering

• Similar Questions DetectionThis feature helps in reducing duplicate contents on the plat-form. When the question, which user wants an answer for isalready available on the platform then we can suggest him.

• Auto taggingThis feature indirectly helps in clustering of contents and alsoto know about user interests. POS tagging is used on the ques-tion to extract the tags and thus questions are classified.

36

Page 37: B.E. Information Science Project report of

• Read At LeisureUser can download the question and answer and read them attheir leisure.Internet may not be available all the time.So youcan download the contents and read it offline when you havetime.

Figure 5: Implementation Of Read At Leisure

• Scoring SystemTo make it more interactive and attract the students,the aspectof gamification is introduced. For example, a user is rewardedcredits for providing a quality answer and question.

Figure 6: Implementation Of The Scoring System

37

Page 38: B.E. Information Science Project report of

• BookMark

Save the question Or answer link , so as to have quick accessin future.We can have customized label for the questions andanswers we want to store.This acts a single place to keep ll

your favorite questions and answers.

Figure 7: Implementation Of BookMark

38

Page 39: B.E. Information Science Project report of

• Edit Question/AnswerOnce user finds any mistake in contents he/she has posted,They can always edit and correct the content.Our primary focusis quality content.So if we have to preserve that we need tocorrect wrong contents.

Figure 8: Implementation Of Editing Question And Answers

39

Page 40: B.E. Information Science Project report of

• Suggestion For Edits

Once user finds grammar mistake or spelling mistake inother’s posted contents then he/she can suggest edit for the

owner of the question or answer, but it is user’s choice toaccept or reject the suggestion.Our primary focus is quality

content.So if we have to preserve that we need to correctwrong contents.

Figure 9: Implementation Of Suggesting Editions

• Follow QuestionOnce user finds any question very interesting then he/she canalways follow that particular question so as to get notified re-garding any action related to it.It is easier to keep track thisway than to search in news feed.

Figure 10: Implementation Of Follow Questions

40

Page 41: B.E. Information Science Project report of

• Text SummarizationThis feature saves user’s time by summarizing the content of thelong answers into a paragraph or two retaining the importantkey points.In this present world time is very important, no onehas time to read answer of 10 pages, so here we are with textsummarizer

Figure 11: Implementation Of Text Summarizer

• Burning TopicsIt Shows the current most popular and talked-about topics onthe platform.

Figure 12: Implementation Of Burning Topics

41

Page 42: B.E. Information Science Project report of

• Spell CheckerThis feature is important because it increases readability for theusers and also access to good quality content.We believe in qual-ity content which can’t be achieved if the system is filled withdirty drain of information.So we need spell checker to avoidgrammar and spelling mistakes

Figure 13: Implementation Of Spell Checker

42

Page 43: B.E. Information Science Project report of

6 Software Test Document

This chapter explains test plan, testing approach followed by fea-tures to be tested and not to be tested, the testing tool and envi-ronment used concluding with the test cases.

6.1 Introduction

6.1.1 System Overview

The system is a Question & Answer Platform, which is completelyanonymous.One can post Question and Answer questions postedby other members of the site.Features like bookmarking of ques-tionanswer, following question, rating of questionanswer, rewardingthe user for his contribution, adds more to user’s experience on thesite.

6.1.2 Test Approach

Our test approach involves measuring the accuracy of prediction ofthe system and posting/editing of question & answers, given ourhomogenous inputs. We want to also like to be able to explain theerror in prediction, posting of question & answers and pin point thesource of the error in order to aid us in correcting the error in futurebuilds.

6.2 Test Plan

6.2.1 Features to be Tested

Our test cases are homogenous with respect to the question & an-swer posted. However, we will monitor the validity of the predictedoverall theme and the accuracy of the scoring mechanism in sup-pressing the effect of the incorrect tagging of questions and ensuringthe correct tagging are seen through as dominant in all levels ofscoring.

6.2.2 Features not to be Tested

Our test cases are modeled to be consistent with respect to the ques-tion & answer posted, although there is a clearly dominant themein them as well.

43

Page 44: B.E. Information Science Project report of

The test cases will also not be testing the absolute run time of thesystem since it is dependent on too many extraneous factors like thespeed of the internet network the computer is connected to and alsothe database used.

6.2.3 Testing Tools and Environment

Our test cases are manually generated keeping in mind each of thefeature that we intend to test. These homogenous inputs will be ourtesting tools. There are no specific settings required for conductingthe tests except for the criterion of a reasonably fast internet con-nection.

6.3 Test Case

6.3.1 Test Case 1

1. PurposePosting Question(Is type) and tags generation

2. InputsIs there any harm in having Maggi everyday?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags : Maggi everyday,harmStatus : Passed

4. Test ResultsTags:having Maggi everyday,harm

6.3.2 Test Case 2

1. PurposeTags generation

2. InputsHow has he managed Microsoft? Has he been as great a CEOas Bill Gates?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:Bill Gates,Microsoft,managed,CEOStatus : Passed

44

Page 45: B.E. Information Science Project report of

4. Test ResultsTags: Bill Gates,Microsoft,managed,CEO

6.3.3 Test Case 3

1. PurposeTags generation

2. InputsIs Satya Nadella on board with Steve Ballmer’s ”One Microsoft”strategy? Or does he intend to reorganized the company?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:Steve Ballmer’s ”One Microsoft” strategy,Satya Nadella,reorganized,company,intendStatus : Passed

4. Test ResultsTags:Steve Ballmer’s ”One Microsoft” strategy,Satya Nadella,reorganize,company,intend

6.3.4 Test Case 4

1. PurposeTags generation

2. InputsTo anyone that had worked with all three Microsoft CEOs,what is the stylistic contrast between Bill Gates, Steve Ballmer,and Satya Nadella?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:Steve Ballmer,Microsoft CEOs,Satya Nadella,Bill Gates,contrastStatus : Passed

4. Test ResultsTags:Steve Ballmer,Microsoft CEOs,Satya Nadella,Bill Gates,contrast

6.3.5 Test Case 5

1. PurposeTags generation

2. InputsWhat are some good ways to insult an IT student?

45

Page 46: B.E. Information Science Project report of

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:student,insult,ways,ITStatus : Passed

4. Test ResultsTags:IIT student,gud ways,insult

6.3.6 Test Case 6

1. PurposeTags generation

2. InputsWhat should I do/say to my 3-year-old son, that was viciouslyattacked by another 3-year-old boy, who turned around thenclobbered him?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:clobbered,attacked,turned,do/say,boyStatus : Passed

4. Test ResultsTags: clobbered,attacked,turned,do/say,boy

6.3.7 Test Case 7

1. PurposeTags generation

2. InputsMy 1-year-old son has 40% blood and 5.7 hemoglobin. Thedoctor is saying that someone should donate blood for him. Isthat really necessary for a small boy?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:hemoglobin,someone,saying,doctor,donateStatus : Passed

4. Test ResultsTags: hemoglobin,someone,saying,doctor,donate

46

Page 47: B.E. Information Science Project report of

6.3.8 Test Case 8

1. PurposeTags generation

2. InputsWhich IAS academy gives the best reaching in chennai?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:IAS academy,reaching,chennai,givesStatus : Passed

4. Test ResultsTags: IAS academy,coaching,chennai,gives

6.3.9 Test Case 9

1. PurposeTags generation

2. InputsHow do I do current affairs for UPSC IAS in 3 months?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:current affairs,UPSC IAS,monthsStatus : Passed

4. Test ResultsTags: UPSC IAS,affairs,months

6.3.10 Test Case 10

1. PurposeTags generation

2. InputsHow do I do current affairs for UPSC IAS in 3 months?

3. Expected Outputs and Pass/Fail CriteriaQuestion gets postedTags:current affairs,UPSC IAS,monthsStatus : Passed

4. Test ResultsTags: UPSC IAS,affairs,months

47

Page 48: B.E. Information Science Project report of

6.3.11 Test Case 11

1. PurposeSentence similarity in questions

2. InputsWhat do people of India do very easily that others can’t ?

3. Expected Outputs and Pass/Fail CriteriaSimilar question : What do people in India do very easily whichcan’t be done by people of other nations?Status : Passed

4. Test ResultsWhat do people in India do very easily which can’t be done bypeople of other nations?

6.3.12 Test Case 12

1. PurposeSentence similarity in questions

2. InputsWhat are some examples of paid news in Indian Media ?

3. Expected Outputs and Pass/Fail CriteriaSimilar question : What is the worst piece of news you havecome across in the Indian Media?Status : Passed

4. Test ResultsWhat is the worst piece of news you have come across in theIndian Media?

6.3.13 Test Case 13

1. PurposeProfanity check

2. InputsHow would you reply if someone says you are an asshole ?

3. Expected Outputs and Pass/Fail CriteriaAlerts user with the abusive word used in the questionStatus : Passed

4. Test ResultsAlerts user with the abusive word used in the question

48

Page 49: B.E. Information Science Project report of

6.3.14 Test Case 14

1. PurposeProfanity check

2. InputsWhat is a bastard?

3. Expected Outputs and Pass/Fail CriteriaAlerts user with the abusive word used in the questionStatus : Passed

4. Test ResultsAlerts user with the abusive word used in the question

6.3.15 Test Case 15

1. PurposeProfanity check

2. InputsIs Tyrion a bastard?

3. Expected Outputs and Pass/Fail CriteriaAlerts user with the abusive word used in the questionStatus : Passed

4. Test ResultsAlerts user with the abusive word used in the question

6.3.16 Test Case 16

1. PurposeProfanity check

2. InputsWho is your favorite asshole and why?

3. Expected Outputs and Pass/Fail CriteriaAlerts user with the abusive word used in the questionStatus : Passed

4. Test ResultsAlerts user with the abusive word used in the question

49

Page 50: B.E. Information Science Project report of

6.3.17 Test Case 17

1. PurposeProfanity check

2. InputsDuring the 1945 United Nations Conference on InternationalOrganization, Dr. Szeming Sze, a delegate from China, con-ferred with Norwegian and Brazilian delegates on creating aninternational health organization under the auspices of the newUnited Nations. After failing to get a resolution passed on thesubject, Alger Hiss, the Secretary General of the conference,recommended using a declaration to establish such an organi-zation. Dr.

and other delegates lobbied and a declaration passed callingfor an international conference on health. The use of the word”world”, rather than ”international”,emphasized the truly globalnature of what the organization was seeking to achieve.The con-stitution of the World Health Organization was signed by all51 countries of the United Nations, and by 10 other countries,on 22 July 1946. It thus became the first specialised agencyof the United Nations to which every member subscribed. Itsconstitution formally came into force on the first World HealthDay on 7 April 1948, when it was ratified by the 26th memberstate.

3. Expected Outputs and Pass/Fail CriteriaAlerts user with the abusive word used in the answer.Status : Passed

4. Test ResultsAlerts user with the abusive word used in the answer.

6.3.18 Test Case 18

1. PurposeSpell checking

2. InputsWhich couuld be besst way to propose a girl ?

50

Page 51: B.E. Information Science Project report of

3. Expected Outputs and Pass/Fail Criteriacorrects besst couuld to best and could respectivelyStatus : Passed

4. Test Resultscorrects besst couuld to best and could respectively.

6.3.19 Test Case 19

1. PurposeSpell checking

2. InputsWhat do people of other nations do very easily that Indianscan’t?

3. Expected Outputs and Pass/Fail CriteriaWhat do people of other nations do very easily that Indianscannot ?Status : Passed

4. Test ResultsWhat do people of other nations do very easily that Indianscannot ?

6.3.20 Test Case 20

1. PurposeSpell checking

2. InputsWhat real change have you felt in your lives (not via mediareports), ever since the Modi government came to power?

3. Expected Outputs and Pass/Fail CriteriaWhat real change have you felt in your lives (not via mediareports), ever since the Modi government came to power?Status : Passed

4. Test ResultsWhat real change have you felt in your lives (not via mediareports), ever since the Modigovernment came to power?

51

Page 52: B.E. Information Science Project report of

6.3.21 Test Case 21

1. PurposeSpell checking

2. InputsWhat has the Modi govarnnment achieved since coming topower?

3. Expected Outputs and Pass/Fail CriteriaWhat has the Modi government achieved since coming to power?Status : Passed

4. Test ResultsWhat has the Modi government achieved since coming to power?

6.3.22 Test Case 22

1. PurposeSpell checking

2. InputsWhat are some interasting intarnship stories at IIT’s ?

3. Expected Outputs and Pass/Fail CriteriaAlert the user with Invalid Question message.Status : Passed.

4. Test ResultsAlert the user with Invalid Question message.

6.3.23 Test Case 23

1. PurposeSpell checking

2. InputsWhat is the most beeeutiful and amazingsciantific and mathamatics facts you have come across ?

3. Expected Outputs and Pass/Fail CriteriaWhat is the most beautiful and amazing scientific and mathe-matics facts you have come across ?Status : Passed.

52

Page 53: B.E. Information Science Project report of

4. Test ResultsWhat is the most beautiful and amazing scientificand mathematics facts you have come across ?

6.3.24 Test Case 24

1. PurposeBookmarking Question

2. InputsAdded label to question. eg : Celebrations

3. Expected Outputs and Pass/Fail CriteriaQuestion along with Answers get posted under bookmark tabStatus : Passed

4. Test ResultsQuestion along with Answers get posted under bookmark tab

6.3.25 Test Case 25

1. PurposeBookmarking Answer

2. InputsAdded label to answer. eg : Celebrations

3. Expected Outputs and Pass/Fail CriteriaThat particular answer along with question gets posted underbookmark tab.Status : Passed

4. Test ResultsThat particular answer along with question gets posted underbookmark tab.

6.3.26 Test Case 26

1. PurposeReading a question at leisure

2. InputsClicks Reading a question at leisure button for a question.

53

Page 54: B.E. Information Science Project report of

3. Expected Outputs and Pass/Fail CriteriaQuestion along with Answers gets downloaded into a text file.Status : Passed.

4. Test ResultsQuestion along with Answers gets downloaded into a text file.

6.3.27 Test Case 27

1. PurposeReading an answer at leisure

2. InputsClicks Reading an answer at leisure button for an answer.

3. Expected Outputs and Pass/Fail CriteriaAnswer along with question gets downloaded into a text file.Status : Passed.

4. Test ResultsAnswer along with question gets downloaded into a text file.

6.3.28 Test Case 28

1. PurposeHiding ratings for question

2. InputsRates the question written by other user.

3. Expected Outputs and Pass/Fail CriteriaGets rated and the user can view the average rate and numberof raters.

4. Test ResultsGets rated and the user can view the average rate and numberof raters.

6.3.29 Test Case 29

1. PurposeHiding ratings for question

54

Page 55: B.E. Information Science Project report of

2. InputsRates the question written by oneself.

3. Expected Outputs and Pass/Fail CriteriaAltered that the user cannot rate the question written by one-self.Status : Passed.

4. Test ResultsAltered that the user cannot rate the question written by one-self.

6.3.30 Test Case 30

1. PurposeHiding ratings for answer.

2. InputsRates the answer written by someone else.

3. Expected Outputs and Pass/Fail CriteriaThe answer gets rated and can view the average rate and num-ber of raters.Status : Passed.

4. Test ResultsThe answer gets rated and can view the average rate and num-ber of raters.

6.3.31 Test Case 31

1. PurposeHiding ratings for answer.

2. InputsRates the answer written by oneself.

3. Expected Outputs and Pass/Fail CriteriaAltered that the user cannot rate the answer written by oneself.Status : Passed.

4. Test ResultsAltered that the person cannot rate the answerwritten by oneself.

55

Page 56: B.E. Information Science Project report of

6.3.32 Test Case 32

1. PurposeFollows the question

2. InputsUser clicks follow button.

3. Expected Outputs and Pass/Fail CriteriaNotifies the user who followed the question, when an answer isadded.Status : Passed.

4. Test ResultsNotifies the user who followed the question, when an answer isadded.

6.3.33 Test Case 33

1. PurposeFollows the question

2. InputsUser clicks follow button.

3. Expected Outputs and Pass/Fail Criterianotifies the user who followed the question, when an answer isadded.Status : Passed

4. Test Resultsnotifies the user who followed the question, when an answer isadded.

6.3.34 Test Case 34

1. PurposeSuggest Question

2. InputsWhen a question is suggested.

3. Expected Outputs and Pass/Fail CriteriaThe question undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed if the owner accepts the

56

Page 57: B.E. Information Science Project report of

suggestion else rejected by notifying the suggester about thesame.Status : Passed

4. Test ResultsThe question undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed if the owner accepts thesuggestion else rejected by notifying the suggester about thesame.

6.3.35 Test Case 35

1. PurposeSuggest Answer

2. InputsWhen an answer is suggested.

3. Expected Outputs and Pass/Fail CriteriaThe answer undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed if the owner accepts thesuggestion else rejected by notifying the suggester about thesame.Status : Passed

4. Test ResultsThe answer undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed if the owner accepts thesuggestion else rejected by notifying the suggester about thesame.

6.3.36 Test Case 36

1. PurposeEdit Question

2. InputsWhen the user wished to edit question by finding the questionin “My activities” tab.

3. Expected Outputs and Pass/Fail CriteriaThe question undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed.Status : Passed

57

Page 58: B.E. Information Science Project report of

4. Test ResultsThe question undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed.

6.3.37 Test Case 37

1. PurposeEdit Answer

2. InputsWhen the user wished to edit answer by finding the questionin “My activities” tab.

3. Expected Outputs and Pass/Fail CriteriaThe answer undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed.Status : Passed

4. Test ResultsThe answer undergoes grammar check, spell check, profanitycheck and gets edited on newsfeed.

58

Page 59: B.E. Information Science Project report of

7 Conclusion And Future Work

This chapter concludes that there is a need for anonymous platformwhich brings like minded people together to share their knowledge.Italso explains future scope for this project.

Q& A systems have been extended in recent years to encom-pass additional domains of knowledge.System has been developed toautomatically answer temporal and geospatial questions, questionsof definition and terminology, biographical questions, multilingualquestions, and questions about the content of audio, images, andvideo. Current Q&A research topics include:1) Interactivity—clarification of questions or answers2) Answer reuse or caching3) Knowledge representation and reasoning4) Social media analysis with Q&A systems5) Sentiment analysisThe above challenges has been solved in our system by introducinganonymity among users and by using data mining algorithms alongwith machine learning and pattern matching algorithms.

As it is statistically proven that “ Man is least himself when hetalks in his own person. Give him a mask and he will tell you thetruth”. Same has been clearly proven by using our site.In today’s world, age, gender, fancy degrees have been given a up-vote and the rest have continued to struggle in this profile orientedworld. In days to come, we and along with Tationem team wouldlike to successfully launch this product on the web, and look intorefining features like auto tagging which is the roadmap to futurefeatures such as auto search etc which would further enhance theuser experience and at the same time, help in clustering more simi-lar minded people to make this world more intellect.

59

Page 60: B.E. Information Science Project report of

Once this application gets deployed successfully on web , the stu-dents would be able to have an anonymous identity on this site, bywhich the users will be able to share their knowledge with the people.The aim to connect people of similar intellectual inclination, shareideas and connect with like-minded individuals who share the sameinterests. By doing so , one can keep track of certain issues or top-ics and get enlightened with them without discriminating your past,present and future. As the site basically aims at student community,it could be an add on to their professional identity .This could beachieved by linking this site to their social networking site[16].

60

Page 61: B.E. Information Science Project report of

A Q&A Platform

Questionandanswer websites are the ones where questions are cre-ated, answered, edited and organized by its community of users.QuestionAnswering (Q&A) is a computer science discipline within the fieldsof information retrieval and natural language processing (NLP),which is concerned with building systems that automatically an-swer questions posed by humans in a natural language.

B Django

Django is a high-level Python Web framework that encourages rapiddevelopment and clean, pragmatic design. Built by experienced de-velopers, it takes care of much of the hassle of Web development, soyou can focus on writing your app without needing to reinvent thewheel. It’s free and open source.

C NLTK

Natural Language Tool Kit for spellchecker,pos tagging etc.NLTKhas been called a wonderful tool for teaching, and working in, com-putational linguistics using Python, and an amazing library to playwith natural language.

D Numpy

Numpy is the fundamental package for scientific computing withPython. It contains among other things:a powerful N-dimensional array objectsophisticated (broadcasting) functionstools for integrating C/C++ and Fortran codeuseful linear algebra, Fourier transform, and random number capa-bilities

61

Page 62: B.E. Information Science Project report of

E PyEnchant

PyEnchant is a spellchecking library for Python, based on the ex-cellent Enchant library. PyEnchant combines all the functionalityof the underlying Enchant library with the flexibility of Python anda nice Pythonic object-oriented interface. It also aims to providesome higher-level functionality than is available in the C API.

F tf–idf

tf–idf, short for term frequency–inverse document frequency, is anumerical statistic that is intended to reflect how important a wordis to a document in a collection or corpus. It is often used as aweighting factor in information retrieval and text mining. The tf-idf value increases proportionally to the number of times a wordappears in the document, but is offset by the frequency of the wordin the corpus, which helps to adjust for the fact that some wordsappear more frequently in general.

G Profanity

Use of abusive words.

H SRS

A software requirements specification (SRS) is a description of asoftware system to be developed, laying out functional and non-functional requirements, and may include a set of use cases thatdescribe interactions the users will have with the software.

I MVC

MODEL VIEW CONTROLLER divides a given software applica-tion into three interconnected parts, so as to separate internal rep-resentations of information from the ways that information is pre-

62

Page 63: B.E. Information Science Project report of

sented to or accepted from the user.

J Python

Python is a widely used general-purpose, high-level programminglanguage.Its design philosophy emphasizes code readability, and itssyntax allows programmers to express concepts in fewer lines of codethan would be possible in languages such as C++ or Java.The lan-guage provides constructs intended to enable clear programs on botha small and large scale.

63

Page 64: B.E. Information Science Project report of

References

[1] Research on intelligent question-answering system. Green,Claude Cordell and Raphael, Bertram,1967.

[2] QUALIFIER In TREC-12 Q&A Main Task. Yang, Hui and Cui,Hang and Maslennikov, Mstislav and Qiu, Long and Kan, Min-Yen and Chua, Tat-Seng,2003.

[3] Robust question answering over the web of linked data. Yahya,Mohamed and Berberich, Klaus and Elbassuoni, Shady andWeikum, Gerhard,2013.

[4] Why youth (heart) social network sites: The role of networkedpublics in teenage social life. Boyd, Danah, 2007.

[5] Performance issues and error analysis in question answering sys-tem.Moldovan, Dan and Pasca, Marius and Harabagiu, Sanda andSurdeanu, Mihai,2003.

[6] Anonymity and the Law. Pichardo, Carlos Vilamil, 2004.

[7] Natural language question answering. Hirschman, Lynette andGaizauskas, Robert,2001.

[8] Wisdom in the social crowd: an analysis of quora. Wang, Gangand Gill, Konark and Mohanlal, Manish and Zheng, Haitao andZhao, Ben Y,2013.

[9] Learning surface text patterns for a question answering system,Ravichandran, Deepak and Hovy, Eduard,2002.

[10] Knowing what we know:: Supporting knowledge creation andsharing in social networks. Cross, Rob and Parker, Andrew andPrusak, Laurence and Borgatti, Stephen P,2001.

[11] Complete anonymity compromises the accuracy of self-reports.Lelkes, Yphtach and Krosnick, Jon A and Marx, David M andJudd, Charles M and Park, Bernadette,2012.

[12] QuestionHolic: Hot topic discovery and trend analysis in com-munityquestion answering systems Zhang, Zhongfeng and Li, Qiudan,Journal:Expert Systems with Applications,2011.

64

Page 65: B.E. Information Science Project report of

[13] BirdKleinLoper09,Steven Bird and Ewan Klein and EdwardLoper, Natural Language Processing with Python, O’Reilly Me-dia,

[14] Eric Jones and Travis Oliphant and Pearu Peterson and others,SciPy: Open source scientific tools for Python, 2001–

[15] Simple and Effective Question Processing using Regular Ex-pressions and WordNet, Whidden, Chris, 2005, Dalhousie FCSTechnical Report

[16] Middleware 2010, Mascolo, Cecilia, 2010, Springer

[17] A survey of techniques for event detection in Twitter, Atefeh,Farzindar and Khreich, Wael, Computational Intelligence, WileyOnline Library

[18] Narcissism and implicit attention seeking: Evidence from lin-guistic analyses of social networking and online presentation, De-Wall, C Nathan and Buffardi, Laura E and Bonser, Ian andCampbell, W Keith,

65