PERSONALIZED ONTOLOGY MODEL FOR WEB INFORMATION GATHERINGBy A
PRO1ECT REPORTSubmitted to the Department of Computer Science &
Engineering in theFACULTY OF ENGNEE!NG & TEC"NOLOGYIn partial
fulfillment of the requirements for the award of the degreeOfMASTER
OF TECHNOLOGYINCOMPUTER SCIENCE & ENGINEERINGAPRIL 2012BONAFIDE
CERTIFICATECertified that thi# pro$ect report
tit%ed&Personalized Ontology Model for WebInformation Gathering
i#the bonafide 'or( of )r* +++++++++++++,ho carried outthe re#earch
under m- #uper.i#ion Certified further/ that to the be#t of m-
(no'%edge the'or( reported herein doe# not form part of an- other
pro$ect report or di##ertation on theba#i# of 'hich a degree or
a'ard 'a# conferred on an ear%ier occa#ion on thi# or an-other
candidate*Signature of the Guide Signature of the H.O.DName Name
CHAPTER 01ABSTRACT:A# a mode% for (no'%edge de#cription and
forma%i0ation/ onto%ogie# are 'ide%-u#edtorepre#ent u#er profi%e#
inper#ona%i0ed'ebinformationgathering* "o'e.er/'hen repre#enting
u#er profi%e#/ man- mode%# ha.e uti%i0ed on%- (no'%edge from either
ag%oba% (no'%edge ba#e or u#er %oca% information* n thi# paper/ a
per#ona%i0ed onto%og-mode% i# propo#ed for (no'%edge repre#entation
and rea#oning o.er u#er profi%e#* Thi#mode% %earn# onto%ogica% u#er
profi%e# from both a 'or%d (no'%edge ba#e and u#er %oca%in#tance
repo#itorie#* The onto%og- mode% i# e.a%uated b- comparing it
again#tbenchmar(mode%#in'ebinformationgathering* There#u%t##ho'that
thi#onto%og-mode% i# #ucce##fu%.PRO1ECT PURPOSE:,eb1ba#ed
information a.ai%ab%e ha# increa#ed dramatica%%-* "o'to gather
u#efu%information fromthe 'ebha# become a cha%%enging i##ue for
u#er#* Current 'ebinformationgathering#-#tem#attempt
to#ati#f-u#erre2uirement#b-capturingtheirinformation need#* For
thi# purpo#e/ u#er profi%e# are createdfor u#er
bac(ground(no'%edgede#cription* U#erprofi%e# repre#entthe
conceptmode%# po##e##ed b- u#er#'hen gathering 'eb information* A
concept mode% i# imp%icit%- po##e##ed b- u#er# and i#generated from
their bac(ground (no'%edge* ,hi%e thi# concept mode% cannot be
pro.enin %aboratorie#/ man- 'eb onto%ogi#t# ha.e ob#er.ed it in
u#er beha.ior* PRO1ECT SCOPE:Onto%og- mining di#co.er# intere#ting
and on1topic (no'%edge fromthe concept#/#emantic re%ation#/ and
in#tance# in an onto%og-* n thi# #ection/ a 3D onto%og-
miningmethod i# introduced4 Specificit- and E5hau#ti.it-*
Specificit- 6denoted #pe7 de#cribe# a#ub$ect8# focu# on a gi.en
topic* E5hau#ti.it-re#trict# a #ub$ect8# #emantic #pace
dea%ing'iththetopic*Thi#methodaim#toin.e#tigatethe#ub$ect#andthe#trengthoftheira##ociation#
in an onto%og-*PRODUCT FEATURES:Onto%og-mode% inthi# paper pro.ide#
a #o%utiontoempha#i0ingg%oba% and%oca%(no'%edge in a #ing%e
computationa% mode%* The finding# in thi# paper can be app%ied
tothe de#ign of 'eb information gathering #-#tem#* The mode% a%#o
ha# e5ten#i.econtribution# to the fie%d# of nformation !etrie.a%/
'eb nte%%igence/ !ecommendationS-#tem#/ and nformation S-#tem#*
Onto%og-techni2ue#/ c%u#tering/ and c%a##ification inparticu%ar/
can he%p to e#tab%i#h the reference/ a# in the 'or( conducted * The
c%u#teringtechni2ue# groupthe document# intoun#uper.i#ed c%u#ter#
ba#edonthe documentfeature#* The#e feature#/ u#ua%%- repre#ented b-
term#/ can be e5tracted from the c%u#ter#*The- repre#ent the u#er
bac(ground (no'%edge di#co.ered from the u#er* INTRODUCTION:The
amount of 'eb1ba#ed information a.ai%ab%e ha# increa#ed
dramatica%%-* "o'togatheru#efu%
informationfromthe'ebha#becomeacha%%engingi##ueforu#er#*Current
'ebinformationgathering#-#tem# attempt to#ati#f-u#er re2uirement#
b-capturingtheirinformationneed#* Forthi#purpo#e/
u#erprofi%e#arecreatedforu#erbac(ground (no'%edge de#cription *U#er
profi%e# repre#ent the concept mode%# po##e##edb- u#er# 'hen
gathering 'eb information* A concept mode% i# imp%icit%- po##e##ed
b-u#er# andi# generatedfromtheir bac(ground(no'%edge* ,hi%ethi#
concept mode%cannot be pro.en in %aboratorie#/ man- 'eb onto%ogi#t#
ha.e ob#er.ed it in u#er beha.ior*,hen u#er# read through a
document/ the- can ea#i%- determine 'hether or not it i#
oftheirintere#t orre%e.ancetothem/ a$udgment that
ari#e#fromtheirimp%icit conceptmode%#* f a u#er8# concept mode% can
be #imu%ated/ then a #uperior repre#entation of u#erprofi%e# can be
bui%t* To#imu%ate u#er concept mode%#/ onto%ogie#9a
(no'%edgede#cription and forma%i0ation mode%9are uti%i0ed in
per#ona%i0ed 'eb informationgathering* Such onto%ogie# are ca%%ed
onto%ogica% u#er profi%e# or per#ona%i0ed onto%ogie#*To repre#ent
u#er profi%e#/ man- re#earcher# ha.e attempted to di#co.er u#er
bac(ground(no'%edge through g%oba% or %oca% ana%-#i#* G%oba%
ana%-#i# u#e# e5i#ting g%oba%(no'%edge ba#e# for u#er bac(ground
(no'%edge repre#entation* Common%- u#ed(no'%edgeba#e#
inc%udegenericonto%ogie# 6e*g*/,ordNet7/ the#auru#e# 6e*g*/
digita%%ibrarie#7/ and on%ine (no'%edge ba#e# 6e*g*/ on%ine
categori0ation# and ,i(ipedia7* Theg%oba% ana%-#i# techni2ue#
produce effecti.e :erformance for u#er bac(ground
(no'%edgee5traction* "o'e.er/ g%oba% ana%-#i# i# %imited b- the
2ua%it- of the u#ed (no'%edge ba#e*For e5amp%e/ ,or%dNet 'a#
reported a# he%pfu% in capturing u#er intere#t in #ome area#but
u#e%e## for other#* Loca% ana%-#i# in.e#tigate# u#er %oca%
information or ob#er.e# u#erbeha.ior in u#er profi%e#* For e5amp%e/
Li and ;hong di#co.ered ta5onomica% pattern#from the u#er#8 %oca%
te5t document# to %earn onto%ogie# for u#er profi%e#* Some
group#%earned per#ona%i0edonto%ogie#adapti.e%-
fromu#er8#bro'#inghi#tor-* A%ternati.e%-/Se(ine and Su0u(i ana%-0ed
2uer- %og# to di#co.er u#er bac(ground (no'%edge* n #ome'or(#/ #uch
a#/ u#er# 'ere pro.ided 'ith a #et of document# and a#(ed for
re%e.ancefeedbac(* U#er bac(ground (no'%edge 'a# then di#co.ered
from thi# feedbac( for u#erprofi%e#* "o'e.er/ becau#e %oca%
ana%-#i# techni2ue# re%- on data mining or
c%a##ificationtechni2ue#for(no'%edgedi#co.er-/occa#iona%%-thedi#co.eredre#u%t#containnoi#-anduncertaininformation*
A# are#u%t/ %oca% ana%-#i# #uffer# fromineffecti.ene## atcapturing
forma% u#er (no'%edge* From thi#/ 'e can h-pothe#i0e that u#er
bac(groundc%u#tering 'ere #ugge#ted* The#e#trategie# 'i%% be
in.e#tigated in future 'or( to #o%.e thi# prob%em* The
in.e#tigation 'i%%e5tendthe app%icabi%it-of theonto%og-mode% tothe
ma$orit-of the e5i#ting'ebdocument# and increa#e the contribution
and #ignificance of the pre#ent 'or(*EXISTING SYSTEM:1. Golden
Model: TREC Model:The T!EC mode% 'a# u#ed to demon#trate the
inter.ie'ing u#er profi%e#/ 'hich ref%ectedu#er concept mode%#
perfect%-* For each topic/ T!EC u#er# 'ere gi.en a #et of
document#to read and $udged each a# re%e.ant or nonre%e.ant to the
topic* The T!EC u#er profi%e#perfect%- ref%ected the u#er#8
per#ona% intere#t#/ a# the re%e.ant $udgment# 'ere
pro.idedb-the#amepeop%e 'ho created the topic# a#'e%%/fo%%o'ing the
fact thaton%-u#er#(no' their intere#t# and preference# perfect%-*2.
Baseline Model: Category ModelThi# mode% demon#trated the
noninter.ie'ing u#er profi%e#/ a u#er8# intere#t# andpreference#
are de#cribed b- a #et of 'eighted #ub$ect# %earned from the u#er8#
bro'#inghi#tor-* The#e#ub$ect# are#pecified'iththe#emantic
re%ation# of #uper c%a## and#ubc%a## in onto%og-* ,hen an O?,AN
agent recei.e# the #earch re#u%t# for a gi.entopic/ it fi%ter# and
reran(# the re#u%t# ba#ed on their #emantic #imi%arit- 'ith the
#ub$ect#*The #imi%ar document# are a'arded and reran(ed higher on
the re#u%t %i#t*3. Baseline Model: Web Model The 'eb mode% 'a# the
imp%ementation of t-pica% #emi inter.ie'ing u#erprofi%e#* t
ac2uired u#er profi%e# from the 'eb b- emp%o-ing a 'eb #earch
engine* Thefeature term# referred to the intere#ting concept# of
the topic* The noi#- term# referred tothe parado5ica% or ambiguou#
concept#*LIMITATIONS OF EXISTING SYSTEM: The topic co.erage of T!EC
profi%e# 'a# %imited* The T!EC u#er profi%e# hadgood preci#ion but
re%ati.e%- poor reca%% performance* U#ing 'eb document# for
training #et# ha# one #e.ere dra'bac(4 'eb informationha# much
noi#e and uncertaintie#* A# a re#u%t/ the 'eb u#er profi%e#
'ere#ati#factor-interm#of reca%%/ but 'ea(interm#of preci#ion*
There'a#nonegati.e training #et generated b- thi# mode%PROPOSED
SYSTEM:The 'or%d(no'%edge andau#er8# %oca% in#tance repo#itor-6L!7
are u#edinthepropo#ed mode%* @7 ,or%d (no'%edge i# common#en#e
(no'%edge ac2uired b- peop%e from e5perience andeducation 37 An L!
i# a u#er8# per#ona% co%%ection of information item#* From a 'or%d
(no'%edgeba#e/ 'econ#truct per#ona%i0edonto%ogie# b-adoptingu#er
feedbac(onintere#ting(no'%edge* A mu%tidimen#iona% onto%og- mining
method/ Specificit- and e5hau#ti.e%-/ i#a%#o introduced in the
propo#ed mode% for ana%-0ing concept# #pecified in onto%ogie#*
Theu#er#8 L!# are then u#edto di#co.er bac(ground (no'%edge and to
popu%ate theper#ona%i0ed onto%ogie#* ADVANTAGES OF PROPOSED
SYSTEM:Compared'iththe T!ECmode%/ theOnto%og-mode%
hadbetterreca%%butre%ati.e%-'ea(er preci#ion performance* The
Onto%og- mode% di#co.ered u#er bac(ground(no'%edge from u#er %oca%
in#tance repo#itorie#/ rather than document# read and $udgedb-
u#er#* Thu#/ the Onto%og- u#er profi%e# 'ere not a# preci#e a# the
T!EC u#er profi%e#* The Onto%og- profi%e# had broad topic co.erage*
The #ub#tantia% co.erage of po##ib%-1re%atedtopic# 'a#gainedfrom
theu#eofthe ,