Top Banner
The Modeling of Scholarly Information Services based on Citation Context Byungkyu Kim 1 , Beom-Jong You 2 and Jihoon Kang 3 1 S&T Information Center, Korea Institute of Science and Technology Information, ASI/KR/KS015/DAEJEON, South Korea [email protected] 2 S&T Information Center, Korea Institute of Science and Technology Information, ASI/KR/KS015/DAEJEON, South Korea [email protected] 3 Dept. of Computer Science and Engineering, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, South Korea [email protected] January 27, 2018 Abstract Background/Objectives: The citation index service, which is a representative service of scholary information ser- vices, is gradually showing its limitations. To overcome this, researchers in major foreign countries are pioneering new research areas that use the citation context as a sentence around “in-text citation”. These studies extract citation contexts, classify their functions, and use them to try out new citation analysis and citation summaries, or search and visualize citations, and further evaluate the quality of re- 1 International Journal of Pure and Applied Mathematics Volume 118 No. 19 2018, 2497-2510 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu 2497
14

The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

May 29, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

The Modeling of Scholarly InformationServices based on Citation Context

Byungkyu Kim1, Beom-Jong You2 and Jihoon Kang3

1S&T Information Center, Korea Institute of Scienceand Technology Information,

ASI/KR/KS015/DAEJEON, South [email protected]

2S&T Information Center, Korea Institute of Scienceand Technology Information,

ASI/KR/KS015/DAEJEON, South [email protected]

3Dept. of Computer Science and Engineering,Chungnam National University,

99 Daehak-ro, Yuseong-gu,Daejeon 34134, South Korea

[email protected]

January 27, 2018

Abstract

Background/Objectives: The citation index service,which is a representative service of scholary information ser-vices, is gradually showing its limitations. To overcome this,researchers in major foreign countries are pioneering newresearch areas that use the citation context as a sentencearound “in-text citation”. These studies extract citationcontexts, classify their functions, and use them to try outnew citation analysis and citation summaries, or search andvisualize citations, and further evaluate the quality of re-

1

International Journal of Pure and Applied MathematicsVolume 118 No. 19 2018, 2497-2510ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)url: http://www.ijpam.euSpecial Issue ijpam.eu

2497

Page 2: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

search results. In this paper, we study the citation contextservice model based on related precedent cases.

Methods/Statistical analysis: In order to develop amodel of the Scholarly Information Services based on the Ci-tation Context, we analyzed the related precedent researchand service cases abroad. KSCD (Korea Science CitationDatabase) was used for citation data and citation contextdata to be presented through this service model. The ci-tation context data was manually extracted for this study,and the citation position and citation function schemes werenewly defined to classify each citation context data.

Findings: We designed new scholary information ser-vice model based on citation context and implemented aprototype system. In addition, detailed service functionsusing citation context were derived. In order to effectivelypresent the service model, we applied the citation contextdata constructed through this study to verify the applica-bility.

Improvements/Applications: The scholary informa-tion service based on the citation context can greatly im-prove the information searching convenience of informationusers. Based on the cited context-based service model pre-sented in this paper, it is necessary to construct a citationcontext database for the entire KSCD, to develop and ap-ply the actual system, and to provide the service. For thiswork, there is a need for research to automatically extractthe citation context and classify the citation function.

Key Words : Citation Context, Citation Sentence, Ci-tation Function, Service Model, KSCI Plus

1 Introduction

The citation index service, which is a representative service of schol-ary information services, is gradually showing its limitations. Toovercome this, researchers in major foreign countries are pioneer-ing new research areas that use the citation context as a sentencearound ”in-text citation”. These studies extract citation contexts,classify their functions, and use them to try out new citation anal-ysis and citation summaries, or search and visualize citations, andfurther evaluate the quality of research results. In this paper, we

2

International Journal of Pure and Applied Mathematics Special Issue

2498

Page 3: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

study the citation context service model based on related prece-dent cases. In this paper, we study the citation context servicemodel based on related precedent cases. In this paper, we analyzethe research and service case of using overseas citation context andpresent a new scholary information service model based on cita-tion context. KSCD (Korea Science Citattion Database), whichis a database based on KSCI (Korea Science Citation Index ser-vice), was used as the base data for the service model. The paperswith relatively high number of citations were selected as the trialservice1.

2 Related Research

The citation context utilization study is characterized by utilizingvarious additional information such as the citation context of thetext form and the location of the citation as well as the simplecitation relation information. The citation context is a sentencethat describes, summarizes, or paraphrases the content of a papercited around an in-text citation or citation mark such as “(Fake-man 2014)” or ”[12] when an author quotes another author’s papers.Various techniques such as information retrieval, automatic classi-fication, automatic summarization, and text mining are used fortextual citation context processing. The research field of citationcontext utilization is proceeding in various directions as follows. 1)classifying the functions of citation contexts 2) summarizing newcitation analysis and citation using it 3) searching and visualizingcitation contexts 4) qualitative evaluation of research results. As aresult of researching and analyzing overseas related papers relatedto the citation context through previous studies of this study2, ithas been studied variously in USA, Europe, and Asian countries.Major research institutions are University of Michigan, Universityof Maryland, Cambridge, and analyzed the co-authorship of majorauthors as shown in Figure 1.

3

International Journal of Pure and Applied Mathematics Special Issue

2499

Page 4: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

Figure 1. Co-Author Analysis of Citation Context Research

The citation context-related research field can be subdividedinto eight sub-fields as follows. Citation statement identification/ citation statement classification steam / citation context auto-matic classification / citation context sensitivity analysis / citationcontext search / citation context summary / citation context vi-sualization / citation context. The identification of citation state-ments has been carried out to distinguish between explicit citationcontexts and implicit citation contexts depending on the presenceor absence of citation markers 3,4. The citation context classifica-tion scheme is used to extract and extract all the components inthe original text, and ParsCit is used as a tool for extracting it 5.The citation context automatic classification has been studied for along time, and Eugene Garfield also defined a detailed classificationscheme according to the purpose of the citation6. In computationallinguistics, citation functions are mainly used according to the cita-tion function. The citation function categorizes sentences or actionscited by intention or motivation by citation of the documents citedby the author. In other words, the citation function suggests theconcrete motivation that the author citation in the citation state-ment. Since the automatic classification must be performed by themachine rather than the manual work done by humans, the machine

4

International Journal of Pure and Applied Mathematics Special Issue

2500

Page 5: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

should be an easy classification scheme. A representative citationfunction classification table is a classification of Simon Teufel, andhe defines 12 categories as weak, neut, etc., and has carried out afollow-up study based on this definition7−10. This paper refers toRitchie’s paper, which summarizes the functional classifications ofcitation contexts studied until 2009 for citation function classifica-tion definitions11. Citation context functions Machine learning canbe used for automatic classification, and techniques include SVM,kNN, and NaiveBayes. The sentiment classification of a citationcontext refers to classifying the citation statement into ”polarity”such as ”positive” or ”negative” comments. The citation contextsearch should add the citation context data to the citation indexbased information service. The citation context summary summa-rizes the outline of the cited paper based on the citation contexts,and a sophisticated summary of the author abstract level requires ahigh level of information processing and related research is activelyunderway11−15. The citation context visualization is based on thecitation relation of the thesis, and can be visualized by networkingat various levels by adding additional information such as citationfunction classification, emotion classification, and citation location.It is also possible to track how the classification of a paper changesover time, comparing with other papers16. The representative ser-vices provided by applying the citation context to the informationservice are ”Microsoft Academic” and ”CiteSeerx”, and the servicemethod shows the citation contexts cited in the references togetherwith the references17,18.

3 Materials and Methods

In order to develop a citation context service model, we analyzedthe related precedent research and service cases of overseas. Thecases were divided into 8 areas and analyzed respectively. For thecitation data and the citation context data to be presented throughthe citation context service model, KSCD of the Korea Instituteof Science and Technology Information (KISTI) was used. KSCDcontains about 600,000 papers and 12 million references on 821 keyacademic journals in Korea. It also includes citation index infor-mation based on the citation index JCR standard. The coverage

5

International Journal of Pure and Applied Mathematics Special Issue

2501

Page 6: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

of KSCD is published from 2002 to 2016 according to the publica-tion standard and is updated annually by KISTI. Specifically, cita-tion context data was manually extracted for this study, and eachcitaiton context data was categorized by defining citation locationand citation function categorization categories. The selection cri-teria of the data to be processed was selected from the academicpapers with a large number of citations in the KSCD. Most of thejournals with large number of citations were overseas academic pa-pers. From the original texts of Korean journals citing these papers,we searched the parts citing the articles and extracted the citationcontexts with the positions of the citations.

4 Results

The base data of the cited context - based service presents thedata construction result and the service model implementation re-sult. The service name is defined as KSCI Plus in the meaning ofextended service of KSCI.

4.1 Construction of Citation Context Database

Based on the citation context-based service, 16 foreign articles and16 domestic papers were selected from KSCD, and 3,250 citationswere manually extracted and the data quality was verified in papersciting them. For the extracted citations, all citation functions werecategorized by applying the citation function classification category.The structure of the citation location classification is ”Abstract”,”Introduction”, ”Background”, ”Conclusion” and ”Other”. Theclassification table for categorizing the citation functions is shownin Table 1 below.

6

International Journal of Pure and Applied Mathematics Special Issue

2502

Page 7: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

Table 1. The General Characteristics (N=177)

The polarity classification according to the above classificationsystem is as follows.

[Weak → Negative], [Description → Neutral], [Comparison, Use,Basis → Positive]

4.2 Design of Service Model

The processing flow from DB construction to service is shown inFigure 2. In this study, we defined the processing steps assumingthat the citation function classification and citation summary ex-traction, which were actually processed by hand, are automaticallyprocessed.

Figure 2. Data Processing Overview for KSCI Plus Service

7

International Journal of Pure and Applied Mathematics Special Issue

2503

Page 8: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

KSCI Plus uses scholary content (meta information, references,citation index information) built in KSCD as an underlying database.First, we search for articles cited at a high frequency in KSCD andextract citation contexts from the original texts of articles citingthese articles. At this time, the citation context is extracted insentence or paragraph unit, and additional information such as theposition of citation is added. Next, citation functions for each ofthe citation contexts are classified using a classification scheme, andkeywords are also extracted. The extracted citation keywords aregenerated as a co-occurence network and provided as a visualiza-tion map. Make citation summaries with different citation contextsfor the same article. The citation location and function classifica-tion information can be used for various purposes in providing cita-tion contexts in KSCI Plus service. The database ER-Diagram forstoring and managing basic information (Journal paper meta-data,reference data, citation index) and citation context information (ci-tation sentences, citation positions, citation functions) is shown inFigure 3.

Figure 3. ER diagram for KSCI Plus

The overall system structure for the KSCI Plus service is shown

8

International Journal of Pure and Applied Mathematics Special Issue

2504

Page 9: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

in Figure 4, which shows the Base database, search and manage-ment functions, and detailed service menus.

Figure 4. Overall system structure

4.3 Service Interface of KSCI Plus

The main service pages of KSCI Plus are shown below. Figure 5shows the information search results. Unlike general informationservice, the number of citations in the paper is analyzed by citationfunction, and citation contexts of the cited papers are providedthrough the service page. For example, in the first article in thesearch results, the number of citations in the article is 209, the mostcited category is ”USE”, and the largest category is ”Comparison”.This can be easily seen through the left circle graph, which reflectsthe color and size of the citation function classification.

9

International Journal of Pure and Applied Mathematics Special Issue

2505

Page 10: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

Figure 5. KSCI Plus Service (Search Result View)

Figure 6 shows the details page of the paper. On the left,the trends in the number of citations and the citation analysis areshown. The citation analysis results show various graphs by citationfunction classification, polarity classification, and citation locationclassification. It is also provided as a tag cloud that can search forkey keywords based on the keywords extracted from the citationcontext. On the top right screen is a network map that visualizesthe co-occurence relationship between the keywords in the citationcontext, so that you can easily see which subject and content thearticle covers. The bottom right-hand side is a visualization of theco-citation relationship of this article as a network map, and a com-prehensive look at other articles that have co-citation relationshipswith that article.

10

International Journal of Pure and Applied Mathematics Special Issue

2506

Page 11: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

Figure 6. KSCI Plus Service (View Details)

5 Conclusion

n this paper, In order to develop a model of the Scholarly Informa-tion Service based on the Citation Context, we analyzed the relatedprecedent research and service cases abroad. KSCD (Korea ScienceCitation Database) was used for citation data and citation contextdata to be presented through this service model. The citation con-text data was manually extracted for this study, and the citation po-sition and citation function schemes were newly defined to classifyeach citation context data. We designed new scholary informationservice model based on citation context and implemented a proto-type system. In addition, detailed service functions using citationcontext were derived. In order to effectively present the servicemodel, we applied the citation context data constructed throughthis study to verify the applicability. It is expected that informa-tion accessibility of information users will be greatly improved ifacademic information model based on the citation context devel-oped through this study is provided as a real service. Therefore,based on the service model presented in this paper, it is necessaryto construct a citation context database for the entire KSCD, todevelop and apply the actual system, and to provide the service.There is a need for research to automatically extract the citation

11

International Journal of Pure and Applied Mathematics Special Issue

2507

Page 12: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

context and classify the citation function.

References

[1] Korea Science Citation Index service :http://ksci.kisti.re.krData accessed 11/30/2017

[2] Kim B., Choi S., Kang M. Analysis of Research Status Basedon Citation Context International. Journal of Contents, 2005,11 (2), pp. 6368.

[3] Kang I., Kim B. Characteristics of Citation Scopes: A Prelimi-nary Study to Detect Citing Sentences. Computer Applicationsfor Database, Education, and Ubiquitous Computing. SpringerBerlin Heidelberg, 2012. 80-85.

[4] Kang I. A Rule-based Approach to Identifying Citation Textfrom Korean Academic Literature. Journal of the Korean So-ciety for information Management, 2012, 29(4), pp. 43-60.

5.

[5] Reference String Parsing Package. LREC, 2008, 2008.

[6] Eugene G. Can citation indexing be automated?. Statistical As-sociation Methods for Mechanized Documentation, 1965, 239,pp. 189-192.

[7] Simone T., Siddharthan A., Tidhar D.Automatic classificationof citation function. Proceedings of the 2006 Conference onEmpirical Methods in Natural Language Processing, Associa-tion for Computational Linguistics, 2006.

[8] Simone T., Siddharthan A., Tidhar D.An annotation schemefor citation function. Proceedings of the 7th SIGdial Work-shop on Discourse and Dialogue. Association for Computa-tional Linguistics, 2009.

[9] Simone T., Siddharthan A., Batchelor C. Towards discipline-independent argumentative zoning: Evidence from chemistryand computational linguistics. Proceedings of the 2009 Confer-ence on Empirical Methods in Natural Language Processing.Association for Computational Linguistics, 2009.

12

International Journal of Pure and Applied Mathematics Special Issue

2508

Page 13: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

[10] Athar A., Simone T. Context-enhanced citation sentiment de-tection.Proceedings of the 2012 Conference of the North Amer-ican Chapter of the Association for Computational Linguis-tics: Human Language Technologies, Association for Compu-tational Linguistics, 2012.

[11] Ritchie A. Citation context analysis for information retrieval.No. UCAM-CL-TR-744. University of Cambridge, ComputerLaboratory, 2009.

[12] Amjad A., Radev D. Coherent Citation-Based Summarizationof Scientific Papers. ACL. 2011.

[13] Vahed Q., Radev D. Scientific paper summarization using ci-tation summary networks. Proceedings of the 22nd Interna-tional Conference on Computational Linguistics, Associationfor Computational Linguistics, 2008.

[14] Vahed Q., Radev D. Identifying non-explicit citing sentencesfor citation-based summarization. Proceedings of the 48th an-nual meeting of the association for computational linguistics,Association for Computational Linguistics, 2010.

[15] Vahed Q. Generating Extractive Summaries of ScientificParadigms. J. Artif. Intell. Res., 2013, 46, pp. 165-201.

[16] Radev D., Amjad A. Rediscovering ACL Discoveries Throughthe Lens of ACL Anthology Network Citing Sentences. Associ-ation for Computational Linguistics, 2012.

[17] Microsoft Academic : https://academic.microsoft.com Dataaccessed 11/30/2017

[18] Citeseerx : http://citeseerx.ist.psu.edu/index Data accessed11/30/2017

13

International Journal of Pure and Applied Mathematics Special Issue

2509

Page 14: The Modeling of Scholarly Information Services based on ...the original text, and ParsCit is used as a tool for extracting it 5. The citation context automatic classi cation has been

2510