Perspectives of Perspectives of information science in information science in the digital age the digital age Tefko Saracevic, PhD Rutgers University USA http://www. scils . rutgers . edu /~ tefko
Perspectives of Perspectives of information science in the information science in the
digital agedigital age
Tefko Saracevic, PhD
Rutgers University
USAhttp://www.scils.rutgers.edu/~tefko
© Tefko Saracevic, Rutgers University 2
Information science:
“the science dealing with the efficient
collection, storage, and retrieval of
information”
Webster
© Tefko Saracevic, Rutgers University 3
Organization
1. Big picture – problems, solutions, social place
2. Underlying stuff – theories, phenomena
3. Structure – what is inside stuff
4. Systems stuff – information retrieval, relevance
5. People stuff – users, use, seeking, context
6. Alliances, competition – the OUCH stuff
7. Digital libraries – whose are they anyhow?
8. Conclusions – Will we have a field stuff?
© Tefko Saracevic, Rutgers University 4
1. The big picture
Problems addressed
Bit of history: Vannevar Bush (1945):Problem: “... the massive task of making more
accessible of a bewildering store of knowledge.”still with us & growing
Basic problem of information science: Information explosion
today: PLUS Communication explosion
© Tefko Saracevic, Rutgers University 5
… solution
Bush: “Memex ... association of ideas ... duplicate mental processes artificially.”
Technological fix to problem
Still with us: technological determinanttail that wags the dog
© Tefko Saracevic, Rutgers University 6
Problems & solutions: SOCIAL CONTEXT
Professional practice AND scientific inquiry related to: Effective communication of knowledge records -
‘literature’ - among humans in the context of social, organizational, & individual need for and use of information.
“modeling the world of publications with a practical goal of being able to deliver their content to inquirers [users] on demand.” White & McCain
Taking advantage of modern information technology
© Tefko Saracevic, Rutgers University 7
Elaboration
Knowledge records = texts, sounds, images, multimedia ... literature in given domains content-bearing structures symbol manipulations are content neutral - infrastructural to
inf. sc.
Communication = human-computer-literature interface study of inf. science is the interface between people &
literatures
Inf. need, seeking, and use = reason d'êtreEffectiveness = relevance, utility
© Tefko Saracevic, Rutgers University 8
General characteristics - leitmotifs
Intedisciplinarity - relations with a number of fields
Technological imperative - driving force, as in many modern fields
Information society - social context and role in evolution - shared with many fields
© Tefko Saracevic, Rutgers University 9
2. Underlying stuff What is information?
Intuitively well understood, but formally????Several viewpoints, models
Shannon: source-channel-destinationgrapes into wine
Cognitive: changes in cognitive structureswater into wine
Social: context is the kingwhatever into wine to get drunk
© Tefko Saracevic, Rutgers University 10
K(S) + I = K(S + S) (Brookes)
Information [structured information] when operating on a knowledge structure produces an effect whereby the knowledge structure is changed
Potential information added (Ingwersen)
Actually, it states the problem – “unoperational” in information systems involves mental events only constructivists rejected it
© Tefko Saracevic, Rutgers University 11
Information in inf science: Three senses (from narrowest to broadest)
Inf. in terms of decision involving little or no cognitive processing signals, bits, straightforward data - e.g.. inf. theory,
economicsInf. involving cognitive processing & understanding
understanding, matching textsInf. also as related to situation, task, problem-at-
hand : USERS, USE For information science (incl. information retrieval): third, broadest interpretation
© Tefko Saracevic, Rutgers University 13
3. Structure
Specialties (White & McCain)
In desc. order of author co-citation; (120 authors, 24 years): experimental retrieval citation analysis practical retrieval bibliometrics library systems, automation user studies and theory scientific communication OPAC’s general - other disciplines indexing theory communication theory
© Tefko Saracevic, Rutgers University 14
Structure or oeuvres
Two large sub-disciplines: “Domain” cluster: analytical study of literatures, their
structure, communication, social context, uses - Retrieval cluster: human-literature interface: IR systems
(largest); interaction; library systems, OPACs, user studies - within each sub-clusters, eras
e.g.. Salton & post-Salton era
Largely not connected some authors in both, migrating BUT: lacking integrating works, authors, texts - big payout
© Tefko Saracevic, Rutgers University 15
Paradigm split in retrieval cluster
Split from early 80’s to date System-centered
algorithms, TRECcontinue traditional IR model
Human-(user)-centeredcognitive, situational, user studies interaction models, some started in TREC
Calls for user-centered approaches & evaluationBut: most support for system work in the digital age support is for digital
© Tefko Saracevic, Rutgers University 16
Human vs. system
Human (user) side: often highly critical, even one-sided mantra of implications for design but does not deliver concretely
System side: mostly ignores user side & studies ‘tell us what to do & we will’
Issue NOT H or S approach even less H vs. S but how can H AND S work together major challenge for the future
© Tefko Saracevic, Rutgers University 17
4. Systems stuff
Information Retrieval
“ IR: ... intellectual aspects of description of inf., ... search, ... & systems, machines...”
Calvin Mooers, 1951
How to provide users with useful information effectively?
For that objective:1. How to organize information intellectually?2. How to specify the search & interaction
intellectually?3. What techniques & systems to use effectively?
© Tefko Saracevic, Rutgers University 18
Streams in IR Res. & Dev. 1. Information science:
Services, users, use; Human-computer interaction; Cognitive aspects
2. Computer science: Algorithms, techniques Systems aspects
3. Information industry: Products, services, Web Market aspects
Problems: ...relative isolation...inadequate cooperation, transfer
© Tefko Saracevic, Rutgers University 19
IR successfully effected:
Emergence & growth of the INFORMATION INDUSTRYEvolution of IS as a PROFESSION & SCIENCEMany APPLICATIONS in many fields including on the Web – search engines
Improvements in HUMAN - COMPUTER INTERACTIONEvolution of INTEDISCIPLINARITY
IR has a long, proud history
© Tefko Saracevic, Rutgers University 20
Broadening of IROPACs (Online Public Access Catalogs)Natural language processingSummarizationMetadata representationsText “understanding”Hypertext, hypermediaMultimedia - images, sounds ... image IR, music IR
Many human-computer interactionsWeb search engines
© Tefko Saracevic, Rutgers University 21
5. People stuff
Quite a few areasProfessional services in organization – moving toward knowledge
management, competitive intelligence in industry – vendors, aggregators, Internet,
Research user & use studies interaction studies broadening to information seeking studies, social
context, collaboration relevance studies social informatics
© Tefko Saracevic, Rutgers University 22
User & use studies
Oldest areacovers many topics, methods, orientationsmany studies related to IR
e.g. searching, multitasking, browsing, navigation
Branching into Web use studiesquantitative & qualitative studiesemergence of webmetrics
© Tefko Saracevic, Rutgers University 23
Interaction
Traditional IR model concentrates on matching not user side & interactionSeveral interaction models suggested
Ingwersen’s cognitive, Belkin’s episode, Saracevic’s stratified model
hard to get experiments & confirmation
Considered key to providing basis for better design understanding of use of systems
Web interactions a major new area
© Tefko Saracevic, Rutgers University 24
Relevance
Effectiveness in IR = relevance thus, relevance became a key notion
and a key headache
A number of studies & reviews on:Nature: Framework, base?Manifestations: Contexts? Typologies?Behavior: Variables? Observations?Effects: Use? Evaluation?
© Tefko Saracevic, Rutgers University 25
Manifestations (types) of relevance
System or algorithmic relevance relation between query & objects (‘texts’) retrieved or failed
to retrieve
Topical or subject relevanceCognitive relevance or pertinenceSituational relevance or utility
relation between the situation, task or problem at hand & texts
Motivational or affective relevance intent, goals, & motivation of user & “texts”
Manifestations interact dynamically
© Tefko Saracevic, Rutgers University 26
Information seeking
Concentrates on broader context not only IR or interaction, people as they move in life & workNumber of models provided e.g. Kuhlthau’s stages, Vakkari’s problem situation,
task complexity
Includes studies of ‘life in the round,’ making sense, information encountering, work life, information discoveryBased on concept of social construction of information
© Tefko Saracevic, Rutgers University 27
6. Alliances, competition Relations
With a number of fields...
Strongest:
1. Librarianship
2. Computer science
© Tefko Saracevic, Rutgers University 28
Librarianship
[Library is]...“contributing to the total communication system in society. Created to maximize the utility of graphic record for the benefits of society... it achieves that goal by working with the individual and through the individual it reaches society.”
J.H.Shera, 1972
© Tefko Saracevic, Rutgers University 29
Common groundsIS & librarianship share:
Social role in information society
Concern with effective utilization of graphic & other types of records
Research problems related to a number of topics
Transfer to & from information retrieval
© Tefko Saracevic, Rutgers University 30
DifferencesIS & librarianship differ in:
Selection & definition of many problems addressedTheoretical questions & frameworkNature & degree of experimentation Tools and approaches usedNature & strength of interdisciplinary relations
© Tefko Saracevic, Rutgers University 31
One field or two?Point of many debatesSuggest: TWO fields in strong interdisciplinary relationsNot a matter of “better” or “worse” - matters little common arguments between many fields
Differences matter in: problem selection & definition agenda, paradigms theory, methodology practical solutions, systems
Best example: IR & library automation
© Tefko Saracevic, Rutgers University 32
Which?
Librarianship. Information science
Library and information science
Libraryandinformationscience
Information science
Information sciences
Information like in the “Information School”
© Tefko Saracevic, Rutgers University 33
Computer science
“systematic study of algorithmic processes that describe and transfer information... . The fundamental question in computing is: ‘What can be (efficiently) automated’ .”
Denning et al., 1989
© Tefko Saracevic, Rutgers University 34
IS & computer science
CS primarily about algorithmsIS primarily about information and its users and useNot in competition, but complementaryGrowing number of computer scientists active in IS – particularly in IR and digital librariesConcentrating on advanced IR algorithms & techniques digital library infrastructure & various domains human computer interaction
© Tefko Saracevic, Rutgers University 35
Human-computer interaction (HCI)
“ Human computer interaction is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them.”
ACM SIGCHI, 1993
Another interdisciplinary area computers sc., cognitive sc., ergonomics, ...
© Tefko Saracevic, Rutgers University 36
Interaction and ISTwo streams: computer-human interaction human-computer interaction
Modern IR is interactive BUT: difference between retrieval engine & retrieval
interface
Many studies on: machine aspects of interaction human variables in interaction
Problem: little feedback betweenInteraction very hard to evaluate - few methods yet
© Tefko Saracevic, Rutgers University 37
7. Digital libraries LARGE & growing area
“Hot” area in R&Da number of large grants & projects in the
US, European Union, & other countriesbut “DIGITAL” big & “libraries“ small
“Hot” area in practicebuilding digital collections, hybrid libraries,many projects throughout the world
© Tefko Saracevic, Rutgers University 38
Technical problemsSubstantial - larger & more complex than anticipated: representing, storing & retrieving of library objects
particularly if originally designed to be printed & then digitized
operationally managing large collections - issues of scale
dealing with diverse & distributed collections interoperability
assuring preservation & persistence incorporating rights management
© Tefko Saracevic, Rutgers University 39
Digital Library Initiatives in the US (DLI)
Research consortia under National Science Foundation DLI 1: 1994-98, 3 agencies, $24M, six large projects DLI 2: 1999-2006, 8 agencies, $60+M, 77 large &
small projects in various categories
‘digital library’ not defined to cover many topics & stretch ideas not constrained by practice
© Tefko Saracevic, Rutgers University 40
European Union
DELOS Network of Excelence on Digital Librariesmany projects throughout European Union
heavily technologicalmany meetings, workshops resembles DLIs in the USwell funded, long range
© Tefko Saracevic, Rutgers University 41
Research issues
understanding objects in DL representing in many formats non-textual materials
metadata, cataloging, indexing conversion, digitization organizing large collections managing collections, scaling preservation, archiving interoperability, standardization accessing, using,
© Tefko Saracevic, Rutgers University 42
DL projects in practice
Heavily oriented toward institutionsAssoc of Res Libraries (ARL) database:427 DL projects in 13 countries374 in the US
51% in universities; 24% fed govmt; 9% hist societies; 6% regional …
84% are explicitly retrospective; 16% technological
1 listed from DLI (Illinois)no connection with DLI projects
© Tefko Saracevic, Rutgers University 43
Agendas
Most DL research agenda is set from top down from funding agencies to projects imprint of the computer science community's interest &
vision
Most DL practice agendas are set from bottom up from institutions, incl. many libraries imprint of institutional missions, interests & vision
providing access to specialized materials and collections from an institution (s) that are otherwise not accessible
covering in an integral way a domain with a range of sources
© Tefko Saracevic, Rutgers University 44
Connection?
DL research & DL practice presently are conducted mostly independent of each other, minimally informing each other,& having slight, or no connection
Parallel universes with little connections & interaction
© Tefko Saracevic, Rutgers University 45
8. Conclusions
IS contributions
IS effected handling of inf. in society
Developed an organized body of knowledge & professional competencies
Applied interdisciplinarity
IR reached a mature stage
IR penetrated many fields & human activities
Stressed HUMAN in human-computer interaction
© Tefko Saracevic, Rutgers University 46
Challenges
Adjust to the growing & changing social & organizational role of inf. & related inf. infrastructurePlay a positive role in globalization of informationRespond to technological imperative in human termsRespond to changes from inf. to communication explosion - bringing own experiences to resolutions, particularly to the INTERNETJoin competition with qualityJoin DIGITAL with LIBRARIES
© Tefko Saracevic, Rutgers University 47
Juncture
IS is at a critical juncture in its evolutionMany fields, groups ... moving into information big competition entrance of powerful players fight for stakes
To be a major player IS needs to progress in its: research & development professional competencies educational efforts interdisciplinary relations
Reexamination necessary