Top Banner
LIS510 lecture 8 Thomas Krichel 2006 -11-08
41

LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

Mar 27, 2015

Download

Documents

Hailey Pereira
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

LIS510 lecture 8

Thomas Krichel

2006 -11-08

Page 2: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

introduction

• Reading– Rubin chapter 2.– Rubin chapter 4 until page 153– Library of Congress "Copyright Basics",

available at http://www.copyright.gov/circs/circ1.html

• Structure– information science– information policy

Page 3: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

Taylor’s 1966 definition

“Information science is the science that investigates the properties and behavior of information, the forces governing the flow of information, and means of processing information for optimum accessibility and usability. The processes include the originations, dissemination, collection, organization, storage retrieval and use of information.”

Page 4: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

Rubin’s organization

1. Information needs, information seeking, information use and information users.

2. Information storage and retrieval.

3. Defining the nature of information and its use.

4. Bibliometrics and citation analysis

5. Management and administrative issues.

?. new areas

Page 5: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

JITA classification

• This is about the only publicly available library and information science classification scheme http://eprints.rclis.org/jita.html

• It was done for the E-LIS system of Library and Information Science (LIS) eprints at http://eprints.rclis.org.

Page 6: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

area 1: information needs

• Much of this literature waffles vaguely about imprecise concepts.

• “Information seeking in context” is now popular.

• Some broad trends– People prefer personal to institutional sources.– People seldomly see librarians as a source.– People make little effort.

Page 7: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

Berrypicking (Bates 1989)• Users sift through information like pickers of

berries.– The query is constantly shifting.– Users may move through a variety of sources.– New information may give people new ideas

and direction – The value of information is all the bits and

pieces gathered during the process.

• This contrast sharply with information retrieval research.

Page 8: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

Kuhlthau (1991)

• Proposes a 6-stage information seeking process– Initiation– Selection– Exploration– Formulation– Collection– Presentation

• IMHO perfectly useless.

Page 9: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

area 2: information retrieval

• Information storage is not really much of an issues anymore.

• When I dealt with it I meant storage as including the organization of the information, which is a bit of a stretch

• Ideally, one needs to know the retrieval needs before designing the organization of the information

Page 10: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

information retrieval

• This has to do with anything of how the user gets to the information out of an information system.

• It is different from data retrieval since the retrieved data has to be “relevant” to the user.

• It is very difficult to say what “relevance” is, objectively.

Page 11: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

typical research

• This usually involves looking at a set of documents that have been classified.

• Then we can pick computer algorithms that best sort the documents satisfying the user need from those who don’t.

• Usually this stuff is heavily mathematical/computational.

• I have been applying work from that area.

Page 12: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

information retrieval performance• How was it for you?

• The traditional methods are – precision = number of relevant documents

retrieved divided by total number of retrieved documents

– recall = number of relevant documents retrieved divided by total number of relevant document.

• They only evaluate a search!

• I have done some work in that area.

Page 13: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

information retrieval models

• They give formal account of the retrieval process.

• there are three basic flavor– Boolean information retrieval– Vector information retrieval– Probabilistic information retrieval

• All are mathematical model• I would also add web information retrieval

as a new type

Page 14: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

web information retrieval

• This has become big business now because finding a user’s need is a way to connect them with advertising.

• One way that has made Google such a success is that they discovered a way to make quality web sites appear at the top.

• Basically, a quality web site is one that has many links to it from other quality sites.

Page 15: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

information storage

• It can mean the preparation of information before searching– which fields are searchable– can there be a variety of means to rank

searches?– is there use of a controlled vocabulary

• It is difficult to make general conclusions but to say that advanced search features are not much used.

Page 16: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

human-computer interface• Tries to understand how users work with

computer systems.

• The idea is to build “user-friendly” systems.

• But don’t leave that to a “computer designer” as suggested by Rubin.

• Note that information systems go way beyond computers.

• This area is usually connected to psychology.

Page 17: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

natural language processing

• Rubin classifies this as a part of computer-human interface.

• Natural language processing is still in its infancy.

• Speech recognition is the best developed part.

• Others are working on connecting computers to the brain.

Page 18: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

artificial intelligence• This has been around for a while.

• The field has developed a number of theoretical tools.

• Some of them are being used in practice now. Things like RDF, the Resource Description Framework, are based on artificial intelligence theory. It is a tool to aggregate knowledge from web resource.

• Still no practical application that demonstrates the use of AI on the web.

Page 19: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

Area 3: defining information & its value

• There is debate on the nature of– data (Thomas: things that can be processed in

the information system)– knowledge (Thomas: stuff that is in people’s

head)– information (something between data and

knowledge). Rubin says its meaning given to data.

• Rubin also talks about wisdom as “knowledge applied for the benefit of humanity”

Page 20: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

scientific view of information• Usually information is modeled as

something that reduces uncertainty• People have a rough idea about something,

say tomorrow’s temperature • The information is the fact that this

something will actually take a precise value, when we know what the temperature is or when we have less uncertainty.

• There is an approach to measuring information through the concept of entropy.

• Thomas used to teach such stuff.

Page 21: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

value of information

• Economists can use a probabilistic model we can set out an approach that puts value to information.

• But their definition is useless for practical purposes.

• Much of the work then involves some cost/benefit analysis. In such analysis one can reach almost any result one wants.

Page 22: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

elements of value-added in libraries• access to resources

• accuracy (for example of bibliographic data)

• browsing (like in library stacks)

• currency (things are up-to-date)

• flexibility (through human interaction)

• formatting (laying out the collection, signs)

• interfacing (probably close to flexibility)

• ordering (buy access to things)

• access to means to get to resources

Page 23: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

area 4: bibliometrics

• Is the application of quantitative methods to the study of information resources

• Mainly concerned with the structure of the resources. The typical example is citation analysis.

• Quantitative studies of use fall more to the first area of interest.

• An expanding area is the use of network analysis.

Page 24: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

bibliometric laws

• Zipf’s law related to the usage of terms in text.

• Lotka’s law related to the number of papers written by authors.

• Bradford’s law relates to the distribution of articles in a field across a number of periodicals.

Page 25: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

citation analysis

• This is the heart of bibliometrics.

• Two important concept– bibliographic coupling means two documents

share some reference– co-citation means two documents are cited by

the same documents

• Citation analysis is also important for scientific activity evaluation.

Page 26: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

area 5: management & admin

• This is an expanding area in libraries.• Rather than collecting physical books,

libraries have to negotiate on-line access. • Area covers all of information policy.

Example problems are– copyright– censorship

• Measuring performance is part of user studies

Page 27: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

service evaluation

• This is an important area is libraries.

• Libraries need to demonstrate value in order to fight for their continued existence.

• They also need to examine usage of the systems that the vendors propose.

Page 28: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

area 6: information architecture

• art and science of organizing information and its interfaces so that seekers find what they want quickly

• mainly used with respect to large web sites. it looks at the contents rather than technical factors or the look-and-feel

• A related idea is usability

Page 29: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

area 7: knowledge management

• this comes from the business environment

• it is a management fad that has overstayed its welcome.

Page 30: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

information policy

• This is any– law– regulation– practice

• that affects the – creation -- organization– acquisition -- dissemination– evaluation

• of information.

Page 31: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

private value of information

• Information has value for its creators.

• Some creators require that you pay them in order to use that information.

• US law encourages the private creation of information and knowledge.

• There is market for information.

Page 32: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

limiting access to information

• The creators of commercial information providers are concerned about unpaid access.

• Other companies, that do not primarily produce information may also be concerned about leaking of data such as– R&D data– financial data– product information

Page 33: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

protecting privacy

• This is a major issue in society in general.

• Financial, health and other data are protected by law.

• In libraries, the concern has been the protection of circulation records.

• The Patriot Act has created fairly loose conditions under which law enforcement agencies can access circulation records.

Page 34: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

freedom of information

• This refers to the idea that government information other than – military secrets– law enforcement records– private medical and financial information

• Government information should be made available to the citizens so they can scrutinize government.

• This should be an important task for public libraries with respect to local government.

Page 35: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

private dissemination of public information

• There has been a tendency away from giving the distribution of government documents by the government printing office to private companies.

• This has caused some as the companies charge the taxpayers for something that has already been produced at the taxpayers’ expense.

• Such companies can copyright the information that the government could not.

Page 36: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

example: legal information• In principle, text of laws and legal

information should be free

• Some old data of it still is in print form and can not be circulated without some cost

• Recent data could all be made available on the web.

• The judicial system does not organize upload of data and organization of data well.

Page 37: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

national security

• Protecting the cyber infrastructure has been made a priority. But nothing much is there that the government can directly do to protect private installations.

• There have been some restrictions on the distribution of formerly government data that has been considered to provide information on potential terror targets.

Page 38: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

the library awareness program

• The FBI started to monitor the use of libraries by foreign individuals in the 70s.

• Libraries were believed to be places where foreign agents could get critical intelligence to gain a technological edge.

• Since the material held in libraries is published and sold commercially, it seems quite silly to monitor its use.

Page 39: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

the Patriot Act• some knee-jerk legislation to try to protect

the USA– increases power to monitor citizens behavior– authorizes roving wiretap– intelligence authorities can require any

“business record”• the person concerned in the record must not be

informed• there is a gagging order to disseminate information

about the request• no independent judicial review of the request

Page 40: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

control of expressions

• There has been a long history of censorship in all countries at all times. – IRA/Sinn Fein– sexually explicit material

• Sometimes the pressure on artistic works is indirect, e.g. through funding channels.

• Libraries generally fight censorship, but they have to keep their target communities in mind.

Page 41: LIS510 lecture 8 Thomas Krichel 2006 -11-08. introduction Reading –Rubin chapter 2. –Rubin chapter 4 until page 153 –Library of Congress "Copyright Basics",

http://openlib.org/home/krichel

Please shut down the computers now.

Thank you for your attention!