Top Banner
Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd [email protected] Business Reference Services Science, Technology & Business Division The Library of Congress
49

Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd [email protected] Business Reference Services Science, Technology & Business Division.

Dec 18, 2015

Download

Documents

Ira Scott
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

Knowledge Organization:Library Tools and

Taxonomies for the WebJan Herd [email protected]

Business Reference Services

Science, Technology & Business Division

The Library of Congress

Page 2: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

2

Page 3: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

3

Web is too big to organize?One billion pages1.5 million pages added

dailySelection of sites by

collection development specialists/reference librarians

Page 4: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

4

Librarians work in corporate settings

Yahoo.com (directory)

Northern Light.com

(search engine)

Amazon.com (e-book seller)

Microsoft.com

Page 5: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

5

OCLC Library Corporation Cooperatively Catalogs:

45 Million Works

350,000 Web sites and growing

Page 6: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

6

Traditional Library Tools on the Web

Medical Subject Headings 1996

Web Dewey 2000

Classification Web 2001 (LCSH & LCC)

Page 7: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

7

Importance of controlled vocabulary as metadata

American Library Association

Subject Analysis Committee (SAC)

Subcommittee on Metadata and

Subject Analysis recommendations

http://www.ala.org/alcts/organization/

ccs/metarept2.html

Page 8: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

8

Controlled VocabulariesWhy We Need Them

Used “behind” search engines

Standard in online databases

New adherents (i.e., Web Content

Managers utilizing Taxonomies)

They Work !

Page 9: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

9

Sherry Vellucci, Associate Professor, St. John’s Univ., during the Conference on Bibliographic Control for the New Millennium:

“authority control is not only wonderful, but critical. Controlled vocabulary mediating tools should cover Subjects, Genres, Gazetteers, Names and Titles, etc.”

Page 10: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

10

Metathesauri/Subject CorrelationsUniversal Medical Language System

(UMLS) maps over 60 medical and health care thesauri in one

http://www.nlm.nih.gov/pubs/ factsheets/umlsmeta.html

ClassificationWebThe Library of Congress subject

headings and LC classification correlations

http://classweb.loc.gov

Page 11: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

11

Page 12: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

12

Page 13: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

13

Page 14: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

14

Page 15: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

15

Page 16: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

16

Page 17: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

17

Page 18: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

18

Page 19: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

19

Page 20: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

20

Page 21: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

21

Page 22: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

22

Mapping:Standard information exchangesystemsDublin Core to MARC

http://lcweb.loc.gov/marc/dccross.html

MARC to Dublin Core

http://www.loc.gov/marc/marc2dc.htmlXMLMARC Crosswalk

http://lcweb.loc.gov/marc/marcsgml.html (Must download files)

MARC to XML to MARC Converter http://www.logos.com/marc/default.asp

Page 23: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

23

Mapping:Specialized information exchange systems

Standard Industrial Classification (SIC codes)

to

North American Industrial Classification System (NAICS codes)

Page 24: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

24

Page 25: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

25

SIC Code Example Major group 73=Business services

737=Computer programming, data processing, and other computer related services, 7372=Prepackaged software

Equivalent NAICS codes are:

Major group=51 Information

511=Publishing industries

5112=Software publishers (with cross ref. to Sector 42 for reselling packaged software)

Page 26: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

26

Using old and new tools for knowledge organization on the Web

Water into Wine

Page 27: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

27

What is a Taxonomy ?

A high level information search device constructed to provide a means of understanding, navigating, and gaining access to intellectual capital.

Page 28: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

28

384 - 322 B.C.

Aristotle

Library of Alexandria

Carl Linnaeus

1707-1778

Kallimachos

305 - 240 B.C.

History of Taxonomies

Page 29: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

29

“Classification” is used much more frequently than “Taxonomy”, in all fields of study.

Page 30: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

30

Numerous formal taxonomies are maintained by

government and commercial enterprises

Page 31: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

31

Taxonomies are used in:

Customized search engines

Interfaces in web portals

Page 32: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

32

Page 33: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

33

Page 34: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

34

Service Codes CODE TITLE A Research and Development B Special Studies and Analysis ‑ Not R&D C Architect and Engineering Services ‑ Construction D Information Technology Services, including Telecommunication Services E Purchase of Structures and Facilities F Natural Resources and Conservation Services G Social Services H Quality Control, Testing and Inspection Services J Maintenance, Repair, and Rebuilding of Equipment K Modification of Equipment L Technical Representative Services M Operation of Government‑Owned Facilities N Installation of Equipment P Salvage Services Q Medical Services R Professional, Administrative and Management Support Services S Utilities and Housekeeping Services T Photographic, Mapping, Printing, and Publication Services U Education and Training Services V Transportation, Travel and Relocation Services W Lease or Rental of Equipment X Lease or Rental of Facilities Y Construction of Structures and Facilities Z Maintenance, Repair or Alteration of Real Property

Page 35: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

35

Page 36: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

36

Page 37: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

37

How do we define taxonomies in a wired world ?

Taxonomy: A classification of elements within a domain

Domain: a sphere of knowledge, influence, or activity

Classification: the operation of grouping elements and establishing relationships between them (or the product of that operation)

Relationships: a defined linkage between two elements

Element: an object or concept

Crandall, Mike.”Taxonomies for the Real World: The Business Imperative to Simply Content Access” TFPL Taxonomies for Business Conference, London, Oct.23, 2000.

Page 38: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

38

What are Taxonomies Good For?Taxonomies are applied to: Items (aka resources) individual pieces of

information (documents, people...

By the use of:Metadata: (aka properties, attributes) information

describing types of data

Which may or may not use values from a:Vocabulary: selection of terms, classified or sorted

To create:Content: an item and its associated metadata

Crandall, Mike.”Taxonomies for the Real World: The Business Imperative to Simply Content Access” TFPL Taxonomies for Business Conference, London, Oct.23, 2000.

Page 39: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

39

ChallengesInformation management across divisions of

your agencyAgency global intranets/Internet portalsGlobal or national document management

including technical documentationIncorporating taxonomy technology into agency

technology +info. policiesCost of building a taxonomyMoving a taxonomy from overhead to being a

core part of your agency’s information management.

Page 40: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

40

More ChallengesCertification of the taxonomy by an

authoritative body.Finding common ground across multiple

taxonomies or schemas with similar terms and different meanings.

Ensuring the ongoing integrity of the taxonomy with constant maintenance.

Acceptance by developers of tagging tools.Integrating with a legacy system and

external content.

Page 41: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

41

The core expertise required for constructing a taxonomy is:

Systems Analyst who understands specifications for creating taxonomies

Domain expert/Subject expert in the subject of the taxonomy

Computational linguist, AI engineerLinguist and/or LexicographerDatabase/Application Development ExpertAdministrative SupportReview Support

Page 42: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

42

Example of a custom taxonomy marked up in xbrl:

<?xml version=”1.0" encoding=”utf-8"?><schema xmlns:xbrl=”http://www.xbrl.org/core/2000-07-31">

targetNamespace=”http://www.xbrl.org/us/gaap/ci/2000-07-31"> <import namespace=http://www.xbrl.org/core/2000-07-31/

schemaLocation=”http://www.xbrl.org/core/2000-07-31/ xbrl-meta-2000-07-31.xsd”/>

<element name=”propertyPlantAndEquipmentGrossNote.purchasedSoftwareForInternalUse” type=”monetary”> <annotation>

<documentation>this is software that...</documentation> <appinfo> <xbrl:rollup to=”ci:propertyPlantAndEquipmentNetNote.propertyPlantAndEquipmentGrossNote” weight=”1" order=”7.5" /> <xbrl:label xml:lang=”en”>Purchased software for internal use</xbrl:label> <xbrl:reference name=”GPSI” number=”73" chapter=”11" paragraph=”b” subparagraph=”i” /> </appinfo>

</annotation> </element></schema>

Page 43: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

43

Page 44: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

44

Recommendations: Actively seek out existing taxonomies in the target discipline or

subject area. If your needs are met in part by an existing taxonomy use it and build on it.

Look at the intended purpose of the taxonomy and select appropriate software tools.

Consider scalability of the taxonomy. Look at the big picture and see how the taxonomy will be able to hook into others.

Consider utilizing numerical taxonomy as a schema in the metadata in order to merge documents in foreign languages.

Accommodate new standards whenever possible. Document “Best Practices” while creating the taxonomy and

review them regularly. Maintain and update the taxonomy continually.

Page 45: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

45

Your Agency

Taxonomy

Existing Taxonomy

in your Field

Related Taxonomy of other agency in same field

Related Taxonomy of other

agency hooked to one above

Electronic Document

in XML

Core Schema (Describes how

document is to be created)

Meta Model(Describes how

taxonomies are created)

Page 46: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

46

Efficient Web information

retrieval systems

in the form of search engines

or Web portals

require continued support and

improvement of:

Page 47: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

47

Web based classification and numerical taxonomic tools to use in

Web based cataloging tools such as CORC, which provides metadata based on

Taxonomies such as controlled vocabularies/thesauri which will be hooked together using

Metathesauri and standard information exchange systems such as MARC-XML

Page 48: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

48

And this is the house that Jack built…

With a wine cellar...

Page 49: Knowledge Organization: Library Tools and Taxonomies for the Web Jan Herd jher@loc.gov Business Reference Services Science, Technology & Business Division.

49

Knowledge Organization:Library Tools and

Taxonomies for the WebJan Herd [email protected]

Business Reference Services

Science, Technology & Business Division

The Library of Congress