Crushing, Blending, and Stretching Data Data Warehousing and Mining Data from Library and University Information Systems for Assessment of Library Operations: A Case Study in Progress Ecole des sciences de l'information, Rabat, Morocco, Monday, April 13, 2009 Ray Schwartz, Systems Specialist Librarian Cheng Library, William Paterson University, Wayne, New Jersey, USA schwartzr2 @ wpunj.edu
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Crushing, Blending, and Stretching Data
Data Warehousing and Mining Data from Library and
University Information Systems for Assessment of Library
Operations: A Case Study in Progress
Ecole des sciences de l'information, Rabat, Morocco, Monday, April 13, 2009
Ray Schwartz, Systems Specialist Librarian
Cheng Library, William Paterson University, Wayne, New Jersey, USAschwartzr2 @ wpunj.edu
2
Outline
• Why Assessment and Why Now?• What is Data Mining and Data
Warehousing and Why Do We Do It?• Our Library and University• Groups and Services• Steps• Reporting
3
Have We Always Assessed?
• Anecdotally—Yes.• Systematically—Not usually.
– Large scale assessment of manual systems (such as serials check-in, and card catalogs, circulation files) are not practical.
– Smaller scale and directed assessment is possible.
4
What changed since the days of manual systems?
5
• For many institutions in the West, the Integrated Library System (ILS) has been in use for over 20 years.
• Larger scale assessment is now possible with the electronic systems.– Counts of circulation transactions– Fund codes for purchases of library
materials• Reports from vendor services
– Bibliographic utilities– Subscription agents– Book jobbers
6
7
8
What is different now?
• New services have come into existence.– Inside libraries
• Full-Text Databases• Link Resolvers
– Outside of libraries• Google• Amazon
9
10
What is Data Mining and Data Warehousing
• Extracting data from legacy systems and other resources;
• cleaning, scrubbing and preparing data for decision support;
• maintaining data in appropriate data stores; • accessing and analysing data using a variety
of end user tools; • and mining data for significant relationships.
• Chaffey, D., Mayer, R., Johnston, K., & Ellis-Chadwick, F. (2002). Internet Marketing: Strategy, Implementation and Practice (2nd ed.). Financial Times/ Prentice Hall.
11
• The primary purpose of these efforts is to provide easy access to specifically prepared data that can be used with decision support applications such as management reports, queries, decision support systems, executive information systems and data mining.
• Chaffey, D., Mayer, R., Johnston, K., & Ellis-Chadwick, F. (2002). Internet Marketing: Strategy, Implementation and Practice (2nd ed.). Financial Times/ Prentice Hall.
12
Of course there are many ways to measure
– Scott Nicholson’s
Measurement Model
13
Knowledge states and User citations to materials•How useful is the library system?•Focus groups, User Citation tracking
Usability•Effectiveness of the system for the staff and institution.
External (User)
Recorded interactions with interface & materials•Bibliomining•Transaction/Web Log Analysis•Observation of User Behavior
Procedures and Standards•Staff survey and interviews•Audits of collections, systems, or staff
Internal (Library System)
UseLibrary SystemPerspectiveTopic
Nicholson, Scott (2004). A Conceptual framework for the holistic measurement and cumulative evaluation of library services. Journal of Documentation 60(2) p.164-181
• 19 librarians and 26 library staff• 350,000 volumes• 18,000 audiovisual items• 22,000 print and electronic periodicals • 100 general and subject specific
databases
16
Our Systems since 2005
• Voyager ILS • Online Periodical Database (OPD)• Clio ILL Software• EZProxy Server• Banner – University ERP• University Networked Drive K:• University Email Server• University Web Server
39.0836.2993% 13 14 508 76 13 419 GRADUATE STUDENTS
19.6915.3978% 186 238 3,663 698 250 2,715 UNDERGRADUATE STUDENTS
CIRC/ BORROWER
CIRC/ MEMBER
% BORROW
INGBORROWERSMEMBERSTOTAL CIRCEQUIP CIRCMEDIA CIRCBOOK CIRCPATRON STATUS
24
Problems with Configuration of Services
• Little to no linkage of data• Need to search multiple services
to get complete picture of serial holdings
• Multiple user IDs for authentication
25
Retirement the the OPD
• Serials holdings data was extracted from the OPD and added to Voyager catalog
• From Voyager catalog, serials holdings data is extracted and added to Serials Solutions A to Z list
26
• Authentication of ILL form is routed through the EZProxy server
• A web bug is placed in the microform request page to record submission in the Voyager's web server logfile.
27
New Services Added
• Serials Solutions MARC Record Service
• Serials Solutions Link Resolver• OCLC Worldcat Collection Analysis
28
Second Step – Setup an Application Server
29
Our Systems in 2008
• Voyager ILS• Shared Application Server• Clio ILL Software• EZProxy Server• Banner – University ERP• University Networked Drive K:• University Email Server• University Web Server
30
Integrated Library System
Voyager
Patrons Searches
Banner
SIS HRS
Web Server
Circulation Media Scheduling
University Networked Drive K:
ILL ( Cliodata )
Patrons Materials
Proxy Server
Off Campus Dbase Hits & ILL Form
( EZProxy Log )
University Email Server
Application Server
Scripting Language
Web Server
DBMS
Usage by
OffCampus Dbase
Patron Groups
ILL Patrons/
Materials Requested
ILL Patrons/Materials Received
Current Relationships
Internal
only WPUNJ Server
Externally accessibleWPUNJ Server
NonWPUNJ Server
Scripting Language
( University ERP System )
Systems Chart - 2008
Other Vendors‘ Database Services & Usage Reports
www.wpunj.edu Scripting Language
Web ServerILL Form
Page
ER Micro Form
Serials Form
Serials SolutionsA to Z
MARC Records
Link Resolver
OCLC – Bibliographic Utility
WorldCat
ILL
WCA
31
What is an Application Server?
• A machine or its software that works in conjunction with a web server to deliver application services such as the dynamic creation of a webpage from content stored in a database. From http://www.webtools.ca.gov/help/Glossary.asp
• Web Server Software (Apache or IIS)• Database Management System – DBMS (MySQL,
Oracle, MS SQL Server)• Scripting Language (Perl, PHP, ColdFusion, ASP)
32
Why an Application Server?
• Relevant data in logfiles need to be in a database to be analyze.
• Need your own DBMS to create new tables and queries.
33
• Decide how you will use the Application Server.
• Decide on the best and most plausible configuration.
34
One of Our Projects• Mining EZProxy logfiles and linking to
patron statistical categories from the Voyager Patron Database
– What majors and departments are accessing which database services?
– What majors and departments are accessing the ILL services?
35
Integrated Library System
Voyager
Patrons Searches
Banner
SIS HRS
Web Server
Circulation Media Scheduling
University Networked Drive K:
ILL ( Cliodata )
Patrons Materials
Serials SolutionsA to Z
MARC Records
Link Resolver
Proxy Server
Off Campus Dbase Hits & ILL Form
( EZProxy Log )
University Email Server
Application Server
Scripting Language
Web Server
DBMS
Usage by
OffCampus Dbase
Patron Groups
ILL Patrons/
Materials Requested
ILL Patrons/Materials Received
Current Relationships
ILL Collection and Patron Group Analyses
Off Campus Database Hits by Patron Group
Internalonly
WPUNJ Server
Externally accessibleWPUNJ Server
NonWPUNJ Server
( University ERP System )
OCLC
WorldCat
ILL
Systems Chart - 2008
Other Vendors‘ Database Services & Usage Reports
www.wpunj.edu Scripting Language
Web ServerILL Form
Page
ER Micro Form
Serials Form
WCA
Scripting Language
36
ILL request form authentications by major – Academic year 07/08
9M- Music9M- Special Programs8M- Psychology7M- Biotechnology7M- Political Science6M- Anthropology6M- Music - Jazz Studies4M- Business4M- Communication4M- Nursing