Top Banner
Data Management: File Systems, Databases, and Metadata Karen S. Baker 1 , Christy A. Troxell-Thomas 2 , William G. Pooler 1 1 Graduate School of Library and Information Science, University of Illinois Urbana-Champaign 2 Biology Department, University of Illinois Springfield Assembling and organizing data often occurs over time. Differing approaches to data storage, organization, and metadata may be used at different stages of project development. A comparison is provided of file systems (#1) and relational databases (#2, #3) for heterogeneous field data projects. Overview 1. File system with files named and placed logically, hierarchically for data storage and organization. Strength: Change is handled with less effort for file systems than for databases; change is a property of high value at the beginning of a project. Weakness: File systems can not have many too many relationships, which makes some analysis difficult. 2. Relational Database Single Key (1 to n relations) with a single key defining relations for 1-to-n queries so multiple files can be opened but specific information cannot be pulled out. This works well for data that can be assembled in a single table but not at the variable level. Strength: More structure with some flexibility, so it can identify and access many files easily. Weakness: There are no many to many relationships so complex analysis is difficult. 3. Relational Database Multiple Relations (n-to-n queries) with multiple keys that facilitate complex queries and allow subsets of data from multiple tables to be assembled into a single product. Strength: Databases can query across many tables to support complex, efficient analysis. Weakness: Databases are rigid designs with set rules and programmatic constraints can make changes and redesign options difficult. Factors for making a transition community data management readiness personnel and resource arrangements stable file system small, simple data table nascent technical infrastructure 3. Rela-onal Database Mul-ple Rela-ons Emiquon Partners TFSE UIS UIUC TNC USF&WS Dickson Mounds INHS (FBFS, IRBFS) 2. Rela-onal Database Single Key University of Illinois By content type: Catalog Document-oriented Full-text Graphic Photographic Knowledge Platform stream Real-time _______________ _______________ By subject: Spatial (Geographical) Temporal (Time period) Project Theme/Phenomenon Domain Botany Chemical Ecological Rivers (hydro) _______________ _______________ Emiquon Science Conference March 2015 Acknowledgement Supported by National Science Foundation (NSF DEB, Rapid Grant# 1347077) and the Institute of Museum and Library Services (IMLS) Data Curation Education in Research Centers (DCERC, Award# RE-02-10-0004-10). 1. File System: Readme file, file names & headers (e.g. Box) 2. Relational Database Single Key: One key (e.g. FileMaker Pro) 3. Relational Database Multiple Relations: Multiple keys, data dictionaries & machine readable form (e.g. Access) Kinds of Metadata Emiquon TFSE FBS IRBS Procedures Thompson Lake -CH Merwin - CH Illinois River - CH Bald Eagle Use Days Waterfowl Abundance Raptor Abundance Emiquon Veg Spunky Fish Merwin Fish GPS Coordinates STRMP Fish IlLTRMP Veg Three Approaches to Data Organiza-on 1. File System Examples of Kinds of Databases
1

Data Management: File Systems, Databases, and Metadata

Feb 04, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Management: File Systems, Databases, and Metadata

Data Management: File Systems, Databases, and Metadata

Karen S. Baker1, Christy A. Troxell-Thomas2, William G. Pooler1 1Graduate School of Library and Information Science, University of Illinois Urbana-Champaign

2Biology Department, University of Illinois Springfield

Assembling and organizing data often occurs over time. Differing approaches to data storage, organization, and metadata may be used at different stages of project development. A comparison is provided of file systems (#1) and relational databases (#2, #3) for heterogeneous field data projects.

Overview  

1.  File system with files named and placed logically, hierarchically for data storage and organization.

Strength: Change is handled with less effort for file systems than for databases; change is a property of high value at the beginning of a project.

Weakness: File systems can not have many too many relationships, which makes some analysis difficult.

2.  Relational Database Single Key (1 to n relations) with a single key defining relations for 1-to-n queries so multiple files can be opened but specific information cannot be pulled out. This works well for data that can be assembled in a single table but not at the variable level.

Strength: More structure with some flexibility, so it can identify and access many files easily.

Weakness: There are no many to many relationships so complex analysis is difficult.

3.  Relational Database Multiple Relations (n-to-n queries)

with multiple keys that facilitate complex queries and allow subsets of data from multiple tables to be assembled into a single product.

Strength: Databases can query across many tables to support complex, efficient analysis.

Weakness: Databases are rigid designs with set rules and programmatic constraints can make changes and redesign options difficult.

Factors for making a transition

community data management readiness

personnel and resource

arrangements

stable file system

small, simple data table

nascent technical infrastructure

3.  Rela-onal  Database  Mul-ple  Rela-ons  

Emiquon Partners TFSE UIS

UIUC TNC

USF&WS Dickson Mounds

INHS (FBFS, IRBFS)

2.  Rela-onal  Database  Single  Key  

University of Illinois

By content type: þ Catalog þ Document-oriented þ Full-text þ Graphic þ Photographic þ Knowledge þ  Platform stream þ Real-time ☐ _______________ ☐ _______________

By subject: þ Spatial (Geographical) þ Temporal (Time period) þ Project þ  Theme/Phenomenon þ  Domain Botany Chemical Ecological Rivers (hydro) ☐ _______________ ☐_______________

Emiquon Science Conference March 2015

Acknowledgement Supported by National Science Foundation (NSF DEB, Rapid Grant# 1347077) and the Institute of Museum and Library Services (IMLS) Data Curation Education in Research Centers (DCERC, Award# RE-02-10-0004-10).

1.  File System: Readme file, file names & headers (e.g. Box) 2.  Relational Database Single Key: One key (e.g. FileMaker

Pro) 3.  Relational Database Multiple Relations: Multiple keys,

data dictionaries & machine readable form (e.g. Access)

Kinds  of  Metadata  

Emiquon

TFSE FBS IRBS Procedures

Thompson Lake -CH

Merwin - CH

Illinois River - CH

Bald Eagle Use Days

Waterfowl Abundance

Raptor Abundance

Emiquon Veg

Spunky Fish

Merwin Fish

GPS Coordinates

STRMP Fish

IlLTRMP Veg

Three  Approaches  to  Data  Organiza-on  

1.  File  System    

Examples  of  Kinds  of  Databases