Top Banner
•The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. •The complexity of the fundamental scientific questions being addressed require a variety of data with integrative and innovative approaches if we are to find solutions. •Geoscientists have a tradition of sharing of data, but being willing to share data if asked or even maintaining an obscure website accomplishes little. Also as a community, we have no mechanisms to share the work that has been done when a third party cleans up, reorganizes or embellishes an existing database. •We waste a large amount of human capital in Some Some Thoughts About the Need for Thoughts About the Need for Cyberinfrastructure Cyberinfrastructure
13

The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Jan 05, 2016

Download

Documents

Milo Wright
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

•The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies.•The complexity of the fundamental scientific questions being addressed require a variety of data with integrative and innovative approaches if we are to find solutions. •Geoscientists have a tradition of sharing of data, but being willing to share data if asked or even maintaining an obscure website accomplishes little. Also as a community, we have no mechanisms to share the work that has been done when a third party cleans up, reorganizes or embellishes an existing database.•We waste a large amount of human capital in duplicative efforts and fall further behind by having no mechanism for existing databases to grow and evolve via community input.

SomeSome Thoughts About the Need for CyberinfrastructureThoughts About the Need for Cyberinfrastructure

Page 2: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Data Set: A relatively raw compilation of data (standards, formats, completeness may be questionable)

Data Base: A mature data compilation that has been “cleaned”, standardized with input from the scientific community, formatted for use by others (not based on proprietary software, e.g., ORACLE)

Data System: A linked and organized set of data bases including public domain software (not platform dependent) and procedures to analyze the data

Some DefinitionsSome Definitions

Page 3: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Goals of the gravity data system effortGoals of the gravity data system effortConstruct an updated gravity database that will have excellent spatial coverage and quality that will serve a diverse community of users, while being simple to access and flexible enough to meet the range of applications and changing needs of users.

Create a structure to provide efficient updating and to encourage additions of new data by users (i.e., craft a living database)

Construct a robust toolbox of public domain software.

Link this specialized data system into an emerging broad national (and North American) geoscience data system.

Page 4: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Software Development EffortsSoftware Development Efforts

1. Advanced technique to remove duplicate measurements and to provide for merging of new data without creating new duplicates (new high quality data will be time-stamped)

2. Advanced technique to detect erroneous values

3. Web interface to the data system (26 students in a software engineering capstone class are involved)

4. Mapping software (GUI for GMT)

5. Modeling software (graphical interface for 2.5 D models)

6. Digital processing (published USGS package - a start)

7. Data reduction equations (community consensus)

Page 5: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

GDRP1 - welcomeGDRP1 - welcome

Page 6: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Map searchMap searchQuickTime™ and a

TIFF (LZW) decompressorare needed to see this picture.

Page 7: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

GDRP3 - search resultsGDRP3 - search results

Page 8: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

•From From Geological Geological Survey of Survey of Canada Canada websitewebsite

Page 9: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

GDRP2 - base stationsGDRP2 - base stations

Page 10: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Base station descriptionBase station description

Page 11: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Base station pictureBase station picture

Page 12: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

GDRP4 - uploadGDRP4 - uploadNSF

Page 13: The Geosciences are a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity.

Status and ProductsStatus and Products Web-based data systemWeb-based data system

Gravity database (U. S. first pass Gravity database (U. S. first pass done; TCs done/online done; TCs done/online soonsoon))

Base stations (U. S. Stations Base stations (U. S. Stations available on lineavailable on line))Educational material (Tutorial Educational material (Tutorial donedone))Standards (Committee Standards (Committee consensus, implementation consensus, implementation

underwayunderway))Links to other related sites (Links to other related sites (donedone))Software to manipulate data (subgrids, Software to manipulate data (subgrids,

profiles, filters, etc.) (profiles, filters, etc.) (Partly donePartly done))Modeling software (Modeling software (2.5D 2.5D donedone))

North American GridsNorth American Grids

North American MapsNorth American Maps