How and why to document data for long-term storage; and What's special about Geographical data?

Post on 08-Feb-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

How and why to document data for long-term storage; and What's special about Geographical data?. Allan Reese Cefas Weymouth. Cefas buildings. Weymouth. Burnham-on-Crouch. Fish farms in E&W. Fish disease. Spring Viremia of Carp Virus. Causes and progress of infection - PowerPoint PPT Presentation

Transcript

How and why to document data for long-term storage;

and

What's special about Geographical data?

Allan ReeseCefas Weymouth

Cefas buildings

Weymouth

Burnham-on-Crouch

Fish disease

• Causes and progress of infection• Diagnostics (viruses, bacteria,

fungi, parasites)• Vaccines and therapeutics

(safety and efficacy)• Epidemiology & risk assessment• Surveillance and control –

Fish Health Inspectors• Emerging and exotic diseases• Policy advice

Spring Viremia of Carp Virus

Fish farms in E&W

Who wants a database?• I’ve got some data so I need a database• Our demo will show you how easy it is to

simultaneously search, share and retrieve information from thousands of library databases

• Project… plans to build, through networking, a database on best practices in the field

• Rapid growth in the quantity of omic data means bio-informaticians need to manage data in an efficient and reliable manner.  The main focus of this course is on designing, creating and querying relational databases

Why a (relational) database?

1. large volume of data (typically gigabytes)2. complex data structure (not matching standard

application)3. long-term use / continued accumulation or

incremental update4. total accuracy & consistency needed on micro-scale5. frequent accesses to small subsets, ad hoc queries6. data shared by more than one person

(University Computing 1991; Significance Dec 2007)

Extract for analysis• Fields ( variables ) = columns• Units ( level of analysis ) = rows• Columns x Rows = Data table

Query -> view ->

table of data -> summary or analysis

Mystery meat• What tables form the raw data?• What fields are in each table?• Data dictionary?• Documenting meanings or DB structure?

Table preferred when• Scientific data probably SHOULD NOT be changed

– or data added in batches ( incremental )

• Structure NOT complex– replication across units allowed, but not excessive

• Levels of analysis are few ( or few dominant )• Analyses summarize whole data or samples

– often one-offs ( bespoke or user-written ) • Sorting or indexing allows very rapid access

Data table needs metadata• Metadata standards (Dublin core)

– emphasis on discovery – list many fields– codebook not mentioned

• A modest suggestion– data table of rows and columns, with column headers– codebook: another table to explain headers– metadata: describe background, ownership etc

Geographical Databases

                               

                     

ESRI (ArcInfo) assumes• The purpose of a GIS is to provide a spatial

framework to support decisions …• Most often, a GIS presents information in the

form of maps and symbols …• A map user is the end consumer of a GIS.

This person looks at maps …• When the Cassini spacecraft was launched,

GIS was used to evaluate the risk of an accident with the plutonium generators on board

Nearer to me

GISs contain

• Data as points, lines, areas• Location data

– lat/long, grid refs, postcodes, toids• Representation instructions

– scaling, icons, label position, shading

Can you get data out?• Point and click works for pop-up labels

– not to output a table• Limited to the precision of the input device, including

the user’s eyesight• I want, probably, a whole layer of data, including the

positions as named fields

How do my needs map into the database?

Lacking / hidden / difficult in GIS

• List fields associated with physical object• Choose many objects and output data

– eg to make proximity matrix• Distinguish raw from constructed data

– point-heights versus interpolated contour

• Output data values for an area – eg sea surface temperatures

Request

GIS suppliers may prefer to address users’ needs by adding yet more features to the interface, or pointing to the SQL interface

I would rather they re-consider the role of the GIS as a data warehouse, from which it should be easier to select and extract data that can be analysed in other software

top related