Structuring Data to Facilitate Analysis Jerry J. Vaske Jay Beaman Colorado State University Warner College of Natural Resources Human Dimensions of Natural Resources Fort Collins, CO shop at the 2008 Pathways to Success Confere ating Human Dimensions into Fish & Wildlife
35
Embed
Structuring Data to Facilitate Analysis Jerry J. Vaske Jay Beaman Colorado State University Warner College of Natural Resources Human Dimensions of Natural.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Structuring Data to Facilitate Analysis
Jerry J. VaskeJay Beaman
Colorado State UniversityWarner College of Natural Resources
Human Dimensions of Natural ResourcesFort Collins, CO
Workshop at the 2008 Pathways to Success Conference:Integrating Human Dimensions into Fish & Wildlife Mgmt.
Workshop Foundation
Workshop Objectives
• Illustrate strategy for:
– Facilitating analysis of 2006 National Survey ofFishing, Hunting, and Wildlife-Associated Recreation (FHWAR)
– Increasing the usability of FHWAR data formanagement, planning & policy
• Compare two types of data structures:
– Flat files
– Relational Entities
Traditional Flat File
Rows = RespondentsColumns = Variables
Flat Files – Journal Article Example
Every journal article has:
• One or more authors
• Title
• Journal name
• Specifics about date of publication:YearVolume numberIssue numberPage numbers
• Potentially keywords
Flat File Data Structure for Journal Articles
Potential Issues with Flat Files
• Problem– Diefenbach et al (2005) article had 7 co-authors– 7 columns (variables) necessary to accommodate
all authors’ last names– 19 of 26 articles in flat file had only 1 or 2 authors– 67% of author fields empty– If first names included – more empty fields
• Solution – Relational database
Relational Databases• Definition
– Set of tables containing data for predefined categories– Data stored in separate files (tables) that are linked
• Terminology– Table = Entity (E)
– Rows (tuples) in table = information about an object(e.g., journal article or respondent)
– Columns (attributes) = variables
– Two types of relations (R)1. Set of tuples – a table with attributes (these R’s store data)2. Algebraic (Person ID in Table A = Person ID in Table B)
Keep Person_ID Sportsperson_Weight Sex State_of_Residence In_State Response Activity_Location Fish_Hunt_Type Response_Unit ; Person_ID = PersonID ; Sportsperson_Weight = spwgt ; Sex = Xsex ; State_of_Residence = put (resstate, $st2num2.) ;
* Array stores info to identify state when decompressing ;Array a1( 2, 8 ) HUNTSTD1-HUNTSTD8 STDAYSHD1-STDAYSHD8 ;
* Array stores info to associate species with variables ;
Array gam1( 9) g1-g9 ;Retain g1 1 g2 2 g3 3 g4 4 g5 5 g6 6 g7 7 g8 40 g9 41 ;Array a7( 2, 9 , 8 ) bgame1d1--bgdifday9d8 ;Do m = 1 To 2 ; Do j=1 To 9 ; Do k=1 To 8 ;If a1( 1, k) = ' ' Then Goto End7 ; Fish_Hunt_Type = gam1(j) ; If m = 1 Then Do ; Response_Unit = 1 ; End ; Else Do ; Response_Unit = 2 ; End ; Response = a7(m, j, k) ; Activity_Location = put(a1( 1, k), $st2num2. ) ; If Activity_Location = State_of_Residence Then In_State = 1 ; Else In_State = 0 ;
* Outputs data for hypothesis;
If Response > 0 Then Output ;End7: End ; End ; End ;run ;
SAS Entity to SPSS Entity
Get SAS Data = ‘C:\Hunt_BGspecies_States.sas7bdat’.
Add Value labels
Save Outfile = ‘C\Hunt_BGspecies_States.sav’.
Testing Hypothesis with Relational EntityGET File = 'C:\Hunt_BGspecies_States.sav'.
WEIGHT BY Sportsperson_Weight.
Select if (Activity_Location = 8 or Activity_Location = 56).
Select if (Fish_Hunt_Type = 2).Select if (Response_Unit = 2).
UNIANOVA Response BY Sex Activity_Location.
Opens data
Weights data
CO huntersWY hunters
Elk huntersDays of participation
ANOVA
GET File = 'C:\FHWAR\Hunting_Activity.sav'.Select if (Sub_Table_ID = 10).
WEIGHT BY Sportsperson_Weight.
Select if (Activity_Location = 8 or Activity_Location = 56).
Select if (Fish_Hunt_Type = 2).Select if (Response_Unit = 2).
UNIANOVA Response BY Sex Activity_Location.
Results
Conclusions
• Analyses that are difficult to perform with flat file data are possible with relational structure
• Restructuring all of 2006 FHWAR data as well as data from 1991, 1996, & 2001 would:
– Yield similar analysis capabilities
– Allow for trend analysis
– New practical opportunities for state agencies
Practical Opportunity• State agencies have accurate records of license
sales (e.g., hunting only, fishing only, combos)
• With potentially 100s of licenses, permits, & stamps sold, not practical to ask about specific licenses in a flat file
• Moving to relational structure for obtaining license data has advantages …
Advantages of Relational License Data
1. Can ask about actual state license salesAll state license info can be “pre-stored” in one entitySize of entity would not impact other data entities
2. Questions about specific license cost not necessary; correct information pre-stored
3. Establishing relationship between state specific license sales & FHWAR dataprovides foundation for benchmarking / calibratingmeaningful estimates based on FHWAR
From Analysis to Data Collection
• Entity based models:– facilitate analyses– can also enhance data collection
• Currently working with software company Techneos (www.techenos.com) toimplement pilot models that yield:– more consistent and – accurate data collection