Top Banner
Chapter 22 The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data U. Salzmann, G. Krause, B.P. Koch, and I. Puch Rojo 22.1 Introduction Regular and efficient exchange of data between investigators is essential for the progress of interdisciplinary and integrative scientific research. Research data have to be easily accessible and retrievable in a structured and understandable format in order to facilitate comparative studies and make them available to a wider commu- nity. Both intra-scientific communication and transfer of scientific knowledge to stakeholders have been integral parts of the interdisciplinary research project MADAM (Mangrove Dynamics and Management) that aims at supporting envi- ronmental management in northern Brazil (Berger et al. 1999). In order to ensure data availability, quality and exchange, a central GIS database called MAIS (Mangrove Information System) has been developed during the initial stage of the MADAM project (Koch 1997). MAIS archives and synthesizes heterogeneous data collected during 10 years of interdisciplinary research in biology, geography, biogeochemistry, socio-economy and meteorology in north Brazil. Facilitated by modern computer performance and memory capacity, the typical scientist stores and analyzes research data on his personal workstation or local server using the resources and applications of his local system. If data are regularly backed up, this method of scientific “data management” is relatively secure and straightforward. However, the volume of valuable and often unique information and data continually increases throughout the “life cycle” of a scientific project, which should result in publications in peer-reviewed journals. Journal publications contain figures, tables and interpretations, whereas digital primary data are rarely published. The primary (raw) data and supporting information (metadata), which are stored in the investigators’ personal file system, become rapidly unmanageable and are, at the end of each research project, in danger of being permanently “buried” in private archives. This equates to an effective loss of data and knowledge to the scientific community (Helly et al. 2003). Raw or primary research data are unique and must be stored and managed for the long-term. Concerted initia- tives to prevent research data loss have started more than 40 years ago with the U. Saint-Paul and H. Schneider (eds.), Mangrove Dynamics and Management in North Brazil, Ecological Studies 211, DOI 10.1007/978-3-642-13457-9_22, # Springer-Verlag Berlin Heidelberg 2010 355
10

The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

Apr 22, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

Chapter 22

The Mangrove Information System MAIS:

Managing and Integrating Interdisciplinary

Research Data

U. Salzmann, G. Krause, B.P. Koch, and I. Puch Rojo

22.1 Introduction

Regular and efficient exchange of data between investigators is essential for the

progress of interdisciplinary and integrative scientific research. Research data have

to be easily accessible and retrievable in a structured and understandable format in

order to facilitate comparative studies and make them available to a wider commu-

nity. Both intra-scientific communication and transfer of scientific knowledge to

stakeholders have been integral parts of the interdisciplinary research project

MADAM (Mangrove Dynamics and Management) that aims at supporting envi-

ronmental management in northern Brazil (Berger et al. 1999). In order to ensure

data availability, quality and exchange, a central GIS database called MAIS

(Mangrove Information System) has been developed during the initial stage of

the MADAM project (Koch 1997). MAIS archives and synthesizes heterogeneous

data collected during 10 years of interdisciplinary research in biology, geography,

biogeochemistry, socio-economy and meteorology in north Brazil.

Facilitated by modern computer performance and memory capacity, the typical

scientist stores and analyzes research data on his personal workstation or local

server using the resources and applications of his local system. If data are regularly

backed up, this method of scientific “data management” is relatively secure and

straightforward. However, the volume of valuable and often unique information

and data continually increases throughout the “life cycle” of a scientific project,

which should result in publications in peer-reviewed journals. Journal publications

contain figures, tables and interpretations, whereas digital primary data are rarely

published. The primary (raw) data and supporting information (metadata), which

are stored in the investigators’ personal file system, become rapidly unmanageable

and are, at the end of each research project, in danger of being permanently

“buried” in private archives. This equates to an effective loss of data and knowledge

to the scientific community (Helly et al. 2003). Raw or primary research data

are unique and must be stored and managed for the long-term. Concerted initia-

tives to prevent research data loss have started more than 40 years ago with the

U. Saint-Paul and H. Schneider (eds.), Mangrove Dynamics and Managementin North Brazil, Ecological Studies 211, DOI 10.1007/978-3-642-13457-9_22,# Springer-Verlag Berlin Heidelberg 2010

355

Page 2: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

establishment of large data repositories such as the World Data Center System

(WDC), which promotes open access and exchange of scientific data (e.g. Mounsey

and Tomlinson 1988; Alverson and Eakin 2001, http://www.ngdc.noaa.gov/wdc).

Today, numerous archiving facilities for environmental scientific data are available

worldwide, ranging from large central data repositories to rather small databases

addressing specific research fields, disciplines or even single research projects (e.g.,

Baba et al. 2004; Diepenbroek et al. 2002; Konnen and Koek 2005). However, to

date, there are no internationally binding regulations for scientific data manage-

ment. The subject of an ongoing debate is how scientific data should be managed

and made available to the general scientific and public community (Klump et al.

2006; Dittert et al. 2001).

Here, we present a description of concept, design and functionality of the GIS-

database MAIS of the MADAM project. Our main objective is to highlight the

potential of such a central data management system for improving interdisciplinary

research. In regard to the ongoing debate on the freedom of scientific information,

we also discuss the challenges in running a project database and outline the

necessary conceptual prerequisites for a successful management of heterogeneous

research data.

22.2 Implementation of a GIS-Database

During the initial planning stage of MAIS, questionnaires were distributed and

meetings organized to fully assess the project collaborators’ needs and expectations

in regard to scientific data management. The interdisciplinary character of the

MADAM project resulted in the production of extremely heterogeneous datasets,

developed from zoological, botanical, geochemical and meteorological measure-

ments as well as the socio-economic census. This heterogeneity made high

demands on the design of the MAIS database. The following major requirements

and objectives for a central project data management were identified:

(1) Long-term data availability in Brazil and Germany (preferably via the Internet)

(2) Secure and long-term data storage

(3) High quality of supporting information (metadata)

(4) Protection of scientific ownership (data privacy)

(5) Use of geo-referenced data to facilitate spatial data comparison

(6) Flexibility and adaptability to specific project requirements

(7) User-friendly graphical user interface (GUI) for data search and visualization

As the program MADAM aims at delivering its scientific outcomes to relevant

stakeholders, the project database has also been regarded as a tool for supporting

management decisions in north Brazil. Therefore, considerable effort has been

devoted to the development of an user-friendly GUI and a clear description of

metadata, which should also be understandable to non-scientists. MAIS was imple-

mented in 1995 when standards in information technology were relatively low

356 U. Salzmann et al.

Page 3: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

compared with today, and when MADAM was still in its initial programme phase.

This necessitated the design of an upgradeable, flexible database, adaptable to

varying project requirements and progress in information technology.

22.3 Database Model and Data Management

Research data within the MADAM project were managed on different levels. Data

storage and processing on the investigators’ personal computers guaranteed a

maximum of data privacy and adaptability to individual research needs. Specific

data analyses and individual software was employed at this level. If necessary, users

were advised by database administrators on how to archive data professionally to

facilitate a subsequent data transfer to the central project database. Before import-

ing into MAIS, the investigators raw data were restructured and redundancies were

removed. Each dataset was catalogued and documented using standardized meta-

data, which assured internal consistency and unambiguous description. Data were

distributed through the intranet and thereby could be made accessible to all project

members.

The MAIS database was initially developed on Microsoft Access Version

97–2000 for Windows. Access is one of the most popular relational database

management systems, which is easily programmable and has a very user-friendly

interface and fast search engine (Viescas 2004). The Access database was con-

nected to ArcView 3.2 using a special GIS-interface, programmed in Avenue,which enabled a visualization and basic analysis of geo-referenced data. The master

version of the MAIS database and GIS-interface was based in Germany and copies

were regularly distributed to servers in Brazil. In 2002, we migrated MAIS onto

a platform-independent and entirely internet-based data management system,

which facilitated data distribution and did not require expensive software licensing.

We employed the open source software MySQL (http://www.mysql.com) for

database management and MapServer (http://mapserver.gis.umn.edu) for the visu-

alization of geo-referenced data. The technical update also increased data security

and privacy. Protection against plagiarism was a major concern of many project

members storing their research data in a central data management system. There-

fore, if requested by the author, unpublished data were protected with a username

and password. Such a protection of data ownership is particularly important for

ongoing research projects as they contain high numbers of sensitive and unpub-

lished research data or preliminary work. While many datasets in MAIS are pass-

word protected, its meta-information is still accessible for all users providing

information about the status of ongoing research within the MADAM project.

The design and implementation of the MAIS database followed general stan-

dards for software and database development (e.g., Lang and Lockemann 1995). The

database was fully normalized and structured using a relational data model (Codd

1990). A full normalization implies (e.g., Carleton et al. 2005): (1) elimination of

repeating groups and redundant data; (2) elimination of columns not dependent on

22 The Mangrove Information System MAIS 357

Page 4: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

key fields, which uniquely identify each record; and (3) isolation of independent

and semantically related multiple relationships.

MAIS-data are grouped in three main units: natural science, social science, and

climate data (derived from climate data loggers) (Fig. 22.1). A separate publication

unit links project papers and reports with respective research data. The primary data

are described by metadata standardized following the “Global Change Master

Directory” (http://gcmd.nasa.gov). MAIS metadata provide information on project,

staff, method, parameter, equipment and sample type. Geographical latitudes

and longitudes are assigned to each dataset, which allows a spatial synthesis and

comparison of research results. An internal species key connected to each bio-

logical dataset enables data retrieval on different taxonomic levels. The species key

follows the nomenclature provided by the Integrated Taxonomic Information

System, ITIS (http://www.itis.gov).

The heterogeneity of datasets generated by the MADAM project was a major

challenge and required a flexible and dynamic data model. This was particularly

true for the integration of fishery and socio-economic census data for which we

had to denormalize the relational database to optimize performance and size of

database queries and applications. For the same reason, most data on the mangrove

crab Ucides cordatus were managed in a separate database unit, which is used for

fisheries assessments (Araujo 2006; Chap. 19).

A multilevel menu-based, user-friendly web interface with applications for

advanced data retrieval was developed and made accessible through the intranet

of the Leibniz-Center of Tropical Marine Ecology in Bremen and Brazil. The MAIS

graphical user interface consists of data retrieval forms and a graphical tool

Fig. 22.1 Schematic design of MAIS database showing grouping of main data and metadata and

accessibility through different graphical user interfaces

358 U. Salzmann et al.

Page 5: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

supported by MapServer to visualize geospatial data (Fig. 22.2). After login,

different web-based forms provide for each data unit (climate data logger, publica-

tion, social and natural science data) access to all data tables of the relational

system. The forms allow the user to retrieve individually queried subsets of

research results. Queries are based on the combination of the following fields,

which can be selected using drop-down lists of available values:

(1) project; (2) researcher, (3) parameter/parameter group, (4) site/station, (5)

method, (6) taxon/group (with advanced search on different taxonomic levels); (7)

time period (start, end); (8) author (for publication); (9) title and year of publica-

tion; (10) keywords (publication).

The result pages are interactive in providing additional metadata, such as

advanced project or parameter descriptions for every dataset on point-and-click in

MAIS Security Login

MAIS Menu & Description

Data Retrieval & Metadata Information

Visualisation in Dynamic Maps

Fig. 22.2 MAIS examples showing login page, main menu, data retrieval form and thematic map

of study area

22 The Mangrove Information System MAIS 359

Page 6: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

a pop-up window. A mapserver module allows the integration and visualization

of geo-referenced data in thematic maps (Fig. 22.2). MapServer was developed

by the University of Minnesota as an Open Source development environment for

building spatial enabled Internet applications (http://mapserver.gis.umn.edu). With

MapServer, we created thematic maps, which allowed the user to browse through

GISdata stored in MAIS. The maps are fully dynamic and different layers can be

added and zoomed (see example in Chap. 19, Fig. 19.3). However, the mapping tool

provided by MapServer does not replace a full geographical information system

and we still employed ArcView to conduct advanced geospatial analyses and used

MapServer to publish the thematic maps.

22.4 MAIS: A Tool for Supporting Interdisciplinary Research?

Archiving and managing scientific data in central databases is a time consuming

and costly endeavor. A professional scientific data management requires a clear

separation of data archiving and integration from “data gathering” and analysis,

which is the responsibility of the respective investigator. In particular, in fixed term

projects, both tasks, database management and research, often compete for the same

funding and databases were managed at the expense of “real” science. This raises

the question whether a central project data management is really worthwhile in

terms of costs and benefits.

Besides assuring long-term data storage, MAIS aimed at being flexible and

dynamic to actively support interdisciplinary research within the running project.

This was particularly important for bridging natural science and social science

data. The flexible design of MAIS greatly facilitated the storage of heterogeneous

datasets, including those originating from social science and fisheries biology. Instead

of reorganising the data to fit into a predefined archive structure, we modified the

project database to meet the requirements of specific scientific demands. This

flexibility made MAIS a useful tool for supporting interdisciplinary science.

MAIS was successfully applied by MADAM researchers to analyze, synthesize

and visualize project data (e.g., Glaser and Diele 2004, 2005; Goch et al. 2005;

Krumme et al. 2005, 2007; Araujo 2006; Chap. 19). Close cooperation and regular

communication between database administrators and researchers was a prerequisite

for successful data management. Both sides actively benefited from this coopera-

tion. While administrators needed to understand the structure of research data, staff

and in particular MSc and PhD students took advantage by receiving professional

advice in scientific data management and analysis.

Although MAIS had a great potential for initiating and supporting interdiscip-

linary science, the overall number of project investigators who regularly used the

project database was rather limited. The problems of acceptance of research

databases are well known in the scientific database management community (e.g.,

French et al. 1990; Gray et al. 2005; Grobe and Diepenbroek 2006). The reasons for

these problems are manifold, and combined efforts towards a better collaboration

360 U. Salzmann et al.

Page 7: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

can be made on both the investigator and database management side. Several

attempts were made to make MAIS more popular by introducing user-friendly

and efficient applications. In the following, we will discuss three major problem

areas that we identified while working with MAIS. We will also define the pre-

requisites which are needed to ensure a successful scientific data management

within research projects.

22.4.1 Quality Control and Improved Analysis Tools

MAIS put much effort into metadata quality and the design of a user-friendly GUI.

Advanced metadata standards, which facilitate the exploration of existing data, as

well as improved analysis and visualization tools, are key factors for a successful

scientific data management in the coming decade (Gray et al. 2005). The increasing

heterogeneity of data in interdisciplinary projects puts even higher demands on the

quality of metadata. Data must be self-describing and must follow international

standards. Good metadata are central for data analysis, data visualization and data

sharing among different disciplines (Gray et al. 2005). However, to recognize the

benefits of a central scientific data management, the user must also be able to

retrieve, interchange, compare, analyze and visualize data in a most efficient way.

Unfortunately, available database applications are often insufficient and cumber-

some and do not address the investigators’ specific needs. Failures in the design of a

user-friendly man�machine interface are not only caused by technical limitations

but also by a lack of communication between database managers and scientists.

Whereas in MAIS, metadata standards reached highest levels, database tools for

retrieving, analyzing and visualizing geo-referenced data were rather limited. More

sophisticated applications could have significantly increased the viability of MAIS

for project investigators, but its implementation would have surpassed the financial

and technical scope of the MADAM project.

22.4.2 Appropriate Support and Funding

One of the biggest nontechnical barriers, which hamper an efficient operation of

scientific databases, is the often low attention researcher and funding bodies pay to

scientific data management. This results in an insecure funding situation, which is a

major threat to long-term archives. Whereas the production of data is often well-

funded, its management is chronically underfunded (French et al. 1990). As a result,

scientific databases are often managed part-time by regularly changing scientific

staff and students, which are primarily interested in data production rather than in

its management. MAIS was also affected by this lack of continuity.

22 The Mangrove Information System MAIS 361

Page 8: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

22.4.3 Intellectual Property Rights and Better Incentivesfor Data Sharing

Protection of intellectual property rights and free exchange of information are

subjects of an ongoing controversy debate within the scientific community (e.g.

Dittert et al. 2001; Klump et al. 2006). In fact, researchers have only little incentives

to release their unpublished datasets into central data management systems. The

fear of plagiarism combined with the lack of binding standards for citing database

sources are major reasons that prevent researchers from publishing their data in

central databases. However, the ability of investigators to share data is vital to the

progress of interdisciplinary and integrative scientific research (Helly et al. 2003).

In MAIS, we protected the property rights by offering an optional password system,

which could be selected by researchers to protect their unpublished data. While

most project members, in particular PhD students, felt confident with this solution,

it had the major drawback that the number of password-protected datasets quickly

exceeded those freely accessible. Once protected by a password, it appeared to be

very difficult to receive permission from authors to release data thereafter. The high

number of password-protected data finally reduced the capability and usability of

MAIS. Our experience underlined that, within a research project, binding rules for

data transfer and release are essential for a successful central data management.

These regulations must include timetables and deadlines. Today, many funding

agencies, research organizations or projects actively encourage data sharing and

transfer to data centres (e.g., National Environmental Research Council, http://

www.nerc.ac.uk/research/sites/data/policy.asp, or US Geological Survey, http://

www.usgs.gov/foia/). Internationally binding regulations, however, are still miss-

ing and many principal investigators still refuse to archive their data in appropriate

databases (Dittert et al. 2001). As long as standards for data citations are missing

and data collectors are not adequately credited in a way comparable to journal

publication standards, internationally binding rules are not applicable.

22.5 Concluding Remarks

There is a growing need for central and integrative data management solutions in

interdisciplinary research projects, where research data must be continuously avail-

able and freely exchangeable. The Mangrove Information System (MAIS) has

proven to be a useful tool for promoting interdisciplinary research and data synth-

eses within the MADAM project. The GIS database MAIS managed heterogeneous

datasets on biology, chemistry, geography and socioeconomics collected in north

Brazil over a period of 10 years. Research data were accessible for the Brazilian and

German project members through the Internet by a user-friendly graphical user

interface.

362 U. Salzmann et al.

Page 9: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

Although MAIS has been successfully used for research data synthesis, the

project database did not work to full capacity and some project investigators showed

little interest in a further use of a central data management system. A general

unwillingness of investigators to share data, coupled with a critical attitude towards

databases, is a common phenomenon in the scientific community. Binding regula-

tions, such as making data sharing part of the funding policy, are an effective way to

improve data availability and to increase the quality of scientific databases. How-

ever, the introduction of such regulations (and sanctions) must be accompanied by

efforts to give better incentives for scientists to release their data into central data

management systems and to create the necessary metadata. Such key incentives

include binding standards and regulations for the citation of archived datasets,

which should be supported by technical mechanisms to track usage of archived data.

References

Alverson K, Eakin CM (2001) Making sure that the world’s palaeodata do not get buried.

Nature 412:269

Araujo A (2006) Fishery statistics and commercialisation of the mangrove crab, Ucides cordatus(L.), in Braganca – Para-Brazil. PhD thesis, University of Bremen, Bremen

Baba S, Gordon C, Kainuma M, Ayivor JS, Dahdouh-Guebas F, Brown M (2004) The Global

Mangrove Database and Information System (GLOMIS): present status and future trends. In:

Vanden Berghe E, Costello MJ, Heip C, Levitus S, Pissierssens P (eds) Proceedings ‘The

Colour of Ocean Data’: international symposium on oceanographic data and information

management with special attention to biological data Brussels, Belgium, November 25–27,

2002. IOC Workshop Report 188. UNESCO, Paris, pp 3–14

Berger U, Glaser MEL, Koch BP, Krause G, Lara R, Saint-Paul U, Schories D, Wolff M (1999) An

integrated approach to mangrove dynamics and management. J Coast Conserv 5:125–134

Carleton CJ, Dahlgren RA, Tate KW (2005) A relational database for the monitoring and analysis

of watershed hydrologic functions: I. Database design and pertinent queries. Comput Geosci

31:393–402

Codd EF (1990) The relational model for database management, version 2. Addison-Wesley,

Reading

Diepenbroek M, Grobe H, Reinke M, Schindler U, Schlitzer R, Sieger R, Wefer G (2002)

PANGAEA – an information system for environmental sciences. Comput Geosci

28:1201–1210

Dittert N, Diepenbroek M, Grobe H (2001) Scientific data must be made available to all. Nature

414:393

French JC, Jones AK, Pfaltz, JL (1990) Scientific Database Management (Final Report). Report of

the Invitational NSF Workshop on Scientific Database Management, Technical Report 90–21,

Department of Computer Science, University of Virginia, Charlottesville, VA

Glaser M, Diele K (2004) Asymmetric Outcomes: Assessing the biological economic and social

sustainability of a mangrove crab fishery, Ucides cordatus (Ocypodidae), in North Brazil. EcolEcon 49:361–373

Glaser M, Diele K (2005) Resultados assimetricos: Avaliando aspectos centrais da sustentabil-

idade biologica, economica e social da pesca de caranguejo, Ucides cordatus (Ocypodidae).In: Glaser M, Cabral N, Ribeiro AL (eds) Gente, ambiente e pesquisa: Manejo transdisciplinar

no manguezal, Belem, pp 51–68

22 The Mangrove Information System MAIS 363

Page 10: The Mangrove Information System MAIS: Managing and Integrating Interdisciplinary Research Data

Goch YG, Krumme U, Saint-Paul U, Zuanon JAS (2005) Seasonal and diurnal changes in the

fish fauna composition of a mangrove lake in the Caete estuary, north Brazil. Amazonia

18:299–315

Gray J, Liu DT, Nieto-Santisteban M, Szalay AS, DeWitt D, Heber G (2005) Scientific data

management in the coming decade. CTWatch Quarterly 1(1). http://www.ctwatch.org/

quarterly/articles/2005/02/scientific-data-management/

Grobe H, Diepenbroek M (2006) Der Wert von Daten liegt in ihrer Nutzung. GMIT Geowis-

senschaftliche Mitteilungen 25:31–32

Helly J, Staudigel H, Koppers A (2003) Scalable models of data sharing in the earth sciences.

Geochem Geophys Geosyst 4:1010

Klump J, Bertelmann R, Brase J, Diepenbroek M, Grobe H, Hock H, Lautenschlager M,

Schindler U, Sens I, W€achter J (2006) Data publication in the Open Access initiative. Data

Sci J 5:79–83

Konnen GP, Koek FB (2005) Description of the CLIWOC database. Clim Change 73:117–130

Koch BP (1997) Konzeption und Abgleich der Projektdatenbank f€ur das okosystemare Forschung-

sprojekt “Mangrove Dynamics and Managament”. Dipl thesis, University of Oldenburg,

Oldenburg

Krumme U, Keuthen H, Barletta M, Saint-Paul U, Villwock W (2005) Contribution to the feeding

ecology of predatory wingfin anchovy Pterengraulis atherinoides (L.) in north Brazilian

mangrove creeks. J Appl Ichthyol 21:469–477

Krumme U, Keuthen H, Saint-Paul U, Villwock W (2007) Contribution to the feeding ecology of

the banded puffer fish Colomesus psittacus (Tetraodontidae) in north Brazilian mangrove

creeks. Braz J Biol 67:383–392

Lang SM, Lockemann PC (1995) Datenbankeinsatz. Springer, Heidelberg

Mounsey H, Tomlinson RF (eds) (1988) Building databases for global science. Taylor & Francis,

London

Viescas JL (2004) Microsoft access 2003. Inside out. Microsoft, Redmond, WA

364 U. Salzmann et al.