Integrating marine ecology data for scientific analysis and resource management: A community database prototype (Hosted by NCEAS)
- O. J. Reichman
- Matthew B. Jones
- Mark P. Schildhauer
NSF Proposal for this project.
|Graduate Student||24th February—16th June 2003||Participant List|
Data integration is increasingly recognized as a difficult but essential precursor to synthetic ecological analyses. The inability to easily integrate data becomes critical when the data also have major practical implications for policy planning or decision-making. In this proposal, we describe a prototype community database repository to accommodate a variety of heterogeneously structured marine ecological data sets. This internet-accessible database will provide a valuable data resource for investigators researching long-term and large-scale ecological processes in the marine environment.
The data will come from a multi-agency consortium of marine ecologists, analysts and data managers who are keen to make their data readily available in a common framework. These data are from academic, government, foundation, and private sources, and deal with the rocky intertidal zone at 61 sites throughout central and southern California. Aside from their intrinsic scientific interest, these data sets are critical in tracking changes in biological communities and invaluable for guiding and informing the preservation and development of California's nearshore environment.
We will use structured metadata to support query and access to the contents of the database. This will be achieved by implementing an XML-based Document Type Definition (DTD) specifically designed for ecological data, and called 'Ecological Metadata Language', EML. This content specification is based on recommendations from several committees of ecological researchers and technologists. We will also provide conversion tools that enable researchers to convert EML to HTML, or transform EML to two other useful metadata formats-Dublin Core and the FGDC's Content Standard for Digital Geospatial Metadata (CSDGM).
The prototype database will be developed on a UNIX system, and network accessible via the Web. We will strive to develop all tools and services based on open standards, and avoid hardware or operating system dependencies.
We will also develop software tools that enable scientists to effectively document their own data for contribution to the prototype repository. We will develop some basic data quality control processing routines that will cross check the data contents based on the documentation provided in the metadata. We will also develop extensions to the general ecological content specification that will enhance the utility of these tools for the marine ecology discipline. Our methodology, however, will be general and extensible, permitting future extensions to be developed in support of other specialties.
We envision the database repository as the basis for a growing resource that will be invaluable for coastal researchers and policy makers. The insights gained from this information resource will represent novel theoretical and practical advances for ecology as a whole.
This material is based upon work supported by the National Science Foundation under Grant No. DBI99-04777. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).