THE SIGNIFICANCE OF SPATIAL DATA UNCERTAINTY FOR ECOLOGICAL APPLICATIONS

Spatial distributions of environmental variables are important for a variety of ecological modeling applications. Geographically distributed ecological phenomena may be modeled digitally as spatially referenced points, lines, and areas with associated variables as attributes. If the phenomenon is spatially continuous, the database objects which comprise it are the result of the particular field model employed. For this set of model, a single value is defined at every location in the data set. This project is primarily concerned with field models. No spatial data set is a perfect representation of reality. Fidelity is lost in the specification, acquisition, and production of data; the exact characteristics of this information loss usually can not be known. Furthermore, data collected at a relatively coarse spatial scale may lack sufficient useful information for an ecological process occurring at a finer scale. Uncertainty therefore exists regarding the difference between the spatial data set and reality. The question arises: how does this uncertainty in spatial data affect the results of an ecological model application?

Hypothetical example: A habitat model exists to identify areas on the landscape that can support the species of interest and the quality of the areas for the species. The underlying assumption for application of the model is that input spatial data is of high enough quality to capture landscape pattern at the process scale of the species. If only coarse (low quality) spatial data are available there is uncertainty regarding the results of the model due to the quality (resolution, extent, location, definition, measurement, etc.).

FRAMEWORK FOR HANDLING UNCERTAINTY

Uncertainty in the results of models can be quantified using one or several approaches identified in the following framework. Uncertainty exists in both the model specification and the data used in the model application. While analytically deriving the uncertainty for a particular spatial data set might be ideal, in practice this is frequently not possible. The methods offered here rely upon developing probability models for spatial uncertainty and simulating alternative, equally probable realizations.

A. Model Uncertainty Understanding of the role of environmental factors in any particular ecological process is incomplete. There is uncertainty about the values used in model parameters, as well as whether the parameters employed completely account for the ecological process of interest (animal population survival, biogeochemmical fluxes, fire or insect dispersion). One can use sensitivity analysis to understand the potential contribution of each of these model components to the uncertainty in the model output. For example, if changing a particular model parameter by small amounts results in a large fluctuation in the result, the model is sensitive to that parameter.

B. Data Error A data error approach attempts to identify elements in the development of spatial data which lead to error. Spatial uncertainty is seen as the outcome of a variety of factors including the difference between:
observers;
measurements; and
definitions.
One can seek to describe error for some portion of each of these input factors.

C. Data--Underdescribed Spatial Variability (i.e., what's missing in a coarse map) When a variable is spatially sampled at densities lower than the variation of that variable, information about the actual distribution is lost. This missing information may affect the ability of the data set to adequately characterize the variable for a particular ecological application. To model this properly, one must obtain an understanding of the spatial structure of the variability.

D. Data--Uncertainty of Measured Position Location is a critical attribute of spatial data. Typically, position is defined at least in a two-dimensional coordinate system. The measurement of location can not be absolutely precise, and the positional accuracy may have some spatial component. Uncertainty in measured position may play a role in ecological model outcomes.

Techniques have been developed which can assist several of these approaches. One important concept is that of error propagation. One uses statistical parameters to generate a number of spatial realizations of the data. A process is run on each of these spatial realizations. The process could be the ecological model itself, a GIS algorithm or set of processing commands, or a set of pattern metrics. By studying the variation in the process outcomes across the realizations, one may derive an understanding of the variation due to uncertainty in the data or a model parameter. This general technique is useful for all approaches listed earlier. For approach B, parameters for generating realizations may come from measures of input error. Approach C uses higher density samples for a subset characteristic of the entire spatial domain of interest. This subset could be "maplets", transects, plots, or points. For D, point locations may be perturbed for multiple runs of the model. For approach A, repeated model runs using a variety of parameter settings will generate a range of outcomes; outcome variability identifies the range of model uncertainty.

Differences between the spatial structure of the data and the structure of the reality it represents must be understood to adequately model the impact of spatial uncertainty on ecological model applications. The key for the success of this project is to underscore the importance of this concept to the ecological community and to identify research approaches to better understand and measure the role of spatial uncertainty in ecological modeling.



Ashton Shortridge is a Ph.D. student in the Department of Geography at the University of California at Santa Barbara. He is a researcher at the National Center for Geographic Information and Analysis at UCSB, where his focus has been on identifying, modeling, and reporting spatial data uncertainty. http://www.ncgia.ucsb.edu/~ashton/home.html