Sampling Vector Polygon Data with a Point Grid Overlay

This example uses the R sp and PBSmapping packages to extract a regularly-spaced grid of attribute samples from the Bailey's Ecoregions map for the continental United States.

Bailey's Ecoregions delineate areas of common climactic and vegetation characteristics across the continent. They are available as a digital spatial dataset (various formats) from from the United States National Atlas data portal.

Download data and R Code for this example

Project Requirement:

An NCEAS scientist required a regularly-spaced grid of Bailey's Ecoregions measurements at different sampling frequencies. A straightforward geospatial processing task,can be completed using geospatial processing methods in the spand PBSMapping packages.

Note: we use the PBSMapping package in this example because they provide a straightforward interface to various geospatial processing (including map plotting) operations. There is a trade-off: incoming geospatial files must be converted into PBSMapping-specific data objects (e.g., the PolySet) prior to processing, and then, when necessary, converted to the desired output format prior to creating output files.

This Use Case demonstrates the reading, displaying, and sampling of the Bailey's Ecoregions file and the spatial data sets derived from this map.

1) Read Bailey's Ecoregions Shape File. Extract and display attribute subset

The Bailey's Ecoregion polygons have nine descriptive attributes. In this example, example, we extract/sample thee of them using the sp package overlay() method, and two point grids that we will create in the next step of this example. On the right is the R script, on the left is a portion of the reduced @data slot of the Bailey's Ecoregions SpatialPolygonsDataFrame object.

Bailey's Ecoregion Polygons: USA

Baileys EcoRegion map


'data.frame': 1690 obs. of 3 variables:
 $ ECO_US_ : num 1246 1247 1248 1249 1250 ...
 $ ECO_US_ID: num 135 136 1402 15 1403 ...
 $ ECOCODE : Factor w/ 164 levels: 
 "-212A","-212B",..: 1 116 51...

 ECO_US_ ECO_US_ID ECOCODE
0 1246 135 -212A
1 1247 136 M212A
2 1248 1402 -242A
3 1249 15 -242A
4 1250 1403 -242A
5 1251 14 M242B
6 1252 1404 -242A
7 1253 1405 -242A
8 1254 1 M242A
9 1255 1406 -242A
10 1256 1407 -242A
 

   library(sp)
   library(rgdal)   
   library(PBSmapping)
   library(maptools)
                                                                                               
# Read ESRI Shape File into SpatialPolygonsDataFrame
# rgdal readOGR() method gets projection information.

   EcoRegion <- readOGR("BaileysEcoRegionUsaGP.shp",
                        "BaileysEcoRegionUsaGP")
   theProjection <- proj4string(EcoRegion)

# Make copy of Spatial Polygon file's Data slot,
# including ONLY columns 3 through 5, corresponding
# to ECO_US_ and ECO_US_ID, and  ECOCODE identifiers.
# Then, create a copy of the Spatial Polygon file,
# using the data subset.

   NewData <- EcoRegion@data[,3:5]
   EcoRegionTrunc <- EcoRegion
   EcoRegionTrunc@data <- NewData

# Display the structure of NewData,
# and then the first 10 rows of the table.

   str(NewData)
   print(NewData[1:10,])
    
# Plot Bailey's Ecoregions polygon, with PBSmapping methods

   EcoRegionPS <- SpatialPolygons2PolySet(EcoRegion)                                
   plotPolys(EcoRegionPS,xlab="Deg W Longitude",
             ylab="Deg N Latitude",
             main="Bailey's Ecoregions for USA",
             projection="LL")                

2) Create the Sampling Point Grids: Two spatial resolutions

Next, use sp package GridTopology() and SpatialGrid() method to generate two sampling grids: with two degree and one degree spatial resolution. Images on the left are (top to bottom): the one-degree grid points, inside (orange) and outside (green) of the USA boundary; one degree grid (blue) and two-degree grid (red) inside the USA boundary.

Baileys EcoRegions In / Out of USA

Baileys EcoRegions One / Two Deg grid


     
# Generate a point grid spanning the Baileys Ecoregion polygons:
# create grids at one-degree and two-degree spatial resolutions

   boundingVals <- EcoRegionTrunc@bbox
   deltaLong <- as.integer((boundingVals[1,2] -
                        boundingVals[1,1]) + 1.5)
   deltaLat <- as.integer((boundingVals[2,2] - 
                      boundingVals[2,1]) + 1.5)

# First, the one-degree grid

   gridRes <- 1.0
   gridSizeX <- deltaLong / gridRes
   gridSizeY <- deltaLat / gridRes

# GridTopology object is basis for sampling grid

   GridTopoOneDeg <- GridTopology(boundingVals[,1],
                                  c(gridRes,gridRes),
                                  c(gridSizeX,gridSizeY))

# Create the sampling point grid...
# Use sp / overlay() to extract Bailey's values at grid points. 
# Need to promote the point grid to a Spatial Points object.

   SampGridOneDeg <- SpatialGrid(GridTopoOneDeg,
                                 proj4string = CRS(theProjection))
   SampPointsOneDeg <- as(as(SampGridOneDeg,"SpatialPixels"),
                                             "SpatialPoints")
   
# Create the two-degree grid by changing one parameter....

   gridRes <- 2.0

   gridSizeX <- deltaLong / gridRes
   gridSizeY <- deltaLat / gridRes
   GridTopoTwoDeg <- GridTopology(boundingVals[,1],
                                 c(gridRes,gridRes),
                                 c(gridSizeX,gridSizeY))
   SampGridTwoDeg <- SpatialGrid(GridTopoTwoDeg,
                                 proj4string = CRS(theProjection))
   SampPointsTwoDeg <- as(as(SampGridTwoDeg,"
                SpatialPixels"),"SpatialPoints")   

3) Perform point sampling / overlay operations. 'Clip' grids to USA Border


# It is straightforward to select ONLY the grid points falling within the
# United States border. Use sp overlay(), which returns a data frame in which 
# attributes for points OUTSIDE of the bounding (Bailey's) polygon have NA values
 
   OneDegPointsInUSA <- overlay(EcoRegionTrunc,SampPointsOneDeg)
   TwoDegPointsInUSA <- overlay(EcoRegionTrunc,SampPointsTwoDeg)   

# Filter out the 'outside USA border' points,
# and add spatial coordinates to create SpatialPointsDataFrame
  
   flags = !is.na(OneDegPointsInUSA[,"ECO_US_"]) 
   OneDegPtsInUSA <- OneDegPointsInUSA[flags,]
   OneDegPtsInUSA_SP = cbind(OneDegPtsInUSA,
                            SampPointsOneDeg[flags,]@coords)
   coordinates(OneDegPtsInUSA_SP) <- cbind(OneDegPtsInUSA_SP[,4],
                                           OneDegPtsInUSA_SP[,5])                                     
   
# promote the data frame to SpatialPointsDataFrame
# by adding the spatial coordinates of corresponding points.   

   flags = !is.na(TwoDegPointsInUSA[,"ECO_US_"]) 
   TwoDegPtsInUSA <- TwoDegPointsInUSA[flags,]  
   TwoDegPtsInUSA_SP = cbind(TwoDegPtsInUSA,
                             SampPointsTwoDeg[flags,]@coords)
   coordinates(TwoDegPtsInUSA_SP) <- cbind(TwoDegPtsInUSA_SP[,4],
                                           TwoDegPtsInUSA_SP[,5])                                 
  
# Plot the one-degree point sets against Bailey's Ecoregion background
# To display the points on a plot under PBSMapping, convert them to Event Lists

   NumPts <- length(SampPointsOneDeg@coords[,1])
   OneDegEventsAll <- as.EventData(data.frame(EID=1:NumPts,
                                   X=SampPointsOneDeg@coords[,1],
                                   Y=SampPointsOneDeg@coords[,2],
                                   projection = "LL"))
  
   NumPts <- length(OneDegPtsInUSA_SP@coords[,1])
   OneDegEventsUSA <- as.EventData(data.frame(EID=1:NumPts,
                                   X=OneDegPtsInUSA_SP@coords[,1],
                                   Y=OneDegPtsInUSA_SP@coords[,2],
                                   projection = "LL"))
                                   
   NumPts <- length(TwoDegPtsInUSA_SP@coords[,1])
   TwoDegEventsUSA <- as.EventData(data.frame(EID=1:NumPts,
                                   X=TwoDegPtsInUSA_SP@coords[,1],
                                   Y=TwoDegPtsInUSA_SP@coords[,2],
                                   projection = "LL"))                                
 

4) Plot clipped grids, then write them as Point Shape Files

Sampling Result Spreadsheet


#
Formal class 'SpatialPointsDataFrame'
 [package "sp"] with 5 slots
 ..@ data :'data.frame': 
      206 obs. of 4 variables:
 .. ..$ EID :
     num [1:206] 3 4 5 6 ...
 .. ..$ X : 
     num [1:206] -121 -119 -117 -115  ...
 .. ..$ Y :
     num [1:206] 48.5 48.5 48.5 48.5  ...
 .. ..$ ecoCode: Factor w/ 114 levels:
 "-212F","-212H",..: 88 111 ...
 .. ..- attr(*, "data_types")= chr [1:4]
 "N" "N" "N" "C"
 ..@ coords.nrs : num(0) 
 ..@ coords : num [1:206, 1:2]
 -121 -119 -117 -115 -113 ...
 .. ..- attr(*, "dimnames")=List of 2
 .. .. ..$ : NULL
 .. .. ..$ : chr [1:2]:
             "coords.x1" "coords.x2"
 ..@ bbox : num [1:2, 1:2]:
      -122.8 26.5 -68.8 48.5
 .. ..- attr(*, "dimnames")=List of 2
 .. .. ..$ : chr [1:2]:
    "coords.x1" "coords.x2"
 .. .. ..$ : chr [1:2] "min" "max"
 ..@ proj4string:Formal class 'CRS'
 [package "sp"] with 1 slots
 ..@ projargs: chr NA
#
# Note that EventData class
 does not retain the point attributes
#
Classes ‘EventData’ and 'data.frame':
 206 obs. of 3 variables:
 $ EID: int[1:206] 1 2 3 4 
 $ X: num[1:206]: -121 -119 -117 -115 ...
 $ Y: num[1:206]: 48.5 48.5 48.5 48.5 ...

# Use PBSmapping plotPolys/addPoints to create plots:        
# Plot both  of one-degree point grid over base map
# Then, add thee two-degree grid to the map.

   dev.set(1)                                
   plotPolys(EcoRegionPS,xlab="Deg W Longitude",
             ylab="Deg N Latitude",
             main="One Deg Grid: All (Green) / In USA (Orange)",
             projection="LL")             
   XL1 = as.numeric(range(OneDegEventsAll$X))
   YL1 = as.numeric(range(OneDegEventsAll$Y))
   addPoints(OneDegEventsAll,xlim=XL1, ylim=YL1,
             col="green",pch=20,cex=.75)         
   
   XL1 = as.numeric(range(OneDegEventsUSA$X))
   YL1 = as.numeric(range(OneDegEventsUSA$Y))
   addPoints(OneDegEventsUSA,xlim=XL1, ylim=YL1,
             col="orange",pch=20,cex=.75)      

   message("hit key to see the one and two degree grids")
      
# Create a second plot with the one and two degree grids
# Add the two-degree point grid

   dev.set(1)
   plotPolys(EcoRegionPS,xlab="Deg W Longitude",
             ylab="Deg N Latitude",
             main="Grids: One Deg (Blue) / Two Deg (Red)",
             projection="LL")             

   addPoints(OneDegEventsUSA,xlim=XL1, ylim=YL1,
             col="blue",pch=20,cex=.75)
   message("hit key to overlay the two-degree grid..")
   browser()

   XL2 = as.numeric(range(TwoDegEventsUSA$X))
   YL2 = as.numeric(range(TwoDegEventsUSA$Y))
   addPoints(TwoDegEventsUSA,xlim=XL2, ylim=YL2,
             col="red",pch=20,cex=0.75)   
   message("hit key to write sample grids to disk..")
   browser()

# Write final results to ESRI Point Shape File.
# unlike the maptools-based writeXXXShape() methods,
# rgdal-based writeOGR writes the spatial object's 
# projection information (.prj Shape File component)
   writeOGR(OneDegPtsInUSA_SP,"BaileyEcoRegionGridOneDeg.shp",
           "BaileyEcoRegionGridOneDeg",driver="ESRI Shapefile")
   writeOGR(TwoDegPtsInUSA_SP,"BaileyEcoRegionGridTwoDeg.shp",
           "BaileyEcoRegionGridTwoDeg",driver="ESRI Shapefile")   

Learning More:

The USFS Ecoregions Center: US Forest Service Natural Resource Assessment and Analysis Ecoregions Center web site

The CRAN (R) Task View summarizes the R Analytical Environment's spatial data analysis and management tools..

This ESRI White Paper describes the Shape File format.

This Web Site describes the standard .dbf format which is a component of the ESRI Shape File.

Point of Contact for this Use Case: reeves [at] nceas [dot] ucsb [dot] edu (Rick Reeves), NCEAS Scientific Programmer
This Use Case was updated May, 2010.