Generate Region Of Interest (ROI) Polygons for Point Data With R

This case presents an R script that: 1) Reads and displays in an outline map ESRI point and polygon Shape Files for a three-county Central California region; 2) demonstrates two different R programming methods for generating the Convex Hulls bounding the polygons 'belonging' to each county, 3) calculates the areas of each county and Convex Hull polygon.

Getting Started:

A Convex Hull polygon is defined as the smallest convex polygon bounding all of the members of a point set. Ecologists often calculate convex hull polygons from point data measurements (such as species foraging data) as estimates of feeding ranges and other population attributes.

As with many R programming tasks, there is more than one way (and associated set of R packages) to calculate Convex Hulls. In this use case, we will demonstrate two methods: from the sp and maptools packages to read and convert the incoming shape files into plot-friendly Polygon data objects. Routines from the splancs package determine the polygons containing each point (the well-defined 'point-in-polygon' problem).

This example uses the following R packages:

  • The base grDevices package provides base graphics services to other R packages, including point, line, and polygon drawing and color and pattern control.
  • PBSMapping provides two-dimensional spatial analysis, transformation, and plotting functions typically found in Geographic Information Systems. This package was developed by scientists conducting statistical fisheries research at the Pacific Biological Station in Nanaimo, British Columbia.
  • splancs provides functions for spatial and space-time point pattern analysis of polygons.
  • sp is a critical base package for spatial operations in R. It provides definitions for basic spatial classes (points, lines, polygons, pixels, and grids) in an attempt to unify the way R packages represent and manage these sorts of data. It also includes several core functions for creating and manipulating these data structures. The sp package is automatically loaded when the splancs package is used. 

If you have not already added these packages to your local R installation, you can do so using the Packages/Install Packages option in the R GUI toolbar (MS Windows XP users), or using the following statement at the R command line:

install.packages(c("maptools","splancs","PBSmapping"))

# sp loaded automatically with maptools or splancs

For tips on installing R on the Ubuntu Linux platform widely used at NCEAS visit the link:

To run thesample code discussed here, download and expand this .zip archive: (ConvexHullRDemoFiles.zip). This file includes the R script and input files.

Working through the example program: CalculateConvexHull.r

The ConvexHullDemo() function:

  1. Reads and converts an ESRI polygon Shape File and an ASCII .csv format point location file into graphics-compatible formats,
  2. Assigns the point locations to their bounding polygons using a Point-in-Polygon algorithm
  3. Generates convex hull polygons around several groups of points
  4. Displays the input and calculated spatial features (e.g., convex hulls) using a simple map graphic.
  5. ConvexHullDemo() demonstrates two different R programming approaches for geospatial data processing and display: routines from the maptools library (combined with methods from the base grDisplay graphics library , and routines from the PBSmapping package.

    First, configure the R libraries, read the spatial data files into R geospatial data objects, and plot them using base R graphics tools

    ConvexHullDemo <- function()
    {
    #
    # Load the required libraries:
    # 
       library(maptools)   
       library(PBSmapping) 
       library(splancs)
    #
    #debug(ConvexHullDemo)
    #browser()
    #
    # read in two spatial datasets: Three California county polygons
    #
       TriCounty <-readShapeSpatial("TriCountyPolyDD.shp")    
       sites <- read.csv("ThreeCountyPoints.csv") 
    # 
    # Create a 1-dim matrix of unique county keys
    # FIPS code is the unique key for counties 
    # Note for later: The first key,"079", corresponts to SLO County
    #
       indexes = as.matrix(TriCounty$COUNTYFIPS)
    #
    # get number of points in the 'sites' data set
    # 
       len = length(sites$LatDD) 
    # 
    # promote the point data frame to an SpatialPointsDataFrame object 
    # so that we can add new attributes to the data frame.
    # 
       sites2 = sites
       coordinates(sites2) = c("LonDD","LatDD") 
    #
    # get the lat/lon coords for the points
    #
       xy=coordinates(sites2)
    #
    # append an attribute column to the point SPDF to hold the covering polygon
    #   
       ar = array(0,len)
       sites2$inpoly = as.vector(ar)  
       TcPolys = Map2poly(TriCounty)
    # 
    # Lets have a look at the data: plot all the polygons and then all the points 
    # 
       plot(TcPolys)
       points(xy,pch="*") 
    

    Here are the baseline counties and their point locations

    Three-County Polygon View

    Next, compute Point-in-Polygon, convex hull, and convex hull area for San Luis Obispo County:

    # Loop through all of the individual polygons in the shape file. 
    
    # (extracted from the incoming point shapefile) that fall within the 
    # boundaries of the current polygon. 
    
    # Next, perform the convex hull calculations: First, demonstrate the chull() method 
    # using the single San Luis Obispo County polygon (and the points that it bounds).
    # The San Luis Obispo County points have the greatest spatial extent. SLO county
    # is the first polygon in TcPolys, so the remaining 'subsites' object contains the 
    # correct points. 
    #
       ThisPoly = TcPolys[[1]] 
    
       inside = inout(xy,ThisPoly)
    #
    # assign the FIPS code for current polygon to the attribute column of all points falling within.
    # Note for later 
       sites2$inpoly[inside] = indexes[1]  # SLO County key = "079" 
    #
    # create a two-column matrix containing coordinates of sites within current polygon
    #
      subsites = as.matrix(cbind(xy[inside==TRUE,1],xy[inside==TRUE,2]))
    #
       hullpts = chull(subsites)
    # 
    # chull() returns a vector of integer coordinates. So, for plotting purposes,
    # close the 'polygon' by appending the first to the last.
    #
       hullpts = c(hullpts,hullpts[1])
    #
    # plot the hull by referencing the 'subset of the subset' points and connecting the dots.
    #
       lines(subsites[hullpts, ],col="green")
    #
       matChull = as.matrix(subsites[hullpts, ])
    #
    

    Here is the result of point-in-polygon and convex hull calculations for San Luis Obispo County, in which the point data exhibit the greatest spatial dispersion.

    San Luis Obispo County Polygon


    Then, compute the convex hull area:
    # Next, compute the area of the convex hull 
    # Tis requires some preprocessing as we need to construct
    # a polygon whose vertices are geographic coordinates and
    # which is projected into UTM Coordinates. 
    # To do so, we use methods and data objects from the 
    # PBSMapping package. 
    #
    # First,create a matrix containing the geographic coordinates
    # of the point coverage elements that make up the convex hull.
    #
    # Create PBSMapping data types: Polygon and Polygons (list of Polygon)
    #
       poly = Polygon(matChull)
       ID = c("P1") # character vector label required by Polygons data type
    # 
    # we need to create a PolySet object in order to use the calcArea method.. 
    # Remember that 'mat'is an R 2-D matrix with the closed coordinates
    # (last point = first point) for a single polygon
    #
       ps = appendPolys(NULL,matChull,1,1,FALSE)
    #
    # to compute area in units of kilometers ** 2, set the projection
    # attribute of the PolySet to UTM and Zone 9.
    #
       attr(ps,"projection") = "LL"
       attr(ps,"zone") = 9
       psUTM = convUL(ps)
    #
       polygonArea = calcArea(psUTM,rollup=1)
    #
    # polygonArea is a compound data type, see them with the command attributes(polygonArea)
    # we will print just the $area attribute.
    #
       message("SLO County convex hull area (using chull()) is %f km**2.",polygonArea$area)
    

    Here is the Convex Hull for SLO County:

    Convex Hull for SLO County points

    Finally, perform the same set of operations using methods from the PBSmapping package:

    #
    # Now, calculate the Convex Hull for the same county using only
    # the spatial data processing routines in PBSmapping. 
    # Note that we used some PBSmapping routines in the first example. 
    #
    # remember, 'sites' is the data frame containing the three-county point data
    #
    # To compute the convex hull, convert the subsites data frame (with SLO county 'inpoly'
    # field marking the points of interest) to a PBSmapping-format 
    # EventData object with longitude and latitude columns named X and Y, respectively.
    #
       SloSites = sites2[sites2$inpoly == indexes[1],]
    #
    # Lets now use the  the PBSMapping methods for reading and displaying ESRI Shapefiles
    #
    # Often, it is necessary to use a new data object/format when using a different R geospatial package. 
    # In this case, the PBSMapping package uses the EventData and PolySet data objects to 
    # handle point and polygon data, respectively
    #
    # First, store the point data in EventSet objects.
    #
    # The calcConvexHull() requires an EventData (data frame) objectleng
    # We create one using the @coords slot from the SloSites SpatialPointsDataFrame object.
    # To see the underlying structure of SloSites, use '> Str(SloSites)'
    #
       len = length(sites2@data[,1])
       AllCountyEventSet <- as.EventData(data.frame(EID=1:len,X=sites2@coords[,1],Y=sites2@coords[,2],projection="LL"))
       len = length(SloSites@data[,1])
       SloEventSet <- as.EventData(data.frame(EID=1:len,X=SloSites@coords[,1],Y=SloSites@coords[,2],projection="LL",zone=9),projection="LL",zone=9)
    #
    # compute the convex hull for SLO county points; results stored in a PolySet data object.
    #
       convexHullPoly = calcConvexHull(SloEventSet)
    #
    # set the projection attribues to: Latitude/Longitude, UTM Zone 9
    #
       attr(convexHullPoly ,"projection") = "LL"
       attr(convexHullPoly ,"zone") = 9
    #
    #  Read the Polygon shape file into a SpatialPolygonsDataFrame, and convert this 
    #  to a PolySet object compatible with the PBSMapping plotPolys function. 
    #  Set projection attributes to match the convex hull polygon.
    #
       CountyPolysSP <- readShapePoly("TriCountyPolyDD", proj4string=CRS("+proj=longlat +zone=9"))
       CountyPolysPS <- SpatialPolygons2PolySet(CountyPolysSP)
    #
    # Open a second graphics device: dev.set(1) opens the first available new device
    #
       dev.set(1)
    #
    #  Plot the County polygons
    #
       plotPolys(CountyPolysPS,projection=TRUE)
    #
    # plot only the SLO points in the eventSet, using addPoints()
    # 
       addPoints(SloEventSet)
    #
    # Next, draw the latest version of the convex hull on the new map. 
    #
       addPolys(convexHullPoly,border = "red")
    #
    # Finally, compute the area of the convex hull. 
    # First, convert the coordinates to UTL so that area calculations are in units of km ** 2.
    #
       convexHullPolyUTM = convUL(convexHullPoly)
       polygonArea = calcArea(convexHullPolyUTM)
    #
      print(sprintf("SLO County convex hull area (using calcConvexHull()) is %f km**2.",polygonArea$area))
    #
    # Done
    # 
    print("end of example")
    }
    

    Here are the results of the different Convex Hull calculations: First, as generated using grDevices/chull()

    San Luis Obispo County Polygon

    SLO County convex hull area (using chull()) is 6596.811490 km**2.

     

    Next, the convex hull as generated by PBSmapping/CalcConvexHull()

    San Luis Obispo County Polygon

    SLO County convex hull area (using CalcConvexHull()) is 6596.811490 km**2.

    Learning More:

    The R Programming Language / CRAN Task View: Analysis of Spatial Data

    This Use Case was originally created June, 2007 / updated June, 2010.

     

     

scicomp: