|
| 1 | +--- |
| 2 | +title: Estimating densities from a point pattern |
| 3 | +maintainer: Thierry Onkelinx |
| 4 | +date: '2017-06-28' |
| 5 | +output: html_document |
| 6 | +params: |
| 7 | + cellsize: 20 |
| 8 | +--- |
| 9 | + |
| 10 | +In this example we focus on a set of 10450 coordinates in a small area. The goal is to estimate the local density of points, expressed as the number of point per unit area. The raw coordinates are given in [WGS84 (EPSG:4326)](https://epsg.io/4326), which is a geodetic coordinate system. That is not suited for calculating distances, so we need to re-project the points into a local projected coordinate system. In this case we use [Lambert72 (EPSG:3170)](https://epsg.io/31370). Next we calculate the density. To visualise the density, we have to transform the results back in to WGS84. |
| 11 | + |
| 12 | +The data used in this example is real data by centred to a different location for privacy reasons. The dataset is available on [GitHub](https://github.com/ThierryO/my_blog/tree/master/data/20170628). |
| 13 | + |
| 14 | +First we must read the data into R. Plotting the raw data helps to check errors in the data. |
| 15 | + |
| 16 | +```{r read-data, fig.cap = "Raw data"} |
| 17 | +points <- read.delim("point-pattern/points.txt", sep = " ") |
| 18 | +library(ggmap) |
| 19 | +map <- get_map( |
| 20 | + location = c(lon = mean(points$lon), lat = mean(points$lat)), |
| 21 | + zoom = 17, |
| 22 | + maptype = "satellite", |
| 23 | + source = "google" |
| 24 | +) |
| 25 | +ggmap(map) + |
| 26 | + geom_point(data = points, alpha = 0.1, colour = "blue", shape = 4) |
| 27 | +``` |
| 28 | + |
| 29 | +The next step is to convert the dataset in to a `SpatialPoints` object with WGS84 project and re-project it into Lambert72. `sp::CRS()` defines the coordinate systems. `sp::coordinates()<-` is an easy way to convert a `data.frame` into a `SpatialPointsDataFrame`, but without specifying a coordinate system. Therefore we need to override the `proj4string` slot with the correct coordinate system. `sp::spTransform()` converts the spatial object from the current coordinate system to another coordinate system. |
| 30 | + |
| 31 | +```{r reproject} |
| 32 | +library(sp) |
| 33 | +crs_wgs84 <- CRS("+init=epsg:4326") |
| 34 | +crs_lambert <- CRS("+init=epsg:31370") |
| 35 | +coordinates(points) <- ~lon + lat |
| 36 | +points@proj4string <- crs_wgs84 |
| 37 | +points_lambert <- spTransform(points, crs_lambert) |
| 38 | +``` |
| 39 | + |
| 40 | +Once we have the points into a projected coordinate system, we can calculate the densities. We start by defining a grid. `cellsize` is the dimension of the square grid cell in the units of the projected coordinate system. Meters in case of Lambert72. The boundaries of the grid are defined using `pretty()`, which turns a vector of numbers into a "pretty" vector with rounded numbers. The combination of the boundaries and the cell size determine the number of grid cells `n` in each dimension. `diff()` calculates the difference between to adjacent numbers of a vector. The density is calculated with `MASS::kde2d()` based on the vectors with the longitude and latitude, the number of grid cells in each dimension and the boundaries of the grid. This returns the grid as a list with elements `x` (a vector of longitude coordinates of the centroids), `y` (a vector of latitude coordinates of the centroids) and `z` (a matrix with densities). The values in `z` are densities for the 'average' point per unit area. When we multiply the value `z` with the area of the grid cell and sum all of them we get 1. So if we multiple `z` with the number of points we get the density of the points per unit area. |
| 41 | + |
| 42 | +We use [`dplyr::mutate()`](http://dplyr.tidyverse.org/) to convert it into a `data.frame`. The last two steps convert the centroids into a set of coordinates for square polygons. |
| 43 | + |
| 44 | +```{r density} |
| 45 | +library(MASS) |
| 46 | +library(dplyr) |
| 47 | +xlim <- range(pretty(points_lambert$lon)) + c(-100, 100) |
| 48 | +ylim <- range(pretty(points_lambert$lat)) + c(-100, 100) |
| 49 | +n <- c( |
| 50 | + diff(xlim), |
| 51 | + diff(ylim) |
| 52 | +) / params$cellsize + 1 |
| 53 | +dens <- kde2d( |
| 54 | + x = points_lambert$lon, |
| 55 | + y = points_lambert$lat, |
| 56 | + n = n, |
| 57 | + lims = c(xlim, ylim) |
| 58 | +) |
| 59 | +dx <- diff(dens$x[1:2]) |
| 60 | +dy <- diff(dens$y[1:2]) |
| 61 | +sum(dens$z * dx * dy) |
| 62 | +dens <- expand.grid( |
| 63 | + lon = dens$x, |
| 64 | + lat = dens$y |
| 65 | +) %>% |
| 66 | + mutate( |
| 67 | + density = as.vector(dens$z) * length(points_lambert), |
| 68 | + id = seq_along(density) |
| 69 | + ) %>% |
| 70 | + merge( |
| 71 | + data.frame( |
| 72 | + x = dx * (c(0, 0, 1, 1, 0) - 0.5), |
| 73 | + y = dy * (c(0, 1, 1, 0, 0) - 0.5) |
| 74 | + ) |
| 75 | + ) %>% |
| 76 | + mutate( |
| 77 | + lon = lon + x, |
| 78 | + lat = lat + y |
| 79 | + ) |
| 80 | +``` |
| 81 | + |
| 82 | +In order to visualise the result, we have to re-project the coordinates back to WGS84. Then we can display the raster with a web based background image. |
| 83 | + |
| 84 | +```{r ggmap, fig.cap = "Static image of density"} |
| 85 | +coordinates(dens) <- ~lon + lat |
| 86 | +dens@proj4string <- crs_lambert |
| 87 | +dens_wgs <- spTransform(dens, crs_wgs84) %>% |
| 88 | + as.data.frame() |
| 89 | +ggmap(map) + |
| 90 | + geom_polygon(data = dens_wgs, aes(group = id, fill = density), alpha = 0.5) + |
| 91 | + scale_fill_gradientn( |
| 92 | + "density\n(#/m²)", |
| 93 | + colours = rev(rainbow(100, start = 0, end = .7)), |
| 94 | + limits = c(0, NA) |
| 95 | + ) |
| 96 | +``` |
| 97 | + |
| 98 | +Using `leaflet` to generate a map was a bit more laborious. Using the `data.frame dens_wgs`directly failed. So we converted the `data.frame` in a `SpatialPolygonsDataframe`, which is a combination of a `SpatialPolygons` and a `data.frame`. The `SpatialPolygons` consists of a list of `Polygons`, one for each row of the `data.frame`. A `Polygons` object consist of a list of one or more `Polygon` object. In this example a single polygon which represents the grid cell. |
| 99 | + |
| 100 | +```{r convert-to-Spatial-Polygons} |
| 101 | +dens_sp <- lapply( |
| 102 | + unique(dens_wgs$id), |
| 103 | + function(i){ |
| 104 | + filter(dens_wgs, id == i) %>% |
| 105 | + select(lon, lat) %>% |
| 106 | + Polygon() %>% |
| 107 | + list() %>% |
| 108 | + Polygons(ID = i) |
| 109 | + } |
| 110 | +) %>% |
| 111 | + SpatialPolygons() %>% |
| 112 | + SpatialPolygonsDataFrame( |
| 113 | + data = dens_wgs %>% |
| 114 | + distinct(id, density), |
| 115 | + match.ID = FALSE |
| 116 | + ) |
| 117 | +``` |
| 118 | + |
| 119 | +`leaflet` requires a predefined function with a colour pallet. We use `leaflet::colorNumeric()` to get a continuous pallet. Setting `stroke = FALSE` removes the borders of the polygon. `fillOpacity` sets the transparency of the polygons. |
| 120 | + |
| 121 | +```{r leaflet, fig.cap = "Dynamic map of density"} |
| 122 | +library(leaflet) |
| 123 | +pal <- colorNumeric( |
| 124 | + palette = rev(rainbow(100, start = 0, end = .7)), |
| 125 | + domain = c(0, dens_sp$density) |
| 126 | +) |
| 127 | +leaflet(dens_sp) %>% |
| 128 | + addTiles() %>% |
| 129 | + addPolygons(color = ~pal(density), stroke = FALSE, fillOpacity = 0.5) %>% |
| 130 | + addLegend(pal = pal, values = ~density) |
| 131 | +``` |
| 132 | + |
0 commit comments