Skip to content

Commit e09be23

Browse files
committed
add rmd file on estimating densities from point patterns
1 parent e9436f6 commit e09be23

3 files changed

Lines changed: 10585 additions & 0 deletions

File tree

source/_site.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,8 @@ navbar:
3939
href: "data-handling-large-files-R.html"
4040
- text: "Read data from INBO databases (SQL Server) from R"
4141
href: "data-handling-database-R.html"
42+
- text: "Estimate density from point pattern"
43+
href: "point-pattern-R.html"
4244
- text: "Git tutorials"
4345
menu:
4446
- text: "Git hands-on session using Github Desktop"

source/point-pattern-R.Rmd

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
---
2+
title: Estimating densities from a point pattern
3+
maintainer: Thierry Onkelinx
4+
date: '2017-06-28'
5+
output: html_document
6+
params:
7+
cellsize: 20
8+
---
9+
10+
In this example we focus on a set of 10450 coordinates in a small area. The goal is to estimate the local density of points, expressed as the number of point per unit area. The raw coordinates are given in [WGS84 (EPSG:4326)](https://epsg.io/4326), which is a geodetic coordinate system. That is not suited for calculating distances, so we need to re-project the points into a local projected coordinate system. In this case we use [Lambert72 (EPSG:3170)](https://epsg.io/31370). Next we calculate the density. To visualise the density, we have to transform the results back in to WGS84.
11+
12+
The data used in this example is real data by centred to a different location for privacy reasons. The dataset is available on [GitHub](https://github.com/ThierryO/my_blog/tree/master/data/20170628).
13+
14+
First we must read the data into R. Plotting the raw data helps to check errors in the data.
15+
16+
```{r read-data, fig.cap = "Raw data"}
17+
points <- read.delim("point-pattern/points.txt", sep = " ")
18+
library(ggmap)
19+
map <- get_map(
20+
location = c(lon = mean(points$lon), lat = mean(points$lat)),
21+
zoom = 17,
22+
maptype = "satellite",
23+
source = "google"
24+
)
25+
ggmap(map) +
26+
geom_point(data = points, alpha = 0.1, colour = "blue", shape = 4)
27+
```
28+
29+
The next step is to convert the dataset in to a `SpatialPoints` object with WGS84 project and re-project it into Lambert72. `sp::CRS()` defines the coordinate systems. `sp::coordinates()<-` is an easy way to convert a `data.frame` into a `SpatialPointsDataFrame`, but without specifying a coordinate system. Therefore we need to override the `proj4string` slot with the correct coordinate system. `sp::spTransform()` converts the spatial object from the current coordinate system to another coordinate system.
30+
31+
```{r reproject}
32+
library(sp)
33+
crs_wgs84 <- CRS("+init=epsg:4326")
34+
crs_lambert <- CRS("+init=epsg:31370")
35+
coordinates(points) <- ~lon + lat
36+
points@proj4string <- crs_wgs84
37+
points_lambert <- spTransform(points, crs_lambert)
38+
```
39+
40+
Once we have the points into a projected coordinate system, we can calculate the densities. We start by defining a grid. `cellsize` is the dimension of the square grid cell in the units of the projected coordinate system. Meters in case of Lambert72. The boundaries of the grid are defined using `pretty()`, which turns a vector of numbers into a "pretty" vector with rounded numbers. The combination of the boundaries and the cell size determine the number of grid cells `n` in each dimension. `diff()` calculates the difference between to adjacent numbers of a vector. The density is calculated with `MASS::kde2d()` based on the vectors with the longitude and latitude, the number of grid cells in each dimension and the boundaries of the grid. This returns the grid as a list with elements `x` (a vector of longitude coordinates of the centroids), `y` (a vector of latitude coordinates of the centroids) and `z` (a matrix with densities). The values in `z` are densities for the 'average' point per unit area. When we multiply the value `z` with the area of the grid cell and sum all of them we get 1. So if we multiple `z` with the number of points we get the density of the points per unit area.
41+
42+
We use [`dplyr::mutate()`](http://dplyr.tidyverse.org/) to convert it into a `data.frame`. The last two steps convert the centroids into a set of coordinates for square polygons.
43+
44+
```{r density}
45+
library(MASS)
46+
library(dplyr)
47+
xlim <- range(pretty(points_lambert$lon)) + c(-100, 100)
48+
ylim <- range(pretty(points_lambert$lat)) + c(-100, 100)
49+
n <- c(
50+
diff(xlim),
51+
diff(ylim)
52+
) / params$cellsize + 1
53+
dens <- kde2d(
54+
x = points_lambert$lon,
55+
y = points_lambert$lat,
56+
n = n,
57+
lims = c(xlim, ylim)
58+
)
59+
dx <- diff(dens$x[1:2])
60+
dy <- diff(dens$y[1:2])
61+
sum(dens$z * dx * dy)
62+
dens <- expand.grid(
63+
lon = dens$x,
64+
lat = dens$y
65+
) %>%
66+
mutate(
67+
density = as.vector(dens$z) * length(points_lambert),
68+
id = seq_along(density)
69+
) %>%
70+
merge(
71+
data.frame(
72+
x = dx * (c(0, 0, 1, 1, 0) - 0.5),
73+
y = dy * (c(0, 1, 1, 0, 0) - 0.5)
74+
)
75+
) %>%
76+
mutate(
77+
lon = lon + x,
78+
lat = lat + y
79+
)
80+
```
81+
82+
In order to visualise the result, we have to re-project the coordinates back to WGS84. Then we can display the raster with a web based background image.
83+
84+
```{r ggmap, fig.cap = "Static image of density"}
85+
coordinates(dens) <- ~lon + lat
86+
dens@proj4string <- crs_lambert
87+
dens_wgs <- spTransform(dens, crs_wgs84) %>%
88+
as.data.frame()
89+
ggmap(map) +
90+
geom_polygon(data = dens_wgs, aes(group = id, fill = density), alpha = 0.5) +
91+
scale_fill_gradientn(
92+
"density\n(#/m²)",
93+
colours = rev(rainbow(100, start = 0, end = .7)),
94+
limits = c(0, NA)
95+
)
96+
```
97+
98+
Using `leaflet` to generate a map was a bit more laborious. Using the `data.frame dens_wgs`directly failed. So we converted the `data.frame` in a `SpatialPolygonsDataframe`, which is a combination of a `SpatialPolygons` and a `data.frame`. The `SpatialPolygons` consists of a list of `Polygons`, one for each row of the `data.frame`. A `Polygons` object consist of a list of one or more `Polygon` object. In this example a single polygon which represents the grid cell.
99+
100+
```{r convert-to-Spatial-Polygons}
101+
dens_sp <- lapply(
102+
unique(dens_wgs$id),
103+
function(i){
104+
filter(dens_wgs, id == i) %>%
105+
select(lon, lat) %>%
106+
Polygon() %>%
107+
list() %>%
108+
Polygons(ID = i)
109+
}
110+
) %>%
111+
SpatialPolygons() %>%
112+
SpatialPolygonsDataFrame(
113+
data = dens_wgs %>%
114+
distinct(id, density),
115+
match.ID = FALSE
116+
)
117+
```
118+
119+
`leaflet` requires a predefined function with a colour pallet. We use `leaflet::colorNumeric()` to get a continuous pallet. Setting `stroke = FALSE` removes the borders of the polygon. `fillOpacity` sets the transparency of the polygons.
120+
121+
```{r leaflet, fig.cap = "Dynamic map of density"}
122+
library(leaflet)
123+
pal <- colorNumeric(
124+
palette = rev(rainbow(100, start = 0, end = .7)),
125+
domain = c(0, dens_sp$density)
126+
)
127+
leaflet(dens_sp) %>%
128+
addTiles() %>%
129+
addPolygons(color = ~pal(density), stroke = FALSE, fillOpacity = 0.5) %>%
130+
addLegend(pal = pal, values = ~density)
131+
```
132+

0 commit comments

Comments
 (0)