Posts Tagged ‘SDM’

When Google Street View helps studying species geographical distribution

This post is about the way we could use the Google street view (GSV) data base to gather data allowing to describe species geographical distribution. There have been some attempts at using the panoramic imagery provided by GSV in social science [1] and preventive medicine [2] but to my knowledge, very few ecological applications have been published so far.

A pine processionary moth silk nest. Photo by Jérôme rousselet.

A pine processionary moth silk nest. Photo by Jérôme rousselet.

Obviously, only organisms that can be reliably detected by road sampling can be assessed using street imagery. We have recently published our first results [Rousselet et al 2013 PLOS ONE e74918] showing how the GSV imagery could be used to perform in silico sampling of species occurrences. Our biological model was the pine processionary moth Thaumetopoea pityocampa, a species easily visible from the roads during winter because it builds white silk nests in its host tree foliage. Readers can get the paper for free from the journal that is an open access publication.

Rousselet, J., Imbert, C.-E., Dekri, A., Garcia, J., Goussard, F., Vincent, B., Denux, O., Robinet, C., Dorkeld, F., Roques, A., Rossi, J.-P., 2013, Assessing Species Distribution Using Google Street View: A Pilot Study with the Pine Processionary Moth, PLoS One 8(10):e74918.

A press release was issued today (in French) and is available from my institute’s web site.

Mapping species spatial distribution using spatial inference and prediction requires a lot of data. Occurrence data are generally not easily available from the literature and are very time-consuming to collect in the field. For that reason, we designed a survey to explore to which extent large-scale databases such as Google maps and Google street view could be used to derive valid occurrence data. We worked with the Pine Processionary Moth (PPM) Thaumetopoea pityocampa because the larvae of that moth build silk nests that are easily visible. The presence of the species at one location can therefore be inferred from visual records derived from the panoramic views available from Google street view. We designed a standardized procedure allowing evaluating the presence of the PPM on a sampling grid covering the landscape under study. The outputs were compared to field data. We investigated two landscapes using grids of different extent and mesh size. Data derived from Google street view were highly similar to field data in the large-scale analysis based on a square grid with a mesh of 16 km (96% of matching records). Using a 2 km mesh size led to a strong divergence between field and Google-derived data (46% of matching records). We conclude that Google database might provide useful occurrence data for mapping the distribution of species which presence can be visually evaluated such as the PPM. However, the accuracy of the output strongly depends on the spatial scales considered and on the sampling grid used. Other factors such as the coverage of Google street view network with regards to sampling grid size and the spatial distribution of host trees with regards to road network may also be determinant.

1. Odgers CL, Caspi A, Bates CJ, Sampson RJ, Moffitt TE (2012) Systematic social observation of children’s neighborhoods using Google street view: a reliable and cost-effective method. J Child Psychol Psychiatry 53: 1009-1017.
2. Rundle AG, Bader MDM, Richards CA, Neckerman KM, Teitler JO (2011) Using Google street view to audit neighborhood environments. Am J Prev Med 40: 94-100.


Retrieve spatial coordinates from location’s name

It is a common task to retrieve the spatial coordinates (longitude and latitude) from the sole name of a place. This occurs for example when one wants to map spatial occurrences of a species that are reported in the literature in the form of a list of places.

This work is strongly facilitated by the R package titled gooJSON developped by Christopher Steven Marcum.

Let’s say that we want to retrieve the coordinates of a place reported as Fort Myers in Florida where J. C. Denmark reported the presence of Homalodisca vitripennis Germar, 1821 in 1957.

a<-gooadd(address = list("Fort Myers","Florida"))

[1] -81.87231 26.64063 0.00000

Here we are, Google API tells us (through R) that Fort Myers in Florida is located at -81.87231 longitude and 26.64063 latitude.

Now imagine that we want to repeat that operation 500 times because we have a bunch of places where our study species has been recorded. The goomap function can be put into a loop and the Google API questioned 500 times, but here comes the twist: Google will not answer 500 times in a row and will send no data for some records. This problem can simply be solved by slowing your loop so that Google API won’t receive your calling to quickly and will answer to all queries. This is done by adding a call to the function Sys.sleep in the loop to suspend execution of R expressions for a given number of seconds. I used Sys.sleep(0.2) and it worked perfectly!

Christopher Steven Marcum (2012). gooJSON: Google JSON Data Interpreter for R. R package version 1.0.01.

Categories: R Tags: , ,

The MIGCLIM R package

I just saw the paper by Engler et al (2012) published a few days ago presenting the R version of MIGCLIM. The MIGCLIM R package allows the integration of dispersal constraints into projections of species distribution models. This is a very promising tool and I will test it as soon as possible.

Robin Engler, Wim Hordijk, Antoine Guisan (2012) The MIGCLIM R package – seamless integration of dispersal constraints into projections of species distribution models. Ecography. In press. DOI: 10.1111/j.1600-0587.2012.07608.x Article first published online: 3 AUG 2012

paper abstract

Categories: R Tags: ,

Playing with R within a GRASS environment

14 August 2012 1 comment

Although there are some powerful GIS utilities in R, I prefer using GRASS to manage my GIS data while I use R to perform scientific computing.

In fact GRASS and R can be very simply interfaced by means of the R package spgrass6.

Under linux operating system (actually Kubuntu Natty Narwhal) I open the console, launch GRASS by typing grass and then select a location (see GRASS manual for details).


Once GRASS is running, R can be launched from within the GRASS console…

From now on I am working in R and I can use the library spgrass6 in order e.g. to read or write rasters into the GRASS system.

As an example we will consider the case of the pine shoot beetle Tomicus piniperda (Coleoptera: Curculionidae: Scolytinae). Readers can find the whole story in Horn et al. (2012) available here. Tomicus piniperdaT. piniperda is present throughout Europe. Interestingly, it has long been assumed to be present in North Africa too although molecular studies have recently shown that T. piniperda only rarely occurs in these regions (Horn et al., 2006, 2009).

We have a set of localities where T. piniperda was recorded as present or, on the contrary, has been searched and was absent (true absence). GRASS is used to host the data and produce the graphical outputs.

Sites where T. piniperda was either present (black circle) or absent (open circle).

Starting R from the GRASS console and using the dismo package allows us to fit a SDM (Species Distribution Model) and build a raster with the probabilities of presence e.g. using the function predict. This raster layer can be expressed as presence/absence and the resulting data written within the GRASS system using the function writeRAST6 from package spgrass6.
The raster can now be plotted using the GRASS interface and its utilities.

GRASS window showing the occurences of T. piniperda and the associated predicted distribution derived from a SDM run in R

Horn, A., Kerdelhué, C., Lieutier, F., Rossi, J.-P. (2012). Predicting the distribution of the two bark beetles Tomicus destruens and Tomicus piniperda in Europe and the Mediterranean region. Agricultural and Forest Entomology, in press, DOI: 10.1111/j.1461-9563.2012.00576.x.

Horn, A., Roux-Morabito, G., Lieutier, F. & Kerdelhué, C. (2006) Phylogeographic structure and past history of the circum-Mediterranean species Tomicus destruens Woll. (Coleoptera: Scolytinae). Molecular Ecology, 15, 1603–1615.

Horn, A., Stauffer, C., Lieutier, F. & Kerdelhué, C. (2009) Complex postglacial history of the temperate bark beetle Tomicus piniperda L. (Coleoptera, Scolytinae). Heredity, 103, 238–247.

Computing the “Multivariate Environmental Similarity Surfaces” (MESS) index in R

13 August 2012 9 comments

*edit oct 03 2012 *
The mess function is now part of the dismo package developped by Robert J. Hijmans, Steven Phillips, John Leathwick and Jane Elith (dismo 0.7-23)***

This post is about  the computation of the “Multivariate Environmental Similarity Surfaces” aka MESS in R. The MESS index was proposed by Elith et al (2010) [Methods in Ecology & Evolution 2010, 1, 330–342]. MESS can be computed with the Maxent software but I  wanted to perform all my data treatments within R/GRASS environment. For that reason I wrote a R function called mess.R.

Note that the function requires the R package raster developed by R. Hijmans and J. van Etten. Both the function and a tutorial can be downloaded for my site.

The present post provides some explanations, the mess.R function and some examples.

1- What is MESS?
MESS stands for Multivariate Environmental Similarity Surfaces. It is an index of similarity reporting the closeness of a point described by a set of environmental attributes to the distribution of these attributes within a population of reference points. The MESS approach has been proposed by Elith et al (2010). It works as BIOCLIM does but provides negative values which allows one to differentiate levels of dissimilarity. This is particularly valuable when considering points associated to values lying outside the range of the reference points.

2-The mess.R function

    f    if(f==0) simi<-100*(p-min(v))/(max(v)-min(v))
    if(0    if(50<=f & f<100) simi<-2*(100-f)
    if(f==100) simi<-100*(max(v)-p)/(max(v)-min(v))
for (i in 1:(dim(E)[2])) {
    e<-data.frame(E[,i]) ; v<-V[,i]
    r_mess[[i]][]<-apply(X=e, MARGIN=1, FUN=messi, v=v)
rmess[]<-apply(X=E, MARGIN=1, FUN=min)
if(full==TRUE) {
    out     layerNames(out)<-c(layerNames(X),"mess")
if(full==FALSE) out return(out)

The function mess.R allows the computation of the MESS index for a set of raster objets. raster objects are defined within the package raster developped by Hijmans & van Etten (2011). mess.R therefore requires that the package raster is properly installed and loaded using the command library(raster).

The following example uses the Bradypus data hosted in the package dismo (Hijmans et al., 2012). As with raster, dismo must be installed and loaded.

filename bradypus bradypus files ’/ex’, sep=’’), pattern=’grd’, full.names=TRUE )
predictors predictors<-dropLayer(x=predictors,i=9)

This is the result:

Elith J., Kearney M., & Phillips S. 2010. The art of modelling range-shifting species. Met. Ecol. Evol 1 :330-342.

Hijmans R.J. & van Etten J. (2011). raster : Geographic analysis and modeling with raster data. R package version 1.9-58.

Hijmans R.J., Phillips S., Leathwick J. & Elith J. (2012). dismo : Species distribution modeling. R package version 0.7-17.

Categories: R Tags: ,