Title:  Species Distribution Modeling 

Description:  Methods for species distribution modeling, that is, predicting the environmental similarity of any site to that of the locations of known occurrences of a species. 
Authors:  Robert J. Hijmans, Steven Phillips, John Leathwick and Jane Elith 
Maintainer:  Robert J. Hijmans <[email protected]> 
License:  GPL (>=3) 
Version:  1.314 
Built:  20241004 03:14:20 UTC 
Source:  https://github.com/rspatial/dismo 
This package implements a few species distribution models, including an R link to the 'maxent' model, and native implementations of Bioclim and Domain. It also provides a number of functions that can assist in using Boosted Regresssion Trees.
A good place to start is the vignette, which you can access by typing vignette('sdm', 'dismo')
In addition there are a number of functions, such sampling background points, kfold sampling, and for model evaluation (AUC) that are useful for these and for other species distribution modeling methods available in R (e.g. GLM, GAM, and RandomForest).
Robert J. Hijmans, Steven Phillips, John Leathwick and Jane Elith
Distribution data for Solanum acaule (a plant species that occurs in the high Andes of Peru and Bolivia).
Downloaded from GBIF with the gbif
function. For use in the 'species distribution modeling' vignette.
data(acaule)
data(acaule)
A number of sites with presence or absence of the shortfinned eel (Anguilla australis) in New Zealand, and environmental data at these sites; and gridded data of the environmental variables for the study area.
type  variable name  values  mean and range 
Reach  LocSed  weighted average of proportional cover of bed sediment  1=mud, 2=sand, 3=fine gravel; 4=coarse gravel; 5=cobble; 6=boulder; 7=bedrock; 3.77, 1 to 7 
Segment  SegSumT  Summer air temperature (degrees C)  16.3, 8.9 to 19.8 
SegTSeas  Winter air temperature (degrees C), normalised with respect to SegJanT  0.36, 4.2 to 4.1  
SegLowFlow  segment low flow (m3/sec), fourth root transformed  1.092, 1.0 to 4.09  
Downstream  DSDist  distance to coast (km)  74, 0.03 to 433.4 
DSDam  presence of known downstream obstructions, mostly dams  0.18, 0 or 1  
DSMaxSlope  maximum downstream slope (degrees)  3.1, 0 to 29.7  
Upstream / catchment  USAvgT  average temperature in catchment (deg C) compared to segment, normalised with respect to SegJanT  0.38, 7.7 to 2.2 
USRainDays  days/month with rain greater than 25 mm  1.22, 0.21 to 3.30  
USSlope  average slope in the upstream catchment (degrees)  14.3, 0 to 41.0  
USNative  area with indigenous forest (proportion)  0.57, 0 to 1  
Fishing method  fishing method in five classes: electric, net, spot, trap & mixture  NA  
data(Anguilla_train) data(Anguilla_test) data(Anguilla_grids)
data(Anguilla_train) data(Anguilla_test) data(Anguilla_grids)
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
The Bioclim algorithm has been extensively used for species distribution modeling. Bioclim is the classic 'climateenvelopemodel'. Although it generally does not perform as good as some other modeling methods (Elith et al. 2006) and is unsuited for predicting climate change effects (Hijmans and Graham, 2006). It is still used, however, among other reasons because the algorithm is easy to understand and thus useful in teaching species distribution modeling.
The BIOCLIM algorithm computes the similarity of a location by comparing the values of environmental variables at any location to a percentile distribution of the values at known locations of occurrence ('training sites'). The closer to the 50th percentile (the median), the more suitable the location is. The tails of the distribution are not distinguished, that is, 10 percentile is treated as equivalent to 90 percentile.
In this R implementation, percentile scores are between 0 and 1, but predicted values larger than 0.5 are subtracted from 1. Then, the minimum percentile score across all the environmental variables is computed (i.e. this is like Liebig's law of the minimum, except that high values can also be limiting factors). The final value is subtracted from 1 and multiplied with 2 so that the results are between 0 and 1. The reason for this transformation is that the results become more like that of other distribution modeling methods and are thus easier to interpret. The value 1 will rarely be observed as it would require a location that has the median value of the training data for all the variables considered. The value 0 is very common as it is assigned to all cells with a value of an environmental variable that is outside the percentile distribution (the range of the training data) for at least one of the variables.
In the predict function, you can choose to ignore one of the tails of the distribution (e.g, to make low rainfall a limiting factor, but not high rainfall),
bioclim(x, p, ...)
bioclim(x, p, ...)
x 
Raster* object or matrix 
p 
two column matrix or SpatialPoints* object 
... 
Additional arguments 
An object of class 'Bioclim' (inherits from DistModelclass
)
Robert J. Hijmans
Nix, H.A., 1986. A biogeographic analysis of Australian elapid snakes. In: Atlas of Elapid Snakes of Australia. (Ed.) R. Longmore, pp. 415. Australian Flora and Fauna Series Number 7. Australian Government Publishing Service: Canberra.
Booth, T.H., H.A. Nix, J.R. Busby and M.F. Hutchinson, 2014. BIOCLIM: the first species distribution modelling package, its early applications and relevance to most current MAXENT studies. Diversity and Distributions 20: 19
Elith, J., C.H. Graham, R.P. Anderson, M. Dudik, S. Ferrier, A. Guisan, R.J. Hijmans, F. Huettmann, J. Leathwick, A. Lehmann, J. Li, L.G. Lohmann, B. Loiselle, G. Manion, C. Moritz, M. Nakamura, Y. Nakazawa, J. McC. Overton, A.T. Peterson, S. Phillips, K. Richardson, R. ScachettiPereira, R. Schapire, J. Soberon, S. Williams, M. Wisz and N. Zimmerman, 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography 29: 129151. doi:10.1111/j.2006.09067590.04596.x
Hijmans R.J., and C.H. Graham, 2006. Testing the ability of climate envelope models to predict the effect of climate change on species distributions. Global change biology 12: 22722281. doi:10.1111/j.13652486.2006.01256.x
predict, maxent, domain, mahal
logo < stack(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(48.243420, 48.243420, 47.985820, 52.880230, 49.531423, 46.182616, 54.168232, 69.624263, 83.792291, 85.337894, 74.261072, 83.792291, 95.126713, 84.565092, 66.275456, 41.803408, 25.832176, 3.936132, 18.876962, 17.331359,7.048974, 13.648543, 26.093446, 28.544714, 39.104026, 44.572240, 51.171810, 56.262906, 46.269272, 38.161230, 30.618865, 21.945145, 34.390047, 59.656971, 69.839163, 73.233228, 63.239594, 45.892154, 43.252326, 28.356155) , ncol=2) bc < bioclim(logo, pts) #or v < extract(logo, pts) bc < bioclim(v) p1 < predict(logo, bc) p2 < predict(logo, bc, tails=c('both', 'low', 'high')) #or #sp < SpatialPoints(pts) #bc < bioclim(logo, pts)
logo < stack(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(48.243420, 48.243420, 47.985820, 52.880230, 49.531423, 46.182616, 54.168232, 69.624263, 83.792291, 85.337894, 74.261072, 83.792291, 95.126713, 84.565092, 66.275456, 41.803408, 25.832176, 3.936132, 18.876962, 17.331359,7.048974, 13.648543, 26.093446, 28.544714, 39.104026, 44.572240, 51.171810, 56.262906, 46.269272, 38.161230, 30.618865, 21.945145, 34.390047, 59.656971, 69.839163, 73.233228, 63.239594, 45.892154, 43.252326, 28.356155) , ncol=2) bc < bioclim(logo, pts) #or v < extract(logo, pts) bc < bioclim(v) p1 < predict(logo, bc) p2 < predict(logo, bc, tails=c('both', 'low', 'high')) #or #sp < SpatialPoints(pts) #bc < bioclim(logo, pts)
Function to create 'bioclimatic variables' from monthly climate data.
biovars(prec, tmin, tmax, ...)
biovars(prec, tmin, tmax, ...)
prec 
vector, matrix, or RasterStack/Brick of precipitation data 
tmin 
vector, matrix, or RasterStack/Brick of minimum temperature data 
tmax 
vector, matrix, or RasterStack/Brick of maximum temperature data 
... 
Additional arguments 
Input data is normaly monthly. I.e. there should be 12 values (layers) for each variable, but the function should also work for e.g. weekly data (with some changes in the meaning of the output variables. E.g. #8 would then not be for a quater (3 months), but for a 3 week period).
Depending on the class of the input data, an object of class 'vector', 'matrix' or 'RasterBrick' with 19 variables (columns, layers)
bio1 = Mean annual temperature
bio2 = Mean diurnal range (mean of max temp  min temp)
bio3 = Isothermality (bio2/bio7) (* 100)
bio4 = Temperature seasonality (standard deviation *100)
bio5 = Max temperature of warmest month
bio6 = Min temperature of coldest month
bio7 = Temperature annual range (bio5bio6)
bio8 = Mean temperature of the wettest quarter
bio9 = Mean temperature of driest quarter
bio10 = Mean temperature of warmest quarter
bio11 = Mean temperature of coldest quarter
bio12 = Total (annual) precipitation
bio13 = Precipitation of wettest month
bio14 = Precipitation of driest month
bio15 = Precipitation seasonality (coefficient of variation)
bio16 = Precipitation of wettest quarter
bio17 = Precipitation of driest quarter
bio18 = Precipitation of warmest quarter
Robert J. Hijmans
tmin < c(10,12,14,16,18,20,22,21,19,17,15,12) tmax < tmin + 5 prec < c(0,2,10,30,80,160,80,20,40,60,20,0) biovars(prec, tmin, tmax) tmn = tmx = prc = brick(nrow=1, ncol=1) tmn < setValues(tmn, t(matrix(c(10,12,14,16,18,20,22,21,19,17,15,12)))) tmx < tmn + 5 prc < setValues(prc, t(matrix(c(0,2,10,30,80,160,80,20,40,60,20,0)))) b < biovars(prc, tmn, tmx) as.matrix(b)
tmin < c(10,12,14,16,18,20,22,21,19,17,15,12) tmax < tmin + 5 prec < c(0,2,10,30,80,160,80,20,40,60,20,0) biovars(prec, tmin, tmax) tmn = tmx = prc = brick(nrow=1, ncol=1) tmn < setValues(tmn, t(matrix(c(10,12,14,16,18,20,22,21,19,17,15,12)))) tmx < tmn + 5 prc < setValues(prc, t(matrix(c(0,2,10,30,80,160,80,20,40,60,20,0)))) b < biovars(prc, tmn, tmx) as.matrix(b)
Make a box plot of model evaluation data, i.e., the model predictions for known presence and absence points.
Arguments:
x
Object of class ModelEvaluation
...
Additional arguments that can be passed to boxplot
Robert J. Hijmans
Function to calculate deviance given two vectors of observed and predicted values. Requires a family argument which is set to binomial by default
calc.deviance(obs, pred, weights = rep(1,length(obs)), family="binomial", calc.mean = TRUE)
calc.deviance(obs, pred, weights = rep(1,length(obs)), family="binomial", calc.mean = TRUE)
obs 
a vector with observed values 
pred 
a vector with predicted values that correspond the the values in obs 
weights 
a vector of weight values 
family 
One of "binomial", "bernoulli", "poisson", "laplace", or "gaussian" 
calc.mean 
Logical. If 
John R. Leathwick and Jane Elith
The Circle hull model predicts that a species is present at sites inside the smallest circle that can contain a set of training points, and absent outside that circle.
circleHull(p, ...)
circleHull(p, ...)
p 
point locations (presence). Two column matrix, data.frame or SpatialPoints* object 
... 
Additional arguments. See details 
An object of class 'CircleHull' (inherits from DistModelclass
)
Robert J. Hijmans
circles, convHull, rectHull, predict
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] ch < circleHull(train) predict(ch, test) plot(r) plot(ch, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(ch, r, progress='') plot(pr) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) # to get the polygons: p < polygons(ch) p
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] ch < circleHull(train) predict(ch, test) plot(r) plot(ch, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(ch, r, progress='') plot(pr) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) # to get the polygons: p < polygons(ch) p
The Circles Range model predicts that a species is present at sites within a certain distance from a training point, and absent further away.
## S4 method for signature 'matrix' circles(p, d, lonlat, n=360, r=6378137, dissolve=TRUE, ...) ## S4 method for signature 'SpatialPoints' circles(p, d, lonlat, n=360, r=6378137, dissolve=TRUE, ...)
## S4 method for signature 'matrix' circles(p, d, lonlat, n=360, r=6378137, dissolve=TRUE, ...) ## S4 method for signature 'SpatialPoints' circles(p, d, lonlat, n=360, r=6378137, dissolve=TRUE, ...)
p 
point locations (presence). Two column matrix, data.frame or SpatialPoints* object 
d 
numeric. The radius of each circle in meters. A single number or a vector with elements corresponding to rows in 
lonlat 
logical. Are these longitude/latitude data? If missing this is taken from the 
n 
integer. How many vertices in the circle? Default is 360 
r 
numeric. Radius of the earth. Only relevant for longitude/latitude data. Default is 6378137 m 
dissolve 
logical. Dissolve overlapping circles. Setting this to 
... 
additional arguments, none implemented 
An object of class 'CirclesRange' (inherits from DistModelclass
)
Robert J. Hijmans
predict, geoDist, convHull, maxent, domain,
mahal, convHull
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] cc < circles(train, lonlat=FALSE) predict(cc, test) plot(r) plot(cc, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(cc, r, progress='') plot(r, legend=FALSE) plot(pr, add=TRUE, col='blue') points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) # to get the polygons: p < polygons(cc) p # to compute the Circular Area Range of a species (see Hijmans and Spooner, 2001) pts < train*10 ca100 < polygons(circles(pts, d=100, lonlat=FALSE)) ca150 < polygons(circles(pts, d=150, lonlat=FALSE)) sum(area(ca150)) / (pi*150^2) sum(area(ca100)) / (pi*100^2) par(mfrow=c(1,2)) plot(ca100); points(pts) plot(ca150); points(pts)
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] cc < circles(train, lonlat=FALSE) predict(cc, test) plot(r) plot(cc, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(cc, r, progress='') plot(r, legend=FALSE) plot(pr, add=TRUE, col='blue') points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) # to get the polygons: p < polygons(cc) p # to compute the Circular Area Range of a species (see Hijmans and Spooner, 2001) pts < train*10 ca100 < polygons(circles(pts, d=100, lonlat=FALSE)) ca150 < polygons(circles(pts, d=150, lonlat=FALSE)) sum(area(ca150)) / (pi*150^2) sum(area(ca100)) / (pi*100^2) par(mfrow=c(1,2)) plot(ca100); points(pts) plot(ca150); points(pts)
The Convex hull model predicts that a species is present at sites inside the convex hull of a set of training points, and absent outside that hull. I.e. this is the spatial convex hull, not an environmental hull.
convHull(x, ...)
convHull(x, ...)
x 
point locations (presence). Two column matrix, data.frame or SpatialPoints* object 
... 
Additional arguments. See details 
You can supply an argument n (>= 1) to get n convex hulls around subsets of the points. You can also set n=1:x, to get a set of overlapping polygons consisting of 1 to x parts. I.e. the first polygon has 1 part, the second has 2 parts, and x has x parts.
An object of class 'ConvexHull' (inherits from DistModelclass
)
Robert J. Hijmans
predict, geoDist, maxent, domain, mahal
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] ch < convHull(train) predict(ch, test) plot(r) plot(ch, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(ch, r, progress='') plot(pr) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) # to get the polygons: p < polygons(ch) p
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] ch < convHull(train) predict(ch, test) plot(r) plot(ch, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(ch, r, progress='') plot(pr) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) # to get the polygons: p < polygons(ch) p
Evaluate a model for intervals of distances to the nearest point in a reference dataset for presence data and a sample of the absence data selected to have a low spatial sorting bias (obtained with pwdSample).
dcEvaluate(p, a, reference, lonlat=TRUE, binsize=15, predp, preda, model, predictors, fun=predict)
dcEvaluate(p, a, reference, lonlat=TRUE, binsize=15, predp, preda, model, predictors, fun=predict)
p 
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations 
a 
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations 
reference 
as above for reference point locations to which distances are computed 
lonlat 
Logical. Use 
binsize 
postive integer. How many presence points in each distance bin? 
predp 
p 
preda 
a 
model 
m 
predictors 
pr 
fun 
function 
list with Evaluation objects
Robert J. Hijmans
Create a density plots of presence and absence data
A density plot. Presence data are in red, and absence data (if available) are in blue.
density(x, ...)
x 
Object of class 'ModelEvaluation' or of a class that inherits from 'DistModel, (such as 'MaxEnt', 'Bioclim')  
... 
Additional arguments that can be passed to plot  
Robert J. Hijmans
Parent class for a number of distribution models defined in the dismo package (those created by bioclim, domain, maxent and mahal
). This is a virtual Class, no objects may be direclty created from it.
presence
:presence data used
absence
:absence or background data used
hasabsence
:Logical indicating whether there is any absence data
Robert J. Hijmans
The Domain algorithm (Carpenter et al. 1993) that has been extensively used for species distribution modeling. It is included here for that reason but please note that it generally does not perform very well in model comparison (Elith et al. 2006, Hijmans and Graham, 2006). The Domain algorithm computes the Gower distance between environmental variables at any location and those at any of the known locations of occurrence ('training sites'). For each variable the minimum distance between a site and any of the training points is taken. To integrate over environmental variables, the maximum distance to any of the variables is used. This distance is subtracted from one, and (in this R implementation) values below zero are truncated so that the scores are between 0 (low) and 1 (high).
domain(x, p, ...)
domain(x, p, ...)
x 
Raster* object or matrix 
p 
two column matrix or SpatialPoints* object 
... 
Additional arguments 
An object of class 'Domain' (inherits from DistModelclass
)
Robert J. Hijmans
Carpenter G., A.N. Gillison and J. Winter, 1993. Domain: a flexible modelling procedure for mapping potential distributions of plants and animals. Biodiversity Conservation 2:667680.
Elith, J., C.H. Graham, R.P. Anderson, M. Dudik, S. Ferrier, A. Guisan, R.J. Hijmans, F. Huettmann, J. Leathwick, A. Lehmann, J. Li, L.G. Lohmann, B. Loiselle, G. Manion, C. Moritz, M. Nakamura, Y. Nakazawa, J. McC. Overton, A.T. Peterson, S. Phillips, K. Richardson, R. ScachettiPereira, R. Schapire, J. Soberon, S. Williams, M. Wisz and N. Zimmerman, 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography 29: 129151. doi:10.1111/j.2006.09067590.04596.x
Hijmans R.J., and C.H. Graham, 2006. Testing the ability of climate envelope models to predict the effect of climate change on species distributions. Global change biology 12: 22722281. doi:10.1111/j.13652486.2006.01256.x
predict, maxent, bioclim, mahal
logo < stack(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(48.243420, 48.243420, 47.985820, 52.880230, 49.531423, 46.182616, 54.168232, 69.624263, 83.792291, 85.337894, 74.261072, 83.792291, 95.126713, 84.565092, 66.275456, 41.803408, 25.832176, 3.936132, 18.876962, 17.331359,7.048974, 13.648543, 26.093446, 28.544714, 39.104026, 44.572240, 51.171810, 56.262906, 46.269272, 38.161230, 30.618865, 21.945145, 34.390047, 59.656971, 69.839163, 73.233228, 63.239594, 45.892154, 43.252326, 28.356155), ncol=2) d < domain(logo, pts) p < predict(d, logo)
logo < stack(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(48.243420, 48.243420, 47.985820, 52.880230, 49.531423, 46.182616, 54.168232, 69.624263, 83.792291, 85.337894, 74.261072, 83.792291, 95.126713, 84.565092, 66.275456, 41.803408, 25.832176, 3.936132, 18.876962, 17.331359,7.048974, 13.648543, 26.093446, 28.544714, 39.104026, 44.572240, 51.171810, 56.262906, 46.269272, 38.161230, 30.618865, 21.945145, 34.390047, 59.656971, 69.839163, 73.233228, 63.239594, 45.892154, 43.252326, 28.356155), ncol=2) d < domain(logo, pts) p < predict(d, logo)
Very simple mechanistic model for plants.
ecocrop(crop, tmin, tavg, prec, rainfed=TRUE, ...) getCrop(name) data(ECOcrops)
ecocrop(crop, tmin, tavg, prec, rainfed=TRUE, ...) getCrop(name) data(ECOcrops)
crop 
An object of class 'ECOCROP', or the name of a crop as in getCrop 
tmin 
Vector of monthly minimum temperature (degrees C) 
tavg 
Vector of monthly average temperature (degrees C) 
prec 
Vector of monthly precipitation (mm) 
rainfed 
Logical. If 
... 
Additinal arguments 
name 
Name of a crop (character). If missing a data.frame with all crop names is returned 
Object of class ECOCROP
Robert J. Hijmans
ecocrop('potato', 5:16, 15:26, runif(12)*100) getCrop('Acacia brachystachya Benth.') crop < getCrop('Hot pepper') ecocrop(crop, 5:16, 15:26, rainfed=FALSE) # with spatial data tmin = tavg = prec = brick(nrow=1, ncol=1) tmin < setValues(tmin, t(matrix(5:16))) tavg < tmin + 5 prec < setValues(prec, t(matrix(15:26))) crop < getCrop('Hot pepper') ecocrop(crop, tmin, tavg, prec, rainfed = FALSE)
ecocrop('potato', 5:16, 15:26, runif(12)*100) getCrop('Acacia brachystachya Benth.') crop < getCrop('Hot pepper') ecocrop(crop, 5:16, 15:26, rainfed=FALSE) # with spatial data tmin = tavg = prec = brick(nrow=1, ncol=1) tmin < setValues(tmin, t(matrix(5:16))) tavg < tmin + 5 prec < setValues(prec, t(matrix(15:26))) crop < getCrop('Hot pepper') ecocrop(crop, tmin, tavg, prec, rainfed = FALSE)
Simple generic limiting factor based model, in the tradition of the PLANTGRO model (Hackett, 1991)
## S4 method for signature 'matrix,matrix' ecolim(x, y, extrapolate=TRUE, ...)
## S4 method for signature 'matrix,matrix' ecolim(x, y, extrapolate=TRUE, ...)
x 
numeric matrix with driver variables (each column has values for the variables). Values have to be in ascending order 
y 
numeric matrix with responses (between 0 and 1), one column for each column in 
extrapolate 
logical. Should the model extrapolate beyond the extremes of x? If 
... 
Additional arguments. None implemented 
Robert J. Hijmans
Hackett, C., 1991. PLANTGRO, a software package for coarse prediction of plant growth. CSIRO, Melbourne, Australia
# get predictor variables fnames < list.files(path=paste(system.file(package="dismo"), '/ex', sep=''), pattern='grd', full.names=TRUE ) env < stack(fnames) bio1 < c(200,250,400,450) bio12 < c(0,1000, 3000, 4000) r1 < c(0, 1, 1, 0) r2 < c(0, 0, 1, 1) x < cbind(bio1, bio12) y < cbind(r1, r2) e < ecolim(x, y) plot(e, lwd=2, col='red') p < predict(e, env) plot(p) # no extrapolation: ef < ecolim(x, y, extrapolate=FALSE) pf < predict(ef, env) plot(pf) occurence < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') occ < read.table(occurence, header=TRUE, sep=',')[,1] fold < kfold(occ, k=5) occtest < occ[fold == 1, ] occtrain < occ[fold != 1, ] bg < randomPoints(env, 1000) ## Not run: # An approach to optimize the values based on # some known presences and (here random) absences # for the same species as in the maxent example # intial parameters v < c(200, 250, 400, 450, 0, 1000, 3000, 4000) # function to be minimized f < function(p) { x[] < p # numbers must go up if ( any(x[1,] < x[nrow(x), ]) ) return(Inf) e < ecolim(x, y) # we are minimizing, hence 1AUC 1evaluate(e, p=occtrain, a=bg, x=env)@auc } # patience... set.seed(0) z < optim(v, f) x[] < z$par eco < ecolim(x, y) evaluate(eco, p=occtest, a=bg, x=env) set.seed(0) pwd < pwdSample(occtest,bg,occtrain) ptest < occtest[!is.na(pwd),] atest < bg[na.omit(pwd),] evaluate(eco, p=ptest, a=atest, x=env) p2 < predict(eco, env) plot(p2) ## End(Not run)
# get predictor variables fnames < list.files(path=paste(system.file(package="dismo"), '/ex', sep=''), pattern='grd', full.names=TRUE ) env < stack(fnames) bio1 < c(200,250,400,450) bio12 < c(0,1000, 3000, 4000) r1 < c(0, 1, 1, 0) r2 < c(0, 0, 1, 1) x < cbind(bio1, bio12) y < cbind(r1, r2) e < ecolim(x, y) plot(e, lwd=2, col='red') p < predict(e, env) plot(p) # no extrapolation: ef < ecolim(x, y, extrapolate=FALSE) pf < predict(ef, env) plot(pf) occurence < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') occ < read.table(occurence, header=TRUE, sep=',')[,1] fold < kfold(occ, k=5) occtest < occ[fold == 1, ] occtrain < occ[fold != 1, ] bg < randomPoints(env, 1000) ## Not run: # An approach to optimize the values based on # some known presences and (here random) absences # for the same species as in the maxent example # intial parameters v < c(200, 250, 400, 450, 0, 1000, 3000, 4000) # function to be minimized f < function(p) { x[] < p # numbers must go up if ( any(x[1,] < x[nrow(x), ]) ) return(Inf) e < ecolim(x, y) # we are minimizing, hence 1AUC 1evaluate(e, p=occtrain, a=bg, x=env)@auc } # patience... set.seed(0) z < optim(v, f) x[] < z$par eco < ecolim(x, y) evaluate(eco, p=occtest, a=bg, x=env) set.seed(0) pwd < pwdSample(occtest,bg,occtrain) ptest < occtest[!is.na(pwd),] atest < bg[na.omit(pwd),] evaluate(eco, p=ptest, a=atest, x=env) p2 < predict(eco, env) plot(p2) ## End(Not run)
Evaluation of models with presence/absence data. Given a vector of presence and a vector of absence values (or a model and presence and absence points and predictors), confusion matrices are computed (for varying thresholds), and model evaluation statistics are computed for each confusion matrix / threshold. See the description of class ModelEvaluationclass
for more info.
evaluate(p, a, model, x, tr, ...)
evaluate(p, a, model, x, tr, ...)
p 
presence points (x and y coordinates or SpatialPoints* object). Or, if Or, a matrix with values to compute predictions for 
a 
absence points (x and y coordinates or SpatialPoints* object). Or, if Or, a matrix with values to compute predictions for 
model 
any fitted model, including objects inheriting from 'DistModel'; not used when 
x 
Optional. Predictor variables (object of class Raster*). If present, 
tr 
Optional. a vector of threshold values to use for computing the confusion matrices 
... 
Additional arguments for the predict function 
An object of ModelEvaluationclass
Robert J. Hijmans
Fielding, A.H. and J.F. Bell, 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24:3849
## See ?maxent for an example with real data. # this is a contrived example: # p has the predicted values for 50 known cases (locations) # with presence of the phenomenon (species) p < rnorm(50, mean=0.7, sd=0.3) # a has the predicted values for 50 background locations (or absence) a < rnorm(50, mean=0.4, sd=0.4) e < evaluate(p=p, a=a) threshold(e) plot(e, 'ROC') plot(e, 'TPR') boxplot(e) density(e) str(e)
## See ?maxent for an example with real data. # this is a contrived example: # p has the predicted values for 50 known cases (locations) # with presence of the phenomenon (species) p < rnorm(50, mean=0.7, sd=0.3) # a has the predicted values for 50 background locations (or absence) a < rnorm(50, mean=0.4, sd=0.4) e < evaluate(p=p, a=a) threshold(e) plot(e, 'ROC') plot(e, 'TPR') boxplot(e) density(e) str(e)
Preparing data for model testing with the ROCR package.
evaluateROCR(model, p, a, x)
evaluateROCR(model, p, a, x)
model 
any fitted model, including objects inheriting from 'DistModel' 
p 
presence points (x and y coordinates or SpatialPoints* object). Or, if Or, a matrix with values to compute predictions for 
a 
absence points (x and y coordinates or SpatialPoints* object). Or, if Or, a matrix with values to compute predictions for 
x 
optional. predictor variables, if present, 
An object of class "prediction" (defined in the ROCR package)
Robert J. Hijmans
Make a ROC curve, or a plot of a threshold dependent measure against threshold values
usage: plot(x, y, ...)
x 
Object of class ModelEvaluation  
y 
Character. Either 'ROC' or a threshold dependent measure such as 'kappa', 'TPR'  
... 
Additional arguments that can be passed to plot  
Robert J. Hijmans
ModelEvaluationclass, density, pairs, plot
# p = the predicted value for 50 known cases (locations) with presence of the phenomenon (species) p = rnorm(50, mean=0.7, sd=0.3) # b = the predicted value for 50 known cases (locations) with absence of the phenomenon (species) a = rnorm(50, mean=0.4, sd=0.4) e = evaluate(p=p, a=a) plot(e, 'ROC') plot(e, 'kappa') plot(e, 'FPR') plot(e, 'prevalence')
# p = the predicted value for 50 known cases (locations) with presence of the phenomenon (species) p = rnorm(50, mean=0.7, sd=0.3) # b = the predicted value for 50 known cases (locations) with absence of the phenomenon (species) a = rnorm(50, mean=0.4, sd=0.4) e = evaluate(p=p, a=a) plot(e, 'ROC') plot(e, 'kappa') plot(e, 'FPR') plot(e, 'prevalence')
This function downloads species occurence records from the Global Biodiversity Information Facility (GBIF) data portal. You can download either a single species (if you append a '*' to the species name) or a subspecies of comparable level. You can download the data for an entire genus by using species='*'
. Before using this function, please first check the GBIF data use agreement and see the note below about how to cite these data.
gbif(genus, species="", ext=NULL, args=NULL, geo=TRUE, sp=FALSE, removeZeros=FALSE, download=TRUE, ntries=5, nrecs=300, start=1, end=Inf)
gbif(genus, species="", ext=NULL, args=NULL, geo=TRUE, sp=FALSE, removeZeros=FALSE, download=TRUE, ntries=5, nrecs=300, start=1, end=Inf)
genus 
character. genus name 
species 
character. species name. Use '*' to download the entire genus. Append '*' to the species name to get all naming variants (e.g. with and witout species author name) and subtaxa 
ext 
Extent object to limit the geographic extent of the records. An extent can be created using functions like 
args 
character. Additional arguments to refine the query. See query parameters in https://www.gbif.org/developer/occurrence for more details 
geo 
logical. If 
sp 
logical. If 
removeZeros 
logical. If 
download 
logical. If 
ntries 
integer. How many times should the function attempt to download the data, if an invalid response is returned (perhaps because the GBIF server is very busy) 
nrecs 
integer. How many records to download in a single request (max is 300)? 
start 
integer. Record number from which to start requesting data 
end 
integer. Last record to request 
data frame
Under the terms of the GBIF data user agreement, users who download data agree to cite a DOI. Citation rewards datapublishing institutions and individuals and provides support for sharing open data [1][2]. You can get a DOI for the data you downloaded by creating a "derived" dataset. For this to work, you need to keep the "datasetKey" variable in your dataset.
Robert J. Hijmans
https://www.gbif.org/occurrence
## Not run: gbif('solanum', download=FALSE) gbif('solanum', 'acaule', download=FALSE) gbif('Batrachoseps', '' , down=FALSE) gbif('Batrachoseps', 'luciae', down=FALSE) g < gbif('Batrachoseps', 'luciae', geo=TRUE) plot(g$lon, g$lat) gs < gbif('Batrachoseps', 'luciae', sp=TRUE) plot(gs) ## End(Not run)
## Not run: gbif('solanum', download=FALSE) gbif('solanum', 'acaule', download=FALSE) gbif('Batrachoseps', '' , down=FALSE) gbif('Batrachoseps', 'luciae', down=FALSE) g < gbif('Batrachoseps', 'luciae', geo=TRUE) plot(g$lon, g$lat) gs < gbif('Batrachoseps', 'luciae', sp=TRUE) plot(gs) ## End(Not run)
Calculates a gradient boosting (gbm) object with a fixed number of trees. The optimal number of trees can be identified using gbm.step or some other procedure. Mostly used as a utility function, e.g., when being called by gbm.simplify. It takes as input a dataset and arguments selecting x and y variables, learning rate and tree complexity.
gbm.fixed(data, gbm.x, gbm.y, tree.complexity = 1, site.weights = rep(1, nrow(data)), verbose = TRUE, learning.rate = 0.001, n.trees = 2000, bag.fraction = 0.5, family = "bernoulli", keep.data = FALSE, var.monotone = rep(0, length(gbm.x)))
gbm.fixed(data, gbm.x, gbm.y, tree.complexity = 1, site.weights = rep(1, nrow(data)), verbose = TRUE, learning.rate = 0.001, n.trees = 2000, bag.fraction = 0.5, family = "bernoulli", keep.data = FALSE, var.monotone = rep(0, length(gbm.x)))
data 
data.frame 
gbm.x 
indices of the predictors in the input dataframe 
gbm.y 
index of the response in the input dataframe 
tree.complexity 
the tree depth  sometimes referred to as interaction depth 
site.weights 
by default set equal 
verbose 
to control reporting 
learning.rate 
controls speed of the gradient descent 
n.trees 
default number of trees 
bag.fraction 
varies random sample size for each new tree 
family 
can be any of "bernoulli", "poisson", "gaussian", or "laplace" 
keep.data 
Logical. If 
var.monotone 
constrain to positive (1) or negative monontone (1) 
object of class gbm
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
Calculates a gradient boosting (gbm) object in which model complexity is determined using a training set with predictions made to a withheld set. An initial set of trees is fitted, and then trees are progressively added testing performance along the way, using gbm.perf until the optimal number of trees is identified.
As any structured ordering of the data should be avoided, a copy of the data set is BY DEFAULT randomly reordered each time the function is run.
gbm.holdout(data, gbm.x, gbm.y, learning.rate = 0.001, tree.complexity = 1, family = "bernoulli", n.trees = 200, add.trees = n.trees, max.trees = 20000, verbose = TRUE, train.fraction = 0.8, permute = TRUE, prev.stratify = TRUE, var.monotone = rep(0, length(gbm.x)), site.weights = rep(1, nrow(data)), refit = TRUE, keep.data = TRUE)
gbm.holdout(data, gbm.x, gbm.y, learning.rate = 0.001, tree.complexity = 1, family = "bernoulli", n.trees = 200, add.trees = n.trees, max.trees = 20000, verbose = TRUE, train.fraction = 0.8, permute = TRUE, prev.stratify = TRUE, var.monotone = rep(0, length(gbm.x)), site.weights = rep(1, nrow(data)), refit = TRUE, keep.data = TRUE)
data 
data.frame 
gbm.x 
indices of the predictors in the input dataframe 
gbm.y 
index of the response in the input dataframe 
learning.rate 
typically varied between 0.1 and 0.001 
tree.complexity 
sometimes called interaction depth 
family 
"bernoulli","poisson", etc. as for gbm 
n.trees 
initial number of trees 
add.trees 
number of trees to add at each increment 
max.trees 
maximum number of trees to fit 
verbose 
controls degree of screen reporting 
train.fraction 
proportion of data to use for training 
permute 
reorder data to start with 
prev.stratify 
stratify selection for presence/absence data 
var.monotone 
allows constraining of response to monotone 
site.weights 
set equal to 1 by default 
refit 
refit the model with the full data but id'd no of trees 
keep.data 
keep copy of the data 
A gbm object
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
Tests whether interactions have been detected and modelled, and reports the relative strength of these. Results can be visualised with gbm.perspec
The function assesses the magnitude of 2nd order interaction effects in gbm models fitted with interaction depths greater than 1. This is achieved by:
1. forming predictions on the linear scale for each predictor pair;
2. fitting a linear model that relates these predictions to the predictor pair, with the the predictors fitted as factors;
3. calculating the mean value of the residuals, the magnitude of which increases with the strength of any interaction effect;
4. results are stored in an array;
5. finally, the n most important interactions are identified, where n is 25
gbm.interactions(gbm.object, use.weights=FALSE, mask.object)
gbm.interactions(gbm.object, use.weights=FALSE, mask.object)
gbm.object 
A gbm object 
use.weights 
Logical. If 
mask.object 
a gbm object describing sample intensity 
object of class gbm
Takes a gbm boosted regression tree object produced by gbm.step and plots a perspective plot showing predicted values for two predictors as specified by number using x and y. Values for all other variables are set at their mean by default but values can be specified by giving a list consisting of the variable name and its desired value, e.g., c(name1 = 12.2, name2 = 57.6)
gbm.perspec(gbm.object, x = 1, y = 2, pred.means = NULL, x.label = NULL, x.range = NULL, y.label = NULL, z.label = "fitted value", y.range = NULL, z.range = NULL, leg.coords = NULL, ticktype = "detailed", theta = 55, phi = 40, smooth = "none", mask = FALSE, perspective = TRUE, ...)
gbm.perspec(gbm.object, x = 1, y = 2, pred.means = NULL, x.label = NULL, x.range = NULL, y.label = NULL, z.label = "fitted value", y.range = NULL, z.range = NULL, leg.coords = NULL, ticktype = "detailed", theta = 55, phi = 40, smooth = "none", mask = FALSE, perspective = TRUE, ...)
gbm.object 
object of class gbm 
x 
the first variable to be plotted 
y 
the second variable to be plotted 
pred.means 
allows specification of values for other variables 
x.label 
allows manual specification of the x label 
x.range 
manual range specification for the x variable 
y.label 
and y label 
z.label 
default z label 
y.range 
and the y 
z.range 
allows control of the vertical axis 
leg.coords 
can specify coords (x, y) for legend 
ticktype 
specifiy detailed types  otherwise "simple" 
theta 
rotation 
phi 
and elevation 
smooth 
controls smoothing of the predicted surface 
mask 
controls masking using a sample intensity model 
perspective 
controls whether a contour or perspective plot is drawn 
... 
allows the passing of additional arguments to plotting routine useful options include shade, ltheta, lphi for controlling illumination and cex for controlling text size  cex.axis and cex.lab have no effect 
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
Function to plot gbm response variables, with the option of adding a smooth representation of the response if requested additional options in this version allow for plotting on a common scale. Note hat fitted functions are centered by subtracting their mean.
gbm.plot(gbm.object, variable.no=0, smooth=FALSE, rug=TRUE, n.plots=length(pred.names), common.scale=TRUE, write.title=TRUE, y.label="fitted function", x.label=NULL, show.contrib=TRUE, plot.layout=c(3, 4), ...)
gbm.plot(gbm.object, variable.no=0, smooth=FALSE, rug=TRUE, n.plots=length(pred.names), common.scale=TRUE, write.title=TRUE, y.label="fitted function", x.label=NULL, show.contrib=TRUE, plot.layout=c(3, 4), ...)
gbm.object 
a gbm object  could be one from gbm.step 
variable.no 
the var to plot  if zero then plots all 
smooth 
Logical. If 
rug 
Logical. If 
n.plots 
plot the first n most important preds 
common.scale 
Logical. If 
write.title 
Logical. If 
y.label 
the default yaxis label 
x.label 
the default xaxis label 
show.contrib 
Logical. If 
plot.layout 
define the default layout for graphs on the page 
... 
other arguments to pass to the plotting useful options include cex.axis, cex.lab, etc. 
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
Plots the fitted values from a gbm object returned by any of the model fitting options. This can give a more reliable guide to the shape of the fitted surface than can be obtained from the individual functions, particularly when predictor variables are correlated and/or samples are unevenly distributed in environmental space. Allows masking out of absences to enable focus on sites with high predicted values.
gbm.plot.fits(gbm.object, v=0, mask.presence=FALSE, use.factor=FALSE )
gbm.plot.fits(gbm.object, v=0, mask.presence=FALSE, use.factor=FALSE )
gbm.object 
a gbm object 
v 
variable numbers to be plotted (if 0 then all are plotted) 
mask.presence 
Logical. If 
use.factor 
Logical. If 
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
The function takes an inital crossvalidated model as produced by gbm.step and then assesses the potential to remove predictors using kfold cross validation. This done for each fold, removing the lowest contributing predictor, and repeating this process for a set number of steps. After the removal of each predictor, the change in predictive deviance is computed relative to that obtained when using all predictors. The function returns a list containing the mean change in deviance and its standard error as a function of the number of variables removed. Having completed the cross validation, it then identifies the sequence of variable to remove when using the full data set, testing this up to the number of steps used in the crossvalidation phase of the analysis with results reported to the screen.
The function returns a table containing the order in which variables are to be removed and some vectors, each of which specifies the predictor column numbers in the original dataframe  the latter can be used as an argument to gbm.step e.g., gbm.step(data = data, gbm.x = simplify.object$pred.list[[4]]... would implement a new analysis with the original predictor set, minus its four lowest contributing predictors.
gbm.simplify(gbm.object, n.folds = 10, n.drops = "auto", alpha = 1, prev.stratify = TRUE, eval.data = NULL, plot = TRUE)
gbm.simplify(gbm.object, n.folds = 10, n.drops = "auto", alpha = 1, prev.stratify = TRUE, eval.data = NULL, plot = TRUE)
gbm.object 
a gbm object describing sample intensity 
n.folds 
number of times to repeat the analysis 
n.drops 
can be automatic or an integer specifying the number of drops to check 
alpha 
controls stopping when n.drops = "auto" 
prev.stratify 
use prevalence stratification in selecting evaluation data 
eval.data 
an independent evaluation data set  leave here for now 
plot 
plot results 
A list with these elements: deviance.summary, deviance.matrix, drop.count, final.drops, pred.list, and gbm.call = gbm.call))
John R. Leathwick and Jane Elith
Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
Function to assess the optimal number of boosting trees using kfold cross validation. This is an implementation of the crossvalidation procedure described on page 215 of Hastie et al (2001).
The data is divided into 10 subsets, with stratification by prevalence if required for presence/absence data. The function then fits a gbm model of increasing complexity along the sequence from n.trees
to n.trees + (n.steps * step.size)
, calculating the residual deviance at each step along the way. After each fold processed, the function calculates the average holdout residual deviance and its standard error and then identifies the optimal number of trees as that at which the holdout deviance is minimised. It fits a model with this number of trees, returning it as a gbm model along with additional information from the crossvalidation selection process.
gbm.step(data, gbm.x, gbm.y, offset = NULL, fold.vector = NULL, tree.complexity = 1, learning.rate = 0.01, bag.fraction = 0.75, site.weights = rep(1, nrow(data)), var.monotone = rep(0, length(gbm.x)), n.folds = 10, prev.stratify = TRUE, family = "bernoulli", n.trees = 50, step.size = n.trees, max.trees = 10000, tolerance.method = "auto", tolerance = 0.001, plot.main = TRUE, plot.folds = FALSE, verbose = TRUE, silent = FALSE, keep.fold.models = FALSE, keep.fold.vector = FALSE, keep.fold.fit = FALSE, ...)
gbm.step(data, gbm.x, gbm.y, offset = NULL, fold.vector = NULL, tree.complexity = 1, learning.rate = 0.01, bag.fraction = 0.75, site.weights = rep(1, nrow(data)), var.monotone = rep(0, length(gbm.x)), n.folds = 10, prev.stratify = TRUE, family = "bernoulli", n.trees = 50, step.size = n.trees, max.trees = 10000, tolerance.method = "auto", tolerance = 0.001, plot.main = TRUE, plot.folds = FALSE, verbose = TRUE, silent = FALSE, keep.fold.models = FALSE, keep.fold.vector = FALSE, keep.fold.fit = FALSE, ...)
data 
input data.frame 
gbm.x 
indices or names of predictor variables in 
gbm.y 
index or name of response variable in 
offset 
offset 
fold.vector 
a fold vector to be read in for cross validation with offsets 
tree.complexity 
sets the complexity of individual trees 
learning.rate 
sets the weight applied to inidivudal trees 
bag.fraction 
sets the proportion of observations used in selecting variables 
site.weights 
allows varying weighting for sites 
var.monotone 
restricts responses to individual predictors to monotone 
n.folds 
number of folds 
prev.stratify 
prevalence stratify the folds  only for presence/absence data 
family 
family  bernoulli (=binomial), poisson, laplace or gaussian 
n.trees 
number of initial trees to fit 
step.size 
numbers of trees to add at each cycle 
max.trees 
max number of trees to fit before stopping 
tolerance.method 
method to use in deciding to stop  "fixed" or "auto" 
tolerance 
tolerance value to use  if method == fixed is absolute, if auto is multiplier * total mean deviance 
plot.main 
Logical. plot holdout deviance curve 
plot.folds 
Logical. plot the individual folds as well 
verbose 
Logical. control amount of screen reporting 
silent 
Logical. to allow running with no output for simplifying model) 
keep.fold.models 
Logical. keep the fold models from cross valiation 
keep.fold.vector 
Logical. allows the vector defining fold membership to be kept 
keep.fold.fit 
Logical. allows the predicted values for observations from crossvalidation to be kept 
... 
Logical. allows for any additional plotting parameters 
object of S3 class gbm
This and other boosted regression trees (BRT) functions in the dismo package do not work if you use only one predictor. There is an easy work around: make a dummy variable with a constant value and then fit a model with two predictors, the one of interest and the dummy variable, which will be ignored by the model fitting as it has no useful information.
John R. Leathwick and Jane Elith
Hastie, T., R. Tibshirani, and J.H. Friedman, 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. SpringerVerlag, New York Elith, J., J.R. Leathwick and T. Hastie, 2009. A working guide to boosted regression trees. Journal of Animal Ecology 77: 80281
data(Anguilla_train) # reduce data set to speed things up a bit Anguilla_train = Anguilla_train[1:200,] angaus.tc5.lr01 < gbm.step(data=Anguilla_train, gbm.x = 3:14, gbm.y = 2, family = "bernoulli", tree.complexity = 5, learning.rate = 0.01, bag.fraction = 0.5)
data(Anguilla_train) # reduce data set to speed things up a bit Anguilla_train = Anguilla_train[1:200,] angaus.tc5.lr01 < gbm.step(data=Anguilla_train, gbm.x = 3:14, gbm.y = 2, family = "bernoulli", tree.complexity = 5, learning.rate = 0.01, bag.fraction = 0.5)
A wrapper around the Google geocoding webservice. It returns 0 to n matches. It is important to be as precise as possible, e.g. always include the country in the locality description.
geocode(x, oneRecord=FALSE, extent=NULL, progress='', geocode_key, ...)
geocode(x, oneRecord=FALSE, extent=NULL, progress='', geocode_key, ...)
x 
A vector of locality descriptions 
oneRecord 
Logical. If 
extent 
An Extent object, or an object that can be coerced to one, to bias the search towards that region 
progress 
Character. Valid values are "" (no progress indicator), "text" or "window" 
geocode_key 
character. Your Google API key for geocoding (and billing). See https://developers.google.com/maps/documentation/geocoding/getapikey) 
... 
additional arguments (currently none implemeted) 
data.frame
with the following fields:
originalPlace 
the locality description as provided (in argument 
interpretedPlace 
the locality as interpreted by the Google API 
lon 
longitude 
lat 
latitude 
lonmin 
minimum longitude of the bounding box 
lonmax 
maximum longitude of the bounding box 
latmin 
minimum latitude of the bounding box 
latmax 
maximum latitude of the bounding box 
uncertainty 
distance from 
It is important to compare fields originalPlace
and interpretedPlace
as the Google interpretation of a (perhaps vague) locality description can be very speculative
Robert J. Hijmans
## Not run: geocode(c('1600 Pennsylvania Ave NW, Washington DC', 'Luca, Italy', 'Kampala')) geocode(c('San Jose', 'San Jose, Mexico')) geocode(c('San Jose', 'San Jose, Mexico'), oneRecord=TRUE) ## End(Not run)
## Not run: geocode(c('1600 Pennsylvania Ave NW, Washington DC', 'Luca, Italy', 'Kampala')) geocode(c('San Jose', 'San Jose, Mexico')) geocode(c('San Jose', 'San Jose, Mexico'), oneRecord=TRUE) ## End(Not run)
The geographic distance model predicts that the likelyhood of presence is highest near places where a species has been observed. It can be used as a nullmodel to calibrate crossvalidation scores with.
The predicted values are the inverse distance to the nearest known presence point. Distances smaller than or equal to zero are set to 1 (highest score).
geoDist(p, ...)
geoDist(p, ...)
p 
point locations (presence). Two column matrix, data.frame or SpatialPoints* object 
... 
Additional arguments. You must supply a lonlat= argument (logical), unless p is a SpatialPoints* object and has a valid CRS (coordinate reference system). You can also supply an additional argument 'a' for absence points (currently ignored). Argument 'a' should be of the same class as argument 'p' 
An object of class 'GeographicDistance' (inherits from DistModelclass
)
Robert J. Hijmans
predict, convHull, maxent, domain, mahal, voronoiHull, geoIDW
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) colnames(pts) < c('x', 'y') train < pts[1:12, ] test < pts[13:20, ] gd < geoDist(train, lonlat=FALSE) p < predict(gd, r) ## Not run: plot(p) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) ## End(Not run)
r < raster(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) colnames(pts) < c('x', 'y') train < pts[1:12, ] test < pts[13:20, ] gd < geoDist(train, lonlat=FALSE) p < predict(gd, r) ## Not run: plot(p) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2) ## End(Not run)
Retrieve a 'Google Map' that can be used as a background for plotting points and other spatial data.
The projection of the returned Raster object is "Mercator" (unless you use lonlat=TRUE
), and other spatial data may need to be transformed before it can be plotted on top of the Google map. You can use the Mercator
function to transform points from longitude/latitude to Mercator. For SpatialLines and SpatialPolygons objects, use spTransform
in the rgdal
package.
This function uses the Google static maps webservice, and is based on functions by Markus Loecher for the RgoogleMaps
package.
gmap(x, exp=1, type='terrain', filename='', style=NULL, scale=1, zoom=NULL, size=c(640, 640), rgb=FALSE, lonlat=FALSE, map_key, geocode_key, ...) Mercator(p, inverse = FALSE)
gmap(x, exp=1, type='terrain', filename='', style=NULL, scale=1, zoom=NULL, size=c(640, 640), rgb=FALSE, lonlat=FALSE, map_key, geocode_key, ...) Mercator(p, inverse = FALSE)
x 
a textual locality description, or an Extent object (with longitude/latitude coordinates), or an object that can be coerced to one (such as a Raster* or Spatial* object), in any (known) coordinate system 
exp 
numeric. An expansion factor to enlarge (by multiplication) the extent specified by 
type 
character. Choose from 'roadmap', 'satellite', 'hybrid', 'terrain' 
filename 
character. Filename (optional). You can open the resulting file in a GIS program 
style 
character. Additional style arguments. See https://developers.google.com/maps/documentation/mapsstatic/overview?csw=1#StyledMapFeatures.
Note that certain style features do not work in combination with (the default) 
scale 
1 or 2. Using 2 doubles the number of pixels returned (and thus gives you better image quality if you need a large image) 
zoom 
integer between 0 (the whole world) to 21 (very small area), centered on the center of the extent 
size 
vector of two integers indicating the number of columns and rows that is requested (what is returned depends on other factors as well). Maximum values are 
rgb 
logical. If 
lonlat 
logical. If 
map_key 
character. Your Google API key for mapping (and billing). See https://developers.google.com/maps/documentation/javascript/getapikey) 
geocode_key 
character. Your Google API key for geocoding (and billing). Only relevant if 
... 
additional parameters 
p 
Points. A twocolumn matrix, or a SpatialPoints object 
inverse 
Should the inverse projection be done (from Mercator to longitude/latitude?) 
If argument x
is a textual locality description, the geocode
function is used to retrieve the extent that should be mapped.
Change the type to 'roadmap' if the map returned says "sorry we have no imagery here"; or use a larger extent.
The returned RasterLayer has a Mercator projection. To plot points (or lines or polygons) on top of it, these need to be transformed first.
A matrix of longitude/latitude data can be transformed with the Mercator function used in the example below. 'Spatial*' objects can be transformed with spTransform
p < spTransform(x, "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs")
RasterLayer
Robert Hijmans and Sebastien Rochette, based on code by Markus Loecher, Sense Networks <markus at sensenetworks.com> in the RgoogleMaps package
## Not run: mymapkey = "pktHVbDiymfUL" mygeokey = "Skxe99adfKeax" library(rgdal) # from a maxtrix with lon/lat points x < runif(30)*10 + 40 y < runif(30)*10  20 xy < cbind(x, y) g < gmap(xy, type='hybrid', map_key=mymapkey) plot(g, inter=TRUE) points(Mercator(xy) , col='red', pch=20) # or from an Extent object e < extent( 121.9531 , 120.3897 , 35.36 , 36.61956 ) # you can also get an Extent object by clicking on the map twice after using: # drawExtent() r < gmap(e, map_key=mymapkey) plot(r, interpolate=TRUE) # transform points to Mercator for plotting on top of map: pt < matrix(c(121, 36), ncol=2) ptm < Mercator(pt) points(ptm, cex=3, pch=20, col='blue') Mercator(ptm, inverse=TRUE) # transform Spatial objects to Mercator for plotting on top of map # here for points, but particularly relevant for lines and polygons pt < data.frame(pt) coordinates(pt) < ~X1 + X2 proj4string(pt) <"+proj=longlat +datum=WGS84 +ellps=WGS84" ptm2 < spTransform(pt, CRS("+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs")) points(ptm, col='red', pch='x', cex=3) # get a map using names g = gmap('Australia', map_key=mymapkey, geocode_key=mygeokey) plot(g, inter=TRUE) gs = gmap('Sydney, New South Wales, Australia', type='satellite', map_key=mymapkey, geocode_key=mygeokey) plot(gs, inter=TRUE) gs = gmap('Sydney, Australia', type='satellite', exp=3, map_key=mymapkey, geocode_key=mygeokey) plot(gs, inter=TRUE) gs = gmap('Sydney, Australia', type='hybrid', zoom=10, scale=2, map_key=mymapkey, geocode_key=mygeokey) plot(gs, inter=TRUE) # styles: g < gmap("Brooklyn", style="feature:road.localelement:geometryhue:0x00ff00saturation:100 &style=feature:landscapeelement:geometrylightness:100", type='roadmap', map_key=mymapkey, geocode_key=mygeokey) plot(g) ## End(Not run)
## Not run: mymapkey = "pktHVbDiymfUL" mygeokey = "Skxe99adfKeax" library(rgdal) # from a maxtrix with lon/lat points x < runif(30)*10 + 40 y < runif(30)*10  20 xy < cbind(x, y) g < gmap(xy, type='hybrid', map_key=mymapkey) plot(g, inter=TRUE) points(Mercator(xy) , col='red', pch=20) # or from an Extent object e < extent( 121.9531 , 120.3897 , 35.36 , 36.61956 ) # you can also get an Extent object by clicking on the map twice after using: # drawExtent() r < gmap(e, map_key=mymapkey) plot(r, interpolate=TRUE) # transform points to Mercator for plotting on top of map: pt < matrix(c(121, 36), ncol=2) ptm < Mercator(pt) points(ptm, cex=3, pch=20, col='blue') Mercator(ptm, inverse=TRUE) # transform Spatial objects to Mercator for plotting on top of map # here for points, but particularly relevant for lines and polygons pt < data.frame(pt) coordinates(pt) < ~X1 + X2 proj4string(pt) <"+proj=longlat +datum=WGS84 +ellps=WGS84" ptm2 < spTransform(pt, CRS("+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs")) points(ptm, col='red', pch='x', cex=3) # get a map using names g = gmap('Australia', map_key=mymapkey, geocode_key=mygeokey) plot(g, inter=TRUE) gs = gmap('Sydney, New South Wales, Australia', type='satellite', map_key=mymapkey, geocode_key=mygeokey) plot(gs, inter=TRUE) gs = gmap('Sydney, Australia', type='satellite', exp=3, map_key=mymapkey, geocode_key=mygeokey) plot(gs, inter=TRUE) gs = gmap('Sydney, Australia', type='hybrid', zoom=10, scale=2, map_key=mymapkey, geocode_key=mygeokey) plot(gs, inter=TRUE) # styles: g < gmap("Brooklyn", style="feature:road.localelement:geometryhue:0x00ff00saturation:100 &style=feature:landscapeelement:geometrylightness:100", type='roadmap', map_key=mymapkey, geocode_key=mygeokey) plot(g) ## End(Not run)
Sample points from xy, using a grid (raster) as stratification. Up to n points are sampled from each stratum (cell). For "chessboard" sampling (i.e. sampling from half the cells), use the argument chess='black'
, or chess='white'
.
gridSample(xy, r, n=1, chess='')
gridSample(xy, r, n=1, chess='')
xy 
A two column matrix or data.frame with x and y coordinates (or longitude and latitude), or a SpatialPoints* object 
r 
Raster* object 
n 
Maximum number of samples per cell 
chess 
Character. ”, 'black', or 'white'. If 'black' or 'white', "chessboard" sampling is used. I.e. only the 'white' fields, or only the 'black' fields are sampled. Cell number 1 (the upper left corner of 
A two column matrix with x and y coordinates (or longitude and latitude)
Robert J. Hijmans
x < rnorm(1000, 10, 5) y < rnorm(1000, 50, 5) xy < cbind(x,y) res < 5 r < raster(extent(range(xy[,1]), range(xy[,2])) + res) res(r) < res samp < gridSample(xy, r, n=1) plot(xy, cex=0.1) points(samp, pch='x', col='red')
x < rnorm(1000, 10, 5) y < rnorm(1000, 50, 5) xy < cbind(x,y) res < 5 r < raster(extent(range(xy[,1]), range(xy[,2])) + res) res(r) < res samp < gridSample(xy, r, n=1) plot(xy, cex=0.1) points(samp, pch='x', col='red')
Inversedistance weighted predictions for presence/absence data. Computed with the gstat function from the gstat package.
geoIDW(p, a, ...)
geoIDW(p, a, ...)
p 
Presence points. Two column matrix, data.frame, or a SpatialPoints* object 
a 
Absence points. Must be of the same class as p 
... 
Addtional arguments. None implemented 
An object of class InvDistWeightModel (inherits from DistModelclass
)
Robert J. Hijmans
r < raster(system.file("external/rlogo.grd", package="raster")) # presence points p < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) # absence points a < matrix(c(30, 23, 5, 5, 31, 33, 91, 63, 60, 88, 93, 97, 65, 68, 85, 97, 35, 32, 29, 55, 3, 8, 19, 71, 49, 36, 69, 41, 20, 28, 18, 9, 5, 9, 25, 71, 8, 32, 46, 60), ncol=2) idw < geoIDW(p, a) prd < predict(r, idw) ## Not run: plot(prd) points(p) points(a, pch='x') ## End(Not run)
r < raster(system.file("external/rlogo.grd", package="raster")) # presence points p < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) # absence points a < matrix(c(30, 23, 5, 5, 31, 33, 91, 63, 60, 88, 93, 97, 65, 68, 85, 97, 35, 32, 29, 55, 3, 8, 19, 71, 49, 36, 69, 41, 20, 28, 18, 9, 5, 9, 25, 71, 8, 32, 46, 60), ncol=2) idw < geoIDW(p, a) prd < predict(r, idw) ## Not run: plot(prd) points(p) points(a, pch='x') ## End(Not run)
kfold partitioning of a data set for model testing purposes. Each record in a matrix (or similar data structure) is randomly assigned to a group. Group numbers are between 1 and k
.
kfold(x, k=5, by)
kfold(x, k=5, by)
x 
a vector, matrix, data.frame, or Spatial object 
k 
number of groups 
by 
Optional argument. A vector or factor with subgroups (e.g. species). Its length should be the same as the number of records in x 
a vector with group assignments
Robert J. Hijmans
#library(disdat) #data(NSWtrain) ## a single species #srsp1 < subset(NSWtrain, spid=='srsp1') #kfold(srsp1, k = 5) ## all species #k = kfold(NSWtrain, k=5, by=NSWtrain$spid) #k[NSWtrain$spid=='srsp1'] ## each group has the same number of records ##(except for adjustments if the number of records divided by k is not an integer) #table(k[NSWtrain$spid=='srsp1']) #k[NSWtrain$spid=='ousp5']
#library(disdat) #data(NSWtrain) ## a single species #srsp1 < subset(NSWtrain, spid=='srsp1') #kfold(srsp1, k = 5) ## all species #k = kfold(NSWtrain, k=5, by=NSWtrain$spid) #k[NSWtrain$spid=='srsp1'] ## each group has the same number of records ##(except for adjustments if the number of records divided by k is not an integer) #table(k[NSWtrain$spid=='srsp1']) #k[NSWtrain$spid=='ousp5']
Distribution model based on the Mahalanobis distance. The predictions are (1distance). I.e. the highest possible value is 1, and there will likely be large negative numbers.
mahal(x, p, ...)
mahal(x, p, ...)
x 
Raster* object or matrix 
p 
two column matrix or SpatialPoints* object 
... 
Additional arguments. Currently not used 
An object of class Mahalanobis (inherits from DistModelclass
)
Robert J. Hijmans
predict, maxent, bioclim, domain
logo < stack(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(48.243420, 48.243420, 47.985820, 52.880230, 49.531423, 46.182616, 54.168232, 69.624263, 83.792291, 85.337894, 74.261072, 83.792291, 95.126713, 84.565092, 66.275456, 41.803408, 25.832176, 3.936132, 18.876962, 17.331359, 7.048974, 13.648543, 26.093446, 28.544714, 39.104026, 44.572240, 51.171810, 56.262906, 46.269272, 38.161230, 30.618865, 21.945145, 34.390047, 59.656971, 69.839163, 73.233228, 63.239594, 45.892154, 43.252326, 28.356155), ncol=2) # fit model m < mahal(logo, pts) # make a prediction predict(m, logo[1]) x < predict(m, logo) # or x < predict(logo, m) via raster::predict # plot(x > 0)
logo < stack(system.file("external/rlogo.grd", package="raster")) #presence data pts < matrix(c(48.243420, 48.243420, 47.985820, 52.880230, 49.531423, 46.182616, 54.168232, 69.624263, 83.792291, 85.337894, 74.261072, 83.792291, 95.126713, 84.565092, 66.275456, 41.803408, 25.832176, 3.936132, 18.876962, 17.331359, 7.048974, 13.648543, 26.093446, 28.544714, 39.104026, 44.572240, 51.171810, 56.262906, 46.269272, 38.161230, 30.618865, 21.945145, 34.390047, 59.656971, 69.839163, 73.233228, 63.239594, 45.892154, 43.252326, 28.356155), ncol=2) # fit model m < mahal(logo, pts) # make a prediction predict(m, logo[1]) x < predict(m, logo) # or x < predict(logo, m) via raster::predict # plot(x > 0)
Build a "MaxEnt" (Maximum Entropy) species distribution model (see references below). The function uses environmental data for locations of known presence and for a large number of 'background' locations. Environmental data can be extracted from raster files. The result is a model object that can be used to predict the suitability of other locations, for example, to predict the entire range of a species.
Background points are sampled randomly from the cells that are not NA
in the first predictor variable, unless background points are specified with argument a
.
This function uses the MaxEnt species distribution model software by Phillips, Dudik and Schapire.
## S4 method for signature 'Raster,ANY' maxent(x, p, a=NULL, factors=NULL, removeDuplicates=TRUE, nbg=10000, ...) ## S4 method for signature 'SpatialGridDataFrame,ANY' maxent(x, p, a=NULL, removeDuplicates=TRUE, nbg=10000, ...) ## S4 method for signature 'data.frame,vector' maxent(x, p, args=NULL, path, ...) ## S4 method for signature 'missing,missing' maxent(x, p, silent=FALSE, ...)
## S4 method for signature 'Raster,ANY' maxent(x, p, a=NULL, factors=NULL, removeDuplicates=TRUE, nbg=10000, ...) ## S4 method for signature 'SpatialGridDataFrame,ANY' maxent(x, p, a=NULL, removeDuplicates=TRUE, nbg=10000, ...) ## S4 method for signature 'data.frame,vector' maxent(x, p, args=NULL, path, ...) ## S4 method for signature 'missing,missing' maxent(x, p, silent=FALSE, ...)
x 
Predictors. Raster* object or SpatialGridDataFrame, containing grids with predictor variables. These will be used to extract values from for the point locations. 
p 
Occurrence data. This can be a data.frame, matrix, SpatialPoints* object, or a vector. If If 
a 
Background points. Only used if 
nbg 
Number of background points to use. These are sampled randomly from the cells that are not 
factors 
character. Which (if any) variables should be considered as categorical? Either by (layer)name or by index. Only used when argument 'x' is a Raster* object because it is not needed in other cases as you can set the appropriate class to the variables in the data.frame 
args 
charater. Additional argument that can be passed to MaxEnt. See the MaxEnt help for more information. The R maxent function only uses the arguments relevant to model fitting. There is no point in using args='outputformat=raw' when *fitting* the model; but you can use arguments relevant for *prediction* when using the predict function. Some other arguments do not apply at all to the R implementation. An example is 'outputfiletype', because the 'predict' function has its own 'filename' argument for that 
removeDuplicates 
Boolean. If 
path 
character. Optional argument to set where you want the MaxEnt output files to be stored. This allows you to permanently keep these files. If not supplied the MaxEnt files will be stored in a temporary file. These are the files that are shown in a browser when typing the model name or when you use "show(model)" 
silent 
Boolean. If 
... 
Additional arguments 
An object of class 'MaxEnt' (inherits from DistModelclass
). Or a 'MaxEntReplicates' object if you use 'replicates=' as part of the args
argument. If the function is run without any arguments a boolean value is returned (TRUE
if maxent.jar was found).
Steven Phillips and Robert J. Hijmans
https://biodiversityinformatics.amnh.org/open_source/maxent/
Steven J. Phillips, Miroslav Dudik, Robert E. Schapire, 2004. A maximum entropy approach to species distribution modeling. Proceedings of the TwentyFirst International Conference on Machine Learning. p. 655662.
Steven J. Phillips, Robert P. Anderson, Robert E. Schapire, 2006. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190:231259.
Jane Elith, Steven J. Phillips, Trevor Hastie, Miroslav Dudik, Yung En Chee, Colin J. Yates, 2011. A statistical explanation of MaxEnt for ecologists. Diversity and Distributions 17:4357. doi:10.1111/j.14724642.2010.00725.x
# test if you can use maxent maxent() if (maxent()) { # get predictor variables fnames < list.files(path=paste(system.file(package="dismo"), '/ex', sep=''), pattern='grd', full.names=TRUE ) predictors < stack(fnames) #plot(predictors) # file with presence points occurence < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') occ < read.table(occurence, header=TRUE, sep=',')[,1] # witholding a 20% sample for testing fold < kfold(occ, k=5) occtest < occ[fold == 1, ] occtrain < occ[fold != 1, ] # fit model, biome is a categorical variable me < maxent(predictors, occtrain, factors='biome') # see the maxent results in a browser: me # use "args" # me2 < maxent(predictors, occtrain, factors='biome', args=c("J", "P")) # plot showing importance of each variable plot(me) # response curves # response(me) # predict to entire dataset r < predict(me, predictors) # with some options: # r < predict(me, predictors, args=c("outputformat=raw"), progress='text', # filename='maxent_prediction.grd') plot(r) points(occ) #testing # background data bg < randomPoints(predictors, 1000) #simplest way to use 'evaluate' e1 < evaluate(me, p=occtest, a=bg, x=predictors) # alternative 1 # extract values pvtest < data.frame(extract(predictors, occtest)) avtest < data.frame(extract(predictors, bg)) e2 < evaluate(me, p=pvtest, a=avtest) # alternative 2 # predict to testing points testp < predict(me, pvtest) head(testp) testa < predict(me, avtest) e3 < evaluate(p=testp, a=testa) e3 threshold(e3) plot(e3, 'ROC') }
# test if you can use maxent maxent() if (maxent()) { # get predictor variables fnames < list.files(path=paste(system.file(package="dismo"), '/ex', sep=''), pattern='grd', full.names=TRUE ) predictors < stack(fnames) #plot(predictors) # file with presence points occurence < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') occ < read.table(occurence, header=TRUE, sep=',')[,1] # witholding a 20% sample for testing fold < kfold(occ, k=5) occtest < occ[fold == 1, ] occtrain < occ[fold != 1, ] # fit model, biome is a categorical variable me < maxent(predictors, occtrain, factors='biome') # see the maxent results in a browser: me # use "args" # me2 < maxent(predictors, occtrain, factors='biome', args=c("J", "P")) # plot showing importance of each variable plot(me) # response curves # response(me) # predict to entire dataset r < predict(me, predictors) # with some options: # r < predict(me, predictors, args=c("outputformat=raw"), progress='text', # filename='maxent_prediction.grd') plot(r) points(occ) #testing # background data bg < randomPoints(predictors, 1000) #simplest way to use 'evaluate' e1 < evaluate(me, p=occtest, a=bg, x=predictors) # alternative 1 # extract values pvtest < data.frame(extract(predictors, occtest)) avtest < data.frame(extract(predictors, bg)) e2 < evaluate(me, p=pvtest, a=avtest) # alternative 2 # predict to testing points testp < predict(me, pvtest) head(testp) testa < predict(me, avtest) e3 < evaluate(p=testp, a=testa) e3 threshold(e3) plot(e3, 'ROC') }
Compute multivariate environmental similarity surfaces (MESS), as described by Elith et al., 2010
mess(x, v, full=FALSE, filename='', ...)
mess(x, v, full=FALSE, filename='', ...)
x 
Raster* object 
v 
matrix or data.frame containing the reference values. Each column should correspond to one layer of the Raster* object 
full 
logical. If 
filename 
character. Output filename (optional) 
... 
additional arguments as for 
v
can be obtained for a set of points using extract
.
A RasterBrick with layers corresponding to the input layers and an additional layer with the mess values (if full=TRUE
and nlayers(x) > 1
) or a RasterLayer with the MESS values (if full=FALSE
).
JeanPierre Rossi <[email protected]>, Robert Hijmans, Paulo van Breugel
Elith J., M. Kearney M., and S. Phillips, 2010. The art of modelling rangeshifting species. doi:10.1111/j.2041210X.2010.00036.xMethods in Ecology and Evolution 1:330342.
set.seed(9) r < raster(ncol=10, nrow=10) r1 < setValues(r, (1:ncell(r))/10 + rnorm(ncell(r))) r2 < setValues(r, (1:ncell(r))/10 + rnorm(ncell(r))) r3 < setValues(r, (1:ncell(r))/10 + rnorm(ncell(r))) s < stack(r1,r2,r3) names(s) < c('a', 'b', 'c') xy < cbind(rep(c(10,30,50), 3), rep(c(10,30,50), each=3)) refpt < extract(s, xy) ms < mess(s, refpt, full=TRUE) plot(ms) ## Not run: filename < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') bradypus < read.table(filename, header=TRUE, sep=',') bradypus < bradypus[,2:3] files < list.files(path=paste(system.file(package="dismo"),'/ex', sep=''), pattern='grd', full.names=TRUE ) predictors < stack(files) predictors < dropLayer(x=predictors,i=9) reference_points < extract(predictors, bradypus) mss < mess(x=predictors, v=reference_points, full=TRUE) plot(mss) ## End(Not run)
set.seed(9) r < raster(ncol=10, nrow=10) r1 < setValues(r, (1:ncell(r))/10 + rnorm(ncell(r))) r2 < setValues(r, (1:ncell(r))/10 + rnorm(ncell(r))) r3 < setValues(r, (1:ncell(r))/10 + rnorm(ncell(r))) s < stack(r1,r2,r3) names(s) < c('a', 'b', 'c') xy < cbind(rep(c(10,30,50), 3), rep(c(10,30,50), each=3)) refpt < extract(s, xy) ms < mess(s, refpt, full=TRUE) plot(ms) ## Not run: filename < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') bradypus < read.table(filename, header=TRUE, sep=',') bradypus < bradypus[,2:3] files < list.files(path=paste(system.file(package="dismo"),'/ex', sep=''), pattern='grd', full.names=TRUE ) predictors < stack(files) predictors < dropLayer(x=predictors,i=9) reference_points < extract(predictors, bradypus) mss < mess(x=predictors, v=reference_points, full=TRUE) plot(mss) ## End(Not run)
Class to store results of model crossvalidation with presence/absence (0/1) data
presence
:presence data used
absence
:absence data used
np
:number of presence points
na
:number of absence points
auc
:Area under the receiver operator (ROC) curve
pauc
:pvalue for the AUC (for the Wilcoxon test W statistic
cor
:Correlation coefficient
pcor
:pvalue for correlation coefficient
t
:vector of thresholds used to compute confusion matrices
confusion
:confusion matrices
prevalence
:Prevalence
ODP
:Overall diagnostic power
CCR
:Correct classification rate
TPR
:True positive rate
TNR
:True negative rate
FPR
:False positive rate
FNR
:False negative rate
PPP
:Positive predictive power
NPP
:Negative predictive power
MCR
:Misclassification rate
OR
:Oddsratio
kappa
:Cohen's kappa
Robert J. Hijmans
Fielding, A. H. & J.F. Bell, 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24: 3849
Liu, C., M. White & G. Newell, 2011. Measuring and comparing the accuracy of species distribution models with presenceabsence data. Ecography 34: 232243.
Compute niche equivalency for two species following Warren et al. (2009). The statistic ranges from 0 to 1.
nicheEquivalency(sp1, sp2, predictors, n=99, model=maxent, verbose=TRUE, ...)
nicheEquivalency(sp1, sp2, predictors, n=99, model=maxent, verbose=TRUE, ...)
sp1 
coordinates for species 1 (matrix with (x, y) or (lon, lat), or SpatialPoints) 
sp2 
coordinates for species 2 (matrix with (x, y) or (lon, lat), or SpatialPoints) 
predictors 
Raster object with environmental variables 
n 
integer. Number of randomizations 
model 
function. modeling algorithm to me used 
verbose 
logical. If 
... 
additional arguments (none) 
numeric
Brian Anacker. Based on a similar function in by Christoph Heibl in the phyloclim package
Warren, D.L., R.E. Glor, M. Turelli, 2008. Environmental niche equivalency versus conservatism: quantitative approaches to niche evolution. Evolution 62:28682883.
Compute niche overlap from predictions of species distributions with the 'I' or 'D' similarity statistic of Warren et al. (2009). The statistic ranges from 0 (no overlap) to 1 (the distributions are identical).
nicheOverlap(x, y, stat='I', mask=TRUE, checkNegatives=TRUE)
nicheOverlap(x, y, stat='I', mask=TRUE, checkNegatives=TRUE)
x 
RasterLayer with nonnegative values (predictions of the probability that a site is suitable for a species) 
y 
RasterLayer with nonnegative values, as above 
stat 
character either 'I' or 'D' to get the statistic with that name 
mask 
logical. If 
checkNegatives 
logical. If 
numeric
Based on SDMTools::Istat by Jeremy VanDerWal
Warren, D.L., R.E. Glor, M. Turelli, and D. Funk. 2009. Environmental niche equivalency versus conservatism: quantitative approaches to niche evolution. Evolution 62:28682883; Erratum: Evolution 65: 1215
r1 < raster(nr=18, nc=36) r2 < raster(nr=18, nc=36) set.seed(0) r1[] < runif(ncell(r1)) r2[] < runif(ncell(r1)) nicheOverlap(r1, r2)
r1 < raster(nr=18, nc=36) r2 < raster(nr=18, nc=36) set.seed(0) r1[] < runif(ncell(r1)) r2[] < runif(ncell(r1)) nicheOverlap(r1, r2)
Pair plots of presence and absence (background) data.
pairs(x, v=NULL, pa='pa', hist=TRUE, cor=TRUE)
x 
Object of class DistModel or derived from that class (such as 'MaxEnt', 'Bioclim')  
v 
numeric, to select a subset of pairs, e.g. v=1:3 to plot only the first three variables  
pa 
Character. Either 'pa', 'p', or 'a' to show presence and absence, presence, or absence data respectively  
hist 
logical. If TRUE a histogram of the values is shown on the diagonal 

cor 
logical. If TRUE the correlation coefficient is shown in the upper panels 

Robert J. Hijmans
Plot predictor values for occurrence (presence and absence) data in a DistModel (or derived) object.
usage: plot(x, y, ...)
x 
Object of class DistModel or from a class that inherits from it  
y 
missing  
... 
Additional arguments that can be passed to plot  
Robert J. Hijmans
Extract values from a Raster* object for point locations. This function adds a few options that can be useful in the context of species distribution modeling to extract
function in the raster package.
pointValues(x, p, a, uniquecells = TRUE, na.rm = TRUE)
pointValues(x, p, a, uniquecells = TRUE, na.rm = TRUE)
x 
A Raster* object 
p 
Points. Twocolum matrix or data.frame; or a SpatialPoints* object 
a 
Additional points (absences). Twocolum matrix or data.frame; or a SpatialPoints* object 
uniquecells 
Logical. If 
na.rm 
Logical. If 
matrix
Robert J. Hijmans
Make a RasterLayer with a prediction based on a model object of class the inherits from 'DistModel', including: Bioclim, Domain, MaxEnt, Mahalanobis, and GeographicDistance. Predictions with model objects that do not inherit from DistModel can be made using the similar predict
function in the 'raster' package.
Provide a Raster* object with the independent variables. The names of the layers in the Raster* object should include those expected by the model.
A RasterLayer or, (if x
is a matrix), a vector.
predict(object, x, ext=NULL, filename='', progress='text', ...)
object 
A fitted model of class Bioclim, Domain, MaxEnt, ConvexHull, or Mahalanobis (classes that inherit from DistModel)  
x 
A Raster* object or a data.frame  
ext 
An extent object to limit the prediction to a subregion of 'x'. Or an object that can be coerced to an Extent object by extent; such as a Raster* or Spatial* object  
filename 
Output filename for a new raster; if NA the result is not written to a file but returned with the RasterLayer object, in the data slot  
progress 
Character. Valid values are "" (no progress bar), "text" and "windows" (on that platform only)  
... 
Additional model specific arguments. And additional arguments for file writing as for writeRaster  
For maxent models, there is an additional argument 'args'
used to pass arguments (options) to the maxent software. See the help page for maxent
for more information.
For bioclim models, there is an additional argument 'tails'
which you can use to ignore the left or right tail of the percentile distribution for a variable. If supplied, tails should be a character vector with a length equal to the number of variables used in the model. Valid values are "both" (the default), "low" and "high". For example, if you have a variable x with an observed distribution between 10 and 20 and you are predicting the bioclim value for a value 25, the default result would be zero (outside of all observed values); but if you use tail='low', the high (right) tail is ignored and the value returned will be 1.
For geoDist models, there is an additional argument fun
that allows you to use your own (inverse) distance function, and argument scale=1
that allows you to scale the values (distances smaller than this value become one, and the others are divided by this value before computing the inverse distance).
Robert J. Hijmans
For spatial predictions with GLM, GAM, BRT, randomForest, etc., see predict in the Raster package.
To fit a model that can be used with this predict method, see maxent, bioclim, mahal, domain, geoDist, convHull
Extent object: extent
logo < stack(system.file("external/rlogo.grd", package="raster")) pts < matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85, 66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31, 22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2) b < bioclim(logo, pts) # prediction for a subregion e < extent(30,90,20,60) p < predict(b, logo, progress='text', ext=e) plot(p)
logo < stack(system.file("external/rlogo.grd", package="raster")) pts < matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85, 66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31, 22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2) b < bioclim(logo, pts) # prediction for a subregion e < extent(30,90,20,60) p < predict(b, logo, progress='text', ext=e) plot(p)
Simple helper function to prepare data for model fitting
prepareData(x, p, b, factors, xy=FALSE)
prepareData(x, p, b, factors, xy=FALSE)
x 
Raster* object 
p 
presence points 
b 
background (or absence) points 
factors 
vectors indicating which variables are factors (using layer names or numbers) 
xy 
logical. If 
data.frame with nlayers(x)+1
columns and nrow(p) + nrow(b)
rows. The first column, 'pb' indicates whether a record represents presence '1' or background '0' values. The other columns have the values from the Raster* object.
Robert J. Hijmans
Select pairs of points from two sets (without replacement) that have a similar distance to their nearest point in another set of points.
For each point in "fixed
", a point is selected from "sample
" that has a similar distance (as defined by threshold
) to its nearest point in "reference
" (note that these are likely to be different points in reference
). The select point is either the nearest point nearest=TRUE
, or a randomly select point nearest=FALSE
that is within the threshold distance. If no point within the threshold distance is found in sample
, the point in fixed
is dropped.
Hijmans (2012) proposed this sampling approach to remove 'spatial sorting bias' (ssb
) from evaluation data used in crossvalidation of presenceonly species distribution models. In that context, fixed
are the testingpresence points, sample
the testingabsence (or testingbackground) points, and reference
the trainingpresence points.
pwdSample(fixed, sample, reference, tr=0.33, nearest=TRUE, n=1, lonlat=TRUE, warn=TRUE)
pwdSample(fixed, sample, reference, tr=0.33, nearest=TRUE, n=1, lonlat=TRUE, warn=TRUE)
fixed 
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations for which a pair should be found in 
sample 
as above for point locations from which to sample to make a pair with a point from 
reference 
as above for reference point locations to which distances are computed 
n 
How many pairs do you want for each point in 
tr 
Numeric, normally below 1. The threshold distance for a pair of points (one of 
nearest 
Logical. If 
lonlat 
Logical. Use 
warn 
Logical. If 
A matrix of nrow(fixed) and ncol(n), that indicates, for each point (row) in fixed
which point(s) in sample
it is paired to; or NA
if no suitable pair was available.
Robert J. Hijmans
Hijmans, R.J., 2012. Crossvalidation of species distribution models: removing spatial sorting bias and calibration with a nullmodel. Ecology 93: 679688
ref < matrix(c(54.5,38.5, 2.5, 9.5, 45.5, 1.5, 9.5, 4.5, 10.5, 10.5), ncol=2) fix < matrix(c(56.5, 30.5, 6.5, 14.5, 25.5, 48.5, 14.5, 2.5, 14.5, 11.5, 17.5, 11.5), ncol=2) r < raster() extent(r) < c(110, 110, 45, 45) r[] < 1 set.seed(0) sam < randomPoints(r, n=50) par(mfrow=c(1,2)) plot(sam, pch='x') points(ref, col='red', pch=18, cex=2) points(fix, col='blue', pch=20, cex=2) i < pwdSample(fix, sam, ref, lonlat=TRUE) i sfix < fix[!is.na(i), ] ssam < sam[i[!is.na(i)], ] ssam plot(sam, pch='x', cex=0) points(ssam, pch='x') points(ref, col='red', pch=18, cex=2) points(sfix, col='blue', pch=20, cex=2) # try to get 3 pairs for each point in 'fixed' pwdSample(fix, sam, ref, lonlat=TRUE, n=3)
ref < matrix(c(54.5,38.5, 2.5, 9.5, 45.5, 1.5, 9.5, 4.5, 10.5, 10.5), ncol=2) fix < matrix(c(56.5, 30.5, 6.5, 14.5, 25.5, 48.5, 14.5, 2.5, 14.5, 11.5, 17.5, 11.5), ncol=2) r < raster() extent(r) < c(110, 110, 45, 45) r[] < 1 set.seed(0) sam < randomPoints(r, n=50) par(mfrow=c(1,2)) plot(sam, pch='x') points(ref, col='red', pch=18, cex=2) points(fix, col='blue', pch=20, cex=2) i < pwdSample(fix, sam, ref, lonlat=TRUE) i sfix < fix[!is.na(i), ] ssam < sam[i[!is.na(i)], ] ssam plot(sam, pch='x', cex=0) points(ssam, pch='x') points(ref, col='red', pch=18, cex=2) points(sfix, col='blue', pch=20, cex=2) # try to get 3 pairs for each point in 'fixed' pwdSample(fix, sam, ref, lonlat=TRUE, n=3)
Null model based on randomization of locations as suggested by Raes and ter Steege (2007).
nullRandom(x, model, n=25, rep=25, pa=FALSE)
nullRandom(x, model, n=25, rep=25, pa=FALSE)
x 
data.frame with environmental predictor values for collecting localities 
model 
Model function that creates a model of class 'DistModel' 
n 
Sample size 
rep 
Number of repetitions 
pa 
Boolean. Prensenceonly or presence/background model (e.g. Maxent) 
List with n
object of class ModelEvaluationclass
Robert J. Hijmans
Raes, N. & H. ter Steege, 2007. A nullmodel for significance testing of presenceonly species distribution models. Ecography 30:727736.
predictors < stack(list.files(path=paste(system.file(package="dismo"), '/ex', sep=''), pattern='grd', full.names=TRUE )) occurence < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') occ < read.table(occurence, header=TRUE, sep=',')[,1] x < extract(predictors, occ) nr < nullRandom(x, bioclim, n=25, rep=25, pa=FALSE) mean(sapply(nr, function(x)x@auc))
predictors < stack(list.files(path=paste(system.file(package="dismo"), '/ex', sep=''), pattern='grd', full.names=TRUE )) occurence < paste(system.file(package="dismo"), '/ex/bradypus.csv', sep='') occ < read.table(occurence, header=TRUE, sep=',')[,1] x < extract(predictors, occ) nr < nullRandom(x, bioclim, n=25, rep=25, pa=FALSE) mean(sapply(nr, function(x)x@auc))
Generate random points that can be used to extract background values ("randomabsence"). The points are sampled (without replacement) from the cells that are not 'NA
' in raster 'mask
'.
If the coordinate reference system (of mask
) is longitude/latitude, sampling is weighted by the size of the cells. That is, because cells close to the equator are larger than cells closer to the poles, equatorial cells have a higher probability of being selected.
randomPoints(mask, n, p, ext=NULL, extf=1.1, excludep=TRUE, prob=FALSE, cellnumbers=FALSE, tryf=3, warn=2, lonlatCorrection=TRUE)
randomPoints(mask, n, p, ext=NULL, extf=1.1, excludep=TRUE, prob=FALSE, cellnumbers=FALSE, tryf=3, warn=2, lonlatCorrection=TRUE)
mask 
Raster* object. If the object has cell values, cells with 
n 
integer. Number of points 
p 
Presence points (if provided, random points won't be in the same cells (as defined by mask) 
ext 
Extent object. Can be used to restrict sampling to a spatial extent 
extf 
numeric. Multiplyer to adjust the size of extent 'ext'. The default increases of 1.1 increases the extent a little (5% at each side of the extent) 
excludep 
logical. If 
prob 
logical. If 
cellnumbers 
logical. If 
tryf 
numeric > 1. Multiplyer used for initial sample size from which the requested sample size is extracted after removing NA points (outside of mask) 
warn 
integer. 2 or higher gives most warnings. 0 or lower gives no warnings if sample size 
lonlatCorrection 
logical. If 
matrix with coordinates, or, if cellnumbers=TRUE
, a vector with cell numbers.
Robert J. Hijmans
The Rectangular Hull model predicts that a species is present at sites inside the minimum (rotated) bounding rectangle of a set of training points, and absent outside that rectangle.
rectHull(p, ...)
rectHull(p, ...)
p 
point locations (presence). Two column matrix, data.frame or SpatialPoints* object 
... 
Additional arguments. See details 
You can supply an argument n (>= 1) to get n hulls around subset of the points. This uses kmeans to form clusters. To reproduce the clusters you may need to use set.seed
.
An object of class 'RectangularHull' (inherits from DistModelclass
)
Robert J. Hijmans. Using an algorithm by whuber and Bangyou on gis.stackexchange.com
predict, circleHull, convHull, maxent, domain, mahal
r < raster(system.file("external/rlogo.grd", package="raster")) # presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] rh < rectHull(train) predict(rh, test) plot(r) plot(rh, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(rh, r, progress='') plot(pr) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2)
r < raster(system.file("external/rlogo.grd", package="raster")) # presence data pts < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) train < pts[1:12, ] test < pts[13:20, ] rh < rectHull(train) predict(rh, test) plot(r) plot(rh, border='red', lwd=2, add=TRUE) points(train, col='red', pch=20, cex=2) points(test, col='black', pch=20, cex=2) pr < predict(rh, r, progress='') plot(pr) points(test, col='black', pch=20, cex=2) points(train, col='red', pch=20, cex=2)
Generate 'response plots', i.e. single variable response curves for a model
response(x, ...)
response(x, ...)
x 
Model object that inherits from 'DistModel', e.g. 'MaxEnt'. Also works for some other models (e.g. GLM) 
... 
Additional arguments. See Details 
var 
Variable to be plotted (if NULL, all variables will be plotted)  
at 
Function to indicate at what level the other variables should be. E.g. median (the default), mean, min, max. Note that currently the function fails when 'mean' is used and one of the variables is a factor. Instead of a function you can also supply a singlerow matrix with values for each of the variables. If NULL, partial response plots as are computed (this can take a lot of time). Partial response plots show, for each variable, the average prediction for all data points that were used to fit the model (or other data, if supplied), for 10 regularly spaced values of the variable.  
range 
'pa' (default) or 'p'. Show responses for the range of the presence data (p) or presence and absence (background) data (pa) for models that use absence data.  
expand 
percentage to expand the range of values with. Default is 10  
rug 
Logical. If TRUE (the default) a 'rug' of deciles is plotted on the horizontal axes) 

data 
data.frame with data to use, normally this is the data used to fit the model and does not needs to be supplied as they are available from the model object  
fun 
predict function. The default is "predict"; but you may change this to e.g., function(x, y, ...) predict(x, y, type='response', ...).  
... 
Additional graphical parameters  
Used for the sideeffect of a plot
Robert J. Hijmans
Determine "spatial sorting bias", or the difference between two point data sets in the average distance to the nearest point in a reference dataset.
ssb(p, a, reference, lonlat=TRUE, avg=TRUE)
ssb(p, a, reference, lonlat=TRUE, avg=TRUE)
p 
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations 
a 
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations 
reference 
as above for reference point locations to which distances are computed 
lonlat 
Logical. Use 
avg 
Logical. If 
matrix with two values. 'dp': the average distance from a point in p
to the nearest point in reference
and 'da': the average distance from a point in a
to the nearest point in reference
.
Distance is in meters if lonlat=TRUE
, and in mapunits (typically also meters) if lonlat=FALSE
Robert J. Hijmans
Hijmans, R.J., 2012. Crossvalidation of species distribution models: removing spatial sorting bias and calibration with a nullmodel. Ecology 93: 679688.
ref < matrix(c(54.5,38.5, 2.5, 9.5, 45.5, 1.5, 9.5, 4.5, 10.5, 10.5), ncol=2) p < matrix(c(56.5, 30.5, 6.5, 14.5, 25.5, 48.5, 14.5, 2.5, 14.5, 11.5, 17.5, 11.5), ncol=2) r < raster() extent(r) < c(110, 110, 45, 45) r[] < 1 set.seed(0) a < randomPoints(r, n=50) b < ssb(p, a, ref) # distances in km b / 1000 # an index of spatial sorting bias (1 is no bias, near 0 is extreme bias) b[1] / b[2]
ref < matrix(c(54.5,38.5, 2.5, 9.5, 45.5, 1.5, 9.5, 4.5, 10.5, 10.5), ncol=2) p < matrix(c(56.5, 30.5, 6.5, 14.5, 25.5, 48.5, 14.5, 2.5, 14.5, 11.5, 17.5, 11.5), ncol=2) r < raster() extent(r) < c(110, 110, 45, 45) r[] < 1 set.seed(0) a < randomPoints(r, n=50) b < ssb(p, a, ref) # distances in km b / 1000 # an index of spatial sorting bias (1 is no bias, near 0 is extreme bias) b[1] / b[2]
Find a threshold (cutoff) to transform model predictions (probabilities, distances, or similar values) to a binary score (presence or absence).
## S4 method for signature 'ModelEvaluation' threshold(x, stat='', sensitivity=0.9, ...)
## S4 method for signature 'ModelEvaluation' threshold(x, stat='', sensitivity=0.9, ...)
x 
A ModelEvaluation object (see 
stat 
character. To select a particular threshold (see section 'value' for possible values) 
sensitivity 
numeric between 0 and 1. For the fixed sensitivity threshold 
... 
Additional arguments. None implemented 
data.frame with the following columns:
kappa: the threshold at which kappa is highest ("max kappa")
spec_sens: the threshold at which the sum of the sensitivity (true positive rate) and specificity (true negative rate) is highest
no_omission: the highest threshold at which there is no omission
prevalence: modeled prevalence is closest to observed prevalence
equal_sens_spec: equal sensitivity and specificity
sensitivty: fixed (specified) sensitivity
Robert J. Hijmans and Diego NietoLugilde
## See ?maxent for an example with real data. # this is a contrived example: # p has the predicted values for 50 known cases (locations) # with presence of the phenomenon (species) p < rnorm(50, mean=0.7, sd=0.3) # b has the predicted values for 50 background locations (or absence) a < rnorm(50, mean=0.4, sd=0.4) e < evaluate(p=p, a=a) threshold(e)
## See ?maxent for an example with real data. # this is a contrived example: # p has the predicted values for 50 known cases (locations) # with presence of the phenomenon (species) p < rnorm(50, mean=0.7, sd=0.3) # b has the predicted values for 50 background locations (or absence) a < rnorm(50, mean=0.4, sd=0.4) e < evaluate(p=p, a=a) threshold(e)
Create Voronoi polygons for a set of points. (These are also known Thiessen polygons, and Nearest Neighbor polygons; and the technique used is referred to as Delauny triangulation.)
## S4 method for signature 'ANY' voronoi(x, ext, eps=1e09, ...)
## S4 method for signature 'ANY' voronoi(x, ext, eps=1e09, ...)
x 
SpatialPoints* or two column matrix with x and y coordinates 
ext 
Extent. Can be used to set the corners of the rectangular window enclosing the triangulation. The default is the data range plus 10 percent. See 
eps 
Numerical tolerance used in triangulation. See 
... 
Additional arguments (none) 
SpatialPolygonsDataFrame
This method is based on the link[deldir]{deldir}
function by Rolf Turner and code by Carson Farmer
# points p < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) v < voronoi(p) v
# points p < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) v < voronoi(p) v
Voronoi polygons for presence/absence data
## S4 method for signature 'matrix,matrix' voronoiHull(p, a, ext=NULL, dissolve=FALSE, crs=NA, ...) ## S4 method for signature 'data.frame,data.frame' voronoiHull(p, a, ext=NULL, dissolve=FALSE, crs=NA, ...) ## S4 method for signature 'SpatialPoints,SpatialPoints' voronoiHull(p, a, ext=NULL, dissolve=FALSE, ...)
## S4 method for signature 'matrix,matrix' voronoiHull(p, a, ext=NULL, dissolve=FALSE, crs=NA, ...) ## S4 method for signature 'data.frame,data.frame' voronoiHull(p, a, ext=NULL, dissolve=FALSE, crs=NA, ...) ## S4 method for signature 'SpatialPoints,SpatialPoints' voronoiHull(p, a, ext=NULL, dissolve=FALSE, ...)
p 
Presence points. Two column 
a 
Absence points. Must be of the same class as 
ext 
Extent to limit or expand the area of interest 
dissolve 
Boolean. Dissolve (aggregate) polygons? 
crs 
character or CRS object. PROJ.4 notation coordinate reference system 
... 
Additional arguments passed to 
A VoronoiHull object (inherits from DistModelclass
)
This function is only correct when using a planar coordinate reference system (not longitude/latitude).
Robert J. Hijmans
r < raster(system.file("external/rlogo.grd", package="raster")) # presence points p < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) # absence points a < matrix(c(30, 23, 5, 5, 31, 33, 91, 63, 60, 88, 93, 97, 65, 68, 85, 97, 35, 32, 29, 55, 3, 8, 19, 71, 49, 36, 69, 41, 20, 28, 18, 9, 5, 9, 25, 71, 8, 32, 46, 60), ncol=2) v < voronoiHull(p, a) x < predict(r, v) ## Not run: plot(x) points(p, col='black', pch=20, cex=2) points(a, col='red', pch=20, cex=2) ## End(Not run)
r < raster(system.file("external/rlogo.grd", package="raster")) # presence points p < matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4, 95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45, 7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2) # absence points a < matrix(c(30, 23, 5, 5, 31, 33, 91, 63, 60, 88, 93, 97, 65, 68, 85, 97, 35, 32, 29, 55, 3, 8, 19, 71, 49, 36, 69, 41, 20, 28, 18, 9, 5, 9, 25, 71, 8, 32, 46, 60), ncol=2) v < voronoiHull(p, a) x < predict(r, v) ## Not run: plot(x) points(p, col='black', pch=20, cex=2) points(a, col='red', pch=20, cex=2) ## End(Not run)