kernel density function in R gives unequal x and y resolution - r

I hope you can help me with a problem that I've had for a while now. I need to make a bias file for MaxEnt, for which I have used this tutorial: https://scottrinnan.wordpress.com/2015/08/31/how-to-construct-a-bias-file-with-r-for-use-in-maxent-modeling/ and altered it to my own situation. However, I'm stuck now...
I need to use the kde2d function to create a 2d kernel density estimation and then turn that into a raster. However, the raster that is created has a different resolution for x and y. This is a problem, since I have to use it in MaxEnt which won't accept unequal x and y resolutions.
This is what I did:
biasraster <- raster("file.tif") #load raster with all the occurrences
presences <-which(values(biasraster)==1)
pres.locs<- coordinates(biasraster)[presences,]
dens <-kde2d(pres.locs[,1],pres.locs[,2],n=c(nrow(biasraster),ncol(biasraster))) #2d kernel density function on the biasraster
dens.ras<-raster(dens) #create raster from kde2d function
The original resolution of the biasraster is 0.00833333 for both x and y, but the resolution for dens.ras has changed to 0.0104052, 0.00833333 (x,y) (so the y resolution is the correct one).
As can probably be seen from the question, I am a total noob when it comes to coding (in r). I've been trying to figure out what to do for about a week now but I can't find any answers that seem to work, so I hope someone here can help me.

I had the same problem. I only could solve it using:
dens.ras <- resample(dens.ras, climdat, method="bilinear") #make bias file in the same resolution of climdat
but additionaly, I need also cut dens.ras.
densmod<-crop(dens.ras,extent(climdat))
With these additional steps, my bias file worked well.

Related

how to intersect an interpolated surface z=f(x,y) with z=z0 in R

I found some posts and discussions about the above, but I'm not sure... could someone please check if I am doing anything wrong?
I have a set of N points of the form (x,y,z). The x and y coordinates are independent variables that I choose, and z is the output of a rather complicated (and of course non-analytical) function that uses x and y as input.
My aim is to find a set of values of (x,y) where z=z0.
I looked up this kind of problem in R-related forums, and it appears that I need to interpolate the points first, perhaps using a package like akima or fields.
However, it is less clear to me: 1) if that is necessary, or the basic R functions that do the same are sufficiently good; 2) how I should use the interpolated surface to generate a correct matrix of the desired (x,y,z=z0) points.
E.g. this post seems somewhat related to the problem I am describing, but it looks extremely complicated to me, so I am wondering whether my simpler approach is correct.
Please see below some example code (not the original one, as I said the generating function for z is very complicated).
I would appreciate if you could please comment / let me know if this approach is correct / suggest a better one if applicable.
df <- merge(data.frame(x=seq(0,50,by=5)),data.frame(y=seq(0,12,by=1)),all=TRUE)
df["z"] <- (df$y)*(df$x)^2
ta <- xtabs(z~x+y,df)
contour(ta,nlevels=20)
contour(ta,levels=c(1000))
#why are the x and y axes [0,1] instead of showing the original values?
#and how accurate is the algorithm that draws the contour?
li2 <- as.data.frame(contourLines(ta,levels=c(1000)))
#this extracts the contour data, but all (x,y) values are wrong
require(akima)
s <- interp(df$x,df$y,df$z)
contour(s,levels=c(1000))
li <- as.data.frame(contourLines(s,levels=c(1000)))
#at least now the axis values are in the right range; but are they correct?
require(fields)
image.plot(s)
fancier, but same problem - are the values correct? better than the akima ones?

R: Is it possible to plot a grid from x, y spatial coordinates?

I've been working with a spatial model which contains 21,000 grid cells of unequal size (i by j, where i is [1:175] and j is[1:120]). I have the latitude and longitude values in two seperate arrays (lat_array,lon_array) of i and j dimensions.
Plotting the coordinates:
> plot(lon_array, lat_array, main='Grid Coordinates')
Result:
My question: Is it possible to plot these spatial coordinates as a grid rather than as points? Does anyone know of a package or function that might be able to do this? I haven't been able to find anything online to this nature.
Thanks.
First of all it is always a bit dangerous to plot inherently spherical coordinates (lat,long) directly in the plane. Usually you should project them in some way, but I will leave it for you to explore the sp package and the function spTransform or something like that.
I guess in principle you could simply use the deldir package to calculate the Dirichlet tessellation of you points which would give you a nice grid. However, you need a bounding region for this to avoid large cells radiating out from the border of your region. I personally use spatstat to call deldir so I can't give you the direct commands in deldir, but in spatstat I would do something like:
library(spatstat)
plot(lon_array, lat_array, main='Grid Coordinates')
W <- clickpoly(add = TRUE) # Now click the region that contains your grid
i_na <- is.na(lon_array) | is.na(lat_array) # Index of NAs
X <- ppp(lon_array[!i_na], lat_array[!i_na], window = W)
grid <- dirichlet(X)
plot(grid)
I have not tested this yet and I will update this answer once I get the chance to test it with some artificial data. A major problem is the size of your dataset which may take a long time to calculate the Dirichlet tessellation of. I have only tried to call dirichlet on dataset of size up to 3000 points...

options to allow heavily-weighted points on a map to overwhelm other points with low weights

what are some good kriging/interpolation idea/options that will allow heavily-weighted points to bleed over lightly-weighted points on a plotted R map?
the state of connecticut has eight counties. i found the centroid and want to plot poverty rates of each of these eight counties. three of the counties are very populated (about 1 million people) and the other five counties are sparsely populated (about 100,000 people). since the three densely-populated counties have more than 90% of the total state population, i would like those the three densely-populated counties to completely "overwhelm" the map and impact other points across the county borders.
the Krig function in the R fields package has a lot of parameters and also covariance functions that can be called, but i'm not sure where to start?
here is reproducible code to quickly produce a hard-bordered map and then three differently-weighted maps. hopefully i can just make changes to this code, but perhaps it requires something more complex like the geoRglm package? two of the three weighted maps look almost identical, despite one being 10x as weighted as the other..
https://raw.githubusercontent.com/davidbrae/swmap/master/20141001%20how%20to%20modify%20the%20Krig%20function%20so%20a%20huge%20weight%20overwhelms%20nearby%20points.R
thanks!!
edit: here's a picture example of the behavior i want-
disclaimer - I am not an expert on Krigging. Krigging is complex and takes a good understanding of the underlying data, the method and the purpose to achieve the correct result. You may wish to try to get input from #whuber [on the GIS Stack Exchange or contact him through his website (http://www.quantdec.com/quals/quals.htm)] or another expert you know.
That said, if you just want to achieve the visual effect you requested and are not using this for some sort of statistical analysis, I think there are some relatively simple solutions.
EDIT:
As you commented, though the suggestions below to use theta and smoothness arguments do even out the prediction surface, they apply equally to all measurements and thus do not extend the "sphere of influence" of more densely populated counties relative to less-densely populated. After further consideration, I think there are two ways to achieve this: by altering the covariance function to depend on population density or by using weights, as you have. Your weighting approach, as I wrote below, alters the error term of the krigging function. That is, it inversely scales the nugget variance.
As you can see in the semivariogram image, the nugget is essentially the y-intercept, or the error between measurements at the same location. Weights affect the nugget variance (sigma2) as sigma2/weight. Thus, greater weights mean less error at small-scale distances. This does not, however, change the shape of the semivariance function or have much effect on the range or sill.
I think that the best solution would be to have your covariance function depend on population. however, I'm not sure how to accomplish that and I don't see any arguments to Krig to do so. I tried playing with defining my own covariance function as in the Krig example, but only got errors.
Sorry I couldn't help more!
Another great resource to help understand Krigging is: http://www.epa.gov/airtrends/specialstudies/dsisurfaces.pdf
As I said in my comment, the sill and nugget values as well as the range of the semivariogram are things you can alter to affect the smoothing. By specifying weights in the call to Krig, you are altering the variance of the measurement errors. That is, in a normal use, weights are expected to be proportional to the accuracy of the measurement value so that higher weights represent more accurate measurements, essentially. This isn't actually true with your data, but it may be giving you the effect you desire.
To alter the way your data is interpolated, you can adjust two (and many more) parameters in the simple Krig call you are using: theta and smoothness. theta adjusts the semivariance range, meaning that measured points farther away contribute more to the estimates as you increase theta. Your data range is
range <- data.frame(lon=range(ct.data$lon),lat=range(ct.data$lat))
range[2,]-range[1,]
lon lat
2 1.383717 0.6300484
so, your measurement points vary by ~1.4 degrees lon and ~0.6 degrees lat. Thus, you can play with specifying your theta value in that range to see how that affects your result. In general, a larger theta leads to more smoothing since you are drawing from more values for each prediction.
Krig.output.wt <- Krig( cbind(ct.data$lon,ct.data$lat) , ct.data$county.poverty.rate ,
weights=c( size , 1 , 1 , 1 , 1 , size , size , 1 ),Covariance="Matern", theta=.8)
r <- interpolate(ras, Krig.output.wt)
r <- mask(r, ct.map)
plot(r, col=colRamp(100) ,axes=FALSE,legend=FALSE)
title(main="Theta = 0.8", outer = FALSE)
points(cbind(ct.data$lon,ct.data$lat))
text(ct.data$lon, ct.data$lat-0.05, ct.data$NAME, cex=0.5)
Gives:
Krig.output.wt <- Krig( cbind(ct.data$lon,ct.data$lat) , ct.data$county.poverty.rate ,
weights=c( size , 1 , 1 , 1 , 1 , size , size , 1 ),Covariance="Matern", theta=1.6)
r <- interpolate(ras, Krig.output.wt)
r <- mask(r, ct.map)
plot(r, col=colRamp(100) ,axes=FALSE,legend=FALSE)
title(main="Theta = 1.6", outer = FALSE)
points(cbind(ct.data$lon,ct.data$lat))
text(ct.data$lon, ct.data$lat-0.05, ct.data$NAME, cex=0.5)
Gives:
Adding the smoothness argument, will change the order of the function used to smooth your predictions. The default is 0.5 leading to a second-order polynomial.
Krig.output.wt <- Krig( cbind(ct.data$lon,ct.data$lat) , ct.data$county.poverty.rate ,
weights=c( size , 1 , 1 , 1 , 1 , size , size , 1 ),
Covariance="Matern", smoothness = 0.6)
r <- interpolate(ras, Krig.output.wt)
r <- mask(r, ct.map)
plot(r, col=colRamp(100) ,axes=FALSE,legend=FALSE)
title(main="Theta unspecified; Smoothness = 0.6", outer = FALSE)
points(cbind(ct.data$lon,ct.data$lat))
text(ct.data$lon, ct.data$lat-0.05, ct.data$NAME, cex=0.5)
Gives:
This should give you a start and some options, but you should look at the manual for fields. It is pretty well-written and explains the arguments well.
Also, if this is in any way quantitative, I would highly recommend talking to someone with significant spatial statistics know how!
Kriging is not what you want. (It is a statistical method for accurate--not distorted!--interpolation of data. It requires preliminary analysis of the data--of which you do not have anywhere near enough for this purpose--and cannot accomplish the desired map distortion.)
The example and the references to "bleed over" suggest considering an anamorph or area cartogram. This is a map which will expand and shrink the areas of the county polygons so that they reflect their relative population while retaining their shapes. The link (to the SE GIS site) explains and illustrates this idea. Although its answers are less than satisfying, a search of that site will reveal some effective solutions.
lot's of interesting comments and leads above.
I took a look at the Harvard dialect survey to get a sense for what you are trying to do first. I must say really cool maps. And before I start in on what I came up with...I've looked at your work on survey analysis before and have learned quite a few tricks. Thanks.
So my first take pretty quickly was that if you wanted to do spatial smoothing by way of kernel density estimation then you need to be thinking in terms of point process models. I'm sure there are other ways, but that's where I went.
So what I do below is grab a very generic US map and convert it into something I can use as a sampling window. Then I create random samples of points within that region, just pretend those are your centroids. After I attach random values to those points and plot it up.
I just wanted to test this conceptually, which is why I didn't go through the extra steps to grab cbsa's and also sorry for not projecting, but I think these are the fundamentals. Oh and the smoothing in the dialect study is being done over the whole country. I think. That is the author is not stratifying his smoothing procedure within polygons....so I just added states at the end.
code:
library(sp)
library(spatstat)
library(RColorBrewer)
library(maps)
library(maptools)
# grab us map from R maps package
usMap <- map("usa")
usIds <- usMap$names
# convert to spatial polygons so this can be used as a windo below
usMapPoly <- map2SpatialPolygons(usMap,IDs=usIds)
# just select us with no islands
usMapPoly <- usMapPoly[names(usMapPoly)=="main",]
# create a random sample of points on which to smooth over within the map
pts <- spsample(usMapPoly, n=250, type='random')
# just for a quick check of the map and sampling locations
plot(usMapPoly)
points(pts)
# create values associated with points, be sure to play aroud with
# these after you get the map it's fun
vals <-rnorm(250,100,25)
valWeights <- vals/sum(vals)
ptsCords <- data.frame(pts#coords)
# create window for the point pattern object (ppp) created below
usWindow <- as.owin(usMapPoly)
# create spatial point pattern object
usPPP <- ppp(ptsCords$x,ptsCords$y,marks=vals,window=usWindow)
# create colour ramp
col <- colorRampPalette(brewer.pal(9,"Reds"))(20)
# the plots, here is where the gausian kernal density estimation magic happens
# if you want a continuous legend on one of the sides get rid of ribbon=FALSE
# and be sure to play around with sigma
plot(Smooth(usPPP,sigma=3,weights=valWeights),col=col,main=NA,ribbon=FALSE)
map("state",add=TRUE,fill=FALSE)
example no weights:
example with my trivial weights
There is obviously a lot of work in between this and your goal of making this type of map reproducible at various levels of spatial aggregation and sample data, but good luck it seems like a cool project.
p.s. initially I did not use any weighting, but I suppose you could provide weights directly to the Smooth function. Two example maps above.

Difference between two density plots

Is there a simple way to plot the difference between two probability density functions?
I can plot the pdfs of my data sets (both are one-dimensional vectors with roughly 11000 values) on the same plot together to get an idea of the overlap/difference but it would be more useful to me if I could see a plot of the difference.
something along the lines of the following (though this obviously doesn't work):
> plot(density(data1)-density(data2))
I'm relatively new to R and have been unable to find what I'm looking for on any of the forums.
Thanks in advance
This should work:
plot(x =density(data1, from= range(c(data1, data2))[1],
to=range(c(data1, data2))[2] )$x,
y= density(data1, from= range(c(data1, data2))[1],
to=range(c(data1, data2))[2] )$y-
density(data2, from= range(c(data1, data2))[1],
to=range(c(data1, data2))[2] )$y )
The trick is to make sure the densities have the same limits. Then you can plot their differences at the same locations.My understanding of the need for the identical limits comes from having made the error of not taking that step in answering a similar question on Rhelp several years ago. Too bad I couldn't remember the right arguments.
It looks like you need to spend a little time learning how to use R (or any other language, for that matter). Help files are your friend.
From the output of ?density :
Value [i.e. the data returned by the function]
If give.Rkern is true, the number R(K), otherwise an object with class
"density" whose underlying structure is a list containing the
following components.
x the n coordinates of the points where the density is estimated.
y the estimated density values. These will be non-negative, but can
be zero [remainder of "value" deleted for brevity]
So, do:
foo<- density(data1)
bar<- density(data2)
plot(foo$y-bar$y)

How to make topographic map from sparse sampling data?

I need to make a topographic map of a terrain for which I have only fairly sparse samples of (x, y, altitude) data. Obviously I can't make a completely accurate map, but I would like one that is in some sense "smooth". I need to quantify "smoothness" (probably the reciprocal the average of the square of the surface curvature) and I want to minimize an objective function that is the sum of two quantities:
The roughness of the surface
The mean square distance between the altitude of the surface at the sample point and the actual measured altitude at that point
Since what I actually want is a topographic map, I am really looking for a way to construct contour lines of constant altitude, and there may be some clever geometric way to do that without ever having to talk about surfaces. Of course I want contour lines also to be smooth.
Any and all suggestions welcome. I'm hoping this is a well-known numerical problem. I am quite comfortable in C and have a working knowledge of FORTRAN. About Matlab and R I'm fairly clueless.
Regarding where our samples are located: we're planning on roughly even spacing, but we'll take more samples where the topography is more interesting. So for example we'll sample mountainous regions more densely than a plain. But we definitely have some choices about sampling, and could take even samples if that simplifies matters. The only issues are
We don't know how much terrain we'll need to map in order to find features that we are looking for.
Taking a sample is moderately expensive, on the order of 10 minutes. So sampling a 100x100 grid could take a long time.
Kriging interpolation may be of some use for smoothly interpolating your sparse samples.
R has many different relevant tools. In particular, have a look at the spatial view. A similar question was asked in R-Help before, so you may want to look at that.
Look at the contour functions. Here's some data:
x <- seq(-3,3)
y <- seq(-3,3)
z <- outer(x,y, function(x,y,...) x^2 + y^2 )
An initial plot is somewhat rough:
contour(x,y,z, lty=1)
Bill Dunlap suggested an improvement: "It often works better to fit a smooth surface to the data, evaluate that surface on a finer grid, and pass the result to contour. This ensures that contour lines don't cross one another and tends to avoid the spurious loops that you might get from smoothing the contour lines themselves. Thin plate splines (Tps from library("fields")) and loess (among others) can fit the surface."
library("fields")
contour(predict.surface(Tps(as.matrix(expand.grid(x=x,y=y)),as.vector(z))))
This results in a very smooth plot, because it uses Tps() to fit the data first, then calls contour. It ends up looking like this (you can also use filled.contour if you want it to be shaded):
For the plot, you can use either lattice (as in the above example) or the ggplot2 package. Use the geom_contour() function in that case. An example can be found here (ht Thierry):
ds <- matrix(rnorm(100), nrow = 10)
library(reshape)
molten <- melt(data = ds)
library(ggplot2)
ggplot(molten, aes(x = X1, y = X2, z = value)) + geom_contour()
Excellent review of contouring algorithm, you might need to mesh the surface first to interpolate onto a grid.
maybe you can use:
GEOMap
geomapdata
gtm
with
Matrix
SparseM
slam
in R

Resources