I would like to plot visits to a website by screen resolution using R's hexbin package. My current data set is in the following format:
Width Height Visits
1366 768 507089
1280 800 401926
1024 768 293210
... ... ...
The complete dataset is available here: http://cl.ly/1t132z0u0l1W
I'd like to plot Width on the x-axis, Height on the y-axis and use Visits to modify cell size and/or color. Hexbin includes a hexTapply function that outputs an array with which we can define cell values, but it does not perform as expected on my data.
I am using the following code:
screenres <- read.csv(file="screenres2013.csv",head=TRUE,sep=",")
width <- screenres$Width
height <- screenres$Height
visits <- screenres$Visits
hbin <- hexbin(width,height,xbins=20,IDs=TRUE)
P <- plot(hbin,type='n')
hVisits <- hexTapply(hbin,visits,sum)
pushHexport(P$plot.vp)
grid.hexagons(hbin,style='lattice',use.count=FALSE,cell.at=hVisits)
After executing grid.hexagons(), I receive the following error:
Error in unit(x, default.units) : 'x' and 'units' must have length > 0
In addition: Warning message:
In min(cnt, na.rm = TRUE) : no non-missing arguments to min; returning Inf
If I then modify hVisits as follows:
hVisits[1] <- 1
and retry grid.hexagons(), it executes as expected.
I have not encountered this error with any of the examples in the documentation. I don't understand what is causing the issue or why the small modification to hVisits has any effect on the function's execution. Is this a problem with my data or am I missing a crucial step?
I'm very new to R (downloaded it yesterday), so please let me know whether any additional information would be helpful. Thanks in advance for your time.
Related
My dataset includes animal locations and id. What I am trying to do is that I am trying to compute Home Range using kernel density function. As my dataset was huge, I tried it splitting the dataset into two.
> library(sp)
> library(adehabitatHR)
> head(temp)
id x y
92 10 480147.6 3112738
93 10 480081.6 3112663
94 10 479992.6 3112667
95 10 479972.4 3112759
96 10 479931.7 3112758
97 10 479970.7 3112730
Each dataset has 99586 observations which include 190 unique IDs. As a result, I am unable to produce a reproducible dataset.
When I try to use the kernelUD function, I have no problems computing. When I try to get the 95% of HR, it gives me error.
> kernel_temp<- kernelUD(temp)
> kernel_95 <- getverticeshr(kernel_temp, percent = 95)
Error in getverticeshr.estUD(x[[i]], percent, ida = names(x)[i], unin, :
The grid is too small to allow the estimation of home-range.
You should rerun kernelUD with a larger extent parameter
So I search about this problem and I find out a solution. I pass the grid function now with the given grid for the points and I get another error for creating the grid coordinates.
> x <- seq(min(temp$x),max(temp$x),by=1.)
> y <- seq(min(temp$y),max(temp$y),by=1.)
> xy <- expand.grid(x=x,y=y)
> gc()
> coordinates(xy) <- ~x+y
Error: cannot allocate vector of size 6.7 Gb
I have a windows system with 32gb ram, I have been checking my processes and I see that I have RAM remaining but R is unable to allot.
Moving ahead I passed a random grid value just to see if it worked, but still the same error.
> kernel_temp<- kernelUD(temp, grid = 1000)
> kernel_95 <- getverticeshr(kernel_temp, percent = 95)
Error in getverticeshr.estUD(x[[i]], percent, ida = names(x)[i], unin, :
The grid is too small to allow the estimation of home-range.
You should rerun kernelUD with a larger extent parameter
When I expand the xy grid- I see my observations are
which is huge. I wanted to know if there was any easier way of computing the HR or passing the grid function without the grid being so huge?
Any help is greatly appriciated. :)
EDIT-
I tried extent = 2 and having the same problem.
> kernel_temp<- kernelUD(temp, extent = 2)
> kernel_95 <- getverticeshr(kernel_temp, percent = 95)
Error in getverticeshr.estUD(x[[i]], percent, ida = names(x)[i], unin, :
The grid is too small to allow the estimation of home-range.
You should rerun kernelUD with a larger extent parameter
After a few more consultations from friends and colleagues, I found the answer.
When you have numerous locations, the best way to calculate HR with KDE is by playing around with the grid size and the extent. Lower the grid and increase the extent is the best answer for this.
In this case, I was able to calculate HR with-
kernelUD(locs_year,grid = 500, h="href", extent = 5)
I tried with multiple methods grid=1000 but still was not able to. grid = 500, extent = 5 was the sweet spot.!
Thank you for your help.! And not sure but someday, it this answer mind be useful to someone. :)
I would like to draw a very huge heatmap by the great-great ComplexHeatmap library.
The initial matrix has more than 2000 rows and 50+ columns with integer values running from -3 to +3.
I encountered the Error: C stack usage is too close to the limit issue immediately - this might be the limitation of the underlying recursive (?) algorithm.
I found the jitter parameter as a solution - after some struggling with prlimit and ulimit stack adjustment.
So everything is almost ok now:
jitter parameter randomizing the clustering by rows per each execution.
So it is hard to check consistency of the resulted heatmaps in my pipeline.
I realized that I can access the input matrix.
For example:
hm <- Heatmap(input_matrix,
name = "Monstre Heatmap etc.",
# ... long parameter list ...
show_row_dend = TRUE,
jitter = TRUE,
# ... further parameters
)
> # accessing the inner data and compare with the input:
> identical(input_matrix, hm#matrix)
[1] TRUE
Is there any field to expose the shifted matrix in heatmap object?
Per request, this is reiterating my comments above:
The jitter introduced in the Heatmap function when you set jitter = TRUE can be kept reproducible when you set a fixed random seed prior to running the function, e.g. set.seed(123).
If jitter = TRUE, random values from uniform distribution between 0 and 1e-10 are generated, as per the Heatmap documentation, so you could probably just introduce jitter into the matrix yourself (using a defined random seed), prior to running Heatmap and get the same result, such as:
input_matrix <- input_matrix + runif(length(input_matrix), 0, 1e-10),
as you mentioned yourself already.
Regarding the memory issue, you may try to install and use fastcluster, which is a drop-in replacement for hclust which is faster and may consume less memory.
For ComplexHeatmap to use it, it may be necessary to run ht_opt("fast_hclust" = TRUE) prior to running Heatmap. To reset to defaults, use ht_opt(RESET = TRUE).
Concerning the legend, you can configure the limits of the color legend yourself (see: https://jokergoo.github.io/ComplexHeatmap-reference/book/legends.html)
As a side note, I found no issue when generating a 20000 x 50 matrix with 50% identical rows or columns with jitter = FALSE, so I am not sure your stack issues are directly caused by this.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I hope you can help me with this problem i can't find how to overcome. Sorry if I made some mistakes while writing this post, my english is a bit rusty right now.
Here is the question. I have .shp data that I want to analyze in R. The .shp can be either lines that represent lines of traps we set to catch octopuses or points located directly over those lines, representing where we had catured one.
The question i'm trying to answer is: Are octopuses statistically grouped or not?
After a bit of investigation it seems to me that i need to use R and its linearK function to answer that question, using the libraries Maptools, SpatStat and Sp.
Here is the code i'm using in RStudio:
Loading the libraries
library(spatstat)
library(maptools)
library(sp)
Creating a linnet object with the track
t1<- as.linnet(readShapeSpatial("./20170518/t1.shp"))
I get the following warning but it seems to work
Warning messages:
1: use rgdal::readOGR or sf::st_read
2: use rgdal::readOGR or sf::st_read
Plotting it to be sure everything is ok
plot(t1)
Creating a ppp object with the points
p1<- as.ppp(readShapeSpatial("./20170518/p1.shp"))
I get the same warning here, but the real problems start when I try to plot it:
> plot(p1)
Error in if (!is.vector(xrange) || length(xrange) != 2 || xrange[2L] < :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: Interpretation of arguments maxsize and markscale has changed (in spatstat version 1.37-0 and later). Size of a circle is now measured by its diameter.
2: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
3: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
4: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
5: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
6: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
7: In plot.ppp(x, ..., multiplot = FALSE, do.plot = FALSE) :
All mark values are NA; plotting locations only.
Now what is left is to join the objects in a lpp object and to analyze it with the linearK function
> pt1 <- lpp(p1,t1)
> linearK(pt1)
Function value object (class ‘fv’)
for the function r -> K[L](r)
......................................
Math.label Description
r r distance argument r
est {hat(K)[L]}(r) estimated K[L](r)
......................................
Default plot formula: .~r
where “.” stands for ‘est’
Recommended range of argument r: [0, 815.64]
Available range of argument r: [0, 815.64]
This is my situation right now. What i dont know is why the plot function is not working with my ppp object and how to understant the return of the linearK function. Help(linearK) didn't provide any clue. Since i have a lot of tracks, each with its set of points, my desired outcome would be some kind of summary like x tracks analized, a grouped, b dispersed and c unkown.
Thank you for your time, i'll greatly appreciate if you can help me solve this problem.
Edit: Here is a link to a zip file containing al the shp files of one day, both tracks and points, and a txt file with my code. https://drive.google.com/open?id=0B0uvwT-2l4A5ODJpOTdCekIxWUU
First two pieces of general advice: (1) each time you create a complicated object, print it at the terminal, to see if it is what you expected. (2) When you get an error, immediately type traceback() and copy the output. This will reveal exactly where the error is detected.
A ppp object must include a specification of the study region (window). In your code, the object p1 is created by converting data of class SpatialPointsDataFrame, which do not include a specification of the study region, converted via the function as.ppp.SpatialPointsDataFrame, into an object of class ppp in which the window is guessed by taking the bounding box of the coordinates. Unfortunately, in your example, there is only one data point in p1, so the default bounding box is a rectangle of width 0 and height 0. [This would have been revealed by printing p1.] Such objects can usually be handled by spatstat, but this particular object triggers a bug in the function plot.solist which expects windows to have non-zero size. I will fix the bug, but...
In your case, I suggest you do
Window(p1) <- Window(t1)
immediately after creating p1. This will ensure that p1 has the window that you probably intended.
If all else fails, read the spatstat vignette on shapefiles...
I have managed to find a solution. As Adrian Baddeley noticed there was a problem with the owin object. That problem seems to be bypassed (not really solved) if I create the ppp object in a manual way instead of converting my set of points.
I have also changed the readShapeFile function for the rgdal::readOGR, since the first once was deprecated, and that was the reason of the warnings I was getting.
This is the R script i'm using right now, commented to clarify:
#first install spatstat, maptools y sp
#load them
library(spatstat)
library(maptools)
library(sp)
#create an array of folders, will add more when everything works fine
folders=c("20170518")
for(f in folders){
#read all shp from that folder, both points and tracks
pointfiles <- list.files(paste("./",f,"/points", sep=""), pattern="*.shp$")
trackfiles <- list.files(paste("./",f,"/tracks", sep=""), pattern="*.shp$")
#for each point and track couple
for(i in 1:length(pointfiles)){
#create a linnet object with the track
t<- as.linnet(rgdal::readOGR(paste("./",f,"/tracks/",trackfiles[i], sep="")))
#plot(t)
#create a ppp object for each set of points
pre_p<-rgdal::readOGR(paste("./",f,"/points/",pointfiles[i], sep=""))
#plot(p)
#obtain the coordinates the current set of points
c<-coordinates(pre_p)
#create vector of x coords
xc=c()
#create vector of y coords
yc=c()
#not a very good way to fill my vectors but it works for my study area
for(v in c){
if(v>4000000){yc<-c(yc,v)}
else {if(v<4000000 && v>700000){xc<-c(xc,v)}}
}
print(xc)
print(yc)
#create a ppp object using the vectors of x and y coords, and a window object
#extracted from my set of points
p=ppp(xc,yc,Window(as.ppp(pre_p)))
#join them into an lpp object
pt <- lpp(p,t)
#plot(pt)
#analize it with the linearK function, nsim=9 for testing purposes
#envelope.lpp is the method for analyzing linear point patterns
assign(paste("results",f,i,sep="_"),envelope.lpp(pt, nsim=9, fun=linearK))
}#end for each points & track set
}#end for each day of study
So as you can see this script is testing for CSR each couple of points and track for each day, working fine right now. Unfortunately I have not managed to create a report or reportlike with the results yet (or even to fully understand them), I'll keep working on that. Of course I can use any advice you have, since this is my first try with R and many newie mistakes will happen.
The script and the shp files with the updated folder structure can be found here(113 KB size)
I am using the plotrix package to make the polar coordinates from my measurements.
It looks that even when I provide measurements for all the polar cordinates from 1 to 360 degrees (or equally to 0 to 359) the first and last points are not connected. For example
require(plotrix)
polar.plot(seq(1,360),polar.pos=1:360,radial.lim=c(0,361),rp.type="l")
A quick and dirty fix I found was to add one more measurement point, so instead of 360 use 361
as
polar.plot(seq(1,360),polar.pos=0:360,radial.lim=c(0,361),rp.type="l")
which gives warning messages.
Warning messages:
1: In cos(radial.pos[i, ]) * lengths[i, ] :
longer object length is not a multiple of shorter object length
2: In sin(radial.pos[i, ]) * lengths[i, ] :
longer object length is not a multiple of shorter object length
Are there any alternatives since showing my end user warning messages is not something that I like to see :)
I would like to thank you for your reply
Regards
Alex
It's going to connect them in order. So, if you want the final vertical line back to the origin, you need to add a datapoint at the end of the vector to make it do so. The error you got is that you added an extra value to one coord but not the other, so x and y are not equal. It recycled one of the vectors to fill it out, which happened to give you what you wanted, but gave you a warning that it was doing so.
polar.plot(c(seq(1,360), 1),
polar.pos = c(1:360, 1),
radial.lim = c(0,361),
rp.type = "l")
I have to simulate an image with a white crack on a black background. So I defined a function that adds to a matrix with all elements equal to zero some consecutive points equal to one.
The function is the following:
crepa<-function(matrice) {
start<-sample(1:ncol(matrice),1)
matrice[1,start]<-1
for (i in 2:nrow(matrice)) {
alpha<-sample(c(-1,0,1),1)
succ<-start+alpha
if (succ==(ncol(matrice)+1)) succ==ncol(matrice)
if (succ==0) succ==1
matrice[i,succ]<-1
start<-succ
}
matrice<-as.matrix(matrice)
}
To control whether the function works well, I applied it over and over again to the following matrix:
m<-matrix(0,64,64)
imma<-crepa(m)
par(mar=rep(0,4))
image(t(imma), axes = FALSE, col = grey(seq(0, 1, length = 256)))
In most cases the result is correct. However, in few cases I run into this Error:
Error in [<-(*tmp*, i, succ, value = 1) : subscript out of bounds
These two lines:
if (succ==(ncol(matrice)+1)) succ==ncol(matrice)
if (succ==0) succ==1
Should be:
if (succ==(ncol(matrice)+1)) succ=ncol(matrice)
if (succ==0) succ=1
In case you still can't see it, you've used the equality test == when you should use assignment = or <-.
The error message told me it had to be the element going off the matrix, so I started printing out the values of succ and then noticed it wasn't being reset within the right range, and only then did I spot the mistake. I probably looked at the code ten times without noticing. I also figured that kind of error was more likely with a small matrix, and so tested with a 6x6 matrix which meant I could be more likely to see it than with a 64x64!